Arabic Letter shapes

Source publication

Multiclass Classification of Unconstrained Handwritten Arabic Words Using Machine Learning Approaches

Article

Full-text available

Sep 2009

In this paper, we propose and describe efficient multiclass classification and recognition of unconstrained handwritten Arabic words using machine learning approaches which include the K-nearest neighbor (K-NN) clustering, and the neural network (NN). The technical details are presented in terms of three stages, namely preprocessing, feature extrac...

Context 1

... the difference here is that each of the 28 letters within the Arabic alphabet has either two or four shapes depending on its position in the text written, and the whole text is written from right to left in a cursive way. Each letter may have up to four various shapes according to its position in the word within the text, i.e. at the start, in the middle, at the end or alone [1] as shown in Table 1. For example letter Ayn () has the following shapes: start , middle , end , and alone . ...

View in full-text

A novel minimal Arabic script for preparing databases and benchmarksfor Arabic text recognition research

Article

Full-text available

Characterization of ancient document images composed by Arabic and Latin scripts

Article

Full-text available

Apr 2011

In this paper we characterize Arabic and Latin ancient document images. The main criticism of existing works is that most of them are interested in the characterization of Latin historical documents, and they are up to now no many methods that can perform the discrimination between these different language old document images. Regions of images hav...

A Lexicon Based System with Multiple HMMs to Recognise Typewritten and Handwritten Arabic Words

Article

Full-text available

Mohammad Khorsheed

A new method to recognise words in Arabic handwritten manuscript is presented. The method injects the spectral features extracted from an input word image to a group of previously trained word models. Each word model is a single hidden Markov model. The likelihood probability of the input pattern is calculated against each model and the pattern is...

The framework of the classification on the encrypted database

The case study of the classification framework on the encrypted database

Sample of encrypted and non-encrypted images over MNIST database. a...

Sample of encrypted and non-encrypted images over NIST19 database. a...

Encrypted image classification based on multilayer extreme learning machine

Article

Full-text available

Jul 2017

Nowadays, numerous corporations (such as Google, Baidu, etc.) require an efficient and effective search algorithm to crawl out the images with queried objects from databases. Moreover, privacy protection is a significant issue such that confidential images must be encrypted in corporations. Nevertheless, decrypting and then classifying millions of...

Sample of handwritten and printed Arabic text

Sample of a printed text at different video frames

Ayn letter (ع) at different locations in the word

Arabic Optical Character Recognition: A Review

Article

Full-text available

Nov 2022

Salah Alghyaline

This study aims to review the latest contributions in Arabic Optical Character Recognition (OCR) during the last decade, which helps interested researchers know the existing techniques and extend or adapt them accordingly. The study describes the characteristics of the Arabic language, different types of OCR systems, different stages of the Arabic...

Structural and Statistical Feature Extraction Methodology for the Recognition of Handwritten Arabic Words

Chapter

Dec 2018

Knowledge concerning the topography of Arabic letters, as well as the structural characteristics between background regions and character components is investigated as a novel approach for Arabic recognition. The suggested feature extraction method reduces the classifier input data to only the most significant and essential.

Combination of multiple classifiers for off-line handwritten arabic word recognition

Article

Full-text available

Sep 2017

This study investigates the combination of different classifiers to improve Arabic handwritten word recognition. Features based on Discrete Cosine Transform (DCT) and Histogram of Oriented Gradients (HOG) are computed to represent the handwritten words. The dimensionality of the HOG features is reduced by applying Principal Component Analysis (PCA). Each set of features is separately fed to two different classifiers, Support Vector Machine (SVM) and Fuzzy K-Nearest Neighbor (FKNN) giving a total of four independent classifiers. A set of different fusion rules is applied to combine the output of the classifiers. The proposed scheme evaluated on the IFN/ENIT database of Arabic handwritten words reveal that combining the classifiers results in improved recognition rates which, in some cases, outperform the state-of-the-art recognition systems.

Combination of Multiple Classifiers for Off-Line Handwritten Arabic Word Recognition

Article

Full-text available

Sep 2017

A Novel Two-Stage Spectrum-Based Approach for Dimensionality Reduction: A Case Study on the Recognition of Handwritten Numerals

Article

Full-text available

May 2014
J Appl Math

Dimensionality reduction (feature selection) is an important step in pattern recognition systems. Although there are different conventional approaches for feature selection, such as Principal Component Analysis, Random Projection, and Linear Discriminant Analysis, selecting optimal, effective, and robust features is usually a difficult task. In this paper, a new two-stage approach for dimensionality reduction is proposed. This method is based on one-dimensional and two-dimensional spectrum diagrams of standard deviation and minimum to maximum distributions for initial feature vector elements. The proposed algorithm is validated in an OCR application, by using two big standard benchmark handwritten OCR datasets, MNIST and Hoda. In the beginning, a 133-element feature vector was selected from the most used features, proposed in the literature. Finally, the size of initial feature vector was reduced from 100% to 59.40% (79 elements) for the MNIST dataset, and to 43.61% (58 elements) for the Hoda dataset, in order. Meanwhile, the accuracies of OCR systems are enhanced 2.95% for the MNIST dataset, and 4.71% for the Hoda dataset. The achieved results show an improvement in the precision of the system in comparison to the rival approaches, Principal Component Analysis and Random Projection. The proposed technique can also be useful for generating decision rules in a pattern recognition system using rule-based classifiers.

Printed Persian Subword Recognition Using Wavelet Packet Descriptors

Article

Full-text available

Jan 2013

In this paper, we present a new approach to offline OCR (optical character recognition) for printed Persian subwords using wavelet packet transform. The proposed algorithm is used to extract font invariant and size invariant features from 87804 subwords of 4 fonts and 3 sizes. The feature vectors are compressed using PCA. The obtained feature vectors yield a pictorial dictionary for which an entry is the mean of each group that consists of the same subword with 4 fonts in 3 sizes. The sets of these features are congregated by combining them with the dot features for the recognition of printed Persian subwords. To evaluate the feature extraction results, this algorithm was tested on a set of 2000 subwords in printed Persian text documents. An encouraging recognition rate of 97.9% is got at subword level recognition.

Improved Arabic Word Classification using Spatial Pyramid Matching Method

Conference Paper

Full-text available

Nov 2011

In recent years, rapidly developed hand written word recognition techniques have attracted researcher's attention to study Arabic word classification. Arabic language has cursive style of writing so it needs special framework for classification. In this paper, a precise framework for Arabic word classification is presented, which uses sparse coding with spatial pyramid matching (SPM) algorithm and linear support vector machine classifier. SPM maps each feature set to a multi-resolution histogram that preserves the individual feature at the finest level. The histogram pyramids are then compared by using a weighted histogram intersection algorithm. Our proposed framework is evaluated with four publically available datasets; IFN/ENIT, PATS-A01, IFHCDB and ISI Bangla numeral. Experimental results show that the proposed framework outperforms those state of art methods used for Arabic words classification. Keywords-Arabic Character Recognition; linear support vector machine (LSVM); Spatial Pyramid Matching (SPM); Scale invariant feature transform (SIFT).

Enhancement of Moment Invariants calculation for Arabic Handwriting recognition

Article

Jun 2011

Moment Invariant (MI) has been frequently used as feature for shape recognition. These features are invariant to several deformations such as rotation, scaling and translation. However it is sensitive to distortions that primarily affect the 'centre of gravity' of the image. Images of an Arabic Word might have different centroid due to the fact that it might be written using different Handwriting styles. In this paper we examine the effect of replacing the image centroid with the center of image as the reference point in Moment Invariant (MI). The new descriptors set was tested to recognize Arabic Words based on IFN/ENIT Database that consisting of 26459 words written by 411 different writers. The Back Propagation Neural Network was used as the classifier. Experiment results had shown that by using the new descriptors the average recognition accuracy has increased by 18.38%.

Offine Automatic Segmentation based Recognition of Handwritten Arabic Words

Article

Full-text available

Jan 2011

The world heritage of handwritten Arabic documents is huge however only manual indexing and retrieval techniques of the content of these documents are available. To facilitate an automatic retrieval of such hand-written Arabic document, a number of automatic recognition systems for handwritten Arabic words have been proposed. Nevertheless, these systems suffer from low recognition accuracy due to the peculiarities of the handwritten Arabic language. Thus, in this Paper we propose a segmentation based recognition system for handwritten Arabic words. We divide a handwritten word into smaller pieces of a word and then these small pieces are segmented into candidate letters. These candidate letters are converted into their correspondence chain-code representation. Thereafter we extract discrete, statistical and structural features for classifica-tion. Additionally, we introduce a novel active contour based feature to increase the recognition accuracy of strongly deformed Arabic letters. We also use a decision tree to reduce the number of potential classes. We then use a neural network to compute weights for all statistical features and use them as input for a k-NN classifier. Our experiments show that the extracted features by our technique achieve higher recognition accuracy as compared to other features.

Off-line handwritten arabic words segmentation based on structural features and connected components analysis

Article

Full-text available

Jan 2011

A precise and efficient segmentation for handwritten Arabic text is a vital prerequisite for the accuracy of the subsequent recognition phase. In this paper, we present a dualphase segmentation approach. The proposed approach starts first by detecting and resolving sub-words overlapping, then a topological features based segmentation is applied by means of a set of heuristic rules. Because of its crucial importance, the segmentation phase is preceded by a handwritten specific preprocessing phase, that considers issues like word's skew- and slant- correction. The proposed approach has been successfully tested on a database of handwritten Arabic words, that contains more than 3000 words images. The results were very promising and indicating the efficiency of our approach.

Automatic recognition of handwritten Arabic using maximally stable extremal region features

Article

Jan 2020
OPT ENG

Arabic Letter shapes

Context in source publication

Similar publications

Citations