Conference PaperPDF Available

Study of Different Features on Handwritten Devnagari Character

January 2010

January 2010

DOI:10.1109/ICETET.2009.215

Source
IEEE Xplore

Conference: Emerging Trends in Engineering and Technology (ICETET), 2009 2nd International Conference on

Authors:

Sandhya Arora

Cummins College of Engineering for Women

Debotosh Bhattacharjee

Jadavpur University

Mita Nasipuri

Jadavpur University

Show all 6 authorsHide

In this paper a scheme for offline handwritten Devnagari character recognition is proposed, which uses different feature extraction and recognition algorithms. The proposed system assumes no constraints in writing style, size or variations. First the character is preprocessed and features namely : chain code histogram, four side views, shadow based are extracted and fed to multilayer perceptrons as a preliminary recognition step. Finally the results of all MLP's are combined using weighted majority scheme. The proposed system is tested on 1500 handwritten devnagari character database collected from different people. It is observed that the proposed system achieves 98.16% recognition rates as top 5 results and 89.58% as top 1 results.

Content uploaded by Sandhya Arora

Content may be subject to copyright.

Abstract— In this paper a scheme for offline Handwritten

Devnagari Character Recognition is proposed, which uses

different feature extraction and recognition algorithms. The

proposed system assumes no constraints in writing style, size

or variations. First the character is preprocessed and features

namely : Chain code histogram , four side views , shadow

based are extracted and fed to Multilayer Perceptrons as a

preliminary recognition step. Finally the results of all MLP’s

are combined using weighted majority scheme. The proposed

system is tested on 1500 handwritten devnagari character

database collected from different people. It is observed that

the proposed system achieves 98.16% recognition rates as top

5 results and 89.58% as top 1 results.

Keywords:- Classification, Multilayer Perceptron, Feature

Extraction, Weighted majority Scheme

I. INTRODUCTION

Although first research report on handwritten Devnagari

characters was published in 1977 [4] but not much research

work is done after that. At present researchers have started

to work on handwritten Devnagari characters and few

research reports are published recently. Hanmandlu and

Murthy [5, 14] proposed a Fuzzy model based recognition

of handwritten Hindi numerals and characters and they

obtained 92.67% accuracy for Handwritten Devnagari

numerals and 90.65% accuracy for Handwritten Devnagari

characters. Bajaj et al [6] employed three different kinds of

features namely, density features, moment features and

descriptive component features for classification of

Devnagari Numerals. They proposed multi-classifier

connectionist architecture for increasing the recognition

reliability and they obtained 89.6% accuracy for

handwritten Devnagari numerals. Kumar and Singh [7]

proposed a Zernike moment feature based approach for

Devnagari handwritten character recognition. They used an

artificial neural network for classification. Sethi and

Chatterjee [8] proposed a decision tree based approach for

recognition of constrained hand printed Devnagari

characters using primitive features. Bhattacharya et al [9]

proposed a Multi-Layer Perceptron (MLP) neural network

based classification approach for the recognition of

Devnagari handwritten numerals and obtained 91.28%

results. N. Sharma and U. Pal [1] proposed a directional

chain code features based quadratic classifier and obtained

80.36% accuracy for handwritten Devnagari characters and

98.86% accuracy for handwritten Devnagari numerals. In

most of the works reported above, multiple classifier

combination has not been reported for handwritten

Devnagari characters. Most of them are based on single

classifier or reported for handwritten Devnagari numerals.

In this paper we are presenting the results of various feature

extraction techniques experimented on handwritten

Devnagari characters. Different features are experimented

individually using MLP classifiers and their combined

results are also experimented. The results of all MLP’s are

combined using weighted majority scheme.

Our feature set is obtained from chain code histogram,

shadow and view based. Chain codes histogram features are

extracted from scaled contour of the image. Shadow

features are extracted from scaled image and view based

features are extracted from scaled and thinned character

image. These features are then fed to the Multi layer

Perceptron for recognition.

Rest of the paper is organized as follows. In section 2,

peculiarities of Devnagari Script are discussed. Feature

extraction techniques are reported in section 3. Section 4,

deals with the classifiers used for the recognition purpose.

The experimental results are discussed in section 5.

II. PECULIARITIES OF DEVNAGARI SCRIPT

Devnagari script is different from Roman script in several

ways. This script has two-dimensional compositions of

symbols: core characters in the middle strip, optional

modifiers above and/or below core characters. Two

characters may be in shadow of each other. While line

segments (strokes) are the predominant features for

English, most of the characters in Devnagari script is

formed by curves, holes, and also strokes. In Devnagari

language scripts, the concept of upper-case, the lower-case

characters is absent. However the alphabet itself contains

more number of symbols than that of English.

Devnagari script have around 14 vowels and 33 consonants

resulting in a total of 47 or even more basic characters.

Vowels occur either in isolation or in combination with

consonants. Apart from vowels and consonants characters

called basic characters, there are compound characters in

Devnagari script alphabet system, which are formed by

combining two or more basic characters. The shape of

compound character is usually more complex than the

Study of Different Features on Handwritten Devnagari Character

S. Arora1, D. Bhattacharjee2, M. Nasipuri2 , D.K. Basu2 , M.Kundu2 , L.Malik3

1Meghnad Saha Institute of Technology, Kolkata-107, India

Email: sandhyabhagat@yahoo.com

2Department of Computer Science and Engg, Jadavpur University, Kolkata ,India

3G.H. Raisoni college of Engineering, Nagpur, India

Second International Conference on Emerging Trends in Engineering and Technology, ICETET-09

constituent basic characters. Coupled to this in Devnagari

script there is a practice of having more than twelve forms

each for 33 consonants , giving rise to modified shapes

which, depending on whether the vowel is placed to the

left, right, top or bottom of the character. They are called

modified characters. The net result is that there are several

thousand different shapes or patterns, which may, in

addition be connected with each other without any visible

separation. This makes Devnagari OCR more difficult to

develop.

(a)

(b)

( c )

Figure 1: Sample of Handwritten Devnagari a) vowel b) consonants c)

compound characters

III. FEATURE EXTRACTION

In the following we give a brief description of the feature

sets used in our proposed system. Chain code histogram

features are extracted by chain coding the contour points of

the scaled character bitmapped image. View based features

are extracted from scaled, thinned one pixel wide skeleton

of character image. Shadow features are extracted from

scaled character image.

A. Shadow Features of character

For computing shadow features [13], the rectangular

boundary enclosing the character image is divided into

eight octants, for each octant shadow of character segment

is computed on two perpendicular sides so a total of 24

shadow features are obtained. Shadow is basically the

length of the projection on the sides as shown in figure 2.

These features are computed on scaled image.

Figure 2. Shadow features

B. Chain Code Histogram of Character Contour

Given a scaled binary image, we first find the contour

points of the character image. We consider a 3 × 3 window

surrounded by the object points of the image. If any of the

4-connected neighbor points is a background point then the

object point (P), as shown in figure 3 is considered as

contour point.

Figure 3. Contour point detection

The contour following procedure uses a contour

representation called “chain coding” that is used for contour

following proposed by Freeman [15], shown in figure 4a.

Each pixel of the contour is assigned a different code that

indicates the direction of the next pixel that belongs to the

contour in some given direction. Chain code provides the

points in relative position to one another, independent of

the coordinate system. In this methodology of using a chain

coding of connecting neighboring contour pixels, the points

and the outline coding are captured. Contour following

procedure may proceed in clockwise or in counter

clockwise direction. Here, we have chosen to proceed in a

clockwise direction.

X P X

930

(a) (b) (c)

Figure 4. Chain Coding: (a) direction of connectivity, (b) 4-connectivity,

the next-in-line pixel

The chain code for the character contour will yield a

smooth, unbroken curve as it grows along the perimeter of

the character and completely encompasses the character.

When there is multiple connectivity in the character, then

there can be multiple chain codes to represent the contour

of the character. We chose to move with minimum chain

code number first.

We divide the contour image in 5 × 5 blocks. In each of

these blocks, the frequency of the direction code is

computed and a histogram of chain code is prepared for

each block. Thus for 5 × 5 blocks we get 5 × 5 × 8 = 200

features for recognition.

C. View based features

This method is based on the fact, that for correct character-

recognition a human usually needs only partial information

about it – its shape and contour. This feature extraction

method examines four “views” of each character extracting

from them a characteristic vector, which describes the

given character. The view is a set of points that plot one of

four projections of the object (top, bottom, left and right) –

it consists of pixels belonging to the contour of the

character and having extreme values of one of its

coordinates. For example, the top view of a letter is a set of

points having maximal y coordinate for a given x

coordinate. Next, characteristic points are marked out on

the surface of each view to describe the shape of that view

(Figure.5) The method of selecting these points and their

number may vary from letter to another. In the considered

examples, eleven uniformly distributed characteristic points

are taken for each view.

Figure 5. Selecting characteristic points for four views

The next step is calculating the y coordinates for the points

on the top and down views, and x coordinates for the points

on left and right views. These quantities are normalized so

that their values are in the range <0, 1>. Now, from 44

obtained values the characteristic vector is created to

describe the given character, and which is the base for

further analysis and classification.

IV. CHARACTER RECOGNITION

We used different MLP with 3 layers including one hidden

layer for two different feature sets consisting of 200 chain

code histogram features 24 shadow features and 44 view

based features. The experimental results obtained while

using these features for recognition of handwritten

Devnagari characters is presented in section 5. At this stage

all characters are non-compound, single characters so no

segmentation is required.

Each MLP is trained with Backpropagation learning

algorithm with momentum [9]. It minimizes the sum of

squared errors for the training samples by conducting a

gradient descent search in the weight space. As activation

function we used sigmoid function. Learning rate and

momentum term are set to 0.8 and 0.7 respectively. As

activation function we used the sigmoid function. Numbers

of neurons in input layer of MLPs are 200, 24 and 44 for

chain code histogram, shadow and view based features

respectively. Number of neurons in Hidden layer is not

fixed, we experimented on the values between 20-50 to get

optimal result and finally it was set to 50, 30 and 40 for

chain code histogram, shadow and view based features

respectively. The output layer contained one node for each

class, so the number of neurons in output layer is 20.

A. Classifier Combination

The ultimate goal of designing pattern recognition system

is to achieve the best possible classification performance.

This objective traditionally led to the development of

different classification scheme for any pattern recognition

problem to be solved. The result of an experimental

assessment to the different design would then be the basis

for choosing one of the classifiers as the final solution to

the problem. It had been observed in such design studies,

that although one of the designs would yield the best

performance, the sets of patterns misclassified by the

different classifiers would not necessarily overlap. This

suggested that different classifier designs potentially

offered complementary information about the pattern to be

classified which could be harnessed to improve the

performance of the selected classifier. So instead of relying

on a single decision making scheme we can combine

classifiers.

We have two Neural networks classifiers as discussed

above, which are trained on 200 chain code, 24 shadow and

44 view based features respectively. The outputs are

confidences associated with each class. As these outputs

cannot be compared directly, we used an aggregation

function for combining the results of all three classifiers.

931

Our strategy is based on weighted majority voting scheme

as described below.

So if kth classifier decision to assign the unknown pattern to

the ith class is denoted by Oik with 1 ≤ i ≤ m, m being the

number of classes, then the final combined decision di

supporting assignment to the ith class takes the form of :-

com = ∑ ωk * Oik …….1 ≤ i ≤ m

k=1,2,3

The final decision dcom is therefore :-

dcom = max di

com

1 ≤ i ≤ m

ωk = ------- ------------

∑ dk

k=1

where m = 20 and ω1, ω2 and ω3 are 0.384 ,0.354 and

0.262 respectively as d1> d2 > d3

d1=88.19% result of classifier trained with chaincode

histogram features

d2=81.25% result of classifier trained with shadow features

d3=60.07% result of classifier trained with view based

features

V. RESULTS

The experiment evaluation of the above technique was

carried out using isolated devnagari characters collected

different people. A total of 1500 samples of Devnagari

basic characters (vowels as well as consonants) are used for

our experiment out of which 65% characters are used for

the training and rest is used for testing purpose. The

recognition accuracy obtained from our above discussed

classifiers separately are shown in table I. Three MLP’s are

designed for features namely Chain code Histogram based,

four side views based and Shadow based features. Results

of three MLP’s are combined using weighted majority

scheme discussed above. Combined MLP is giving 98.61%

accuracy as we considered top 5 choices results.

We applied 3-fold cross validation testing. We divided the

whole dataset into three parts. In first fold, first two parts

are used for training and third part is used for testing. In

second fold, first and third part is used for training and

second part is used for testing. In fold three, second and

third part is used for training and first part is used for

testing. The average error across all three trials is

computed. The advantage of this method is that it matters

less how the data gets divided. Every data point gets to be

in test set exactly once, and gets to be in training set

remaining times. We compared our current results with

those existing pieces of work. Details comparative results

are given in table III.

Table I. Results of three different MLP

Table II. Top Choices Results

Table III: Comparison of Results

No.

Method purposed by Accuracy

1. Kumar and Singh [7] 80%

2. N. Sharma, U. Pal, F. Kimura, and S. Pal

[1]

80.36%

3. M. Hanmandlu, O.V. R. Murthy, V.K.

Madasu [14]

90.65%

5. Proposed method 98.61%

VI. CONCLUSION

India is a multi-lingual and multi-script country comprising

of eleven different scripts. Devnagari is third most widely

used script, used for several major languages such as Hindi,

Sanskrit, Marathi and Nepali, and is used by more than 500

million people. But not much work has been done towards

off-line handwriting recognition of Devnagari script. In this

paper we present a technique of recognition of offline

handwritten Devnagari characters using MLP In future we

plan to experiment on other feature extraction methods to

get higher recognition accuracy from our system.

ACKNOWLEDGMENT

Authors are thankful to the “Centre for Microprocessor

Application for Training Education and Research” and

“Project on Storage Retrieval and Understanding of Video

for Multimedia”, at the Department of Computer Science

MLP Input layer

Neuron

Hidden La yer

Neuron

Output La yer

Neuron

Result

Chain Code

Histogram

Feature

based

200 50 20 88.19%

Shadow

Features

based

32 15 20 81.25%

View based

Feature

based

44 30

20 60.07%

No.

Proposed method

result

Accuracy

obtained

1 Top 1 choice 89.58%

2 Top 2 choices 94.79%

3 Top 3 choices 97.57%

4 Top 4 choices 98.26%

5 Top 5 choices 98.61%

932

and Engineering, Jadavpur University, Kolkata-700032 for

providing the necessary facilities for carrying out this work.

First author gratefully acknowledge the support of the

Meghnad Saha Institute of Technology for carrying out this

research work.

REFERENCES

[1] N. Sharma, U. Pal, F. Kimura, and S. Pal, ” Recognition of Off-Line

Handwritten Devnagari Characters Using Quadratic Classifier”,

ICVGIP 2006, LNCS 4338, pp. 805 – 816, 2006.

[2] U. Pal and B.B. Chaudhuri, “Indian script character recognition: A

Survey”, Pattern Recognition, Vol. 37, pp. 1887-1899, 2004.

[3] B. B. Chaudhuri and U. Pal, “A complete printed Bangla OCR

system”, Pattern Recognition, vol. 31, pp. 531-549, 1998.

[4] I.K. Sethi and B. Chatterjee, “Machine Recognition of constrained

Hand printed Devnagari”, Pattern Recognition, Vol. 9, pp. 69-75,

1977.

[5] M. Hanmandlu and O.V. Ramana Murthy, “Fuzzy Model Based

Recognition of Handwritten Hindi Numerals”, Intl.Conf. on

Cognition and Recognition, pp. 490-496, 2005.

[6] Reena Bajaj, Lipika Dey, and S. Chaudhury, “Devnagari numeral

recognition by combining decision of multiple connectionist

classifiers”, Sadhana, Vol.27, part. 1, pp.-59-72, 2002

[7] Satish Kumar and Chandan Singh, “A Study of Zernike Moments and

its use in Devnagari Handwritten Character Recognition”, Intl.Conf.

on Cognition and Recognition, pp. 514- 520, 2005.

[8] I.K. Sethi and B. Chatterjee, “Machine Recognition of constrained

Hand printed Devnagari”, Pattern Recognition, Vol. 9, pp. 69-75,

1977.

[9] U. Bhattacharya, B. B. Chaudhuri, R. Ghosh and M. Ghosh, “On

Recognition of Handwritten Devnagari Numerals”, In Proc. of the

Workshop on Learning Algorithms for Pattern Recognition (in

conjunction with the 18th Australian Joint Conference on Artificial

Intelligence), Sydney, pp.1-7, 2005.

[10] S. Arora, D. Bhattacharjee, M. Nasipuri, L. Malik, “Classification Of

Gradient Change Features Using MLP for Handwritten Character

Recognition”, Emerging Applications of Information Technology

(EAIT), Kolkata, India, 2006

[11] S. Arora, D. Bhattacharjee, M. Nasipuri, L. Malik, “A Novel

Approach for Handwritten Devnagari Character Recognition”,

International Conference on Signal and Image Processing (ICSIP),

Hubli, Karnataka, India, 2006

[12] S. Arora, D. Bhattacharjee, M. Nasipuri, L. Malik, “A Two Stage

Classification Approach for Handwritten Devanagari Characters”,

International Conference on Computational Intelligence and

Multimedia Application(ICCIMA07), Sivkasi, Tamil Nadu, India

2007

[13] S. Basu, N.Das, R. Sarkar, M. Kundu, M. Nasipuri, D.K. Basu,

“Handwritten Bangla alphabet recognition using MLP based

classifier”, NCCPB, Bangladesh, 2005

[14] M. Hanmandlu, O.V. Ramana Murthy, Vamsi Krishna Madasu,

“Fuzzy Model based recognition of Handwritten Hindi characters”,

IEEE Computer society, Digital Image Computing Techniques and

Applications , 2007

[15] Freeman, H., On the Encoding of Arbitrary Geometric

Configurations, IRE Trans. on Electr. Comp. or TC(10), No. 2,

June, 1961, pp. 260-268.

933

Rotation Invariance in Transform Features for Handwritten Devanagiri Character Recognition

Conference Paper

Oct 2018

ASPECTS OF ENGINEERING AND TECHNOLOGY IN HUMAN LIFE (ISBN NO: 978-81-947377-7-3) Saliha Publications

Chapter

Full-text available

Dec 2020

Natural language processing is a branch of computer science and artificial intelligence which is concerned with interaction between computers and human languages. Natural language processing is the study of mathematical and computational modelling of various aspects of language and the development of a wide range of systems. These include the spoken language systems that integrate speech and natural language. Natural language processing has a role in computer science because many aspects of the field deal with linguistic features of computation. Natural language processing is an area of research and application that explores how computers can be used to understand and manipulates natural language text or speech to do useful things. The applications of Natural language processing include fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross language information retrieval (CLIR), speech recognition, artificial intelligence (AI) and expert systems.

Handwritten Devanagari Character Recognition Using Modified Lenet and Alexnet Convolution Neural Networks

Article

Full-text available

Jan 2022
WIRELESS PERS COMMUN

Despite many advances, Handwritten Devanagari Character Recognition (HDCR) remains unsolved due to the presence of complex characters. For HDCR, the traditional feature extraction and classification techniques are limited to the datasets developed in the respective laboratory that are not available publicly. A standard benchmarking dataset is not available for HDCR that helps to develop deep learning models. To progress the performance of HDCR, in this study, we produced a dataset of 38,750 images of Devanagari numerals, and vowels are generated and made publicly available for fellow researchers in this domain. This data is collected from more than 3000 subjects of different age groups. Each character is extracted by a segmentation technique proposed here, which is limited to this application. Experiments are conducted on the dataset; three different Convolution Neural Networks (CNN) architecture is developed. 1. CNN, 2. Modified Lenet CNN (MLCNN) and 3. Alexnet CNN (ACNN). A Modified LCNN is proposed by changing the architecture of Lenet 5 CNN. Regular Lenet 5 has \(\mathrm{tanh}(x)\) as its activation function. Since the Devangari characters are nonlinear, non-linearity is introduced in the Networks by using Rectified Linear Unit. This solves the problem of vanishing gradient problem by \(\mathrm{tanh}(x)\). We achieved a recognition rate of 96% on training data and 94% on unseen data using CNN. MLCNN obtained an accuracy rate of 99% and 94% with less computational cost. Whereas, ACNN attained a recognition rate of 99% and 98% on unseen data. A series of experiments were conducted on the data with different combination splits of data and found a minimum loss of 0.001%. Such developments fill a significant percentage of the huge gap between real-world requirements and the actual performance of Devanagari recognizers.

Handwritten Hindi Character Recognition Using Layer-Wise Training of Deep Convolutional Neural Networks

Article

Oct 2020

Manually written character acknowledgment is as of now getting the consideration of scientists in view of potential applications in helping innovation for dazzle and outwardly hindered clients, human–robot collaboration, programmed information passage for business reports, and so on. In this work, we propose a strategy to perceive transcribed Devanagari characters utilizing profound convolutional neural organizations (DCNN) which are one of the ongoing procedures embraced from the profound learning network. We tested the ISIDCHAR information base gave by (Information Sharing Index) ISI, Kolkata and V2DMDCHAR information base with six distinct structures of DCNN to assess the exhibition and furthermore research the utilization of six as of late created versatile inclination strategies. A layer-wise method of DCNN has been utilized that assisted with accomplishing the most noteworthy acknowledgment exactness and furthermore get a quicker union rate. The consequences of layer-wise-prepared DCNN are great in correlation with those accomplished by a shallow strategy of high quality highlights and standard DCNN

Handwritten Devanagari Character Recognition using Convolutional Neural Network

Conference Paper

Full-text available

Oct 2018

Machine and Deep Learning Classifiers for Indic Scripts Recognition: Challenges and Research Perspective

Conference Paper

Oct 2022

Feature extraction and classification techniques for handwritten Devanagari text recognition: a survey

Article

Full-text available

Jun 2022
MULTIMED TOOLS APPL

The character recognition system is a vital area in the field of pattern recognition. One interesting, complex, and challenging task is handwritten character recognition because of various writing styles of individuals. The accuracy of such systems highly depends upon the extraction and selection of features. Many researchers proposed a variety of feature extraction and classification methods for various scripts including Devanagari. In view of that, this article presents a broad study of feature extraction and classification methods considered so far for online and offline Handwritten Character Recognition (HCR) for Devanagari script, which is essential in Optical Character Recognition (OCR) research. This article presents techniques used by authors, the dataset used, the accuracy achieved by the methods of the work already available for the OCR research. This article is depicting the latest studies, research gaps, challenges and future perspectives for the researchers working in the Devanagari text recognition domain. Moreover, methods developed for feature extraction and classification in the area of Devanagari character recognition are presented in a systematic way as an assistance for future researchers. It has been gathered that traditional feature extraction and classifications methods are being replaced with deep learning methods to achieve higher recognition accuracy in this area.

Handwritten Marathi Consonants Recognition using Multilevel Classification

Article

Full-text available

Jan 2016

This paper presents approach for the recognition of handwritten Marathi consonants. In order to recognize handwritten Marathi consonants, a database of handwritten Marathi consonants is developed to carry recognition experiments. Problem of handwritten Marathi consonant recognition is simplified using multilevel classificationwhich improves recognition rate. Total 36 Marathi consonants are transformed using instance simplification technique into six sub classesdepending on special property of consonants. Suitable features are extracted from different sub classes and further classification is carried out using SVM and k-NN classifiers.We have used database of 7920 characters for testing and found recognition accuracy 78.27% using SVM classifier and 73.29% using k-NN classifier.

Multiclass Recognition of Offline Handwritten Devanagari Characters using CNN

Article

Dec 2020
IJMEMS

The handwriting style of every writer consists of variations, skewness and slanting nature and therefore, it is a stimulating task to recognise these handwritten documents. This article presents a study on various methods available in literature for Devanagari handwritten character recognition and performs its implementation using Convolutional neural network (CNN). Available methods are studied on different parameters and a tabular comparison is also presented which concludes superiority of CNN model in character recognition task. The proposed CNN model results in well acceptable accuracy using dropout and stochastic gradient descent (SGD) optimizer.

A Survey on Devanagari Character Recognition

Chapter

Jan 2020

Machine recognition of constrained hand printed devanagari

Article

Full-text available

Jul 1977
PATTERN RECOGN

A method is presented for the machine recognition of constrained, hand printed Devanagari characters. A set of very simple primitives is used, and all the Devanagari characters are looked upon as a concatenation of these primitives. Most of the decisions are taken on the basis of the presence/absence or positional relationship of these primitives; and the decision process is a multistage process, where each stage of decision making narrows down the choice regarding the class membership of the input token.

Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier

Conference Paper

Full-text available

Jan 2006

Recognition of handwritten characters is a challenging task because of the variability involved in the writing styles of different individuals. In this paper we propose a quadratic classifier based scheme for the recognition of off- line Devnagari handwritten characters. The features used in the classifier are obtained from the directional chain code information of the contour points of the characters. The bounding box of a character is segmented into blocks and the chain code histogram is computed in each of the blocks. Based on the chain code histogram, here we have used 64 dimensional features for recognition. These chain code features are fed to the quadratic classifier for recognition. From the proposed scheme we obtained 98.86% and 80.36% recognition accuracy on Devnagari numerals and characters, respectively. We used five- fold cross-validation technique for result computation.

A Study of Zernike Moments and Its Use in Devanagari Handwritten Character Recognition

Conference Paper

Dec 2005

Satish Kumar

Devnagari numeral recognition by combining decision of multiple connectionist classifiers

Article

Feb 2002

This paper is concerned with recognition of handwritten Devnagari numerals. The basic objective of the present work is to provide an efficient and reliable technique for recognition of handwritten numerals. Three different types of features have been used for classification of numerals. A multi-classifier connectionist architecture has been proposed for increasing reliability of the recognition results. Experimental results show that the technique is effective and reliable.

On the Encoding of Arbitrary Geometric Configurations

Article

Jul 1961

Herbert Freeman

A method is described which permits the encoding of arbitrary geometric configurations so as to facilitate their analysis and manipulation by means of a digital computer. It is shown that one can determine through the use of relatively simple numerical techniques whether a given arbitrary plane curve is open or closed, whether it is singly or multiply connected, and what area it encloses. Further, one can cause a given figure to be expanded, contracted, elongated, or rotated by an arbitrary amount. It is shown that there are a number of ways of encoding arbitrary geometric curves to facilitate such manipulations, each having its own particular advantages and disadvantages. One method, the so-called rectangular-array type of encoding, is discussed in detail. In this method the slope function is quantized into a set of eight standard slopes. This particular representation is one of the simplest and one that is most readily utilized with present-day computing and display equipment.

A complete printed Bangla OCR system

Article

Mar 1998
PATTERN RECOGN

A complete Optical Character Recognition (OCR) system for printed Bangla, the fourth most popular script in the world, is presented. This is the first OCR system among all script forms used in the Indian sub-continent. The problem is difficult because (i) there are about 300 basic, modified and compound character shapes in the script, (ii) the characters in a word are topologically connected and (iii) Bangla is an inflectional language. In our system the document image captured by Flat-bed scanner is subject to skew correction, text graphics separation, line segmentation, zone detection, word and character segmentation using some conventional and some newly developed techniques. From zonal information and shape characteristics, the basic, modified and compound characters are separated for the convenience of classification. The basic and modified characters which are about 75 in number and which occupy about 96% of the text corpus, are recognized by a structural-feature-based tree classifier. The compound characters are recognized by a tree classifier followed by template-matching approach. The feature detection is simple and robust where preprocessing like thinning and pruning are avoided. The character unigram statistics is used to make the tree classifier efficient. Several heuristics are also used to speed up the template matching approach. A dictionary-based error-correction scheme has been used where separate dictionaries are compiled for root word and suffixes that contain morpho-syntactic informations as well. For single font clear documents 95.50% word level (which is equivalent to 99.10% character level) recognition accuracy has been obtained. Extension of the work to Devnagari, the third most popular script in the world, is also discussed.

Fuzzy model based recognition of handwritten numerals

Article

Jun 2007
PATTERN RECOGN

This paper presents the recognition of handwritten Hindi and English numerals by representing them in the form of exponential membership functions which serve as a fuzzy model. The recognition is carried out by modifying the exponential membership functions fitted to the fuzzy sets. These fuzzy sets are derived from features consisting of normalized distances obtained using the Box approach. The membership function is modified by two structural parameters that are estimated by optimizing the entropy subject to the attainment of membership function to unity. The overall recognition rate is found to be 95% for Hindi numerals and 98.4% for English numerals.

Chaudhuri, B.B.: Indian script character recognition-A survey. Pattern Recognition 37, 1887-1899

Article

Sep 2004
PATTERN RECOGN

Intensive research has been done on optical character recognition (OCR) and a large number of articles have been published on this topic during the last few decades. Many commercial OCR systems are now available in the market. But most of these systems work for Roman, Chinese, Japanese and Arabic characters. There are no sufficient number of work on Indian language character recognition although there are 12 major scripts in India. In this paper, we present a review of the OCR work done on Indian language scripts. The review is organized into 5 sections. Sections 1 and 2 cover introduction and properties on Indian scripts. In Section 3, we discuss different methodologies in OCR development as well as research work done on Indian scripts recognition. In Section 4, we discuss the scope of future work and further steps needed for Indian script OCR development. In Section 5 we conclude the paper.

Handwritten Bangla Alphabet Recognition using an MLP Based Classifier

Article

Mar 2012

The work presented here involves the design of a Multi Layer Perceptron (MLP) based classifier for recognition of handwritten Bangla alphabet using a 76 element feature set Bangla is the second most popular script and language in the Indian subcontinent and the fifth most popular language in the world. The feature set developed for representing handwritten characters of Bangla alphabet includes 24 shadow features, 16 centroid features and 36 longest-run features. Recognition performances of the MLP designed to work with this feature set are experimentally observed as 86.46% and 75.05% on the samples of the training and the test sets respectively. The work has useful application in the development of a complete OCR system for handwritten Bangla text.

Classification Of Gradient Change Features Using MLP For Handwritten Character Recognition

Article

Jun 2010

A novel, generic scheme for off-line handwritten English alphabets character images is proposed. The advantage of the technique is that it can be applied in a generic manner to different applications and is expected to perform better in uncertain and noisy environments. The recognition scheme is using a multilayer perceptron(MLP) neural networks. The system was trained and tested on a database of 300 samples of handwritten characters. For improved generalization and to avoid overtraining, the whole available dataset has been divided into two subsets: training set and test set. We achieved 99.10% and 94.15% correct recognition rates on training and test sets respectively. The purposed scheme is robust with respect to various writing styles and size as well as presence of considerable noise.

Study of Different Features on Handwritten Devnagari Character

Abstract

Recommended publications

Handwritten digit recognition using combination of neural network classifiers

Multiple Classifier Combination for Off-line Handwritten Devnagari Character Recognition

Combining Multiple Feature Extraction Techniques for Handwritten Devnagari Character Recognition

Complementary features combined in a MLP-based system to recognize handwritten Devnagari character

Recognition of Non-Compound Handwritten Devnagari Characters using a Combination of MLP and Minimum...