ArticlePDF Available

Handwriting Classification Based on Support Vector Machine with Cross Validation

January 2013
Engineering 05(05):84-87

January 2013
05(05):84-87

DOI:10.4236/eng.2013.55B017

License
CC BY 4.0

Authors:

Rubita Sudirman

Universiti Teknologi Malaysia

Puspa Inayat Khalid

Universiti Teknologi Malaysia

Support vector machine (SVM) has been successfully applied for classification in this paper. This paper discussed the basic principle of the SVM at first, and then SVM classifier with polynomial kernel and the Gaussian radial basis function kernel are choosen to determine pupils who have difficulties in writing. The 10-fold cross-validation method for training and validating is introduced. The aim of this paper is to compare the performance of support vector machine with RBF and polynomial kernel used for classifying pupils with or without handwriting difficulties. Experimental results showed that the performance of SVM with RBF kernel is better than the one with polynomial kernel.

Procedure of 10-fold cross validation.

…

Figures - uploaded by Puspa Inayat Khalid

Content may be subject to copyright.

Content uploaded by Puspa Inayat Khalid

Content may be subject to copyright.

Engineering, 2013, 5, 84-87

doi:10.4236/eng.2013.55B017 Published Online May 2013 (http://www.scirp.org/journal/eng)

Handwriting Classification Based on Support Vector

Machine with Cross Validation

Anith Adibah Hasseim, Rubita Sudirman, Puspa Inayat Khalid

Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Johor Bahru, Johor, Malaysia

Email: rubita@fke.utm.my

Received 2013

ABSTRACT

Support vector machine (SVM) has been successfully applied for classification in this paper. This paper discussed the

basic principle of the SVM at first, and then SVM classifier with polynomial kernel and the Gaussian radial basis func-

tion kernel are choosen to determine pupils who have difficulties in writing. The 10-fold cross-validation method for

training and validating is introduced. The aim of this paper is to compare the performance of support vector machine

with RBF and polynomial kernel used for classifying pupils with or without handwriting difficulties. Experimental re-

sults showed that the performance of SVM with RBF kernel is better than the one with polynomial kernel.

Keywords: Support Vector Machine; Handwriting Difficulties; Cross-Validation

1. Introduction

The field of handwriting has been of interest from a vari-

ety of aspects; its entity, indications and aesthetic. In the

beginning, the development of handwriting and the fac-

tors that affect handwriting performance were investi-

gated [1,2], but later whole words were addressed. Most

of the systems reported in the literature until today in-

volved screening measures in identifying pupils who are

at risk of handwriting difficulties and also addressed the

absence of an appropriate tool for monitoring beginning

handwriting development. More importantly, automated

handwriting analysis has been given more attention in the

hunt for quantitative features and key indicators in

monitoring beginning handwriting skill development.

Such automated handwriting analysis include recogniz-

ing the writer (e.g. [3]), the text written (e.g. [4]), move-

ment and procedure (e.g. [5,6]), or even semantic content

of the text (e.g. [7]). More or less each of these issues can,

and have been investigated either offline or online related

to the available data.

Up to sixty percent of children’s typical school day is

allocated to fine motor activities, with writing being the

predominant task during these time periods [8]. These

tasks all require the foundational skill of basic handwrit-

ing proficiency to allow teachers to accurately assess

students’ understanding and comprehension of instruc-

tional material. If students do not possess basic hand-

writing proficiency, it can limit their ability to success-

fully complete a majority of classroom tasks. In addition,

it has also been suggested that students with handwriting

problems need to focus more attention on the physical

process of writing, thus limiting use of higher order cog-

nitive skills, planning and generation of content [9]. Thus,

handwriting proficiency is an important foundation upon

which success with later writing tasks depends. Due to

the number of every day school tasks which involve

writing, unsuccessful mastery of handwriting skill can

negatively influence later success in school.

1.1. Support Vector Machine

Support Vector Machine (SVM) is a new classification

technique based on the statistical learning theory pro-

posed by Vapnik in 1995 [10]. It can successfully solve

over-fitting, local optimal problem and is especially

suitable for small-sample and high-dimensional nonlinear

case. Besides, it already showed good results in the

medical diagnostics, optical character recognition, elec-

tric load forecasting and other fields.

Kernel Fuction

In general, a radial basis function is one of the most

popular kernel and reasonable first choice. The reason

why is, this kernel nonlinearly

Given the linearly separability sample set (xi, yi) where

i = 1,…, n. If taking the simplest case; 2 class classifica-

tion, then x∈Rn, y ∈ { + 1, − 1} is the classes number.

The commonly form of the linear decision function is:

f(x) = w. x + b (1)

Sometimes linear classifiers are not complex enough;

therefore SVM maps the data into a higher dimensional

space, unlike the linear kernel which can handle the case

A. A. HASSEIM ET AL. 85

when the relation between class labels and attributes is

nonlinear [11]. Formally, pre-process the data with:

x →φ(x) (2)

and then learn the map from φ(x) to y:

f(x) = w. φ(x) + b (3)

However, the dimensionality of φ(x) can be very large,

making w hard to represent explicitly in memory, and

hard to solve. The Representer theorem (Kimeldorf &

Wahba, 1971) shows that (for SVMs as a special case):

()





 (4)

for some variables α. Instead of optimizing w directly we

can thus optimize α. The decision rule is now:



() ( ).

xxx

 



b



(5)

If the dot product (x. xi) is replaced by the kernel func-

tion K(x, x′), the optimal decision function is as follows:



() ,

fx Kxx b





 (6)

In this project, 2 kinds of common kernel function are

used. The first one is Gaussian radial basis function

(RBF):

(,) exp( )

Kx x g



 (7)

and the other one is polynomial kernel:



(,) . 1

Kx x xx









(8)

Classical techniques utilizing radial basis functions

employ some method of determining a subset of centre.

Typically a method of clustering is first employed to se-

lect a subset of centre. An attractive feature of the SVM

is that this selection is implicit, with each support vectors

contributing one local Gaussian function, center at that

data point.

1.2. Cross Validation (CV)

Currently, cross-validation has been widely used for es-

timating the performance of neural networks and other

applications such as support vector machine and k-nearest

neighbor. Cross-validation is a statistical method of

evaluating and comparing learning algorithms. The basic

idea of cross-validation is splitting the data, which is

consists of dividing the available training data into two

sets. The first set is used to train the network, while the

other is used to evaluate the performance of the trained

network. In typical cross-validation, the training and

validation sets must cross-over in successive rounds such

that each data point has a chance of being validated

against. The basic form of cross-validation is k-fold

cross-validation. Other forms of cross-validation are spe-

cial cases of k-fold cross-validation or involve repeated

rounds of k-fold cross-validation.

Advantages of this method are as follows: 1) Average

classification accuracies of k SVM classifiers are used to

evaluate the SVM classifier parameters performance

which can improve the generalization ability of the SVM

classifier with the optimized parameters; 2) k-fold

cross-validation method can ensure all the sample data be

involved in the SVM classifier training and validation, it

can make full use of the limited sample data; 3) no matter

how the data gets divided, every data point is used as a

test set exactly once, and gets to be in a training set k-1

times. The disadvantage of this method is that the train-

ing algorithm has to be rerun from scratch k times, which

means it takes k times as much computation to make an

evaluation.

2. Methodology

The data was obtained from Khalid et al in [13]. The data

is composed of 120 samples which contain 2 features

(that is The standard deviation of pen pressure when

drawing RU, p-value < 0.0001 and z-value = minus

4.319 and Ratio of time taken to draw HR and HL,

p-value < 0.0001 and z-value = minus 5.205.) and two

group of writers (that is below average printers (test

group) and above average printers (control group)).

Firstly, the data is portioned into k equally sizes seg-

ments or folds. In this project, we used 10-fold cross

validation (k = 10) as it is the most common used for data

mining and machine learning. As shown in Figure 1, the

darker section of the data are used for training while the

remaining data; lighter sections are used for validate the

model. This process is repeated 10 times until all sections

have been validated.

Model Parameter Selection

Two models; SVM of polynomial kernel function and

RBF kernel are chose in looking for performance com-

parison. Performance of the SVM depends on the choice

of parameters. The optimal selection of these parameters

is a nontrivial issue. According to study, the important of

RBF kernel is need to find parameter C and g. SVM of

polynomial kernel function chooses different parameter

C and d. The penalty factor C, is used to improve gener-

alized capability when C is increasing while g and d are

the adjustable parameter of study machine in the experi-

ment and they are used to adjust experienced error value.

The parameter slightly influences classification result

when a smaller amount of training samples are used [12].

After training SVM, the best value C and g can be

used to classify children with handwriting problems. For

A. A. HASSEIM ET AL.

Trai n

Model

Te st Tes t

Result

Train

Model

Tes t

Result

Figure 1. Procedure of 10-fold cross validation.

Table 1. Accuracy of Prediction based on SVM with RBF

Kernel.

the SVM with polynomial kernel, there are two parame-

ters: C and d. The SVM with RBF kernel has also two

parameters: g and C. In order to know different per-

formance each parameter produces to outputs, we select

three values for each parameter just like choosing the

number of hidden nodes in the neural networks.

Accuracy of Predictions (%)

Feature g/C

0.01 0.1 1

1 83.33 91.67 83.33

10 93.33 91.67 91.67

Feature 1

100 91.67 91.67 91.67

1 91.67 92.80 91.67

10 91.67 91.67 83.33

Feature 2

100 91.67 86.67 83.33

3. Results and Discussion

Table 1 and Table 2 present the recognition results using

the SVM with polynomial kernel, RBF kernel respec-

tively. The classification was considered correct if the

output from the model was similar to the one that had

been judged by the teachers (using Handwriting Profi-

ciency Screening Questionnaire (HPSQ)). In this paper,

we used the classification error (rejection of genuine

category) as the metric.

Table 2. Accuracy of Prediction based on SVM with Poly-

nomial Kernel.

According to Table 1, as can be seen the percentage of

correct prediction of feature 1 is in decreasing when the

variation g varies from 0.01 to 0.1. While it is in reverse

direction when the variation g varies from 0.1 to 1.The

results confirmed that the best value of the variation g

near 0.1. When the coefficient of penalty C is increased,

the accuracy of prediction is in decreasing. Different

from feature 1, feature 2 is seen to be decreasing in per-

centage of correct prediction when g varies from 0.01 to

1 and when C increases in the value.

Accuracy of Predictions (%)

Feature g/d

0.01 0.1 1

3 86.67 86.67 91.67

5 83.33 86.67 83.33

Feature 1

10 66.67 71.67 83.33

3 86.67 86.67 86.67

5 83.33 83.33 86.67

Feature 2

10 61.67 71.67 86.67

In the other hand, the result from Table 2 shows in

different from Table 1. It is clear that, when the variation

g increases in the value from 0.01 to 1, both percentage

of corrects prediction for feature 1 and feature trends to

decrease. While when the variation d varies from 3 to 10,

the accuracy of prediction is increasing. This exhibits

SVM good generalization performance.

The results reported here have shown that the per-

formance of SVM with RBF kernel is better than SVM

with polynomial kernel. We use SVM (RBF kernel) with

A. A. HASSEIM ET AL. 87

changing C and g to simulate and to classify children

with and without handwriting problem based on drawing

tasks.

4. Conclusions

SVM RBF and polynomial have been used in this study

to select those who are at risk of handwriting difficulty

due to the improper use of graphic rules. Cross-validation

method is adopted to choose parameter in order to gain

preferable classificatory result. In this paper, we have

testified that the performance of SVM with RBF kernel is

better than the one with polynomial kernel. Experiment

simulative results indicate: average accuracy of classifi-

catory testing based on SVM RBF algorithm reaches

more than 93%. The data is apparently high compared

with SVM polynomial algorithm.

5. Acknowledgements

This work was supported by the Malaysia Ministry of

Higher Education and Universiti Teknologi Malaysia

under Vote Q.J130000.2623.09J28.

REFERENCES

[1] V. Berninger, A. Cartwright, C. Yates, H. L. Swanson

and R. Abbott, “Developmental Skills Related to Writing

and Reading Acquisition in the Intermediate Grades:

Shared and Unique Variance,” Reading and Writing: An

Interdisciplinary Journal, Vol. 6, 1994, 161-196.

doi:10.1007/BF01026911

[2] S. Graham, V. W. Berninger, N. Weintraub and W.

Schafer, “Development of Handwriting Speed and Legi-

bility in Grades 1-9,” Journal of Educational Research,

Vol. 92, 1997, pp. 42-52.

doi:10.1080/00220679809597574

[3] Z. Yong, T. Tan and Y. Wang, “Biometric Personal Iden-

tification Based on Handwriting,” Pattern Recognition,

Proceedings. 15th International Conference on, Vol. 2,

2000, pp. 797-800.

[4] L. M. Lorigo and V. Govindaraju, “Offline Arabic Hand-

Writing Recognition: A Survey,” Pattern Analysis and

Machine Intelligence, IEEE Transactions on, Vol. 28, pp.

712-724, 2006. doi:10.1109/TPAMI.2006.102

[5] H. Ishida, et al., “A Hilbert Warping Method for Hand-

Writing Gesture Recognition,” Pattern Recogn., Vol. 43,

2010, pp. 2799-2806. doi:10.1016/j.patcog.2010.02.021

[6] H. Bezine, A. D. Alimi and N. Sherkat, “Generation and

Analysis of Handwriting Script with the Beta-Elliptic

Model,” Proceedings of the Ninth International Work-

shop on Frontiers in Handwriting Recognition, 2004, pp.

515-520. doi:10.1109/IWFHR.2004.45

[7] S. Srihari, J. Collins, R. Srihari, H. Srinivasan, S. Shetty

and J. B. Griffler, “Automatic Scoring of Short Hand-

Written Essays in Reading Comprehension Tests,” Artifi-

cial Intelligence, Vol. 172, 2008, pp. 300-324.

doi:10.1016/j.artint.2007.06.005

[8] K. McHale and S. A. Cermak, “Fine Motor Activities in

Elementary School: Preliminary Findings and Provisional

Implications for Children with Fine Motor Problems,”

American Journal of Occupational Therapy, Vol. 46,

No.10, 1992, pp. 898-903. doi:10.5014/ajot.46.10.898

[9] V. Berninger, “Coordinating Transcription and Text Gen-

eration in Working Memory During Composing: Auto-

matic and Constructive Processes,” Learning Disability

Quarterly, Vol. 22, 1999, pp. 99-112.

doi:10.2307/1511269

[10] V. N. Vapnik, “The Nature of Statistical Learning The-

ory,” Springer, New York, 1995.

doi:10.1007/978-1-4757-2440-0

[11] M. Hong, G. Yanchun, W. Yujie and L. Xiaoying, “Study

on Classification Method Based on Support Vector Ma-

chine,” 2009 First International Workshop on Education

Technology and Computer Science, Wuhan, China, 7-8

March 2009, pp.369-373.

[12] S. S. Keerthi and C. J. Lin, “Asymptotic Behaviors of

Support Vector Machines with Gaussian Kernel,” Neural

Computation, Vol. 15, No. 7, 2003, 1667-1689.

doi:10.1162/089976603321891855

[13] S. S. Keerthi and C. J. Lin, “Asymptotic Behaviors of

Support Vector Machines with Gaussian Kernel,” Neural

Computation, Vol. 15, No. 7, 2003, pp. 1667-1689.

doi:10.1162/089976603321891855

DEVELOPMENT OF HEALTH INSURANCE CLAIM PREDICTION METHOD BASED ON SUPPORT VECTOR MACHINE AND BAT ALGORITHM

Article

Full-text available

Dec 2023

Health insurance industry is very much needed by the community in handling the financial risks in the health sector. The number of claims greatly affects the achievement of profits and the sustainability of the health insurance industry. Therefore, filing claims by insurance users from year to year is important to be predicted in insurance firm. The Machine Learning (ML) method promises to be a good solution for predicting health insurance claims compared to conventional data analytics methods. Support Vector Machine (SVM) is one of the superior ML approaches. Nonetheless, SVM performance is controlled by the suitable selection of SVM parameters. The SVM parameters is typically selected by trial and error, sometimes resulting in not optimal performance and taking a long time to complete. Swarm intelligence-based algorithms can be used to select the best parameters from SVM. This method is capable of locating the global best solution, is simple to implemented, and doesn't involve derivatives. One of the best swarm intelligence algorithms is the Bat Algorithm (BA). BA has a faster convergence rate than other algorithms, for example Particle Swarm Optimization (PSO). Based on this situation, this paper offers the new classification model for predicting health insurance claim based on SVM and BA. The metrics utilized for evaluation are accuracy, recall, precision, f1-score, and computing time. The experimental outcomes show that the proposed approach is superior to the conventional SVM and the hybrid of SVM and PSO in forecasting health insurance claims. In addition, the proposed method has a substantially shorter computing time than the hybrid of SVM and PSO. The outcomes of the experiments also indicate that the new classification model for predicting health insurance claim based on the SVM and BA can avoid over-fitting condition.

HEALTH CLAIM INSURANCE PREDICTION USING SUPPORT VECTOR MACHINE WITH PARTICLE SWARM OPTIMIZATION

Article

Full-text available

Jun 2023

The number of claims plays an important role the profit achievement of health insurance companies. Prediction of the number of claims could give the signiﬁcant implications in the profit margins generated by the health insurance company. Therefore, the prediction of claim submission by insurance users in that year needs to be done by insurance companies. Machine learning methods promise the great solution for claim prediction of the health insurance users. There are several machine learning methods that can be used for claim prediction, such as the Naïve Bayes method, Decision Tree (DT), Artificial Neural Networks (ANN) and Support Vector Machine (SVM). The previous studies show that the SVM has some advantages over the other methods. However, the performance of the SVM is determined by some parameters. Parameter selection of SVM is normally done by trial and error so that the performance is less than optimal. Some optimization algorithms based heuristic optimization can be used to determine the best parameter values of SVM, for example Particle Swarm Optimization (PSO) and Genetic Algorithm (GA). They are able to search the global optimum, easy to be implemented. The derivatives aren’t needed in its computation. Several researches show that PSO give the better solutions if it is compared with GA. All particles in the PSO are able to find the solution near global optimal. For these reasons, this article proposes the health claim insurance prediction using SVM with PSO. The experimental results show that the SVM with PSO gives the great performance in the health claim insurance prediction and it has been proven that the SVM with PSO give better performance than the SVM standard.

Transfer Learning to Detect Age From Handwriting

Article

Jan 2022

Handwriting analysis is the science of determining an individual’s personality from his or her handwriting by assessing features such as slant, pen pressure, word spacing, and other factors. Handwriting analysis has a wide range of uses and applications, including dating and socialising, roommates and landlords, business and professional, employee hiring, and human resources. This study used the ResNet and GoogleNet CNN architectures as fixed feature extractors from handwriting samples. SVM was used to classify the writer’s gender and age based on the extracted features. We built an Arabic dataset named FSHS to analyse and test the proposed system. In the gender detection system, applying the automatic feature extraction method to the FSHS dataset produced accuracy rates of 84.9% and 82.2% using ResNet and GoogleNet, respectively. While the age detection system using the automatic feature extraction method achieved accuracy rates of 69.7% and 61.1% using ResNet and GoogleNet, respectively

Significance of prosodic features for automatic emotion recognition

Conference Paper

Full-text available

Apr 2020

This paper aims to recognize emotions from speech on a more realistic database using various classifiers. For this purpose, experiments are conducted using the standard 6373 dimensional Computational Paralinguistic Challenge (ComParE) feature set. The features extracted are modeled using Support Vector Machine (SVM) and Deep Neural Network (DNN) classifiers. The effectiveness of the proposed system has been validated on the Emotional Sensitivity Assistance System for People with Disabilities (EmotAsS) database, provided as part of the INTERSPEECH 2018 Computational Paralinguistics Challenge. Besides, experiments have also been performed on a reduced subset of the standard ComParE acoustic feature set consisting of 873 prosodic features. Experimental results suggest that the reduced prosodic feature set provides comparable performance with the original feature set. It is also observed that DNN classifier provides better performance than SVM.

Local Contrast based Texture features for Gliomas Grade Identification using Magnetic Resonance Images

Article

Aug 2019

Manu Gupta

Gliomas are most common brain tumor in children and adults worldwide and accounts for 80% of all malignant tumors. In this work, we proposed a novel method for glioma grade classification using texture feature set extracted from T2-weighted magnetic resonance images (MRI). Gray-level co-occurrence matrix (GLCM) parameters are computed from local Optimal Oriented Pattern (LOOP) transformed images to differentiate low grade and high grade glioma. Classification is carried out using support vector machine (SVM), Naive Bayes and k-nearest neighbor (k-NN) classifier and their performance for glioma grade classification is accessed. SVM classifier outperforms other classifiers and achieved an accuracy of 95%, sensitivity of 93% and specificity of 100% for classifying gliomas using proposed LOOP transformed based GLCM texture features.

Age Detection From Handwriting using different feature classification models

Article

Feb 2023
PATTERN RECOGN LETT

Novel Feature Extraction Methods to Detect Age from Handwriting

Chapter

Dec 2022

Age detection from handwritten documents is a crucial research area in many disciplines such as forensic analysis and medical diagnosis. Furthermore, this task is challenging due to the high similarity and overlap between individuals’ handwriting. The performance of the document recognition and analysis systems, depends on the extracted features from handwritten documents, which can be a challenging task as this depends on extracting the most relevant information from row text. In this paper, a set of age-related features suggested by a graphologist, to detect the age of the writers, have been proposed. These features include irregularity in slant, irregularity in pen pressure, irregularity in textlines, and the percentage of black and white pixels. Support Vector Machines (SVM) classifier has been used to train, validate and test the proposed approach on two different datasets: the FSHS and the Khatt dataset. The proposed method has achieved a classification rate of 71% when applied to FSHS dataset. Meanwhile, our method outperformed state-of-arts methods when applied to the Khatt dataset with a classification rate of 65.2%. Currently, these are the best rates in this field.KeywordsAge detectionMachine learningImage processingHandwriting analysis

Novel Features to Detect Gender from Handwritten Documents

Article

Sep 2022
PATTERN RECOGN LETT

Gender detection from handwritten documents is a crucial research area in many disciplines such as psychology, pyelography, graphology, and forensic analysis. Furthermore, this task is challenging due to the high similarity and overlap between individuals’ handwriting. The performance of the document recognition and analysis systems, depends on the extracted features from handwritten documents, which can be a challenging task as this depends on extracting the most relevant information from row text. In this paper, a set of gender-related features suggested by a graphologist, to detect the gender of the writers, have been proposed. These features include margins, space between words, pen-pressure and handwriting irregularity. Both SVM and ANN classifiers have been used to train, validate and test the proposed approach on two different data sets: our data set FSHS and ICDAR2013 dataset. The proposed method has achieved high classification rates of 94.7% and 97.1% using SVM and ANN respectively. Meanwhile, our method outperformed state-of-arts methods when applied to the ICDAR2013 dataset with classification rates of 91.4% and 92.5% using SVM and ANN respectively.

Non-invasive Brain Tumor Detection using Magnetic Resonance Imaging based Fractal Texture Features and Shape Measures

Conference Paper

Feb 2020

Automatic Personality Analysis through Signatures

Article

May 2019

Sameera Khan

Biometric personal identification based on iris patterns

Article

Jan 2000

The Nature of Statistical Learning Theory

Chapter

Jan 2000

Vladimir N. Vapnik

In the history of research of the learning problem one can extract four periods that can be characterized by four bright events: (i) Constructing the first learning machines, (ii) constructing the fundamentals of the theory, (iii) constructing neural networks, (iv) constructing the alternatives to neural networks.

Coordinating Transcription and Text Generation in Working Memory during Composing: Automatic and Constructive Processes

Article

May 1999

Virginia W. Berninger

Research evidence is reviewed to show (a) that transcription and working memory processes constrain the development of composition skills in students with and without learning disabilities; and (b) that in turn other processes constrain the development of transcription and working memory skills. The view of working memory as a resource-limited process is contrasted with a view of working memory as a resource-coordination process that integrates transcription and constructive processes, which may be on different time scales, in real time. Theory-driven, research-validated interventions for transcription are discussed with a focus on how training transcription transfers to improved composition. Five theoretical explanations for why the spelling component of transcription is more difficult to learn than the word recognition component of reviewing are also considered with a focus on the instructional implications of each for improving spelling. Finally, a rationale is presented for directing writing instruction to the simultaneous goals of (a) automaticity of low-level transcription and (b) high-level construction or meaning for purposeful communication.

Study on Classification Method Based on Support Vector Machine

Article

Jan 2009

Classification experiments are made with neural network algorithm and support vector machine method separately. The samples are divided into three groups and two kinds of support vector machines based on polynomial kernel and radial basis function are applied by changing the parameter values. The simulated results show that, as for the dataset with less training samples, using simple structure learning function will avoid the over fitting problem. In contrast, the learning function with slightly simple structure will reduce the generalization ability. In the experiment, the Penalty factor C is introduced in order to allow the training samples to be classified wrongly. Increasing the value of C , generalization ability of the learning machine can be improved. Using cross-validation method to choose parameter values can improve the classification accuracy. The experimental results show that the support vector machine method is superior to the neural network algorithm.

Development of Handwriting Speed and Legibility in Grades 1–9

Article

Sep 1998

The development of handwriting speed and legibility in 900 children in Grades 1–9 was examined. Each student completed 3 writing tasks: copying a paragraph, writing a narrative, and writing an essay. The children's speed of handwriting on the copying task typically increased from one grade to the next, but the pace of development was uneven during the intermediate grades and leveled off in Grade 9 as speed began to approximate adult speeds. In contrast, improvement in handwriting legibility on the 3 writing tasks was primarily limited to the intermediate grades. Girls' handwriting was more legible than boys' handwriting, and the girls wrote faster in Grades 1, 6, and 7. Right-handers were also faster than left-handers, but there was no difference in the legibility of their written products. Finally, handwriting speed contributed significantly to the prediction of legibility on the narrative and expository writing tasks, but the contribution was small, accounting for only 1 % of the variance.

The Nature Of Statistical Learning Theory

Book

Jan 1995

Vladimir N. Vapnik

Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

Developmental skills related to writing and reading acquisition in the intermediate grades - Shared and unique functional systems

Article

Jun 1994

Multiple measures of the fine motor system, the orthographic system, the phonological system, the working memory system, the verbal intelligence system, the writing system, and the reading system were administered to 300 students in grades 4, 5, and 6. Results showed that the writing system and the reading system share many of the same orthographic, phonological, and working memory sub-processes but thepatterns of concurrent relation between these sub-processes and writing and between these subprocesses and reading differ. These results are consistent with the view that writing and reading draw upon the same as well as unique cognitive systems.

Automatic scoring of short handwritten essays in reading comprehension tests

Article

Feb 2008
ARTIF INTELL

Reading comprehension is largely tested in schools using handwritten responses. The paper describes computational methods of scoring such responses using handwriting recognition and automatic essay scoring technologies. The goal is to assign to each handwritten response a score which is comparable to that of a human scorer even though machine handwriting recognition methods have high transcription error rates. The approaches are based on coupling methods of document image analysis and recognition together with those of automated essay scoring. Document image-level operations include: removal of pre-printed matter, segmentation of handwritten text lines and extraction of words. Handwriting recognition is based on a fusion of analytic and holistic methods together with contextual processing based on trigrams. The lexicons to recognize handwritten words are derived from the reading passage, the testing prompt, answer rubric and student responses. Recognition methods utilize children's handwriting styles. Heuristics derived from reading comprehension research are employed to obtain additional scoring features. Results with two methods of essay scoring—both of which are based on learning from a human-scored set—are described. The first is based on latent semantic analysis (LSA), which requires a reasonable level of handwriting recognition performance. The second uses an artificial neural network (ANN) which is based on features extracted from the handwriting image. LSA requires the use of a large lexicon for recognizing the entire response whereas ANN only requires a small lexicon to populate its features thereby making it practical with current word recognition technology. A test-bed of essays written in response to prompts in statewide reading comprehension tests and scored by humans is used to train and evaluate the methods. End-to-end performance results are not far from automatic scoring based on perfect manual transcription, thereby demonstrating that handwritten essay scoring has practical potential.

The Nature of Satistical Learning Theory

Chapter

Jan 1999

Vladimir N. Vapnik

Biometric Personal Identification Based on Handwritin.

Conference Paper

Jan 2000

In this paper, we describe a new method to identify the writer of Chinese handwriting documents. There are many methods for signature verification or writer identification, but most of them require segmentation or connected component analysis. They are the kinds of content dependent identification methods as signature verification requires the writer to write the same text (e.g. his name). In our new method, we take the handwriting as an image containing some special texture, and writer identification is regarded as texture identification. This is a content independent method. We apply the well- established 2-D Gabor filtering technique to extract features of such textures and a Weighted Euclidean Distance classifier to fulfil the identification task. Experiments are made using Chinese handwritings from 17 different people and very promising results were achieved.

Handwriting Classification Based on Support Vector Machine with Cross Validation

Abstract and Figures

Recommended publications

On the optimal parameter choice for ν-support vector machines

Use Support Vector Machine to Evaluate the Operational Effectiveness of Radar Jammer

Polyhedral conic kernel-like functions for SVMs

Study of flue-cured Tobacco classification model based on the PSO-SVM