Figure 21 - uploaded by Badre-Eddine el Kessab
Content may be subject to copyright.
The Multi-layer Perceptron 

The Multi-layer Perceptron 

Source publication
Article
Full-text available
This paper deals with an optical character recognition (OCR) system of handwritten digit, with the use of neural networks (MLP multilayer perceptron). And a method of extraction of characteristics based on the digit form, this method is tested on the MNIST handwritten isolated digit database (60000 images in learning and 10000 images in test). This...

Similar publications

Preprint
Full-text available
Technological advancement has led to digitizing hard copies of media effortlessly with optical character recognition (OCR) system. As OCR systems are being used constantly, converting printed or handwritten documents and books have become simple and time efficient. To be a fully functional structure, Bengali OCR system needs to overcome some constr...
Preprint
Full-text available
With the increasing prevalence of video recordings there is a growing need for tools that can maintain the privacy of those recorded. In this paper, we define an approach for redacting personally identifiable text from videos using a combination of optical character recognition (OCR) and natural language processing (NLP) techniques. We examine the...
Article
Full-text available
This paper explores the contribution of authenticity to digitized handwritten signatures using a deep learning-based approach, implementing ResNet-50 and optical character recognition (OCR). Signature authentication is a crucial issue in various fields, such as transaction security, protection of official documents, and fraud prevention. Our approa...
Article
Full-text available
This paper comprehensively assesses the application of active learning strategies to enhance natural language processing-based optical character recognition (OCR) models for image-to-LaTeX conversion. It addresses the existing limitations of OCR models and proposes innovative practices to strengthen their accuracy. Key components of this study incl...
Article
Full-text available
An optical character recognition (OCR) system segments the character from the given document before recognizing it. The recognition of such character images requires the class labels to be associated with each character sample in the training set, and this requires the placing of all the samples of each segmented character in various distinct folde...

Citations

... The next repeated except for the number and size of convolution kernels, the convolution layer, and the pooling layer work similarly to the preceding ones. Eventually, the output layer is a fully linked layer where the classifier's outcome is the output neurons' largest value [2]. ...
... Again comes another set of convolution layer and pooling [17] layers which have similar operation patterns, except for the fact that the numeral amount and size of convolution kernels. Our final layer that is the output layer which is fully connected layer as the name suggests it combines all the neuron to produce and output where the result of the classifier is the maximum value of output neurons [2]. ...
Chapter
Full-text available
The goal of this analysis has been on the development of handwritten digit recognition with the use of the MNIST dataset. In the latest days, the identification of handwritten digits has become a challenging research topic in machine learning. Due to physically formed digits having varying lengths, widths, orientations, and positions. It may be utilized in several ways, such as the amount and signature on bank checks, the location of postal and tax papers, and so on. This research used CNN for recognition. Total four steps followed by pre-processing, feature extraction, training CNN, classification, and recognition. Along with its great higher accuracy, CNN outperforms other methods in detecting essential characteristics without the need for human intervention. On top of that, it incorporates unique levels of convolution and pooling processes. Through CNN, 97.78% accuracy was obtained.
... [15]. Again, the traditional recognition processes are also time-consuming for several processing stages [8] [37]. Fig. 7 shows the validation performance of proposed five CNN architectures and among them, architecture 5 resulted in the highest accuracy. ...
Article
Full-text available
Background: Handwriting recognition becomes an appreciable research area because of its important practical applications, but varieties of writing patterns make automatic classification a challenging task. Classifying handwritten digits with a higher accuracy is needed to improve the limitations from past research, which mostly used deep learning approaches. Objective: Two most noteworthy limitations are low accuracy and slow computational speed. The current study is to model a Convolutional Neural Network (CNN), which is simple yet more accurate in classifying English handwritten digits for different datasets. Novelty of this paper is to explore an efficient CNN architecture that can classify digits of different datasets accurately. Methods: The author proposed five different CNN architectures for training and validation tasks with two datasets. Dataset-1 consists of 12,000 MNIST data and Dataset-2 consists of 29,400-digit data of Kaggle. The proposed CNN models extract the features first and then performs the classification tasks. For the performance optimization, the models utilized stochastic gradient descent with momentum optimizer. Results: Among the five models, one was found to be the best performer, with 99.53% and 98.93% of validation accuracy for Dataset-1 and Dataset-2 respectively. Compared to Adam and RMSProp optimizers, stochastic gradient descent with momentum yielded the highest accuracy. Conclusion: The proposed best CNN model has the simplest architecture. It provides a higher accuracy for different datasets and takes less computational time. The validation accuracy of the proposed model is also higher than those of in past works.
... The static representation of a digitized document is used in the offline system of digit recognition, example of which are check form, mail or document processing. Contrary to offline, in online system depends on the information acquired during the production of the handwriting [2], [3], [4]. ...
... In this research for four features were selected (Vertical, Horizontal, Diagonals and Inverse-diagonals Histograms) for the feature extraction stage. Vertical histogram as in (1) generates 28 element, horizontal histogram as in (2) generates also 28 elements, while Diagonal (3), (4) generates 55 element, and Inverse-diagonal (5), (6) generates 55 elements too. All these elements were joined in one feature vector of 155 elements. ...
Research
Full-text available
Handwritten Digit Recognition has been proposed using different techniques that were implemented over the available datasets. Although existing systems reached high recognition accuracy, more efforts regarding speed and memory allocation is required. In this research, we experiment the impact of image resolution reduction on recognition accuracy for handwritten digits. A set of features were extracted, include histogram of pixels for horizontal, vertical, diagonal and inversed diagonal orientations. Feature vector constructed by joining these features. Then, support vector machine is applied for classification. Rescaled handwritten digit images were experimented against recognition accuracy, speed and memory. MNIST database of handwritten digits is utilized for implementation. Results showed that the reduction of the size for the features vector due to image rescaling to quarter of the original size had only about 1% accuracy degradation impact.
... Some of them are shown to make this research reliable. Kessab et al., [7] presented an idea about over the MNIST database considering extraction method. Dataset had been taken from the web, LeCun (1998). ...
Article
Full-text available
With the expansion of Artificial Neural Network (ANN), Deep Learning (DL) has brought interesting turn in the various fields of Artificial Intelligence (AI) by making it smarter and more efficient than what we had even in 10-2 years back. DL has been in use in various fields due to its versatility. Convolutional Neural Network (CNN) is at the major point of advancement that brings together the ANN and innovative DL techniques. In this research paper, we have contrived a multi-layer, fully connected neural network (NN) with 10 and 12 hidden layers for handwritten digits (HD) recognition. The testing is performed on the publicly attainable MNIST handwritten database. We selected 60,000 images from the MNIST database for training, and 10,000 images for testing. Our multi-layers ANN (10), ANN (12) and CNN are able to achieve an overall accuracy of 99.10%, 99. 34% and 99.70% respectively while determining digits using the MNIST handwriting dataset.
... In this paper we propose using SVM optimized by the latest GFWA swarm intelligence algorithm for handwritten digit recognition intentionally using weak features with which other approaches would not give good results. Our proposed algorithm was tested on standard MNIST dataset for handwritten digit recognition and performance was better than other approaches from literature [9][10][11]. ...
... Rather simple features were used in order to prove the quality of the proposed classifier. Our proposed algorithm was compared with [9] where bat algorithm was used for the SVM optimization using the same features, and with [11] where projection histograms combined with zoning technique were used as features, while multi-layer perception neural network was used as classifier. ...
... It has been proven earlier that for handwritten digit recognition SVM with parameters tuned by the GFWA provides better results than grid search. Moreover, in [9] results were compared with results from [11], which are also included in this paper. Method from [9] outperformed the method proposed in [11] and our proposed algorithm outperformed both. ...
Chapter
Full-text available
Handwritten digit recognition is an important subarea in the object recognition research area. Support vector machines represent a very successful recent binary classifier. Basic support vector machines have to be improved in order to deal with real-world problems. The introduction of soft margin for outliers and misclassified samples as well as kernel function for non linearly separably data leads to the hard optimization problem of selecting parameters for these two modifications. Grid search which is often used is rather inefficient. In this paper we propose the use of one of the latest swarm intelligence algorithms, the fireworks algorithm, for the support vector machine parameters tuning. We tested our approach on standard MNIST base of handwritten images and with selected set of simple features we obtained better results compared to other approaches from literature.
... However, they severely reduced the accuracy (down to 96%) and they cut all fully connected layers. In [16], the authors implemented a number recognition based on MNIST using a multilayer perceptron. However, they only achieved 80% accuracy while we achieved a much higher accuracy above 99%. ...
Conference Paper
In this paper, a software comparative analysis of two neural network models is presented, namely, Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM) neural network. The evaluation is performed using the famous deep learning database the "MNIST" to check the accuracy, model size, speed and complexity of the two models for future digital realization on reconfigurable hardware. In addition to that, we optimize the size of the two models by quantizing the weights width to 8-bits instead of 32-bits. The results show an extensive reduction in the size of each model (by 10X) with a slight drop in the accuracy. The results also show that the CNN is more accurate and much faster than LSTMs making it the best model to be implemented on reconfigurable hardware. Keywords-Convolutional neural network; Long Short Term Memory Recurrent Neural Networks;MNIST I. INTRODUCTION Deep Neural Networks have proven to be increasingly useful and more accurate than ever. The task of a classifier is to obtain knowledge from a known example set with labels, to be able to correctly label data outside of the set [1]. Convolutional Neural Networks, which are a type of deep neural network, have shown an outstanding performance and especially in computer vision. Long Short Term memory neural networks (LSTM) have also been on the rise, as they have come to show excellent results over older recurrent networks. Applications of deep neural networks, or deep learning, have proven to be endless, ranging from image recognition [2] and speech applications to medical diagnosis using DNA and other medical data [3]. Use cases of CNNs include image recognition [2], and many more. Some use cases of LSTM include Document modeling [4] and speech recognition [5][6]. Our work here is a comparison between two neural network models, in terms of complexity and accuracy tradeoff. It is taken into account the accuracy, speed, and the size, of the two inherently different models. Namely, the convolutional neural network, and the Long short Term memory recurrent neural network. We tend to show the accuracy tradeoff with the model size and computational cost, or complexity. The aim is to find an efficient model among the LSTM and CNN to be realized in the future on a reconfigurable hardware such as FPGAs.
... Recently many new classifiers and feature extraction methods have been proposed and tested for handwritten numerals recognition. The techniques proposed and developed provide high classification accuracies [6]- [10]. However, there is still room for new techniques that are efficient and provide high-quality features. ...
... Kussul and Baidyk use the classifier Limited Receptive Area (LIRA) to classify MNIST database producing very high accuracies [14]. Kessab et al. presented a novel system of handwritten digit recognition using Multi-Layer Perceptron (MLP) and a novel method for features extraction, and a recognition rate of 80% is acquired [10]. ...
Article
Arabic handwritten digit recognition is the science of recognition and classification of handwritten Arabic digits. It has been a subject of research for many years with rich literature available on the subject. Handwritten digits written by different people are not of the same size, thickness, style, position or orientation. Hence, many different challenges have to overcome for resolving the problem of handwritten digit recognition. The variation in the digits is due to the writing styles of different people which can differ significantly. Automatic handwritten digit recognition has wide application such as automatic processing of bank cheques, postal addresses, and tax forms. A typical handwritten digit recognition application consists of three main stages namely features extraction, features selection, and classification. One of the most important problems is feature extraction. In this paper, a novel feature extraction approach for off-line handwritten digit recognition is presented. Wavelets-based analysis of image data is carried out for feature extraction, and then classification is performed using various classifiers. To further reduce the size of training data-set, high entropy subbands are selected. To increase the recognition rate, individual subbands providing high classification accuracies are selected from the over-complete tree. The features extracted are also normalized to standardize the range of independent variables before providing them to the classifier. Classification is carried out using k-NN and SVMs. The results show that the quality of extracted features is high as almost equivalently high classification accuracies are acquired for both classifiers, i.e. k-NNs and SVMs.
... The RBF layer is connected to a shallow Multi-Layer Perceptron (MLP) [32] to perform classification. The MLP is among the most useful types of neural networks, with an ability to learn the representation of data and relate it to the output, increasing the overall accuracy. ...
Article
Full-text available
This paper introduces the "Projectron" as a new neural network architecture that uses Radon projections to both classify and represent medical images. The motivation is to build shallow networks which are more interpretable in the medical imaging domain. Radon transform is an established technique that can reconstruct images from parallel projections. The Projectron first applies global Radon transform to each image using equidistant angles and then feeds these transformations for encoding to a single layer of neurons followed by a layer of suitable kernels to facilitate a linear separation of projections. Finally, the Projectron provides the output of the encoding as an input to two more layers for final classification. We validate the Projectron on five publicly available datasets, a general dataset (namely MNIST) and four medical datasets (namely Emphysema, IDC, IRMA, and Pneumonia). The results are encouraging as we compared the Projectron's performance against MLPs with raw images and Radon projections as inputs, respectively. Experiments clearly demonstrate the potential of the proposed Projectron for representing/classifying medical images.
... The RBF layer is connected to a shallow Multi-Layer Perceptron (MLP) [32] to perform classification. The MLP is among the most useful types of neural networks, with an ability to learn the representation of data and relate it to the output, increasing the overall accuracy. ...
Preprint
This paper introduces the `Projectron' as a new neural network architecture that uses Radon projections to both classify and represent medical images. The motivation is to build shallow networks which are more interpretable in the medical imaging domain. Radon transform is an established technique that can reconstruct images from parallel projections. The Projectron first applies global Radon transform to each image using equidistant angles and then feeds these transformations for encoding to a single layer of neurons followed by a layer of suitable kernels to facilitate a linear separation of projections. Finally, the Projectron provides the output of the encoding as an input to two more layers for final classification. We validate the Projectron on five publicly available datasets, a general dataset (namely MNIST) and four medical datasets (namely Emphysema, IDC, IRMA, and Pneumonia). The results are encouraging as we compared the Projectron's performance against MLPs with raw images and Radon projections as inputs, respectively. Experiments clearly demonstrate the potential of the proposed Projectron for representing/classifying medical images.
... The Dilation is one of the basic operations in mathematical morphology [2][3][4][5][6][7][8]. Originally developed for binary images, it has been expanded first to grayscale images. ...
Article
Full-text available
The isolated handwritten character recognition with multiple styles is a challenging research problem. In this paper, we propose a novel method of features extraction for character recognition based on the mathematical morphology and histogram techniques into vertical, horizontal, diagonal and anti-diagonal directions, knowing that the features extarction method is an important step in many image processing tasks. In this context, we present two comparisons in isolated handwritten Greek characters recognition, in fact the first comparison is between the hybrid methods exploited in features extraction which are the mathematical morphology combined with the histogram method; in contrast the second comparison is performed in order to deduce what is the most powerful between third genres of distances used in classification The Euclidean, Manhattan, and Minkowski distances. For this purpose, we have pre-processing each character image with different techniques. Furthermore, in the experiments results we provide extensive comparisons which demonstrate that our method outperforms for different characters’ recognition, the results that we have obtained demonstrates really in one hand the performance of a novel method used in features extraction and the Euclidean distance in classification in the other hand.