The Multi-layer Perceptron

Source publication

Extraction Method of Handwritten Digit Recognition Tested on the MNIST Database

Article

Full-text available

Jan 2013

This paper deals with an optical character recognition (OCR) system of handwritten digit, with the use of neural networks (MLP multilayer perceptron). And a method of extraction of characteristics based on the digit form, this method is tested on the MNIST handwritten isolated digit database (60000 images in learning and 10000 images in test). This...

Constraints in Developing a Complete Bengali Optical Character Recognition System

Preprint

Full-text available

Mar 2020

Technological advancement has led to digitizing hard copies of media effortlessly with optical character recognition (OCR) system. As OCR systems are being used constantly, converting printed or handwritten documents and books have become simple and time efficient. To be a fully functional structure, Bengali OCR system needs to overcome some constr...

Fig. 1. An example of redacting names, and phone numbers using Google...

Fig. 2. Pipeline for proposed redaction method.

Fig. 3. Overview of image samples and corresponding outputs.

Fig. 4. Accuracy analysis graphs. (a) Overall accuracy between two OCR...

Precision and Recall table for two OCR models

To show or not to show: Redacting sensitive text from videos of electronic displays

Preprint

Full-text available

Aug 2022

With the increasing prevalence of video recordings there is a growing need for tools that can maintain the privacy of those recorded. In this paper, we define an approach for redacting personally identifiable text from videos using a combination of optical character recognition (OCR) and natural language processing (NLP) techniques. We examine the...

CONTRIBUTION TO THE AUTHENTICITY OF DIGITIZED HANDWRITTEN SIGNATURES THROUGH DEEP LEARNING WITH RESNET-50 AND OCR

Article

Full-text available

Mar 2024

This paper explores the contribution of authenticity to digitized handwritten signatures using a deep learning-based approach, implementing ResNet-50 and optical character recognition (OCR). Signature authentication is a crucial issue in various fields, such as transaction security, protection of official documents, and fraud prevention. Our approa...

Advancing OCR Accuracy in Image-to-LaTeX Conversion—A Critical and Creative Exploration

Article

Full-text available

Nov 2023

This paper comprehensively assesses the application of active learning strategies to enhance natural language processing-based optical character recognition (OCR) models for image-to-LaTeX conversion. It addresses the existing limitations of OCR models and proposes innovative practices to strengthen their accuracy. Key components of this study incl...

A few consonants and their respective half forms in the ancient...

Vowels and their respective vowel modifiers in the ancient Devanagari...

Vowels and their respective vowel modifiers in the ancient Maithili script

Character images having faded text portions in ancient Devanagari and...

A semi-self-supervised learning model to recognize handwritten characters in ancient documents in Indian scripts

Article

Full-text available

Jan 2024

An optical character recognition (OCR) system segments the character from the given document before recognizing it. The recognition of such character images requires the class labels to be associated with each character sample in the training set, and this requires the placing of all the samples of each segmented character in various distinct folde...

Study and Develop a Convolutional Neural Network for MNIST Handwritten Digit Classification

Chapter

Full-text available

Jul 2022

The goal of this analysis has been on the development of handwritten digit recognition with the use of the MNIST dataset. In the latest days, the identification of handwritten digits has become a challenging research topic in machine learning. Due to physically formed digits having varying lengths, widths, orientations, and positions. It may be utilized in several ways, such as the amount and signature on bank checks, the location of postal and tax papers, and so on. This research used CNN for recognition. Total four steps followed by pre-processing, feature extraction, training CNN, classification, and recognition. Along with its great higher accuracy, CNN outperforms other methods in detecting essential characteristics without the need for human intervention. On top of that, it incorporates unique levels of convolution and pooling processes. Through CNN, 97.78% accuracy was obtained.

An Efficient CNN Model for Automated Digital Handwritten Digit Classification

Article

Full-text available

Apr 2021

Background: Handwriting recognition becomes an appreciable research area because of its important practical applications, but varieties of writing patterns make automatic classification a challenging task. Classifying handwritten digits with a higher accuracy is needed to improve the limitations from past research, which mostly used deep learning approaches. Objective: Two most noteworthy limitations are low accuracy and slow computational speed. The current study is to model a Convolutional Neural Network (CNN), which is simple yet more accurate in classifying English handwritten digits for different datasets. Novelty of this paper is to explore an efficient CNN architecture that can classify digits of different datasets accurately. Methods: The author proposed five different CNN architectures for training and validation tasks with two datasets. Dataset-1 consists of 12,000 MNIST data and Dataset-2 consists of 29,400-digit data of Kaggle. The proposed CNN models extract the features first and then performs the classification tasks. For the performance optimization, the models utilized stochastic gradient descent with momentum optimizer. Results: Among the five models, one was found to be the best performer, with 99.53% and 98.93% of validation accuracy for Dataset-1 and Dataset-2 respectively. Compared to Adam and RMSProp optimizers, stochastic gradient descent with momentum yielded the highest accuracy. Conclusion: The proposed best CNN model has the simplest architecture. It provides a higher accuracy for different datasets and takes less computational time. The validation accuracy of the proposed model is also higher than those of in past works.

Recognition Impact on Rescaled Handwritten Digit Images Using Support Vector Machine Classification

Research

Full-text available

Feb 2021

Handwritten Digit Recognition has been proposed using different techniques that were implemented over the available datasets. Although existing systems reached high recognition accuracy, more efforts regarding speed and memory allocation is required. In this research, we experiment the impact of image resolution reduction on recognition accuracy for handwritten digits. A set of features were extracted, include histogram of pixels for horizontal, vertical, diagonal and inversed diagonal orientations. Feature vector constructed by joining these features. Then, support vector machine is applied for classification. Rescaled handwritten digit images were experimented against recognition accuracy, speed and memory. MNIST database of handwritten digits is utilized for implementation. Results showed that the reduction of the size for the features vector due to image rescaling to quarter of the original size had only about 1% accuracy degradation impact.

A Comparative Study of Different Deep Learning Model for Recognition of Handwriting Digits

Article

Full-text available

Jan 2021

With the expansion of Artificial Neural Network (ANN), Deep Learning (DL) has brought interesting turn in the various fields of Artificial Intelligence (AI) by making it smarter and more efficient than what we had even in 10-2 years back. DL has been in use in various fields due to its versatility. Convolutional Neural Network (CNN) is at the major point of advancement that brings together the ANN and innovative DL techniques. In this research paper, we have contrived a multi-layer, fully connected neural network (NN) with 10 and 12 hidden layers for handwritten digits (HD) recognition. The testing is performed on the publicly attainable MNIST handwritten database. We selected 60,000 images from the MNIST database for training, and 10,000 images for testing. Our multi-layers ANN (10), ANN (12) and CNN are able to achieve an overall accuracy of 99.10%, 99. 34% and 99.70% respectively while determining digits using the MNIST handwriting dataset.

Support Vector Machine Optimized by Fireworks Algorithm for Handwritten Digit Recognition

Chapter

Full-text available

Jan 2020

Handwritten digit recognition is an important subarea in the object recognition research area. Support vector machines represent a very successful recent binary classifier. Basic support vector machines have to be improved in order to deal with real-world problems. The introduction of soft margin for outliers and misclassified samples as well as kernel function for non linearly separably data leads to the hard optimization problem of selecting parameters for these two modifications. Grid search which is often used is rather inefficient. In this paper we propose the use of one of the latest swarm intelligence algorithms, the fireworks algorithm, for the support vector machine parameters tuning. We tested our approach on standard MNIST base of handwritten images and with selected set of simple features we obtained better results compared to other approaches from literature.

A Comparison of quantized convolutional and LSTM recurrent neural network models using MNIST

Conference Paper

Nov 2019

In this paper, a software comparative analysis of two neural network models is presented, namely, Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM) neural network. The evaluation is performed using the famous deep learning database the "MNIST" to check the accuracy, model size, speed and complexity of the two models for future digital realization on reconfigurable hardware. In addition to that, we optimize the size of the two models by quantizing the weights width to 8-bits instead of 32-bits. The results show an extensive reduction in the size of each model (by 10X) with a slight drop in the accuracy. The results also show that the CNN is more accurate and much faster than LSTMs making it the best model to be implemented on reconfigurable hardware. Keywords-Convolutional neural network; Long Short Term Memory Recurrent Neural Networks;MNIST I. INTRODUCTION Deep Neural Networks have proven to be increasingly useful and more accurate than ever. The task of a classifier is to obtain knowledge from a known example set with labels, to be able to correctly label data outside of the set [1]. Convolutional Neural Networks, which are a type of deep neural network, have shown an outstanding performance and especially in computer vision. Long Short Term memory neural networks (LSTM) have also been on the rise, as they have come to show excellent results over older recurrent networks. Applications of deep neural networks, or deep learning, have proven to be endless, ranging from image recognition [2] and speech applications to medical diagnosis using DNA and other medical data [3]. Use cases of CNNs include image recognition [2], and many more. Some use cases of LSTM include Document modeling [4] and speech recognition [5][6]. Our work here is a comparison between two neural network models, in terms of complexity and accuracy tradeoff. It is taken into account the accuracy, speed, and the size, of the two inherently different models. Namely, the convolutional neural network, and the Long short Term memory recurrent neural network. We tend to show the accuracy tradeoff with the model size and computational cost, or complexity. The aim is to find an efficient model among the LSTM and CNN to be realized in the future on a reconfigurable hardware such as FPGAs.

High Quality Wavelets Features Extraction for Handwritten Arabic Numerals Recognition

Article

Apr 2019

Arabic handwritten digit recognition is the science of recognition and classification of handwritten Arabic digits. It has been a subject of research for many years with rich literature available on the subject. Handwritten digits written by different people are not of the same size, thickness, style, position or orientation. Hence, many different challenges have to overcome for resolving the problem of handwritten digit recognition. The variation in the digits is due to the writing styles of different people which can differ significantly. Automatic handwritten digit recognition has wide application such as automatic processing of bank cheques, postal addresses, and tax forms. A typical handwritten digit recognition application consists of three main stages namely features extraction, features selection, and classification. One of the most important problems is feature extraction. In this paper, a novel feature extraction approach for off-line handwritten digit recognition is presented. Wavelets-based analysis of image data is carried out for feature extraction, and then classification is performed using various classifiers. To further reduce the size of training data-set, high entropy subbands are selected. To increase the recognition rate, individual subbands providing high classification accuracies are selected from the over-complete tree. The features extracted are also normalized to standardize the range of independent variables before providing them to the classifier. Classification is carried out using k-NN and SVMs. The results show that the quality of extracted features is high as almost equivalently high classification accuracies are acquired for both classifiers, i.e. k-NNs and SVMs.

Projectron -A Shallow and Interpretable Network for Classifying Medical Images

Article

Full-text available

Apr 2019

This paper introduces the "Projectron" as a new neural network architecture that uses Radon projections to both classify and represent medical images. The motivation is to build shallow networks which are more interpretable in the medical imaging domain. Radon transform is an established technique that can reconstruct images from parallel projections. The Projectron first applies global Radon transform to each image using equidistant angles and then feeds these transformations for encoding to a single layer of neurons followed by a layer of suitable kernels to facilitate a linear separation of projections. Finally, the Projectron provides the output of the encoding as an input to two more layers for final classification. We validate the Projectron on five publicly available datasets, a general dataset (namely MNIST) and four medical datasets (namely Emphysema, IDC, IRMA, and Pneumonia). The results are encouraging as we compared the Projectron's performance against MLPs with raw images and Radon projections as inputs, respectively. Experiments clearly demonstrate the potential of the proposed Projectron for representing/classifying medical images.

Projectron -- A Shallow and Interpretable Network for Classifying Medical Images

Preprint

Mar 2019

This paper introduces the `Projectron' as a new neural network architecture that uses Radon projections to both classify and represent medical images. The motivation is to build shallow networks which are more interpretable in the medical imaging domain. Radon transform is an established technique that can reconstruct images from parallel projections. The Projectron first applies global Radon transform to each image using equidistant angles and then feeds these transformations for encoding to a single layer of neurons followed by a layer of suitable kernels to facilitate a linear separation of projections. Finally, the Projectron provides the output of the encoding as an input to two more layers for final classification. We validate the Projectron on five publicly available datasets, a general dataset (namely MNIST) and four medical datasets (namely Emphysema, IDC, IRMA, and Pneumonia). The results are encouraging as we compared the Projectron's performance against MLPs with raw images and Radon projections as inputs, respectively. Experiments clearly demonstrate the potential of the proposed Projectron for representing/classifying medical images.

A Novel Feature Extraction Method Based on Histogram and Mathematical Morphology for Isolated Handwritten Greek Characters Recognition

Article

Full-text available

Sep 2018

The isolated handwritten character recognition with multiple styles is a challenging research problem. In this paper, we propose a novel method of features extraction for character recognition based on the mathematical morphology and histogram techniques into vertical, horizontal, diagonal and anti-diagonal directions, knowing that the features extarction method is an important step in many image processing tasks. In this context, we present two comparisons in isolated handwritten Greek characters recognition, in fact the first comparison is between the hybrid methods exploited in features extraction which are the mathematical morphology combined with the histogram method; in contrast the second comparison is performed in order to deduce what is the most powerful between third genres of distances used in classification The Euclidean, Manhattan, and Minkowski distances. For this purpose, we have pre-processing each character image with different techniques. Furthermore, in the experiments results we provide extensive comparisons which demonstrate that our method outperforms for different characters’ recognition, the results that we have obtained demonstrates really in one hand the performance of a novel method used in features extraction and the Euclidean distance in classification in the other hand.

The Multi-layer Perceptron

Similar publications

Citations