ArticlePDF Available

Knowledge Extraction in Digit Recognition Using MNIST Dataset: Evolution in Handwriting Analysis

International Journal of Knowledge Management

January 2021
17(4):52-75

DOI:10.4018/IJKM.2021100103

License
CC BY 3.0

Authors:

Rohit Rastogi

ABES Engineering College

Himanshu Upadhyay

Dr. Himanshu face hospital

Akshit Rajan Rastogi

ABES Engineering College

Divya Sharma

Lovely Professional University

Show all 7 authorsHide

In handwriting recognition, traditional systems have relied heavily on handcrafted features and a massive amount of prior data and knowledge. Deep learning techniques have been the focus of research in the field of handwriting digit recognition and have achieved breakthrough performance in the last few years for knowledge extraction and management. KM and knowledge pyramid helps the project with its relationship with big data and IoT. The layers were selected randomly by which the performance of all the cases was found different. Data layers of the knowledge pyramid are formed by the sensors and input devices, whereas knowledge layers are the result of knowledge extraction applied on data layers. The knowledge pyramid and KM helps in making the use of IoT and big data easily. In this manuscript, the knowledge management principles capture the handwritten gestures numerically and get it recognized correctly by the software. The application of AI and DNN has increased the acceptability significantly. The accuracy is better than other available software on the market.

Error Rate of Different Classification Algorithms

…

Samples of MNIST Dataset

…

Simple Steps for Digits Recognition with MNIST

…

Activity Diagram for Digit Recognition Systems

…

Flow Chart for Digit/Character Recognition System.

…

Figures - uploaded by Rohit Rastogi

Content may be subject to copyright.

Access to this full-text is provided by IGI Global.

Learn more

Content available from International Journal of Knowledge Management

This content is subject to copyright. Terms and conditions apply.

DOI: 10.4018/IJKM.2021100103



Volume 17 • Issue 4

This article, originally published under IGI Global’s copyright on October 1, 2021 will proceed with publication as an Open Access article

starting on February 15, 2024 in the gold Open Access journal, International Jour nal of Knowledge Management (converted to gold Open

Access January 1, 2022), and will be distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/

licenses/by/4.0/) which permits unrestricted use, distribution, and production in any medium, provided the author of the original work and

original publication source are properly credited.

*Corresponding Author







Rohit Rastogi, ABES Engineering College, Ghaziabad, India

https://orcid.org/0000-0002-6402-7638

Himanshu Upadhyay, ABES Engineering College, Ghaziabad, India

Akshit Rajan Rastogi, ABES Engineering College, Ghaziabad, India

Divya Sharma, ABES Engineering College, Ghaziabad, India

Prankur Bishnoi, ABES Engineering College, Ghaziabad, India

Ankit Kumar, ABES Engineering College, Ghaziabad, India

Abhinav Tyagi, ABES Engineering College, Ghaziabad, India



In handwriting recognition, traditional systems have relied heavily on handcrafted features and a

massive amount of prior data and knowledge. Deep learning techniques have been the focus of

research in the field of handwriting digit recognition and have achieved breakthrough performance in

the last few years for knowledge extraction and management. KM and knowledge pyramid helps the

project with its relationship with big data and IoT. The layers were selected randomly by which the

performance of all the cases was found different. Data layers of the knowledge pyramid are formed

by the sensors and input devices, whereas knowledge layers are the result of knowledge extraction

applied on data layers. The knowledge pyramid and KM helps in making the use of IoT and big data

easily. In this manuscript, the knowledge management principles capture the handwritten gestures

numerically and get it recognized correctly by the software. The application of AI and DNN has

increased the acceptability significantly. The accuracy is better than other available software on the

market.



Convolutional Neural Network (CNN), Handwritten Digits Recognition, Hidden Layers, Machine Learning,

MNIST Dataset



Handwriting recognition is a process, which has been a centre of attraction for analysts since 1998,

with nearly all the calculations structured at a point. In this process, the error rate obtained by direct

classifier was 12% in 1998, which gradually decreased to 0.23% in 2012 when applied by convolution

nets. Nowadays, as there is a drastic increase in the amount of information, Researchers and AI

specialists are trying their level best to create and approve solo learning strategies, for example,

auto-encoders, profound learning models, etc.



Volume 17 • Issue 4

In this research work, a handwriting recognition framework is actualized with the celebrated

MNIST informational index. This index is a well known dataset. It consists of a large database, which

is required for training and assessment of various image processing systems. Machine learning is

one of the fields where this data is regularly used in order to test and train various systems. By using

KM and converting all the data achieved to knowledge it can help in solving many other problems.

Digit recognition system (DRS) is a subfield of image processing that is used to scan the image

or documents with the help of the MNIST database or dataset. The main goal of this research work

is to implement a project and discuss its accuracy which scans the image digits using basic image

correlation, also known as matrix matching. The six cases performed different outputs as shown in

subsequent sections, all the cases are different from each other which show different accuracy and

different performances. Still, the rapid and high growth of the handwritten numerical data generated

recently and rising availability of massive processing power urge for improvements in recognition

accuracy and deserves much needed further investigation (Hamad, K. et al., 2016).

Convolutional neural networks (CNNs) are found to be very effective in recognizing the structure

of handwritten digits/characters in many ways, which also helps in automatically finding distinct

features. This scenario makes CNN the best available solution for solving handwritten digit recognition

problems. We have introduced a suitable combination of learning parameters in designing a CNN

that helps us achieve a new absolute record in classifying MNIST handwritten digits. Knowledge

obtained can help in further training machines and making predictions and reducing computation

required in understanding the whole image.



Digit recognition systems are being used on a very large scale in today’s world because the handwritten

digits are not always in the correct size, shape or style. That’s why, a lot of problems may occur to

find the correct digit in the real world with the help of a machine (Dorosh, N. et al., 2020).

When we include ensemble architecture, it helps in increasing the network’s recognition accuracy.

So here, the objective is to accomplish comparable accuracy by using a pure CNN architecture without

using ensemble architecture. As, ensemble architecture introduces computational cost and high testing

complexity. Therefore, in order to acquire accuracy even better than ensemble architecture, we are

trying CNN architecture, with additional bonus profit of reduced operational complexity and cost.



This research work gives us the motivation to create a model using CNN (Convolution Neural

Network) that would be able to receive and identify digits in the form of images. Also, the goal of

the program is to learn about CNN, and apply it to handwriting recognition systems (Dorosh, N.

et al., 2020). The aim of this research paper is to find the accuracy of CNN to classify handwritten

digits using multiple hidden layers.

Our motive in this research work is to explore the various design possibilities like kernel size,

receptive field, and stride size, number of layers, padding and dilution for CNN-based handwritten

digit recognition.



Handwritten digit recognition is one of the practically important issues in pattern recognition

applications. Digits are used in different types of data like vehicle number plate, written data, meter

reading etc. In the modern world digit recognition plays an important role in many user authentication

applications. It can be used in vehicle number plate recognition which uses a number plate in order

to identify the vehicle. This vehicle number plate recognition system is used in the area around top

government offices like Parliament, Supreme Court etc.

Digit recognition can also be used in postal mail sorting, handwriting recognition, form data entry,

historical document preservation in archaeology departments, old document automation in libraries



Volume 17 • Issue 4

and banks. All these areas deal with large databases therefore demand high recognition accuracy,

lesser computational complexity and consistent performance of the recognition system. In future,

this study has major implications in hand written answers checking in examinations and human hand

writing notifications. It can also be used to perform the predictive analysis of any terrorist messages

after screening their content partially (Ahlawat, S. et al, 2020).



Gesture recognition is a means of human-machine interaction via body actions. The interaction of

human and computer takes place through traditional input devices like mouse, keyboard etc. Hand

gestures can act as a useful medium for the human-computer interaction and can make the interaction

easier. Gestures vary from person to person in orientation as well as shape. Hence non-linearity

exists here.

It has been proved in recent researches that Convolutional Neural Network (CNN) can be used for

image representation and classification as CNN can learn complex and non-linear relationships among

images. And also recently gesture recognition based on machine learning has been developed rapidly

in human-computer interaction as a graphics processor unit (GPU) and artificial intelligence(AI)

image processing has been introduced. Machine learning algorithms such as support vector machine

(SVM), neural networks are widely used in gesture recognition(Islam, M. et al, 2019).



Digit recognition systems can also prove helpful for the visually impaired persons. The input to

such a system is the pressure sensor data. This data is obtained from a sensor pad which has 256

capacitive pressure sensors. This system recognizes correctly the 2-D shape of a digit which is placed

on the sensor pad. In the first step of the recognition process the Euler number is used to achieve a

preliminary classification while in the second step fourth order central moment is applied to correctly

recognize the input. Also, it can be seen that deep learning using Convolutional Neural Networks is

becoming a popular technique for challenging computer vision applications. But deep learning’s power

consumption and bandwidth requirements limit its application in embedded and mobile systems with

tight energy budgets. Bio-inspired Angle Sensitive Pixels(ASP), custom CMOS diffractive image

sensors (Paul, A. et al, 2012).



The latest Knowledge Pyramid tries to set the knowledge hierarchy within the ambience of the true

world. This Knowledge pyramid is inverted (if compared with the old knowledge pyramid). Here,

there is more information than the data. Similarly more knowledge than information and more wisdom

than knowledge.

The data layers of this pyramid are the results of sensors and input devices, whereas the knowledge

layers are the result of knowledge extraction applied on data layers.

As per revised knowledge pyramid, the reality i.e. real world object, person or situation is in centre,

accordingly the suitable data is collected through sensors or social media on which analysis has to be

performed. After refining and preprocessing operations, information is obtained and through mining

exercises, knowledge and through visualizations and graphical analysis, AI and ML applications,

predictive models, wisdom is achieved. This wisdom is cream and can be used for individual to

organizational learning for different purposes.



Knowledge management is the technique or procedure that is used to capture, distribute and use

knowledge in an efficient manner. It is an approach used in various disciplines making best use of KM to

achieve organizational goals. Knowledge Management circumscribes psychology, epistemology, along



Volume 17 • Issue 4

with cognitive science. Better understanding of Knowledge Management can increase throughput,

innovation and can amplify what is known for both, an individual and organizational perspective.

In recent times, the ability to refine a significantly large volume of data, information and

knowledge to gain computational and competitive advantage and the significance of raw data and

text analytics to this effort is gaining strength.

The basic components of a knowledge management strategy can be made useful for the

management of pyramid activities with seven steps. The Knowledge Pyramid and KM helps in

making the use of IOT and Big data easily, opening new opportunities for future work too (Jennex,

M.E., 2017).



CNN is playing a major role in image processing and it is being used in natural small localized areas.

CNN is also used in Nano-technologies like classification (Dorosh, N. et al., 2020). There are many

different other methods of image processing but CNN is best because of using Keras with the one or

multiple layers, which provide multiple layers at a same time to detect fault and classification, that’s

why high accuracy takes place (Gulli, Antonio et a., 2020).

As shown in Figure-1 of CNN layers, there are seven layers that take place in CNN method in

which five layers are followed by one layer as input and other one remaining layer works as an output

layer which is used to recognize handwritten digits (as per Fig. 1).

Being a problem for well-differentiated, multi-machine learning techniques such as regression,

KNN, SVMs and Neural Networks have been investigated to obtain digits with high accuracy (as per

Fig. 2) (Umapada Pal et al., 2012).

As with each existing research, KNN and SVM in the thinking class may cause scalability

difficulties, and may be comparing the default of the second classifier.

Figure 1. A Seven Layered Convolution Neural Network for Digit Recognition



Volume 17 • Issue 4



To build a strong Neural Network for this project, we should make use of Knowledge Management(KM).

Knowledge Management (KM) involves several disciplines such as psychology, cognitive science etc.

The objective of KM is to get insights of the data that we are using and also enable people and firms to

cooperate, share, create, use and reuse knowledge. We have a very large amount of data present around

us and all over the internet but to acquire knowledge from that data we require understanding of KM.

Neural networks can be trained in a better way with the help of this knowledge (Jennex, M.E., 2017).



OmniPage Ultimate

This software is combined with artificial intelligence and neural networks. It provides the best

results.It includes more than 120 language recognition which makes sure that the converted text is

as accurate as possible. It has the latest OCR technology which picks up handwritten text very well.

It easily allows to edit, share and search the PDF document. Free trial is also available from their

official website. It reduces the amount of time spent in conversion of files, lower operational costs,

and managing and sharing documents (Hamad, K. et al, 2016).

TopOCR

It is one of the best handwriting recognition software. It is an optical character recognition application.

It uses a sourced image captured by a scanner or digital camera and offers a dual pane format which

displays the original image on the left and the conversion on the right. When the text is loaded on the

right-side panel, one can read it and also correct any mistakes. After the completion the text can be

saved to an array of different formats. It also has a text-to-speech engine. It is available in windows

only. It supports 11 languages and also has a PDF export feature. It’s free trial version is also available

(Sahu, N. et al, 2017).

Simple OCR

It is a freeware tool which recognizes approximately 120,000 words. It allows to add more words to

it’s dictionary. It has more than 97% accuracy. It even identifies formatted text and it is also possible

to set it to ignore formatting. It is a speedy tool. It is available on desktop only. It is the popular

freeware OCR software with hundreds of thousands of users worldwide. It is 100% free and not

limited in any way. It uses the despeckle or noisy document feature if the handwriting which is being

converted is messy. It can be set to decipher whole documents, portions, or multiple documents in

batches (Hamad, K. et al, 2016).

Software For Handwriting Recognition & Transformation For Dark Data

Since, everywhere in the globe, the govt. of many countries are facing pressure to be more active

and transparent in their functioning so all the processes and happenings has to be brought before

Figure 2. Error Rate of Different Classification Algorithms



Volume 17 • Issue 4

all to make the system transparent. Specially the secret operations, foreign policies, heavy defense

and healthcare purchases etc. are areas where the challenges are hidden transactions are in cursive

handwriting, signatures and constrained and unconstrained handprints has to be understood.

These software are able to detect and curb the high instruction on image and noise levels to

improve the accuracy and optimize the performances. Hand written data and fields available in free

formats are challenges and contain structure in big data which plays a crucial role in big organizations.

(Ref. https://www.parascript.com/blog/best-icr-software-for-handwriting/)

Besides this, the professionals are also trying to design the software which can read and understand

the complex and crypto coded human hand writing and so the terrorist messages can be understood

and can be dealt with in advance.



During the project we have used various tools and techniques to get best results. We used python as

our base tool for implementation of the text recognition through CNN algorithm with 5 layers along

with the MNIST dataset of numerals.

CNN algorithm is one of those algorithms which are being used prominently in the field of

computer vision. It is widely used in the field of object detection and recognition in an image with

a great efficiency as compared to other neural network algorithms.

This can be easily observed through “Figure 2” of this document.



MNIST stands for “Modified national institute of standards and technology”. It is a large sized dataset

widely used in the field of computer vision and convolutional neural networks(CNN). MNIST dataset

is used for training and testing the systems which includes numerals as data. it holds binary images

of handwritten digits.

This Dataset contains 60,000 images which is used for training and cross validation.

Additional 10,000 images are also present for testing purposes. All the images of dataset are

into 28*28 sizes which makes the total dimension vector as 28*28=784(Sharma, I., et al., 2020).

As shown figure 3, all the digits are in 28*28 size which makes a unique and stable combination

to recognize the handwritten digits.



As per the current structure of knowledge pyramid, since it is inverted now so, the knowledge obtained

from the data set will be raw and it will be refined from semi and fully structured data. To convert

into wisdom, complex mathematical operation will be performed so that the knowledge obtained will

be useful for individual and for organization and can be applied in designing much runtime software

to check fraud and crime. The knowledge management will play a crucial role here.





Step 1:

In this step, the handwritten document is scanned and digits are extracted from that document by the

system. Then, the digits are arranged in a sequence of characters as a string. This input can serve as

a prominent base for gathering various vital information derived from text extraction. The Devices

used in this process can be a digital camera or a scanner.



Volume 17 • Issue 4

Step 2:

In this step, extracted image’s digits gets resized to 28*28 dimension. This step is important because

the dimensions of a handwritten document’s data is specific to the writing sense and stroke style

of the writer. So this step brings the input data to a fixed standard size. Also, the size of images in

MNIST data set is with the dimensions of 28*28 which makes the comparison more convenient. This

modified data further can be used as a base data for other comparisons. This data can pile up and

contribute to a very large dataset i.e. Big Data. This big data can be used to give us vital information

which can give useful outcomes on processing.

Step 3:

In this step, multiple layers of CNN are employed on the dataset which makes the digits more clear.

Along with this, image processing techniques are employed to classify the input. This step brings the

handwritten document closer to the digital document. After the processing the handwritten document

becomes a more convenient source through which one can get some insight of the real data present

in the handwritten document file and can provide knowledge which can be used for decision making

and further uses.

Step 4:

In this step, handwritten digits are tested against the MNIST dataset. In this process, each numeral is

checked against every numeral present in the dataset giving out accuracy percentage.This phase can

be used to get and create wisdom for a particular individual.

Figure 3. Samples of MNIST Dataset



Volume 17 • Issue 4



All the Necessary Diagrams to explain the functionality of our Research work are mentioned below

( Fig. 5 to Fig. 8)

Here, the activity diagram depicts how the procedure is undertaken by the digit recognition system.

Firstly, after starting the process, we will upload the digit image. If the format of the image is not

supported, then the system will show an error message and will take us back to the initial stage. But

if the format of the image is supported, then the system will pass the subject image through the image

acquisition stage, and then through the image preprocessing stage where the image is converted to the

grayscale image, which is then converted to the binary format and after that, it is then normalized.

But, if the preprocessed image is not clear, then the system will again show an error message and

will again take us back to the initial stage. But, if the preprocessed image is clear, then it will pass

through various stages which include image segmentation, feature extraction and then the classification

or recognition. Then after completing all the steps, the system will generate a process at the end.

The network contains 784 neurons as input data which means the input layers consists of 28*28

pixels. on the grayscale the white pixel with the value of 0 and the black pixel with the value of 1.

Hence, CNN models have 5 hidden layers ( as per Equations 1,2 and 3) (as per Fig. 9 and 10).

For the performance of cost function the equation is expressed as -

The 1st hidden layer is the decision layer responsible for extraction in input data. Addition to

this, it contains many feature maps with readable headers and repair units.

The kernel size determines the filter area. The repair unit is used as a trigger function at the last

of each case layer and layer that is fully connected to enlarge model performance. Next hidden layer

pool layer 1. Reduces flow information to the display layer then decreases the no. of parameters

again the complexity of the model. Various large pool lakes, central swimming, and L2 swimming.

Figure 4. Simple Steps for Digits Recognition with MNIST



Volume 17 • Issue 4

Here, maximum swimming used to decrease in size maps for each feature. Conversion layer 1

and 2 have the same function as output layer 1 and also no ponding layer 2.

To decrease the excess weight, stop the standard method is applied to a fully connected layer

1.It occasionally quenches specific neurons during training in order to improve network performance

by providing more efficient mightily. The output of the layer network has 10 neurons and finds the

digits from 0 - 9 output layers.

This is done by an algorithm known as gradient descent (as per Fig. 10).

As shown in figure the graphical representation of cost vs. weight .Where the gradient descent

algorithm is unstable when training the very large data (as per Fig. 12).



Convolutional Neural Network (CNN) is one of the main algorithms which is used to perform image

recognition, object detection and object recognition from a given image. This algorithm is widely used

in object detection and recognition as it has very good accuracy as compared to other neural networks.

Figure 5. Activity Diagram for Digit Recognition Systems



Volume 17 • Issue 4

Figure 6. Flow Chart for Digit/Character Recognition System.



Volume 17 • Issue 4

Figure 7. Sequence Diagram for Digit Recognition System.



Volume 17 • Issue 4

Types of Layers

Let’s understand the working of layers by taking an example of an image of dimension 32 x 32 x 3.

1. Input Layer:This layer stores the raw input of an image with dimension 32 x 32 x1.

2. Convolution Layer: This layer performs the computations to obtain the output volume. Output

volume can be obtained by performing dot product between all filters and image patches.

Figure 8. Use Case Diagram for Digit Recognition System.



Volume 17 • Issue 4

3. Activation Function Layer: This layer applies activation function to each and every element

of the output of the second layer. Some most frequently used activation functions are Sigmoid,

Tanh, RELU, Leaky RELU, etc. The volume does not change in this layer.

4.Pool Layer: Main function of this layer is to decrease the size of volume. Decreasing the volume

makes the computation fast and reduces the memory. Pool layer also prevents over fitting.

Two common types of pooling layers are max pooling and average pooling.

5 .Fully-Connected Layer: This layer is a regular neural network layer. It takes input from the

previous layer and outputs the class using an activation function and classifies the input image.



In this, CNN is applied on MNIST dataset to handwritten digits recognition to observe the accuracies.

The accuracy is validation using python training and accuracy (As per Table 1).

Figure 9. The Mathematical Operations Occurring at the First and Second Hidden Layer

Figure 10. The Mathematical Operations Occurring at the First and Second Hidden Layer



Volume 17 • Issue 4

This Table shows the maximum and minimum training after using six different cases and this

table also provides the validation accuracies of CNN.

Addition to this, in this table all six cases take place which provide the actual results to the

handwritten digits with the help of MNIST dataset (as per Fig. 12 to Fig. 17).

After applying the MNIST recognition the output comes which contains 2 layers and 10 neurons

and recognition the digits from 0 to 9.

• In this case where pooling 1 and convolution 2 are used .which is connected to each other during

the recognition of digits or the testing .

Figure 11. The Gradient Descent Method for Specific Neurons

Figure 12. The Mathematical Operations Occurring at the Intermediate Hidden Layer



Volume 17 • Issue 4

• In this case two convolutions are used by pooling layers, flatten layers and connected layers.

In this case a dropout will be used and the validation of the accuracy of digits will take place.

• the minimum training accuracy is found approx. 98.11% at the epoch 12 and the maximum

training accuracy is found approx. 97.18% at the epoch 14.

Figure 13. This figure show the observed accuracy for case one in which the first hidden layer is convolution layer one which

use 32 filters at the time of recognition which are in the size of 3*3 pixels.

Figure 14. This shows the accuracy of training and validation of the layers as shown.



Volume 17 • Issue 4

Figure 18. In this case, the total number of hidden layers which make the combination using

pooling and convolutions layers also show the different output which affect the accuracy

of the digit recognition.

• These six cases show the different output with different layers by which during the recognition

of digits every layer affects the output of the recognition.

• From the above cases the maximum accuracy was found nearly 99.13% in the case 5 and the

minimum accuracy was found approx. 96.55% in the case 8 during the digits recognition.

Figure 15. Observed Accuracy for case 3

Figure 16. Observed Accuracy for case 4



Volume 17 • Issue 4

Currently, the authors planned to build a multi-layer neural feed forward network using both

Keras and - an interface for displaying machine learning algorithms in Python.

This network contains perceptions inside layers that absorb the input and transfer the information

to the next part. The team planned on building a multifaceted CNN deposition and layers of maxpool

and final exit layer by activmax activation 3,4.



With the help of CNN, multi layers, MNIST dataset and many other required methods, handwritten

digits recognized which give the output as below(as per Fig. 18, 19 and 20)

Our model will pretend that the input image is unseen and new for the system and then our model

will predict the integer that the image represents.

The load_image () function will return the loaded image which will be ready for classification.

After loading the input image our system will force the image to be in grayscale format and then

force the size to be of 28×28 pixels.

Next, we call the predict classes () function to predict the digit that the image represents.

Output-1

As shown in the above image, 2 is written on the input image. So our model has predicted the integer

that the image represents with an accuracy of 99%.

Figure 17. Observed Accuracy for case 5



Volume 17 • Issue 4

Figure 18. Figure shows the output that our model has accurately predicted the correct digit i.e 2 with accuracy of 99%

Figure 19. Figure shows the output that our model has accurately predicted the correct digit i.e 6 with accuracy of 77%.



Volume 17 • Issue 4

Output-2

As shown in the above image, 6 is written on the input image. So our model has predicted the integer

that the image represents with an accuracy of 77%.

Output-3

As shown in the above image, 5 is written on the input image. So our model has predicted the integer

that the image represents with an accuracy of 70%.



System’s accuracy depends on its training and how good the handwriting of the sample is and how

easy it is to differentiate each word from each other. Some digits are quite similar to each other like

1 and 7, 6 and 8, 3 and 8, 9 and 0, which is quite confusing for computers and proves to be a high

hurdle sometimes and may even reduce the accuracy of the system.

It may also be possible that the exactness among all perceptions in the exhibition was discovered

to be 97.07% when the number was tested to be 6. In addition, the miss ratio in all our tests was

roughly approx 0.049449. Also, the least testing miss ratio was around 0.21313 when the number

was tested to be 1. The lowest miss ratio will provide CNN with better execution to accomplish good

picture goals & commotion handling. Later on, we will try to watch the variety in the general grouping

precision by fluctuating the quantity of cluster size and concealed layers. We can also improve the

accuracy of our model by trying various methods like by Tuning Pixel Scaling, Tuning the Learning

rate and tuning the Model depth.



The six cases performed different outputs as shown above; all the cases are different to each other

which show different accuracy and different performance. The layers were selected randomly by

which performance of all the cases are different.

Figure 20. Figure shows the output that our model has accurately predicted the correct digit i.e 5 with accuracy of 70%.



Volume 17 • Issue 4

The maximum and minimum accuracy was observed with different cases with the help of different

algorithms and different methods with MNIST dataset. Data to knowledge conversion requires much

more computational power than just working on data, increasing system requirements of the project.

KM is different from working on data only, hence, much more complex and requires higher skill.

Although the system achieves almost perfect accuracy, it can’t reach the 100% accuracy mark. It’s

computational cost and testing complexity may be low compared to ensemble architecture, it is still

not trivial. Some digits are quite similar to each other like 1 and 7, 6 and 8, 3 and 8, 9 and 0, which

is quite confusing for computers and proves to be a high hurdle sometimes and may even reduce the

accuracy of the system. KM helps in understanding data and making predictions which helps a lot

but it isn’t still 100% accurate or totally safe for public use like traffic fines. For KM large amounts

of data are required to train the machine and improve accuracy by reducing errors, so although the

machine starts with low accuracy, it starts to chase 100% accuracy as data received increases.

Humans have different handwritings and some are not quite understandable or distinguishable even

for humans, these types of cases prove to be a challenge for the system. System’s accuracy depends

on its training and how good the handwriting of the sample is and how easy it is to differentiate each

word from each other. The system doesn’t have capability to identify its errors and notify users or

correct them.





There are multiple methods that can help us in recognizing digits, for example DNN (Deep Neural

Network), DBF (Deep Belief Network), CNN (Convolution Neural Network), etc.. In this research

paper, we combined CNN with MNIST dataset which provided higher accuracy and a clear image

with high performance. In order to enhance the performance of Digit Recognition System, we

tested various versions of this network in order to avoid costly feature extraction and complex digit

image processing. We can also use data and knowledge for future research. IoT and Big data are big

customers for the knowledge attained by our project. This knowledge can also help in training neural

networks and AIs. Also, with the help of this research work, we left behind the traditional recognition

system which was quite a complex ensemble approach. Still the poor hand writing, cursive letters

and intentionally done bad hand writing is a great challenge in this domain. Sufficient knowledge

extraction to convert into wisdom is difficult for current workers of this area.



The capability of this system in thoroughly investigating different parameters and objectives of

CNN architecture that helps us in providing best accuracy in recognition for the combined dataset.

The different cases were tested with the help of different algorithms and maximum & minimum

accuracy were noted. The maximum accuracy was observed as 99.13% and the minimum accuracy

was observed as 92.42%. This digit recognition system provided higher accuracy and it will definitely

help expediting all large-scale exhibition systems.



Research can be performed in various architectures of Convolutional Neural Network (CNN). Also,

investigation of domain specific recognition can be performed. With the help of different algorithms,

we can optimize the training parameters like number of layers, kernel sizes of filters, learning, etc.

The algorithm proposed in this research paper can be modified further to improve the minimum or

maximum accuracy of recognition.



Volume 17 • Issue 4

The algorithm we applied in the recognition of handwritten digits, can also be applied further for

the recognition of handwritten characters. Also, the proposed algorithm should be applied on large

databases in order to determine the efficiency of the system developed.

The Knowledge Pyramid and KM helps in making the use of IOT and Big data easily, opening

new opportunities for future work too. Advancements in related technologies opens many ways to

modify and improve the workings of this project. The data and Knowledge extracted by this project

is extremely precious and flexible for future researchers. With the IOT machines can be trained to not

require human interaction at all like automatic traffic fining through traffic cameras. Accuracy of the

project is yet to achieve flat out accuracy which can be a great topic of research (Jennex, M.E., 2017).

The proposed algorithm can be modified further for the recognition of broken characters or

digits. This kind of recognition system is really important as this can help us in various fields like

forensics, reading postal addresses, forms filled by candidates, bank check amounts. Also, with the

help of machine learning, the system will be able to produce precise and accurate outcomes.



In this research paper we have implemented a CNN algorithm with 5 layers in order to make digit

recognition in handwritten documents. To make it happen we have used MNIST dataset which consist

of more than 60,000 images for training and additional 10,000 images for testing purposes.

While researching, we found out that this CNN approach is much better than “Random Forest

classifier”, “K nearest neighbors” and “supervised vector machine” with utmost 1.28 error rate.

Also, it gave better results because of use of multiple layers and keras which helped in the reduction

of error in classification.

This system is very helpful for solving real time problems such as the problem of faulty written

cheques in banking documents, Forensic etc. Also, by employing the same strategy with alphabets

we may use it as a full-fledged similarity index checker of handwritten documents along with their

digital conversion.

This technique is also helpful for visually impaired people, as that digital document can be

further translated to an audio file.

Knowledge attained through this project not only helps in automating other projects it also helps

with Big data and IOT. AI and Neural Network will be specially helped by this. Compatibility of the

project with knowledge based technologies is really high due to its nature of data based project. It

has really good data for big data operations.

KM is different from working on data only, hence, much more complex and requires higher skill

but as we are moving towards more advanced times and researching better and advanced technologies,

this will be really impactful on society.

While working on this we found out that our technique still has some weak areas, for example

sometimes a fluent stroke of digit ‘3’ may appear similar to digit ‘8’ and same with the ‘6’ and ‘0’,

‘7’ and ‘1’. This could be a good replacement of traditional OCR technique. Although we received

exceptionally good results but still we have approx. 1.28 error rate.



Volume 17 • Issue 4



Ahlawat, S., Choudhary, A., Nayyar, A., Singh, S., & Yoon, B. (2020). Improved Handwritten Digit Recognition

Using Convolutional Neural Networks(CNN). Sensors (Basel), 20(3344), 1–18. doi:10.3390/s20123344

PMID:32545702

Dorosh, N., & Fenenko, T. (2020, May 19). Recognition of MNIST Handwritten Digits And Character Set

Research. National Metallurgical Academy of Ukraine. Retrieved from https://www.citethisforme.com/cite/

sources/journalautociteeval

Gulli, A., & Pal, S. (2020). Deep Learning With Keras (2nd ed.). Digit Recognition System.

Hamad, K., & Kaya, M. (2016). A Detailed Analysis of Optical Character Recognition Technology. International

Journal of Applied Mathematics, Electronics and Computers, 4(Special Issue), 244-249.

Islam, M., Hossain, M., Islam, R., & Andersson, K. (2019). Static Hand Gesture Recognition Using Convolutional

Neural Network with Data Augmentation. Joint 8th International Conference on Informatics, Electronics

&Vision(ICIEV) and 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR).

doi:10.1109/ICIEV.2019.8858563

Jennex, M. E. (2017). Big Data, the Internet of Things and the Revised Knowledge Pyramid. The Data Base for

Advances in Information Systems, 48(4), 69–79. doi:10.1145/3158421.3158427

Pal, U., Jayadevan, R., & Sharma, N. (2012, March). Handwriting Recognition In Indian Regional Scripts: A

Survey Of Offline Techniques. ACM Transactions on Asian Language Information Processing, 11(1), 1–35.

doi:10.1145/2090176.2090177

Paul, A., & Bhattacharya, N. (2012). Digit recognition from pressure sensor data using Euler number and central

moments. International Conference on Communications, Devices and Intelligent Systems(CODIS). , 93-96.

doi:10.1109/CODIS.2012.6422144

Sahu, N., & Sonkusare, M. (2017). A Study On Optical Character Recognition Techniques. The International

Journal of Computational Science, Information Technology and Control Engineering, 4(1), 1–14.

Sharma, I. (2020, June 14). Handwritten Digits Recognition Using Google Tensorflow With Python”, Data

aspirant. Retrieved from https://dataaspirant.com/handwritten-digits-recognition-tensorflow-python/



Volume 17 • Issue 4



As it is already discussed, each image is of 28X28 dimension which gives 784 entries corresponding

to image and pixel intensity.

In the dataset, the first column, called “label”, is the digit that was drawn by the user. The rest of the

columns contain the pixel-values of the associated image.

Tabular Data set, Sample Data Set is given as per Image 22.

1. Test Dataset:

As we have already discussed, each image is of 28X28 dimension which gives 784 entries

corresponding to image and pixel intensity. In this partial screenshot, some of the entries are visible

in which, Vertical entries depict pixel intensities and Horizontal entries depict different images (As

per Figure 21).

2. Training Dataset:

Sample of this data set is given as per Fig. 21

As we have already discussed, each image is of 28X28 dimension which gives 784 entries

corresponding to image and pixel intensity. In this partial screenshot, some of the entries are visible

in which, Vertical entries depict pixel intensities and Horizontal entries depict different images (As

per Figure 22).

Figure 21. MNIST dataset for testing



Volume 17 • Issue 4

Rohit Rastogi received his B.E. degree in Computer Science and Engineering from C.C.S. Univ. Meerut in 2003,

the M.E. degree in Computer Science from NITTTR-Chandigarh (National Institute of Technical Teachers Training

and Research-affiliated to MHRD, Govt. of India), Punjab Univ. Chandigarh in 2010. Currently he is pursuing his

Ph.D. In computer science from Dayalbagh Educational Institute, Agra under renowned professor of Electrical

Engineering Dr. D.K. Chaturvedi in area of spiritual consciousness. Dr. Santosh Satya of IIT-Delhi and dr. Navneet

Arora of IIT-Roorkee have happily consented him to co supervise. He is also working presently with Dr. Piyush

Trivedi of DSVV Hardwar, India in center of Scientific spirituality. He is a Associate Professor of CSE Dept. in

ABES Engineering. College, Ghaziabad (U.P.-India), affiliated to Dr. A.P. J. Abdul Kalam Technical Univ. Lucknow

(earlier Uttar Pradesh Tech. University). Also, he is preparing some interesting algorithms on Swarm Intelligence

approaches like PSO, ACO and BCO etc.Rohit Rastogi is involved actively with Vichaar Krnati Abhiyaan and

strongly believe that transformation starts within self.

Himanshu Upadhyay is a B.Tech. in Computer Science Engineering student of Dr. Abdul Kalam Technical University.

He is currently working on “Smart Attendance Monitoring and Marking System” and “IOT based Agricultural

Monitoring System”. His area of interest includes Back End Development, Java, Big Data Analysis, Python, DBMS

etc. His hobbies include listening to music, Swimming, Watching vlogs, Cricket, etc.

Akshit Rajan Rastogi is an engineering student in AKTU University, presently he is in B.Tech 3rd Year CSE Branch

in ABES Engineering College, Ghaziabad. He is currently working on various research papers as an analyst and

content developer. He loves doing research and finding new things related to the Computer Science & Information

Technology field.

Divya Sharma is an engineering student at AKTU University, Lucknow. Presently she is in 3rd year B.Tech CSE

Branch in ABES Engineering College. She is currently working on various research papers as a content writer.

She loves to explore new things in the technical field. She has a keen interest in Machine learning. She wants to

serve the society in future with all her technical resources.

Prankur Bishnoi has achieved B.Tech. in CSE from AKTU University, Lucknow, India. Presently he is working on

various research papers as an analyst and content developer. He loves doing research and finding new things

related to the Computer Science & Information Technology field.

Ankit Kumar is an engineering student in AKTU University, presently he is in B.Tech graduate from ABES Engineering

College, Ghaziabad. He is currently working on various research papers as an analyst and content developer.

He loves doing research and finding new things related to the Computer Science & Information Technology field.

Abhinav Tyagi is a Computer Science Engineering student of Dr. Abdul Kalam Technical University. Currently,

he is in the final year of his graduation. He is currently working on “Smart Attendance Monitoring and Marking

System” and “IOT based Agricultural Monitoring System”. His area of interest includes Front End Development,

Java, python, Machine Learning, Javascript (ES6), Python, DBMS etc. His hobbies include listening to music,

Swimming, Watching vlogs etc. He likes to explore and work on new ideas.

Figure 22. MNIST dataset for training

Content uploaded by Rohit Rastogi

Content may be subject to copyright.

Handwritten Digital Image Recognition based on Fusion of Multiple Machine Vision Algorithms

Conference Paper

Mar 2024

Improved Handwritten Digit Recognition Using Convolutional Neural Networks (CNN)

Article

Full-text available

Jun 2020
SENSORS-BASEL

Traditional systems of handwriting recognition have relied on handcrafted features and a large amount of prior knowledge. Training an Optical character recognition (OCR) system based on these prerequisites is a challenging task. Research in the handwriting recognition field is focused around deep learning techniques and has achieved breakthrough performance in the last few years. Still, the rapid growth in the amount of handwritten data and the availability of massive processing power demands improvement in recognition accuracy and deserves further investigation. Convolutional neural networks (CNNs) are very effective in perceiving the structure of handwritten characters/words in ways that help in automatic extraction of distinct features and make CNN the most suitable approach for solving handwriting recognition problems. Our aim in the proposed work is to explore the various design options like number of layers, stride size, receptive field, kernel size, padding and dilution for CNN-based handwritten digit recognition. In addition, we aim to evaluate various SGD optimization algorithms in improving the performance of handwritten digit recognition. A network’s recognition accuracy increases by incorporating ensemble architecture. Here, our objective is to achieve comparable accuracy by using a pure CNN architecture without ensemble architecture, as ensemble architectures introduce increased computational cost and high testing complexity. Thus, a CNN architecture is proposed in order to achieve accuracy even better than that of ensemble architectures, along with reduced operational complexity and cost. Moreover, we also present an appropriate combination of learning parameters in designing a CNN that leads us to reach a new absolute record in classifying MNIST handwritten digits. We carried out extensive experiments and achieved a recognition accuracy of 99.87% for a MNIST dataset.

Big Data, the Internet of Things, and the Revised Knowledge Pyramid

Article

Full-text available

Nov 2017

Murray E. Jennex

The knowledge pyramid has been used for several years to illustrate the hierarchical relationships between data, information, knowledge, and wisdom. An earlier version of this paper presented a revised knowledge-KM pyramid that included processes such as filtering and sense making, reversed the pyramid by positing there was more knowledge than data, and showed knowledge management as an extraction of the pyramid. This paper expands the revised knowledge pyramid to include the Internet of Things and Big Data. The result is a revision of the data aspect of the knowledge pyramid. Previous thought was of data as reflections of reality as recorded by sensors. Big Data and the Internet of Things expand sensors and readings to create two layers of data. The top layer of data is the traditional transaction / operational data and the bottom layer of data is an expanded set of data reflecting massive data sets and sensors that are near mirrors of reality. The result is a knowledge pyramid that appears as an hourglass.

Recognition of MNIST handwritten digits and character set research

Article

Mar 2020

The goal of the work is the study of influence of descriptors and reduction of their quantity for recognition of MNIST database of handwritten digits.For recognition of the MNIST digits, a set of 12 descriptors was chosen. Statistical analysis of descriptors was performed. Analysis of descriptors gave the reason to assume, that the fifth, sixth and seventh Hu-moments doesn’t contribute into result of digit recognition. Digit recognition with usage of classifier based on on k-means method with n_neighbors = 10 of Scikit-Learn Python system library was done. Best results using 8 descriptors, excluding the fifth, sixth and seventh Hu-moments and eccentricity. Recognition accuracy was 78.58% compared to 78.14%.

Handwritten Digits Recognition Using Google Tensorflow With Python

Jan 2020

I Sharma

Sharma,I.(2020,June14).Handwritten Digits Recognition Using Google Tensorflow With Python", Data aspirant.Retrievedfromhttps://dataaspirant.com/handwritten-digits-recognition-tensorflow-python/