ChapterPDF Available

Deep Learning in Engineering Education: Performance Prediction Using Cuckoo-Based Hybrid Classification

January 2020

January 2020

DOI:10.4018/978-1-7998-3095-5.ch009

In book: Machine Learning and Deep Learning in Real-Time Applications (pp.187-218)

Authors:

Deepali Vora

Symbiosis Institute of Technology

R. Kamatchi

Atlas SkillTech university

The goodness measure of any institute lies in minimising the dropouts and targeting good placements. So, predicting students' performance is very interesting and an important task for educational information systems. Machine learning and deep learning are the emerging areas that truly entice more research practices. This research focuses on applying the deep learning methods to educational data for classification and prediction. The educational data of students from engineering domain with cognitive and non-cognitive parameters is considered. The hybrid model with support vector machine (SVM) and deep belief network (DBN) is devised. The SVM predicts class labels from preprocessed data. These class labels and actual class labels acts as input to the DBN to perform final classification. The hybrid model is further optimised using cuckoo search with Levy flight. The results clearly show that the proposed model SVM-LCDBN gives better performance as compared to simple hybrid model and hybrid model with traditional cuckoo search.

A Deep Architecture

…

Taxonomy of Optimization

…

Experimental Setup

…

Results for Hybrid Model: Sensitivity, Specificity, Accuracy and F1_Score

…

System Model

…

Figures - uploaded by Deepali Vora

Content may be subject to copyright.

Content uploaded by Deepali Vora

Content may be subject to copyright.

Machine Learning and

Deep Learning in Real-

Time Applications

Mehul Mahrishi

Swami Keshvanand Institute of Technology, India

Kamal Kant Hiran

Aalborg University, Denmark

Gaurav Meena

Central University of Rajasthan, India

Paawan Sharma

Pandit Deendayal Petroleum University, India

A volume in the Advances in Computer and

Electrical Engineering (ACEE) Book Series

Published in the United States of America by

IGI Global

Engineering Science Reference (an imprint of IGI Global)

701 E. Chocolate Avenue

Hershey PA, USA 17033

Tel: 717-533-8845

Fax: 717-533-8661

E-mail: cust@igi-global.com

Web site: http://www.igi-global.com

any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher.

Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or

companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark.

Library of Congress Cataloging-in-Publication Data

British Cataloguing in Publication Data

A Cataloguing in Publication record for this book is available from the British Library.

All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the

authors, but not necessarily of the publisher.

For electronic access to this publication, please contact: eresources@igi-global.com.

Names: Mahrishi, Mehul, 1986- editor. | Hiran, Kamal Kant, 1982- editor. |

Meena, Gaurav, 1987- editor. | Sharma, Paawan, 1983- editor.

Title: Machine learning and deep learning in real-time applications / Mehul

Mahrishi, Kamal Kant Hiran, Gaurav Meena, and Paawan Sharma, editors.

Description: Hershey, PA : Engineering Science Reference, an imprint of IGI

Global, [2020] | Includes bibliographical references and index. |

Summary: “This book examines recent advancements in deep learning

libraries, frameworks and algorithms. It also explores the

multidisciplinary applications of machine learning and deep learning in

real world”-- Provided by publisher.

Identifiers: LCCN 2019048558 (print) | LCCN 2019048559 (ebook) | ISBN

9781799830955 (hardcover) | ISBN 9781799830962 (paperback) | ISBN

9781799830979 (ebook)

Subjects: LCSH: Machine learning. | Real-time data processing.

Classification: LCC Q325.5 .M3216 2020 (print) | LCC Q325.5 (ebook) | DDC

006.3/1--dc23

LC record available at https://lccn.loc.gov/2019048558

LC ebook record available at https://lccn.loc.gov/2019048559

This book is published in the IGI Global book series Advances in Computer and Electrical Engineering (ACEE) (ISSN:

2327-039X; eISSN: 2327-0403)

187

Chapter 9

DOI: 10.4018/978-1-7998-3095-5.ch009

ABSTRACT

The goodness measure of any institute lies in minimising the dropouts and targeting good placements. So,

predicting students’ performance is very interesting and an important task for educational information

systems. Machine learning and deep learning are the emerging areas that truly entice more research

practices. This research focuses on applying the deep learning methods to educational data for clas-

siﬁcation and prediction. The educational data of students from engineering domain with cognitive and

non-cognitive parameters is considered. The hybrid model with support vector machine (SVM) and deep

belief network (DBN) is devised. The SVM predicts class labels from preprocessed data. These class

labels and actual class labels acts as input to the DBN to perform ﬁnal classiﬁcation. The hybrid model

is further optimised using cuckoo search with Levy ﬂight. The results clearly show that the proposed

model SVM-LCDBN gives better performance as compared to simple hybrid model and hybrid model

with traditional cuckoo search.

INTRODUCTION

Nowadays, Educational Data Mining (EDM) exists as a novel trend in the Knowledge Discovery in Da-

tabases (KDD) and Data Mining (DM) field that concerns in mining valuable patterns and finding out

practical knowledge from the educational systems. One important goal of the educational system among

many is tracking the performance of the student. Many techniques and algorithms are used to track the

Deep Learning in

Engineering Education:

Performance Prediction Using Cuckoo-

Based Hybrid Classication

Deepali R. Vora

Vidyalankar Institute of Technology, India

Kamatchi R. Iyer

Amity University, India

188

Deep Learning in Engineering Education

progress of students. However, evaluating the educational performance of students is challenging as

their academic performance pivots on varied constraints. This domain has gained importance with the

increase in data volume and the development of new algorithms.

Data generated from various educational sources is explored using different methods and techniques

in EDM. The multidisciplinary research that deals with the development of such methods and techniques

are the focus of EDM. Analysis of educational data could provide information about student’s behav-

iours, based on which education policies could be enhanced further (Sukhija, Jindal, & Aggarwal, 2015,

October). EDM discusses the techniques, tools, and research intended for automatically extracting the

meaning from large repositories of educational systems’ data.

According to Davies (Davis, 1998), “Education has become a commodity in which people seek

to invest for their own personal gain, to ensure equality of opportunity and as a route to a better life.”

Because of this Higher education providers are competing mainly for students, funding, research and

recognition within the wider society.

It seems important to study data of students studying professional courses as for the growth of any

nation producing better professionals is the key to success. Higher education system faces two main

challenges: finding placements and students dropping out. Analysis of educational data can help in an-

swering the two major challenges satisfactorily. Predicting the performance leads to better placements

and minimise the dropouts.

A statistical technique to predict future behaviour is known as Predictive modelling. Predictive

analytics is used widely in the area of product management and recommendation. It is a powerful tool

to understand the data at hand and get useful insights from it. Figure 1 represents Predictive analytics

in education.

One of the most popular methods for predictive analytics is Machine learning (ML) to predict future

behaviour. From the plethora of algorithms available, it is always interesting to discover which algorithm

or technique is most suitable for analysis of data under consideration. EDM is the area of research where

predictive modelling is most useful.

ML has become very popular among researchers because of the astonishing results the algorithms

are giving for diverse data and applications. But when data is growing enormously simple ML are not

efficient and beneficial. Meantime there are lot many advances in hardware and software. So, it was pos-

sible to have more complex and hybrid architectural models performing various DM or Big Data tasks.

Big data is already posing a challenge on traditional ML models for efficiency and accuracy. Various

hybrid models are proposed and tested in many domains to tackle these challenges and are proved to be

useful. Thus, applying a hybrid model in the education domain will be useful.

ML is changing in a better way to tackle new age data and one of such advances is Deep Learning

(DL). Nonlinear data analysis can be effectively done using deep learning. Characteristics of the data

can be effectively analysed using layers in the deep learning model. DL is being applied in many do-

mains; predominantly in image processing and natural language processing (Deng & Yu, 2014). Thus,

it is interesting to apply Deep Learning in the field of education.

This chapter addresses the main objectives as:

1. Identification of areas like EDM where Deep Learning is applied and is useful

2. Applying hybrid classification method using Deep Learning on Educational Data for Classification

3. Improvising the Hybrid Model By Applying Cuckoo Search with Levy Flight optimization technique

189

Deep Learning in Engineering Education

BACKGROUND

Deep Learning

Hinton and colleagues suggested the concept of Deep Learning in the year 2006. Deep Learning (DL)

is capable of learning from small data sets. The learning is through a nonlinear network structure. The

Deep Learning is made up of the network structure with normally more than 4 hidden layers with one

input and one output layer. Such a network can transform the raw features of images into superior features

thereby making classification and prediction better (Bengio, 2009) (Najafabadi, et al., 2015).

DL differs from ML in many ways. In terms of accuracy of algorithms, DL performs much better

than normal ML. when data increases, DL learns fast from such ever-increasing data thereby increas-

ing accuracy. In contrast, ML algorithms are restricted by the representation of data which hampers the

response time and accuracy of the system using such algorithms. Consider an example of email spam

filtering. To identify if an email is a spam or not, the ML algorithm is given various representations

of a good and bad email. Using which incoming emails are categorized as good or bad. ML algorithm

directly without any representations will not be able to decide on anything.

Here, DL comes to the rescue. Identification of important features and learning from them is easily

performed by DL. DL algorithms can identify the features from the raw data and create representations for

learning. DL has numerous algorithms of ML. These algorithms attempt to model high-level abstractions

in data. They create or design architectures which are composed of many non-linear transformations.

Figure 1. Predictive Modelling

190

Deep Learning in Engineering Education

Deep architectures can be modelled using any combinations of layers of a network, but still, it has set of

traditional algorithms such as Stacked AutoEncoder, Deep Boltzmann Machines, Deep Convolutional

Networks and Deep Belief Networks. Figure 3 shows the set of predefined DL models.

In general, the model of deep learning technique can be classified into discriminative, generative,

and hybrid models (Alwaisi & Baykan, 2017).

Figure 2. A Deep Architecture

Figure 3. Deep Learning Models

191

Deep Learning in Engineering Education

Discriminative models are used for modelling dependency of unobserved (target) variable Y on

observed variables X whereas the generative models are used for learning the joint probability distribu-

tion. The generative model learns the full relationship between input X (features) and label Y giving

maximum flexibility at the time of testing. Discriminative models learn from the only X to predict Y

using conditional probability. By using few modelling assumptions these models can use existing data

more efficiently. CNN, deep neural network and recurrent neural network are Discriminative models

and DBN, restricted Boltzmann machine, and regularized autoencoders are generative models. Hybrid

deep models are a combination of discriminative and generative models.

These DL models are used in various different application areas to gain better accuracy or output.

Table 1 summarizes the work done in various areas:

In addition to the above mentioned, there are many applications in various domains where DL algo-

rithms or deep networks are used very effectively.

From the study of various articles, it is evident that DL is applied widely in many areas. The improve-

ment in hardware has also made application of DL feasible. These algorithms are proved to give better

accuracy in many cases than other traditional ML algorithms. Still, there are many domains where DL

may prove beneficial, one of those being Educational System. In many articles, the DL algorithms are

compared with traditional machine learning algorithms and are observed to be more accurate. Many

articles proved that DL algorithms improve accuracy over traditional ML algorithms.

Also, the review of articles suggests that applying Deep Learning algorithm with other generalised

algorithms may give better results in classification and prediction tasks. Through the survey, it is observed

that hybrid models are more popular than plain DL algorithm based models (Vora & Iyer, A Survey of

Inferences from Deep Learning Algorithms, 2017). In many applications standard dimensionality reduc-

tion algorithms are used to reduce the features and then DL algorithms are applied to improve accuracy.

Educational Data Mining

EDM is a popular research area and an ample amount of research articles are available for study. These

research articles indicate the experimentation and algorithms used in EDM for performing various

tasks. For the performance prediction, various new techniques and ML algorithms have experimented.

There are many factors or features which have a significant effect in predicting the performance of the

students. These factors are classified as cognitive and non-cognitive factors. Non-cognitive factors play

an important role in various EDM goals.

Wattana & Nachirat (Punlumjeak & Rachburee, 2015,October) used various techniques like K-Nearest

Neighbourhood, Naïve Bays, and Neural Network to classify the students’ data. The features considered

were very few and majority attributes were related to marks of students.

Norlida, Usamah & Pauziah (Buniyamin, Mat, & Arshad, 2015, November) used Neuro Fuzzy al-

gorithms to predict the performance of the engineering student. Here only 6 linguistic parameters are

used for prediction.

Camilo, Elizabeth & Fabio (Guarín, Guzmán, & González, 2015) used Decision tree and Bayesian

Classifier for prediction of students’ performance. Students’ admission test score and academic informa-

tion were used for prediction. In addition, few socio-economic parameters were also used for prediction.

The major stress was on the admission parameters.

Phung, Chau & Phung (Phung, Chau, & Phung, 2015,November) used Rule Extraction algorithm

for classification in EDM. The algorithm is able to handle discrete and continuous data. The algorithm

192

Deep Learning in Engineering Education

Table 1. Deep Learning Application area

Application Area Deep Learning Algorithm Key Findings

Malware Detection DBN (Davidt & Netanyahu,

2015,July)

• Dropout method was used while training the network.

• Various layers were used to detect malware signatures.

• The network was trained using a GPU to detect 30 signatures.

Intrusion Detection DBN (GAO, GAO, Gao, &

Wang, 2014,November)

• DBN has proved more accurate than SVM and Artificial Neural

Network (ANN).

• DBN with 4 different configurations was used. The performance of

shallow DBN is same as SVM and ANN.

• DBN with 2 and more hidden layers gave better output. The DBN is

used for multiclass classification.

Spam Filtering DBN (Tzortzis & Likas,

2007,October)

• DBN for Spam Filtering.

• The performance was compared with SVM and DBN and was found

more accurate.

Image Processing

Deep Convolutional DBN

(Nguyen, Fookes, & Sridharan,

2015,September)

• Deep Convolutional DBN used for classification of images.

• Accuracy is improved and training time for the deep network reduced.

Image Processing

DBN & DAE (Vincent,

Larochelle, Lajoie, Bengio, &

Manzagol, 2010)

• DBN and DAE (Denoising AutoEncoder) used for analysing the

images.

• Experimental results show that DAE was helpful for learning of higher

level representations.

• DBN and DAE gave better accuracy for image classification when

combined with SVM.

Classification Deep SVM (Kim, Minho, &

Shen, 2015,July)

• Experimented with a new model created by combining Autoencoder,

Deep SVM and GMM.

• The input was fed to SVM and then to GMM forming one layer.

• Thus deep layers were constructed for feature extraction and then a

Naïve Bays algorithm was used for classification.

Image Processing SVR with LRU (Kuwata &

Shibasaki, 2015,July)

• Used SVR (Support Vector Regression) with Linear Rectifier Units for

estimating the crop yields from remotely sensed data.

• This paper described Illinois crops yield estimation using deep learning

and machine learning algorithm.

• Experimentation was done using Caffe tool.

• SVM with Gaussian Radial Bias function was used for the same

experimentation and proved that traditional SVM overfits the regression

model making accuracy low.

Regression Analysis

Deep SVM (M. A. Wiering,

Millea, Meijster, & Schomaker,

2016)

• Used Deep SVM for the regression analysis. The deep model was

constructed by stacking two layers of SVM.

• Initial layers were used for extracting the important features and final

layer was used for classification.

Finance

Deep SVM with Fuzzy (Deng,

Zhiquan Ren, Kong, Bao, &

Dai, 2017)

• Used Fuzzy Deep Neural Network for the classification of financial

trading data.

• The deep network was given a high-level representation of data. This

representation was generated by the fuzzy model and the neural network

model.

Education SAE (Guo, Zhang, guang, Shi,

& Yang, 2015,July)

• Used sparse Autoencoders for classification and prediction.

• The network was trained using a backpropagation algorithm.

• The experimentation was done on data collected from 9th-grade high

school children.

• The experimentation was carried out on GPU and CPU.

• The observed accuracy of DL algorithm was higher than SVM and

Naive Bayes algorithm.

Music Classification

Deep Feed forward network

and LRU (Rajanna, Aryafar,

Shokoufandeh, & Ptucha, 2015)

• Rectilinear Unit (RLU) was used as an activation function in a deep

neural network with 2 hidden layers.

• The accuracy of the classifier is improved significantly.

193

Deep Learning in Engineering Education

has a major challenge in creating compact rules. The numerous rules formed made the system difficult

to use with more parameters.

Wen and Patrick (Shiau & Chau, 2016) and Sadaf & Eydgahi (Ashtari & Eydgahi, 2017) used Statis-

tical modelling for EDM. Statistical methods are not able to support the change in population and size.

Also, it was difficult to handle lead time bias.

Fernando et al. (Koch, Assunção, Cardonha, & Netto, 2016) used Partial Least square method and

proved that it was cost effective. Here the method was sensitive to the choice of parameters. The param-

eters used were few.

Janice et al. (Gobert, Kim, Pedro, Kennedy, & Betts, 2015) and Anjana, Smija & Kizhekkethottam

(Pradeep, Das, & J, 2015) used Decision trees in EDM. Limited features were used while predicting the

performance. As well tree structure was prone to sampling error. The accuracy was affected by imbal-

anced data.

Evandro B. Costa et al. (Costa, Fonseca, Santana, Araújo, & Rego, 2017) used Naïve Bays, Decision

Tree, SVM and Neural Network to predict the performance of the students. The data used was collected

from distance learning and on-campus students. Performance data per week for the four weeks was col-

lected and analysed for the effectiveness of the algorithms.

Wanli et al. (Xing, Guo, Petakovic, & Goggins, 2015) used genetic programming for predicting

Students’ performance. The genetic algorithm produced an optimised prediction rate. While predict-

ing, less consideration was given to the qualitative aspects. They monitored closed classroom learning

of students and identified the factors which affect the performance. The participation of the student in

various activities was majorly considered.

Xin Chen, et al. (Chen, Vorvoreanu, & Madhavan, 2013) studied social data to identify the factors

which affect the behaviour or performance of students as study-life balance, lack of sleep, lack of social

engagement, and lack of diversity.

Michail N. Giannakos et al. (Giannakos, et al., 2017, April) identified various cognitive factors like

academic performance, attendance etc. and its effect on students’ performance.

Hijazi & Naqvi (Hijazi & Naqvi, 2006) and Shoukat (Shoukat, 2013) has studied the impact of vari-

ous cognitive and non-cognitive on students’ performance.

Mushtaq & Khan (Mushtaq & Khan, 2012) proved that communication, learning facilities, proper

guidance and family stress has a direct impact on students’ performance. As well as Omar & Dennis (2015)

used many factors for study and identified which factors played a vital role in students’ performance.

Suryawan & Putra (Suryawan & Putra, 2016) did a detailed survey to identify the factors which affect

students GPA. Also, regression tests and correlation analysis were done on various factors. It proved

that the entrance exam and attendance in the class were important factors. Lecturer quality was also

important and has an effect on GPA.

In an interesting article in 2016, Pooja Mondal (Mondal) identified various factors like intellect, learn-

ing, physical, mental, social and economic as factors which affected students’ behaviour and performance.

Most of the research is centred on the application of Data Mining and Machine Learning techniques

in the classification task for students’ performance. Classification and prediction task widely uses Clas-

sification methods based on learning from examples, such as Decision Tree, Artificial Neural Networks

and Support Vector Machine algorithms. Although hybrid algorithms gained popularity for solving

complex problems, they are not cited as commonly as the other methods in students’ performance clas-

sification and prediction (Vora & Kamatchi, EDM – Survey of Performance Factors and Algorithms

Applied, 2018).

194

Deep Learning in Engineering Education

Optimization

Optimization is the selection of the best element using some criterion from some set of available al-

ternatives. The goal of optimization is to provide near perfect, effective and all possible alternatives.

Maximizing or minimizing some function related to application or features of the application is the

process of optimization. For example, minimising the cost function or minimising the mean square

error is the goal of optimization in typical ML algorithms. Machine learning often uses optimization

algorithms to enhance their learning performance. Training optimization improves the generalization

ability of any model.

Following diagram shows the taxonomy of optimization techniques.

It is difficult to train the network effectively. Classification accuracy is improved with improvement

in training. So, the training of deep learning algorithms can be improved using optimization techniques.

There are many optimization techniques, but recent techniques indicate the use of metaheuristic algo-

rithms for training optimization. Metaheuristic techniques are popular because they can be applied in

any generalized problem domain. Among many metaheuristic nature inspired algorithms like ant and

bee algorithms, particle swarm optimization, genetic algorithms, firefly algorithms, harmony search,

the Cuckoo Search (CS) Algorithm (Yang & Deb, 2014) was preferred. Primary advantages of CS are

as follows:

• Applied in a wide variety of problems like Face Recognition, Engineering optimization, Medical

Domain etc. and proved beneﬁcial

Figure 4. Taxonomy of Optimization

195

Deep Learning in Engineering Education

• Cuckoo search has better global convergence properties than other popular variants like Genetic

Algorithm, Particle Swarm Optimization, Ant colony optimization etc.

• For a random walk, Cuckoo Search uses Levey Flight which helps in more exploration in search

space. This guarantees early and deﬁnite global convergence.

• Hybrid model constructed with CS is proved to be beneﬁcial.

PERFORMANCE PREDICTION USING HYBRID MODEL

Experimental Setup

The experimental setup is divided into three parts as (i) Data Collection and preparation (ii) Design and

implementation of the model and finally, (iii) evaluation of the model for the problem identified. Figure

4 shows these steps clearly.

For any experimentation, input data plays a vital role. Thus one of the important steps of experimental

design is the collection of relevant data. The only collection of data is not sufficient, but making it ready

as per the requirement of the model is also important. Thus Data Preprocessing becomes a quintessen-

tial step in an experimental setup. Once data is prepared or preprocessed as per the requirement of the

model, existing models and new suggested models are implemented. The comparative results provide

new insights regarding the usefulness and accuracy of the model.

Figure 5. Experimental Setup

196

Deep Learning in Engineering Education

Algorithms

Dimensionality reduction using Principal Component Analysis (PCA)

In the proposed prediction model, PCA (Maćkiewicz & Ratajczak, 1993) is used for reducing the

vast data. There are many reasons why one wishes to reduce the dimensionality of input data. Many

times the complexity of the model depends on the number of dimensions in the data as well as the size

of the data sample. When the dimensions are reduced then the cost related to extracting the not required

dimensions is reduced. Many times the models are robust when the dataset is small and gives accurate

results. Also, data with few dimensions can be visualized properly to reduce the outliers.

Consider a p-dimensional random variable U with the dispersion matrix ∑ and let λ1 .. λn be the

eigenvalues. Consider that P1…Pn are the corresponding Eigenvectors of ∑. Then one can write:

   

 

111

PP P P

ppp

' '

... (1)

   PP P P

p p1 1

' '

... (2)

P P P P

i i j1 1 0

' '

,£ £  



, i≠j (3)

The transformed random variables can be represented as:

Y PU i P

i  

', , . (4)

Here Y is the new random variable vector and P is the orthogonal matrix then Y can be obtained

from U by the orthogonal transformation as Y=PU. Here this random variable vector Yi is called as the

ith principal component of U.

Only the basic steps of PCA are followed here. These basic steps of PCA are given in Algorithm 1.

Algorithm 1: Steps of PCA

Step 1: Standardize the input data

Step 2: Evaluate the covariance of the data

Step 3: Deduce Eigenvectors and Eigenvalues

Step 4: Re-orient data with respect to Principal Components

Step 5: Plot re-oriented data

Step 6: Bi-plot

Support Vector Machine (SVM)

PCA acts as dimensionality reduction techniques. The features are given as an input to the PCA and

reduced extracted features are considered for further computation. If there are 12 features then PCA

reduces it to 6 features and so on.

197

Deep Learning in Engineering Education

The reduced dimension and class labels are given to the SVM (Yuan, et al., 2017) for prediction of

the class. The SVM here will get reduced dimensions to work on. As the model is working on reduced

but important data the predictions are more accurate.

The data considered here consists of 35 features and one class label. Providing these many features

directly to DBN makes it computationally intensive as well results may not be so accurate. So intermit-

tently SVM is used to generate near accurate class labels which are fed to the DBN. SVM with a linear

kernel is used to generate the class labels.

Here the tuning of SVM is obviously not so accurate with the resultant prediction (in which class the

performance fall). Hence, the resultant class labels from SVM are considered as the features to DBN

classifier. DBN classifier classifies the students’ overall performance.

Deep Belief Network (DBN)

Generally, DBN includes multiple layers, and each and every layer has visible neurons, which establish

the input layer, and hidden neurons form the output layer. Further, there presents a deep connection with

hidden and input neurons; but there was no connection among hidden neurons and no connections are

present in the visible neurons. The connection among visible as well as hidden neurons is symmetric

and exclusive. This corresponding neuron model defines an accurate output for the input.

Since the stochastic neurons’ output in Boltzmann network is probabilistic, Eq. (5) denotes the output

and Eq. (6) specifies the possibility in sigmoid-shaped function, where tP indicates the pseudo-temperature.

The deterministic model of the stochastic approach is given in Eq. (7).



 







(5)

PO P



 











1 1

with



(6)

lim lim

t t

p p

for

  

 

 











0 0



for















0

(7)

The diagrammatic representation of the DBN model is in Figure 10, in which the process of feature

extraction takes place through a set of RBM layers and the process of classification takes place via MLP.

The arithmetic model exposes the energy of Boltzmann machine for the creation of neuron or binary

state bi, and that is defined in Eq. (8), where Wa,l indicates the weights among neurons and θa indicates

the biases.

198

Deep Learning in Engineering Education

EN bi W

a a l a

 

 



(8)

The progression of energy in terms of the joint composition of visible as well as hidden neurons

(x,y) is defined in Eq. (9), Eq. (10) and Eq. (11). In this, xa indicates either the binary or neuron state of

a visible unit, Bl indicates the binary state of l hidden unit, and ka is constant.

EN x y W x y k x B y

a l

a l a l

a a

l a

 

  

 

  

(9)

EN x y W y k

al l a

 

 



(10)

EN x y W x B

al a l



 

 



(11)

The input data’s possibility dissemination is encoded into weight (parameters), which is spread as

RBM’s learning pattern. RBM training can attain distributed possibilities, and the consequent weight

assignment is defined by Eq. (12).

W C x

Wx N









 

 





 



max (12)

Figure 6. Architecture of DBN in the proposed model

199

Deep Learning in Engineering Education

For the visible and hidden vectors pair



x hi,

 

, the possibility assigned RBM approach is given in

Eq. (13), where PRF specifies the partition function as in Eq. (14).

c x hi

EN x y





 

 



 

(13)

PR e

x y

EN x y







 

 

, (14)

The DBN is trained using CD (Contrastive Divergence) (Goodfellow, Bengio, & Courville, 2016)

algorithm. The steps of the CD training are as follows:

Step 1: Choose the x training samples and brace it into visible neurons.

Step 2: Evaluate the feasibility of hidden neurons cy by identifying the product of



weight matrix and

visible vector

Step 3: Examine the y hidden states from cy probabilities.

Step 4: Evaluates the x exterior product of vectors and cy that is measured as a positive gradient  

x cy

Step 5: Examine the reconstruction of

′

visible states from y hidden states. Further, it is needed to

evaluate

′

hidden states from the reconstruction of

′

Step 6: Evaluate the

′

and

′

’s exterior product, be it as a negative gradient   

x y tP

Step 7: Define the updated weight as defined in Eq. (15), where η indicates the learning rate.

W



  

 



( ) (15)

Step 8: Update the weights with new values.

The following step defines the progression of DBN training with MLP training (normal) and RBM

training (pre-training)

Step 1: Initialize the DBN model with weights, biases and further associated parameters, which are

randomly selected.

Step 2: Firstly, the initialization of RBM model is progressed with the input data that serves the potentials

in its visible neurons and gives the unsupervised learning.

Step 3: Here, the input to the subsequent layer is subjected by potential sampling that processed in the

hidden neurons of the preceding layer. Further, it follows the unsupervised learning.

Step 4: The above-specified steps are continued for the corresponding count of layers. Hence, the pre-

training stage by RBM is processed till it reaches the MLP layer.

Step 5: MLP phase specifies the attained learning by supervised format and is continued till it attains

the target error rate.

200

Deep Learning in Engineering Education

Finally, the classifier predicts the students’ performance with increased accuracy rate. The predictions

are evaluated on the basis of various evaluation measures identified.

Dataset

To predict the performance of students, the private engineering college students are decided as popula-

tion. The engineering colleges under Mumbai University are selected as population.

There are more than 50+ engineering colleges under Mumbai University. Mumbai University con-

sists of the engineering colleges from Mumbai, New Mumbai and Thane region. So at the first stage, it

was decided to collect samples from Mumbai. At the second stage, it was decided to concentrate on the

geographical centre part of Mumbai. The samples are collected from engineering colleges which are

centrally located in Mumbai City.

Data collected here is Primary data and data collection is done using a questionnaire. Data collection

through the questionnaire is the most popular method in case of big enquiries.

Various parameters are identified which have a direct or indirect effect on the performance of stu-

dents. Careful selection of questions was important while keeping in mind that the questionnaire does

not pose a burden on respondents. The cognitive and non-cognitive parameters are identified. The effect

of cognitive and non-cognitive parameters in performance prediction is carefully understood by studying

various articles. The parameters identified are shown below:

Cognitive factors are the characteristic of a person which affect the performance and learning directly.

These factors are measurable. Non-cognitive factors are the parameters which are not directly linked

to but may have an effect on the performance and learning. Studies have shown that non-cognitive pa-

rameters have an equivalent effect on performance and learning. Non-cognitive factors are not directly

measurable. Keeping in mind the scenario of engineering students and colleges, few non-cognitive factors

which may have an indirect effect on the performance of students are decided.

Based on the parameters identified the class label is decided based on CGPA (Cumulative Grade

Point Average) score of 5th semester. The parameter ‘class’ indicates the CGPA score of the student in

Semester 5. The CGPA score is calculated on the scale of 1 to 10. This parameter indicates the perfor-

mance of the student in the coming semester. The implemented system predicts the performance of the

student as a CGPA score range.

Figure 7. Cognitive Parameters

201

Deep Learning in Engineering Education

There are 6% and 8% samples out of total samples in class 1 and 4 respectively. There are 36% and

50% samples out of total samples in class 3 and 4 respectively.

Evaluation Measures

To evaluate the effectiveness of the Machine Learning algorithms basic measures like Accuracy, Preci-

sion, Recall and F1-Measure (Han & Kamber, 2012) were adopted. Squared error based cost functions

are inconsistent for solving classification problems. Also, these measures are widely used in domains

such as information retrieval, machine learning and other domains that involve classification (Olson &

& Delen, 2008). A confusion matrix is a base for the determination of these measures.

Figure 8. Non-cognitive Parameters

Table 2. Output class label

Class (CGPA Score) Class Label

Bet 5 and 7

Bet 7 and 9

More than 9

202

Deep Learning in Engineering Education

• Confusion Matrix: The confusion matrix can be represented as follows:

Where –

True Positive (TP) = Number of positive instances correctly classified as positive.

False Positive (FP) = Number of positive instances incorrectly classified as negative.

True Negative (TN) = Number of negative instances correctly classified as negative.

False Negative (FN) = Number of negative instances incorrectly classified as positive

• Accuracy: Accuracy indicates the closeness of a predicted or classiﬁed value to its real value. The

state of being correct is called Accuracy. It can be calculated as:

Accuracy= (TP+TN)/(TP+TN+FP+FN)

• Precision: Precision can be deﬁned as the number of relevant items selected out of the total num-

ber of items selected. It represents the probability that an item is relevant. It can be calculated as:

Precision = TP/(FP+TP)

Precision is the measure of exactness.

• Recall: The Recall can be deﬁned as the ratio of relevant items selected to relevant items avail-

able. The recall represents a probability that a relevant item is selected. It can be calculated as:

Recall = TP/(FN+TP)

The recall is the measure of completeness.

• F1-Measure: F1-Measure is the harmonic mean between Precision and Recall as described

below:

F1-Measure= 2 * (Precision * Recall) / (Precision +Recall)

It creates a balance between precision and recall. Accuracy may be affected by class imbalance but

F1 Measure is not affected by class imbalance. So with accuracy F1-measure is also used for evaluation

of classification algorithms.

Predicted/Classified

Negative Positive

Actual

Negative True Negative (TN) False Positive (FP)

Positive False Negative (FN) True Positive (TP)

203

Deep Learning in Engineering Education

• Sensitivity: Sensitivity is used to ﬁnd out the proportion of positive samples that are correctly

identiﬁed also called a true positive rate. It is calculated as:

Sensitivity=TP/P

Where,

P = Total Number of Positive Samples

N = Total number of Negative Samples

• Speciﬁcity: Speciﬁcity is used to ﬁnd out the proportion of negative samples that are correctly

identiﬁed and also called a true negative rate. It is calculated as:

Specificity=TN/N

• False Positive Rate (FPR): FPR is used to ﬁnd out the proportion of negative samples that are

misclassiﬁed as positive samples. It is calculated as:

FPR=FP/N

• False Negative Rate (FNR): FNR is used to ﬁnd out the proportion of positive samples which are

misclassiﬁed as negative samples. It is calculated as:

FNR=FN/P

• Negative Predictive Value (NPV): NPV is used to ﬁnd out the number of samples which are true

negative. It is calculated as:

NPV=TN/(TN+FN)

• False Discovery Rate (FDR): FDR is also called an error rate. It is used to ﬁnd out a proportion

of false positive among all the samples that are classiﬁed as positive. It is calculated as:

FDR=FP/(FP+TP)

• Matthews’s correlation coeﬃcient (MCC): It is calculated as:

MCC=(TP*TN)-(FP*FN) /SQRT((TP+FP)(TP+FN)(TN+FP)(TN+FN))

MCC is a balanced measure based on a confusion matrix. This measure is used even if the classes

are of different sizes. It is a correlation coefficient between the actual classes and predicted classes. The

value of MCC lies between -1 to 1. The value near to +1 indicates the prediction is perfect. The value 0

indicates random prediction. The value -1 indicates a total disagreement between the actual and predicted

values. MCC score above zero indicates balanced classification. MCC is a good measure when the data

204

Deep Learning in Engineering Education

have varying classes, unbalanced dataset and random data (Jurman, Riccadonna, & Furlanello, 2012).

With F1-score the MCC guides in a better way to determine the suitable algorithm for classification.

Results

The results for various evaluation measures for the various training percentages are indicated in Figure

9, 10 and 11

The specificity of the hybrid model remains almost same to 0.85 for all training percentages. The

sensitivity score is 0.80. The precision is also almost constant to 0.78 and is increased for 60% training.

The FPR is changing slightly from 0.18 to 0.20. For all training percentages, the ratio lies quite low

from 0.03 to 0.06. The NPV score is good with an average value of 0.80. The FDR score is constant at

0.21 for different training percentages.

The accuracy graph shows a variation from 69% to 75% for different training percentages. The ac-

curacy is good with 50% training. The accuracy has improved and is better than pure SVM and DBN

for the considered data. The F1-score graph shows no variation in F1-score and it has a score of 0.81.

The MCC score also shows a variation with the values ranging from 0.13 to 0.18. The value of MCC

score is far better than the MCC scores of SVM and DBN. The good and positive MCC score suggests

that the proposed model is better suited for the data under consideration. There is a considerable improve-

ment in MCC score indicating the suitability of the model for the educational data.

Figure 9. Results for Hybrid Model: Sensitivity, Specificity, Accuracy and F1_Score

205

Deep Learning in Engineering Education

Figure 10. Results for Hybrid Model: FDR, FNR, NPV and FPR

Figure 11. Results for Hybrid Model: Precision and MCC

206

Deep Learning in Engineering Education

It is important to understand if the hybrid model is better than other models. It is necessary to look

at evaluation measures to find the performance and suitability of the hybrid classification method for

the collected educational data.

Table 3 shows the overall performance of the hybrid classification model over other models.

From this, it is observed that the hybrid prediction model is more superior to other methods with

respect to all measures. Particularly, the specificity of proposed SVM with Deep Learning model is

better from DBN and SVM.

The accuracy of the hybrid model is 5.06% and 6.69% superior to DBN and SVM. The hybrid model

also attained great precision over other methods. Similarly, the FPR of the hybrid model is 2.26% and

76.62% better from DBN and SVM respectively with less FPR.

The F1-Score of the hybrid method is 49.27% and 59.64% better from DBN and SVM. From this

analysis, it is proved that the hybrid prediction model is highly efficient when compared to other con-

ventional methods.

The Graph in Figure 12 shows the overall performance of the proposed model. The hybrid model

has better accuracy, F1 score and MCC indicating that the proposed hybrid model created using SVM

and DBN is able to classify the educational data in a better way.

Discussion

Table 4(a), (b), (c) and (d) shows the performance score of the hybrid model for evaluation parameters

Accuracy, F1 Measure, FPR and MCC for different classes. The training percentage is 60%.

Accuracy for the various classes is improved drastically for the hybrid model, mainly for the classes

where data samples are less. Even F1 score and MCC score is better of the hybrid model. Low FPR

indicates that prediction of classes by the hybrid model is improving through the data is imbalanced.

Table 3. Overall Performance of hybrid classification model over other methods

Algorithms→

Measures↓ SVM DBN SVM with DBN

Specificity 0.48 0.84 0.87

Sensitivity 0.18 0.78 0.82

Accuracy 0.65 0.7 0.76

Precision 0.36 0.38 0.79

FPR 0.83 0.23 0.19

FNR 0.53 0.17 0.04

NPV 0.18 0.78 0.82

FDR 0.65 0.63 0.22

F1- Score 0.41 0.54 0.82

MCC 0.04 0.09 0.27

207

Deep Learning in Engineering Education

The accuracy is increased to 75%. Still, there is scope for improvement in the model to achieve better

accuracy. The model can be further optimised to gain better accuracy. The model can be further improved

by improving training. The scores of evaluation measures for various training percentages indicated that

the model can be improved with improved training experience.

Figure 12. Performance of hybrid model over other algorithms

Table 4. Scores of Accuracy, FPR, F1-Score and MCC for Different Classes for Hybrid Model

Algorithm→

Class↓ SVM DBN SVM-DBN Algorithm→

Class↓ SVM DBN SVM-DBN

1 0.47 0.77 0.77 1 0.75 0.09 0.01

2 0.36 0.76 0.8 2 0.84 0.09 0.01

3 0.37 0.79 0.79 3 0.88 0.09 0.03

4 0.15 0.37 0.43 4 0.09 0.01 0.01

Accuracy FPR

Algorithm→

Class↓ SVM DBN SVM-DBN Algorithm→

Class↓ SVM DBN SVM-DBN

1 0.5 0.58 0.5 1 0.05 0.17 0.19

2 0.49 0.5 0.5 2 0.05 0.18 0.18

3 0.5 0.53 0.5 3 0.06 0.17 0.21

4 0.03 0.5 0 4 0.13 0.17 0.29

F1-Score MCC

208

Deep Learning in Engineering Education

SOLUTIONS AND RECOMMENDATIONS

Hybrid DL model gave better performance than the advanced ML models. In such cases, it is always

interesting to investigate the combinatory models to find out the classification experience. The perfor-

mance of the Combinatory models may be improved by optimizing the learning of the model.

After evaluating the performance of the Hybrid model including Deep Learning, an optimized hybrid

model is implemented using Cuckoo search optimization method. The optimization is achieved in a bet-

ter way when Cuckoo Search with Levy Flight is used.

Algorithms

The Optimized model - LCDBN

The input educational data is first given to the PCA for dimensionality reduction. The important features

extracted are fed to the SVM for further generalized prediction. These predictions which are not so ac-

curate are then fed to the DBN for final classification and prediction. Here in DBN, there is a change

in training the DBN model. The RBM units in DBN model are trained with Cuckoo Search with Levy

Flight. The training of DBN is improved using the CS with Levy Flight so the model - LCDBN.

Algorithm 2 depicts the training of the proposed LCDBN model. Here the RBM is trained using the

Levy Flights of Cuckoo Search. Instead of using the simple random walk in the Cuckoo Search, Levy

Flights are used to further improve the Cuckoo Search Algorithm. In traditional CS algorithm, the ran-

dom walk was taken using Gaussian Processes.

A Levy Flight can be thought of as a random walk where the step size has a Levy tailed probability

distribution. The weight of the DBN model has been updated as shown in Eq. (16). Here, the levy search

of CSA W t

old *

 



is included in the RBM weight matrix update. From Eq. (16), Wold refers to Yi

t of

CSA, Er denotes the error and α indicates the scaling factor.

W W Er s x Q s x X W t

old old

 

 

 

 



 



  

* * *

' '

1 1 2 2 2

1| (16)

System Model

Figure 13 shows the optimized hybrid model.

The input parameters are split and are fed to PCA to reduce parameters. The output from each PCA is

given to the individual SVM for prediction of class label. The class labels predicted by the SVM; acts as

the input to the DBN. Using this input and actual class labels, the DBN predicts the classes for the data.

The DBN is constructed with 2 layers of RBM. One layer of RBM represents a hidden layer and a

visible layer. The RBM layers are constructed with 3 neurons each and the activation function used is

a sigmoid function. The numbers of input neurons are 3. After RBM layers an MLP layer is added for

prediction of class. The MLP layer has 3 neurons and logistic regression is used as an activation func-

tion. The output layer has one neuron to predict the class label. Here the learning of DBN is optimized

using Cuckoo Search with Levy Flight.

209

Deep Learning in Engineering Education

The model follows a parallel framework. If the number of features increases in future then more PCA

and SVM components can be introduced. Vertical fragmentation suggests model can be easily adapted

in the Map Reduce framework (Maitrey & C.K.Jha, 2015) for Big Data processing. As well as horizon-

tal fragmentation is also done to suggest the suitability for Big Data application. Here the horizontal

fragmentation may give multiple calls to single PCA block.

Algorithm 2. Modified RBM learning Model

RBM update (X1,ε,W, u,c)

This is the RBM update process for binomial units. It is adopted to other kinds of units

X1 denotes a sample from the training distribution for RBM

ε denotes a learning rate for stochastic gradient descent in contrastive divergence

W denotes RBM weight matrix

u denotes RBM offset vector for input units

c denotes the RBM offset vector for hidden units

Q(s2=1|X2) indicates the vector with components Q(s2i=1|X2)

for the entire hidden units, i do

Evaluate Q(s1i=1|X1) (for binomial units, sigm c W s

ij i















1

Sample s1i𝜖{0,1} from Q(s1i|X1)

end for

for the entire hidden units, j do

Evaluate P(x2j=1|s1) (for binomial units, sigm u W s

ij i















1)

Sample x2jϵ{0,1} from P(x2j=1|X1)

end for

for the entire hidden units, i do sigm c W x

ij j















2

Evaluate Q(s2i=1|X2) (for binomial units, sigm c W x

ij j















2

end for

Update the weight using Eq. (4)

u←u+ε(x1 – x2)

c←c+ε(s1 – Q(s2=1|X2))

210

Deep Learning in Engineering Education

Results

The overall performance analysis regarding the students’ performance prediction using proposed SVM-

LCDBN model is given by Figure 14. From the analysis, better accuracy, specificity, sensitivity, preci-

sion, FPR, FNR, FDR, NPV, F1-score and MCC was determined for the adopted scheme, thus revealing

its superiority when compared with other schemes.

From the simulation, the accuracy of the presented approach is 49.32% better than NN, 17.63% better

than SVM, 11.32% better than DBN, 3.5% better than SVM-DBN and 2.33% better than SVM-CDBN

models.

Effect of λ on results

Figure 15 shows the performance of proposed model SVM-LCDBN for the various values of CS pa-

rameter λ. The graph is plotted for training percentage 70. One can fairly see that the performance of the

proposed model is best for the value λ=1. The model is tested for various values and it is observed that

the proposed model gives best performance for various evaluation parameters for the value 1.

Also it is seen that the scores of various evaluation parameters is good when the training percentage

is 70.

Effect of α on Prediction

The scale parameter is represented by α. The performance of the classifier depends on scaling parameter,

that’s why various values of α are tested for performance evaluation.

Figure 13. System Model

211

Deep Learning in Engineering Education

Table 5 describes the effect of scaling factor for the various performance measures on predicting the

students’ performance for training percentage 70. Accordingly, here the value of α is varied for α=0.2,

α=0.4, α=0.6 and α=0.8 and the measures are evaluated.

Figure 14. Performance of Optimized Model over other Algorithms

Figure 15. Effect of λ on Performance of Optimized Model

212

Deep Learning in Engineering Education

Discussion

In DBN, LC model was adopted for weight computation. Consequently, the adopted prediction model

SVM-LCDBN was suggested that makes a deep connection with the hybrid classifier to obtain a more

precise output. Moreover, the proposed SVM-LCDBN model was compared with traditional schemes,

and the results are attained.

From the simulation, the accuracy of the presented approach was 17.63% better than SVM, 11.32%

better than DBN, 3.5% better than SVM-DBN and 2.33% better than SVM-CDBN models. Thus better

enhancements were obtained by the proposed SVM-LCDBN model for predicting the student’s performance.

Table 6 shows the score of the evaluation measures for Accuracy, f1 Score, MCC and FPR of the

optimised model. These values are calculated for λ=1 and α=0.2. The accuracy % for any model for

CGPA<5 indicates that the model is able to accurately classify % of total samples in this class.

Table 5. Effect of α on prediction

Parameter

𝛂=0.2 𝛂=0.4 𝛂=0.6 𝛂=0.8

SVM-LCDBN SVM-LCDBN SVM-LCDBN SVM-LCDBN

Specificity 0.88 0.86 0.8 0.8

Sensitivity 0.86 0.86 0.83 0.88

Accuracy 0.79 0.75 0.75 0.72

Precision 0.8 0.73 0.86 0.77

FPR 0.19 0.15 0.1 0.13

FNR 0.03 0.15 0.2 0.3

NPV 0.82 0.86 0.83 0.88

FDR 0.15 0.28 0.15 0.24

F1_Score 0.83 0.82 0.85 0.85

MCC 0.3 0.24 0.23 0.19

Table 6. Evaluation Measures for the Model SVM-LCDBN

Evaluation Measure→

Class↓ Accuracy F1-Score MCC FPR

CGPA Score <5 76 0.79 0.1 0.11

CGPA Score Between 5 and 7 77 0.81 0.23 0.11

CGPA Score Between 7 and 9 80 0.85 0.24 0.12

CGPA Score More than 9 78 0.74 0.13 0.11

Overall Score 79 0.83 0.28 0.13

213

Deep Learning in Engineering Education

The results show that the scores are improved for all classes. Accuracy and MCC score indicates that

the optimised model is better. The model can be used to predict the class label effectively. The motivation

behind the research work is to predict the performance of the student at the early stage. The implemented

model is able to predict the performance of the students by identifying the appropriate class label with

good accuracy. As well, improved MCC score suggests that the model is suitable for the educational data.

FUTURE RESEARCH DIRECTIONS

The chapter represents one of the ways to analyse the Educational Data using an optimised hybrid model.

There are many other ways to work in the area of EDM and DL together. Some improvement in the DL

model is also beneficial to improve the accuracy of prediction in Educational Domain.

The model considered here is a hybrid model for classification using SVM – DBN. The model is used

for performance prediction. There are many other tasks in EDM like course recommendation where such

hybrid models may be effective. SVM and DBN being generative models are applicable in many domains.

The performance of the DBN can be further improved if training is improved. There are many opti-

mization techniques which can be combined to improve the training of the DBN. It is interesting to find

out how other optimization techniques will be beneficial to improve the accuracy of the model.

CONCLUSION

A Deep Learning model for the Performance Prediction of Students in Educational Information System

is implemented. The work started with the motivation to implement a hybrid model with better accuracy

in Educational Domain and extended to optimised hybrid model.

ML and DL are the fields of Artificial Intelligence where algorithms learn by themselves. These

algorithms can be applied in many emerging areas where they may be effective. These algorithms are

found useful in many areas with an increase in data size. The main aim of using DL model is to increase

accuracy.

Before devising the hybrid model, ML algorithms are applied to the data collected from the educa-

tional domain. The ML algorithms like SVM and pure DL algorithm - DBN are applied on the collected

data. Additional evaluation measures are used to test the algorithms. Balanced evaluation measure like

MCC particular for the ML domain is used with traditional F1-score. The results show that pure DL

and advanced ML algorithms are giving similar accuracy. Hence a new hybrid model for performance

prediction of students to get better accuracy is implemented

Optimization is the technique to improve the accuracy of the model by tuning learning weightsThere

are many popular metaheuristic techniques but Cuckoo Search Optimization is chosen because of its

many advantages.

A new students’ performance prediction hybrid model is proposed and is improved by using the

Cuckoo Search with Levy flight optimization technique. The proposed model uses a new hybridized

classifier to predict the performance, which hybridizes the SVM and DBN classifier. The data is trained

by SVM, and as the tuning is not appropriate inaccurate prediction, the resultant class labels from SVM

214

Deep Learning in Engineering Education

are considered as the features to DBN, where it has classified the performance. The performance of the

proposed prediction model is further improved by optimizing the training of the DBN. From the results,

it was evident that the results of the optimized hybrid prediction model are better than traditional ML,

advanced ML and pure DL algorithms.

REFERENCES

Alpaydin, E. (2004). Introduction to Machine Learning. Cambridge: MIT Press.

Alwaisi, S., & Baykan, O. K. (2017). Training Of Artificial Neural Network Using Metaheuristic Algo-

rithm. International Journal of Intelligent Systems and Applications in Engineering.

Ashtari, S., & Eydgahi, A. (2017). Student perceptions of cloud applications effectiveness in higher

education. Journal of Computational Science, 23, 173–180. doi:10.1016/j.jocs.2016.12.007

Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends in Machine Learning,

2(1), 1–127. doi:10.1561/2200000006

Buniyamin, N., Mat, U. b., & Arshad, P. M. (2015, November). Educational Data Mining for Predic-

tion and Classification of Engineering Students Achievement. In IEEE 7th International Conference on

Engineering Education (ICEED), (pp. 49-53). Kanazawa: IEEE.

Chen, X., Vorvoreanu, M., & Madhavan, K. (2013). Mining Social Media Data for Understanding Stu-

dents’ Learning Experiences. IEEE Transactions on Learning Technologies.

Chui, K. T., Fung, D. C., Lytras, M. D., & Lam, T. M. (2017). Predicting at-risk university students in a

virtual learning environment via a machine learning algorithm. Computers in Human Behavior.

Costa, E., Fonseca, B., Santana, M. A., Araújo, F., & Rego, J. (2017). Evaluating the effectiveness of

educational data mining techniques for early prediction of students’ academic failure in introductory

programming courses. Computers in Human Behavior, 73, 247–256. doi:10.1016/j.chb.2017.01.047

Davidt, O. E., & Netanyahu, N. S. (2015, July). DeepSign: Deep Learning for Automatic Malware Detec-

tion. International Joint Conference on Neural Networks (IJCNN), 1-8. 10.1109/IJCNN.2015.7280815

Davis, D. (1998). The virtual university: A learning university. Journal of Workplace Learning, 10(4),

175–213. doi:10.1108/13665629810213935

Deng, L., & Yu, D. (2014). Deep Learning for Signal and Information Processing. Redmond, WA:

Microsoft Research.

Deng, Y., Ren, Z., Kong, Y., Bao, F., & Dai, Q. (2017). A Hierarchical Fused Fuzzy Deep Neural

Network for data classification. IEEE Transactions on Fuzzy Systems, 25(4), 1006–1012. doi:10.1109/

TFUZZ.2016.2574915

Gao, N., Gao, L., Gao, Q., & Wang, H. (2014, November). An Intrusion Detection Model Based on

Deep Belief Networks. Second International Conference on Advanced Cloud and Big Data, 247-252.

215

Deep Learning in Engineering Education

Giannakos, M. N., Aalberg, T., Divitini, M., Jaccheri, L., Mikalef, P., Pappas, I. O., & Sindre, G. (2017,

April). Identifying Dropout Factors in Information Technology Education: A Case Study. IEEE Global

Engineering Education Conference (EDUCON), 1187-1194. 10.1109/EDUCON.2017.7942999

Gobert, J. D., Kim, Y. J., Pedro, M. A., Kennedy, M., & Betts, C. G. (2015). Using educational data

mining to assess students’ skills at designing and conducting experiments within a complex systems

microworld. Thinking Skills and Creativity, 18, 81–90. doi:10.1016/j.tsc.2015.04.008

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

Guarín, C., Guzmán, E., & González, F. (2015). A Model to Predict Low Academic Performance at a

Specific Enrollment Using Data Mining. IEEE Journal of Latin-American Learning Technologies, 10(3).

Guo, B., Zhang, R., Guang, X., Shi, C., & Yang, L. (2015, July). Predicting Students performance in

educational data mining. International Symposium on Educational Technology (ISET), 125-128. 10.1109/

ISET.2015.33

Han, J., & Kamber, M. (2012). Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann.

Hijazi, S. T., & Naqvi, S. R. (2006). Factors Affecting Students’ Performance. Bangladesh e-Journal

of Sociology, 3.

Jurman, G., Riccadonna, S., & Furlanello, C. (2012). A comparison of MCC and CEN error measures

in multi-class prediction. PLoS One, 7(8), e41882. doi:10.1371/journal.pone.0041882 PMID:22905111

Kim, S. M. L., & Shen, J. (2015, July). A novel deep learning by combining discriminative model

with generative model. International Joint Conference on Neural Networks (IJCNN), 1-6. 10.1109/

IJCNN.2015.7280589

Koch, F., Assunção, M. D., Cardonha, C., & Netto, M. A. (2016). Optimising resource costs of cloud com-

puting for education. Future Generation Computer Systems, 55, 473–479. doi:10.1016/j.future.2015.03.013

Kuwata, K., & Shibasaki, R. (2015, July). Estimating crop yields with deep learning and remotely

sensed data. International Geoscience and Remote Sensing Symposium (IGARSS), 858-861. 10.1109/

IGARSS.2015.7325900

Maćkiewicz, A., & Ratajczak, W. (1993). Principal Components Analysis (PCA). Computers & Geosci-

ences, 19(3), 303–342. doi:10.1016/0098-3004(93)90090-R

Maitrey, S., & Jha, C. K. (2015). MapReduce: Simplified Data Analysis of Big Data. Procedia Computer

Science, 57, 563–571. doi:10.1016/j.procs.2015.07.392

Mondal, P. (n.d.). 7 Important Factors that May Affect the Learning Process. Retrieved from http://

www.yourarticlelibrary.com/learning/7-important-factors-that-may-affect-the-learning-process/6064/

Mushtaq, I., & Khan, S. N. (2012). Factors Affecting Students’ Academic Performance. Global Journal

of Management and Business Research, 12(9).

Najafabadi, M. M., Villansutre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic, E. (2015).

Deep learning applications and challenges in big data analytics. Springer Open Journal of Big Data.

doi:10.118640537-014-0007-7

216

Deep Learning in Engineering Education

Nguyen, K., Fookes, C., & Sridharan, S. (2015, September). Improving Deep Convolutional Neural

Networks with Unsupervised Feature Learning. IEEE International Conference on Image Processing

(ICIP), 2270-2271. 10.1109/ICIP.2015.7351206

Olson, D. L., & Delen, D. (2008). Advanced data mining techniques (1st ed.). Springer Publishing

Company.

Phung, L. T., Chau, V. T., & Phung, N. H. (2015, November). Extracting Rule RF in Educational Data

Classification from a Random Forest to Interpretable Refined Rules. International Conference on Ad-

vanced Computing and Applications (ACOMP), 20-27. 10.1109/ACOMP.2015.13

Pradeep, A., & Das, S., & J, J. (2015). Students Dropout Factor Prediction Using EDM Techniques.

International Conference on Soft-Computing and Network Security. 10.1109/ICSNS.2015.7292372

Punlumjeak, W., & Rachburee, N. (2015, October). A Comparative Study of Feature Selection Tech-

niques for Classify Student Performance. 7th International Conference on Information Technology and

Electrical Engineering (ICITEE), 425-429. 10.1109/ICITEED.2015.7408984

Rajanna, A. R., Aryafar, K., Shokoufandeh, A., & Ptucha, R. (2015). Deep Neural Networks: A Case

Study for Music Genere Classification. 14th International Conference on Machine Learning and Ap-

plications (ICMLA), 665-660. 10.1109/ICMLA.2015.160

Shiau, W.-L., & Chau, P. Y. (2016). Understanding behavioral intention to use a cloud computing

classroom: A multiple model comparison approach. Information & Management, 53(3), 355–365.

doi:10.1016/j.im.2015.10.004

Shoukat, A. (2013). Factors Contributing to the Students’ Academic Performance: A Case Study of

Islamia University Sub-Campus. American Journal of Educational Research, 283–289.

Sukhija, K., Jindal, D. M., & Aggarwal, D. N. (2015, October). The Recent State of Educational Data

Mining: A Survey and Future Visions. IEEE 3rd International Conference on MOOCs, Innovation and

Technology in Education, 354-359. doi: 10.1109/MITE.2015.7375344

Suryawan, A., & Putra, E. (2016). Analysis of Determining Factors for Successful Student’s GPA

Achievement. 11th International Conference on Knowledge, Information and Creativity Support Systems

(KICSS), 1-7.

Tzortzis, G., & Likas, A. (2007, October). Deep Belief Networks for Spam Filtering. IEEE International

Conference on Tools with Artificial Intelligence (ICTAI 2007), 306-309.

Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P.-A. (2010). Stacked Denoising Autoen-

coders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. Journal

of Machine Learning Research, 11, 3371–3408.

Vora, D., & Iyer, K. (2017). A Survey of Inferences from Deep Learning Algorithms. Journal of Engi-

neering and Applied Sciences, 12(SI), 9467-9472.

217

Deep Learning in Engineering Education

Vora, D., & Kamatchi, R. (2018). EDM – Survey of Performance Factors and Algorithms Applied.

International Journal of Engineering & Technology, 7(2.6), 93-97.

Wiering, M., Schutten, M., Millea, A., Meijster, A., & Schomaker, L. (2013, October). Deep Learning

using Linear Support Vector Machines. International Conference on Machine Learning: Challenges in

Representation Learning Workshop.

Wiering, M. A. M. S., Millea, A., Meijster, A., & Schomaker, L. (2016). Deep Support Vector Machines

for Regression Problems. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.71

8.987&rep=rep1&type=pdf

Xing, W., Guo, R., Petakovic, E., & Goggins, S. (2015). Participation-based student final performance

prediction model through interpretable Genetic Programming: Integrating learning analytics, educational

data mining and theory. Computers in Human Behavior, 47, 168–181. doi:10.1016/j.chb.2014.09.034

Yang, X.-S., & Deb, S. (2014). Cuckoo Search: Recent Advances and Applications. Neural Computing

& Applications, 24(1), 169–174. doi:10.100700521-013-1367-1

Yuan, Y., Zhang, M., Luo, P., Ghassemlooy, Z., Lang, L., Wang, D., ... Han, D. (2017). SVM-based

detection in visible light communications. Optik (Stuttgart), 151, 55–64. doi:10.1016/j.ijleo.2017.08.089

ADDITIONAL READING

Asif, R., Merceron, A., Ali, S. A., & Haider, N. (2017, October). Analyzing undergraduate students’

performance using educational data mining. Computers & Education, 11, 177–1943. doi:10.1016/j.

compedu.2017.05.007

Baker, R., & Siemens, G. (2013). Educational Data Mining and Learning Analytics. Cambridge hand-

book of the Learning Sciences.

Echegaray-Calderon O. A., Barrios-Aranibar D. (2015, October). Optimal selection of factors using

Genetic Algorithms and Neural Networks for the prediction of students’ academic performance. IEEE

Latin America Congress on Computational Intelligence (LA-CCI), Curitiba, 2015, pp. 1-6.

Fu, J., Chang, J., Huang, Y., & Chao, H. (2012). A Support Vector Regression-Based Prediction of Stu-

dents’ School Performance. International Symposium on Computer, Consumer and Control. 10.1109/

IS3C.2012.31

Guo, X., Huang, H., & Zhang, J. (2014). Comparison of Different Variants of Restricted Boltzmann

Machines. 2nd International Conference on Information Technology and Electronic Commerce(ICITEC).

10.1109/ICITEC.2014.7105610

218

Deep Learning in Engineering Education

KEY TERMS AND DEFINITIONS

Cognitive Factors: Characteristics of the student that have a direct effect on learning and perfor-

mance of the student.

Educational Data Mining: Tools and techniques to extract meaningful patterns from educational data.

Non-Cognitive Factors: Characteristics of the student which do not have as such a direct effect on

learning and performance but may have an indirect effect on performance and learning.

Optimization: Optimization is the technique to improve the accuracy of the model by tuning learn-

ing weights.

Predictive Analytics: Exploration of data to predict the future using various methods like statistics,

machine learning etc.

A Review of Training and Guidance Systems in Medical Surgery

Article

Full-text available

Aug 2020

In this paper, a map of the state of the art of recent medical simulators that provide evaluation and guidance for surgical procedures is performed. The systems are reviewed and compared from the viewpoint of the used technology, force feedback, learning evaluation, didactic and visual aid, guidance, data collection and storage, and type of solution (commercial or non-commercial). The works’ assessment was made to identify if—(1) current applications can provide assistance and track performance in training, and (2) virtual environments are more suitable for practicing than physical applications. Automatic analysis of the papers was performed to minimize subjective bias. It was found that some works limit themselves to recording the session data to evaluate them internally, while others assess it and provide immediate user feedback. However, it was found that few works are currently implementing guidance, aid during sessions, and assessment. Current trends suggest that the evaluation process’s automation could reduce the workload of experts and let them focus on improving the curriculum covered in medical education. Lastly, this paper also draws several conclusions, observations per area, and suggestions for future work.

Predicción de dificultades estudiantiles mediante técnicas de minería de textos

Article

Full-text available

Oct 2022

Uno de los problemas a los que se enfrenta el docente universitario, se relaciona con la falta de retroalimentación por parte de los estudiantes, que le permita conocer las dificultades que ellos enfrentan a la hora de adquirir conocimientos nuevos en clase. Algunos esfuerzos de investigación se han realizado para identificar y clasificar los problemas de los estudiantes, con datos provenientes de las redes sociales, pero limitando los criterios de búsqueda y recuperación de los datos. El presente artículo pretende cubrir la brecha dejada por los trabajos anteriores, mediante la creación de una metodología para predecir las dificultades de los estudiantes universitarios, que utiliza técnicas de extracción de datos provenientes de la red social Twitter y que ha sido plasmada en un flujo de trabajo, que integra técnicas de Minería de Textos, Procesamiento de Lenguaje Natural y Aprendizaje Profundo, usando Algoritmos Supervisados y Redes Neuronales Recurrentes. Para probar la metodología propuesta, se construyó un modelo de clasificación, cuya evaluación es prometedora, pues se alcanzó una precisión y exactitud del 80%. Los resultados obtenidos garantizan la predicción de las dificultades estudiantiles, dotando al docente de una herramienta efectiva para identificar las áreas que demanden mayor atención y poder así brindar ayuda oportuna a los estudiantes que tienen dificultades en el aula.

A survey of inferences from deep learning algorithms9467-9472

Article

Full-text available

Mar 2017

Deepali Vora

Deep Learning, viewed as a part of machine learning, is capable of learning the essence of data from a small data set by learning a nonlinear network structure. Deep Learning algorithms have shown wide use in speech recognition, image processing, video processing and has wide application in various areas which uses any of these techniques. Still many application areas are emerging and are untouched by Deep Learning Architectures. This paper gives an overview of various application areas where Deep Learning algorithms or variants are applied and proved beneficial. As well it gives a summary of the top five popular Deep Learning tools available for research and implementation of Deep Learning algorithms.

EDM - Survey of performance factors and algorithms applied

Article

Full-text available

Mar 2018

Educational Data Mining (EDM) is a new field of research in the data mining and Knowledge Discovery in Databases (KDD) field. It mainly focuses in mining useful patterns and discovering useful knowledge from the educational information systems from schools, to colleges and universities. Analysing students’ data and information to perform various tasks like classification of students, or to create decision trees or association rules, so as to make better decisions or to enhance student’s performance is an interesting field of research. The paper presents a survey of various tasks performed in EDM and algorithms (methods) used for the same. The paper identifies the lacuna and challenges in Algorithms applied, Performance Factors considered and data used in EDM.

Educational data mining and learning analytics

Chapter

Full-text available

Jan 2014

During the past decades, the potential of analytics and data mining - methodologies that extract useful and actionable information from large datasets - has transformed one field of scientific inquiry after another (cf. Collins, Morgan, & Patrinos, 2004; Summers et al., 1992). Analytics has become a trend over the past several years, reflected in large numbers of graduate programs promising to make someone a master of analytics, proclamations that analytics skills offer lucrative employment opportunities (Manyika et al., 2011), and airport waiting lounges filled with advertisements from different consultancies promising to significantly increase profits through analytics. When applied to education, these methodologies are referred to as learning analytics (LA) and educational data mining (EDM). In this chapter, we will focus on the shared similarities as we review both parallel areas while also noting important differences. Using the methodologies we describe in this chapter, one can scan through large datasets to discover patterns that occur in only small numbers of students or only sporadically (cf. Baker, Corbett, & Koedinger, 2004; Sabourin, Rowe, Mott, & Lester, 2011); one can investigate how different students choose to use different learning resources and obtain different outcomes (cf. Beck, Chang, Mostow, & Corbett, 2008); one can conduct fine-grained analysis of phenomena that occur over long periods of time (such as the move toward disengagement over the years of schooling - cf. Bowers, 2010); and one can analyze how the design of learning environments may impact variables of interest through the study of large numbers of exemplars (cf. Baker et al., 2009). In the sections that follow, we argue that learning analytics has the potential to substantially increase the sophistication of how the field of learning sciences understands learning, contributing both to theory and practice.

Identifying Dropout Factors in Information Technology Education: A Case Study

Conference Paper

Full-text available

Apr 2017

Educators and researchers have been working to understand the reasons that may be contributing to high dropout rates, and low rates of participation, by females in the computer and information sciences discipline. Along the same lines, and propelled by the increased need for information technology (IT) professionals worldwide, we implemented a students' survey during the fall of 2015 in Norway's primary university for technological education. In this initiative we aim to identify reasons that may be contributing to high dropout rates, low rates of participation by females and aspects important for the efficient preparation of young people for careers in computer science and information technology. The results provide valuable insights and allow us to take appropriate measures for enhancing students' learning experience in the computer and information sciences.

Predicting At-risk University Students in a Virtual Learning Environment via a Machine Learning Algorithm

Article

Jun 2018
COMPUT HUM BEHAV

A university education is widely considered essential to social advancement. Ensuring students pass their courses and graduate on time have thus become issues of concern. This paper proposes a reduced training vector-based support vector machine (RTV-SVM) capable of predicting at-risk and marginal students. It also removes redundant training vectors to reduce the training time and support vectors. To examine the effectiveness of the proposed RTV-SVM, 32,593 university students on seven courses were chosen for performance evaluation. Analysis reveals that the RTV-SVM achieved a training vector reduction of at least 59.7% without altering the margin or accuracy of the classifier. Moreover, the results showed the proposed method to be capable of achieving overall accuracy of 92.2–93.8% and 91.3–93.5% in predicting at-risk and marginal students, respectively.

Training Of Artificial Neural Network Using Metaheuristic Algorithm

Article

Jul 2017

Shaimaa Safaa Ahmed Alwaisi

This article clarify enhancing classification accuracy of Artificial Neural Network (ANN) by using metaheuristic optimization algorithm. Classification accuracy of ANN depends on the well-designed ANN model. Well-designed ANN model Based on the structure, activation function that are utilized for ANN nodes, and the training algorithm which are used to detect the correct weight for each node. In our paper we are focused on improving the set of synaptic weights by using Shuffled Frog Leaping metaheuristic optimization algorithm which are determine the correct weight for each node in ANN model. We used 10 well known datasets from UCI machine learning repository. In order to investigate the performance of ANN model we used datasets with different properties. These datasets have categorical, numerical and mixed properties. Then we compared the classification accuracy of proposed method with the classification accuracy of back propagation training algorithm. The results showed that the proposed algorithm performed better performance in the most used datasets.

SVM-based Detection in Visible Light Communications

Article

Aug 2017
OPTIK

A support vector machine (SVM)-based data detection for 8-superposed pulse amplitude modulation and direct-current-biased optical orthogonal frequency division multiplexing in visible light communication is proposed and experimentally demonstrated. In this work, the SVM detector contains multiple binary classifiers with different classification strategies. The separating hyperplane of each SVM is constructed by means of the training data. The experiment results presented that the SVM detection offers improved bit error rate performance compared with the traditional direct decision method.

Analysis of determining factors for successful student's GPA achievement

Conference Paper

Nov 2016

Analyzing undergraduate students' performance using educational data mining

Article

May 2017
COMPUT EDUC

The tremendous growth in electronic data of universities creates the need to have some meaningful information extracted from these large volumes of data. The advancement in the data mining field makes it possible to mine educational data in order to improve the quality of the educational processes. This study, thus, uses data mining methods to study the performance of undergraduate students. Two aspects of students' performance have been focused upon. First, predicting students' academic achievement at the end of a four-year study programme. Second, studying typical progressions and combining them with prediction results. Two important groups of students have been identified: the low and high achieving students. The results indicate that by focusing on a small number of courses that are indicators of particularly good or poor performance, it is possible to provide timely warning and support to low achieving students, and advice and opportunities to high performing students.

Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses

Article

Feb 2017
COMPUT HUM BEHAV

The data about high students' failure rates in introductory programming courses have been alarming many educators, raising a number of important questions regarding prediction aspects. In this paper, we present a comparative study on the effectiveness of educational data mining techniques to early predict students likely to fail in introductory programming courses. Although several works have analyzed these techniques to identify students' academic failures, our study differs from existing ones as follows: (i) we investigate the effectiveness of such techniques to identify students likely to fail at early enough stage for action to be taken to reduce the failure rate; (ii) we analyse the impact of data preprocessing and algorithms fine-tuning tasks, on the effectiveness of the mentioned techniques. In our study we evaluated the effectiveness of four prediction techniques on two different and independent data sources on introductory programming courses available from a Brazilian Public University: one comes from distance education and the other from on-campus. The results showed that the techniques analyzed in our study are able to early identify students likely to fail, the effectiveness of some of these techniques is improved after applying the data preprocessing and/or algorithms fine-tuning, and the support vector machine technique outperforms the other ones in a statistically significant way.

Deep Learning in Engineering Education: Performance Prediction Using Cuckoo-Based Hybrid Classification

Abstract and Figures

Recommended publications

A New Hybrid Model of Deep Learning ResNeXt-SVM for Weed Detection

Deep Learning in Engineering Education: Implementing a Deep Learning Approach for the Performance Pr...

EDM - Survey of performance factors and algorithms applied

A hybrid classification model for prediction of academic performance of students: a big data applica...

Predicting students' academic performance: Levy search of cuckoo-based hybrid classification