ChapterPDF Available

Predicting Student Academic Performance Using Machine Learning

September 2021

September 2021

DOI:10.1007/978-3-030-87013-3_36

In book: Computational Science and Its Applications – ICCSA 2021 (pp.481-491)

Authors:

Opeyemi Ojajuni

Southern University and A&M College

Foluso Ayeni

University of Nebraska at Omaha

Femi Ekanoye

ICT University

Show all 8 authorsHide

The introduction of the Internet of Things (IoT), Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL), and Big Data have paved the way for research focused on improving the student learning experience and help to address challenges faced by the education system. Machine Learning technology analyzes data to recognize patterns and use them to make predictions. This paper introduces a ML model that classify and predict student academic success by utilizing supervised ML algorithms like Random Forest, Support Vector Machines, Gradient boosting, Decision Tree, Logistic Regression, Regression, Extreme Gradient Boosting (XGBoost), and Deep Learning. This paper aims to predict student’s academic success based on historical data and identify the key factors that affect student academic success. Thus, the proposed approach offers a solution to predict student academic performance efficiently and accurately by comparing several ML models to the Deep Learning model. Results show that the Extreme Gradient Boosting (XGBoost) can predict student academic performance with an accuracy of 97.12%. Furthermore, results showed significant social and demographic features that affect student academic success. This study concludes that applying Machine Learning technology in the classroom will help educators identify gaps in student learning and enable early detection of underperforming students, thus empowering educators with informed decision-making.

ML model flowchart

…

Important features and its score

…

Dataset [13] attributes

…

Comparison of Machine Learning models ML classifier Accuracy (%) Cross validation (%)

…

Figures - uploaded by Sanjay Misra

Content may be subject to copyright.

Content uploaded by Sanjay Misra

Content may be subject to copyright.

Predicting Student Academic Performance

Using Machine Learning

Opeyemi Ojajuni1, Foluso Ayeni2(B), Olagunju Akodu3, Femi Ekanoye4,

Samson Adewole4,TimothyAyo

4, Sanjay Misra5, and Victor Mbarika6

1Department of Science and Mathematics Education, Southern University and A&M College,

Baton Rouge, USA

Opeyemi_ojajuni_00@subr.edu

2Department of Information Systems and Quantitative Analysis, University of Nebraska,

Omaha, USA

fayeni@unomaha.edu

3Department of Electrical and Electronics Engineering, Southern University and A&M

College, Baton Rouge, USA

olagunju_akodu_00@subr.edu

4Global Technology Management and Policy Research Group, Southern University and A&M

College, Baton Rouge, USA

femi_ekanoye@subr.edu, oluwadamilaresam@gmail.com,

timothyayo99@gmail.com

5Department of Information and Communication Engineering, Covenant University,

Ota, Nigeria

sanjay.misra@covenantuniversity.edu.ng

6Department of Management Information Systems, East Carolina University, Greenville, USA

mbarikav20@ecu.edu

Abstract. The introduction of the Internet of Things (IoT), Artiﬁcial Intelligence

(AI), Machine Learning (ML), Deep Learning (DL), and Big Data have paved the

way for research focused on improving the student learning experience and help

to address challenges faced by the education system. Machine Learning technol-

ogy analyzes data to recognize patterns and use them to make predictions. This

paper introduces a ML model that classify and predict student academic suc-

cess by utilizing supervised ML algorithms like Random Forest, Support Vector

Machines, Gradient boosting, Decision Tree, Logistic Regression, Regression,

Extreme Gradient Boosting (XGBoost), and Deep Learning. This paper aims to

predict student’s academic success based on historical data and identify the key

factors that affect student academic success. Thus, the proposed approach offers

a solution to predict student academic performance efﬁciently and accurately by

comparing several ML models to the Deep Learning model. Results show that the

Extreme Gradient Boosting (XGBoost) can predict student academic performance

with an accuracy of 97.12%. Furthermore, results showed signiﬁcant social and

demographic features that affect student academic success. This study concludes

that applying Machine Learning technology in the classroom will help educators

identify gaps in student learning and enable early detection of underperforming

students, thus empowering educators with informed decision-making.

O. Gervasi et al. (Eds.): ICCSA 2021, LNCS 12957, pp. 481–491, 2021.

https://doi.org/10.1007/978-3-030-87013-3_36

482 O. Ojajuni et al.

Keywords: Machine learning ·Deep learning ·Student academic performance ·

Educational data mining ·Data analytics ·Convolutional Neutral Networks

(CNN)

1 Introduction

Educational data mining (EDM) applies data mining, machine learning, and deep learn-

ing to data generated in an academic setting to improve student learning experiences

[1,2,3]. The interaction of students with learning platforms and materials creates large

amounts of data [4,5]. Analyzing this data provides insight into the student learning

process and student achievement. Further analysis can identify academic, demographic,

and social factors affecting student academic success. Student academic success is mea-

sured by assessing student performance across academic subjects. Teachers measure

student academic performance from different approaches, ranging from students’ ﬁnal

grades, Grade Point Average (GPA), and Standardized Tests. According to reports from

the United States of America Department of Education and National Assessment of

Educational Progress (NAEP), the education system suffers from several challenges

like student academic underachievement, increased university dropout rates, graduation

delays, and inadequate student workforce readiness. Over the years, student academic

success has continued to decline, even more prevalent amongst minority students [6,7,

8]. Education technology advancements such as Artiﬁcial Intelligence (AI), Virtual Real-

ity (VR), 3D printing, smart multimedia devices, Internet of Things (IoT), and Machine

Learning are beginning to improve the student learning process and management [9].

Machine Learning analyzes data to recognize patterns and use those patterns to

make predictions. Applying ML in the classroom will enable educators to identify criti-

cal factors affecting student’s success. Furthermore, ML will allow educators to identify

underperforming students, thus empowering educators with informed decision-making.

Several tools such as R Software, Python Scikit-learn, TensorFlow are currently used

in ML technology. A wide range of ML algorithms is also available for predicting stu-

dent academic performance. These algorithms include Random Forest, Support Vector

Machines (SVM), AdaBoost, Decision Tree, Naive Bayes, and K-nearest Neighbors.

In this research work, we aim to use historical education data on student academic

performance collected from the UC Irvine Machine Learning Repository to identify the

key factors that affect student academic achievement. Furthermore, the research intends

to predict future student academic success by recognizing patterns in the historical

dataset and using the patterns to make predictions. The research objectives addressed in

this research work are listed below:

1. What are the factors that have signiﬁcant effect on students’ academic success?

2. How can these factors predict student academic performance using machine

learning?

The research paper is organized under the following subheading: Related research

work, methods and implementation, results, and conclusion.

Predicting Student Academic Performance Using Machine Learning 483

2 Related Research Work

Learning management systems have empowered education institutions with interactive

learning tools such as game-based, simulation applications, virtual reality, and e-learning

systems. These platforms have allowed researchers to collect and analyze student data

[2,5]. The authors [9] applied the Decision Tree, Neural Network, and Support Vector

Machine (SVM) classiﬁcation ML algorithm to predict academic performance from stu-

dent internet usage behaviors. Their results showed that student internet usage behaviors

effectively predict academic performance with an accuracy of 71%–76%; however, the

authors only considered accuracy as the performance metric. In [10] work, the authors

proposed a system that uses ML algorithms trained to predict students’ academic per-

formance by classifying them into bad or good. The model was trained on data gathered

from a university source and implemented using the K-nearest neighbor and Decision

tree classiﬁer. The result showed that the Decision tree classiﬁer has 94.44% accuracy,

but the author considered only accuracy as its performance metrics.

Similarly, the authors [2] proposed a classiﬁcation ML model using SVM and Logis-

tic regression classiﬁers to predict students’ academic performance. The model extracted

features from the preprocessed dataset obtained from an online educational platform to

classify student academic performance as bad, average, or good. The result showed that

the SVM produced an accuracy of 79%, which was higher than the logistic regression.

The authors considered accuracy, recall, precision, and f1-score using confusion box

metrics to evaluate the system’s performance. The authors [1] used Naïve Bayes, Ran-

dom Forest classiﬁer, and Ensemble learners classiﬁcation ML model to predict student

academic performance using a dataset comprising 887 instances of 19 attributes of ﬁrst-

year students. The Random Forest classiﬁer outperformed other models with an accuracy

of 93%. Evaluation metrics of recall, precision, and f1-score using confusion box metrics

was employed in evaluating the model performance. Research on ML in education is

still in its preliminary stages, there are still many challenges such as prediction accuracy,

overﬁtting, underﬁtting, deployment of the model that need attention. Thus, our proposed

approach offers an efﬁcient and accurate student academic performance by comparing

several ML models to deep learning models. Generally, deep learning models have better

accuracy because they extract features from the dataset in an incremental manner. ML

algorithms are applied to the dataset to analyze and identify features that signiﬁcantly

impacted student academic performance. Finally, leveraging these features, several ML

models are trained to classify and predict student academic performance category, and

we also compared the model’s performance based on accuracy score and cross-validation

score.

3 Material and Methods

3.1 Tools

The experiments were conducted on a computer running MacOS Big Sur operating sys-

tem with the speciﬁcation of 2.3 GHz Dual-Core Intel Core i5 with 8 Gigabytes memory.

Python programming language was used along with Scikit-learn, and TensorFlow ML

libraries to implement algorithms, build ML model, and obtain statistical results [11,

12].

484 O. Ojajuni et al.

3.2 Dataset

The dataset used in this study was from the UC Irvine Machine Learning repository

[13]. The dataset consists of 1044 student’s academic performance in two high schools.

The data attributes include demographic, social, and academic related features. Table 1

shows the summary of our dataset attributes.

Table 1. Dataset [13] attributes

Feature category Name of the attributes Description Attribute type

Demographical features School Student’s school Categorical

Sex Student’s sex Categorical

Age Student’s age Numeric

Address Student’s home

address type

Categorical

Famsize Family size Categorical

Pstatus Parent’s cohabitation

status

Categorical

Medu Mother’s education Numeric

Fedu Fedu - father’s

education

Numeric

Mjob Mother’s job Categorical

Fjob Father’s job Categorical

Reason Reason to choose this

school

Categorical

Guardian Guardian - student’s

guardian

Categorical

Social features Internet Internet access at

home

Categorical

Romantic With a romantic

relationship

Categorical

Famrel Quality of family

relationships

Numeric

Freetime Free time after school Numeric

Goout Going out with

friends

Numeric

Dalc Workday alcohol

consumption

Numeric

Wal c Weekend alcohol

consumption

Numeric

(continued)

Predicting Student Academic Performance Using Machine Learning 485

Table 1. (continued)

Feature category Name of the attributes Description Attribute type

Health Current health status Numeric

Academic related features Absences Number of school

absences

Numeric

Traveltime Home to school travel

time

Numeric

Studytime Weekly study time Numeric

Failures Number of past class

failures

Numeric

Schoolsup Extra educational

support

Categorical

Famsup Family educational

support

Categorical

Paid Number of past class

failures

Numeric

Activities Extra-curricular

activities

Categorical

Nursery Attended nursery

school

Categorical

Higher Wants to take higher

education

Categorical

Final grade Final grade Numeric

3.3 Data Preprocessing and Feature Engineering

Data preprocessing is done on the dataset to check for null values, duplicates, and invalid

values. Fortunately, our dataset is clean and ready for encoding. The ﬁnal grade was

converted into multiclass categories- “excellent, good, satisfactory, poor, and failure”

under the following conditions:

•Excellent – ﬁnal grade score is between 45–60

•Good– ﬁnal grade score is between 36–44

•Satisfactory– ﬁnal grade score is between 24–35

•Poor – ﬁnal grade score is between 20–23

•Failure – ﬁnal grade score is between 0–23

486 O. Ojajuni et al.

ML models require all input and output data to be attributed to numeric values.

Any data that is not numeric must be encoded to numeric values before ﬁtting it into a

ML model. Several attributes are non-numeric and categorical in our dataset, as seen in

Table 1. This study employs the One-Hot-encoding in Python’s Scikit-Learn to encode

and normalize non-numeric and categorical data attribute type [11]. Feature engineering

techniques help in extracting important features from the dataset.

3.4 Machine Learning Classiﬁcation Model

Solving problems with ML is grouped into supervised and unsupervised learning. Unsu-

pervised ML works with unstructured data, while supervised ML works with a structured

dataset where the input variables are mapped with the output variables. Supervised ML

problems are grouped into regression and classiﬁcation problems [14]. Regression prob-

lems involve predicting a continuous, discrete value, for example, predicting student ﬁnal

grade score. ML classiﬁcation refers to the process of predicting a category from input

data points. The category output can be binary classiﬁcation - “fail” or “pass” or multi-

class classiﬁcation- “excellent, good, satisfactory, poor, and failure”. ML classiﬁcation

is a supervised ML where input data is labeled and mapped with the output data; the ML

model lis trained to predict the output from input. Implementing a ML classiﬁer requires

importing the necessary ML module package, then loading the dataset [14]. Data pre-

processing and cleaning are done on the dataset to check for null values, duplicates,

invalid values and encode non-numeric and category data attribute types.

After successful data preprocessing, the feature engineering technique explores the

dataset to understand the correlation relationship between variables to identify features

that signiﬁcantly impact the output variable. This enabled us to improve the model’s

accuracy by removing attributes that signiﬁcantly impact the output variable (ﬁnal stu-

dent grade) but not an essential feature in predicting student academic performance. The

reﬁned dataset is then split into training & testing sets. The training dataset trains the

model, and the testing dataset measures the model’s performance based on accuracy

and cross-validation. Figure 1shows this study ML model ﬂowchart. This study built

and trained the following ML classiﬁcation algorithms: Random Forest, Support Vec-

tor Machine classiﬁer, Stochastic Gradient Descent, Decision Tree, Adaptive Boosting,

Logistic Regression, and Deep Learning. Deep learning is a technique that uses neural

network concepts to build and train ML models. Deep learning consists of the input

layer (receives the input data), hidden layer (incrementally extracts important features),

and the output layer [15]. Deep learning consisting of a Convolutional Neural Network

(CNN) model with four hidden layers is suitable for our research objectives.

Predicting Student Academic Performance Using Machine Learning 487

3.5 Machine Learning Model Performance Evaluation

ML uses the testing dataset to measure the performance of the model. Accuracy, cross-

validation, precision, recall, F1-score, confusion matrix, log loss, Receiver Operating

Characteristic (ROC), and Area Under Curve (AUC) are some of the performance metrics

used to evaluate ML classiﬁcation model [16]. This research employs accuracy and

cross-validation as performance metrics to evaluate the ML classiﬁcation models. The

CNN model’s performance was evaluated using a confusion matrix to calculate the

model’s accuracy, precision, and sensitivity. Accuracy is the total number of correct

predictions out of the total number of predictions [7]. Cross-validation assesses how

effective the model will work on a new dataset. The confusion matrix is an error matrix

that virtualizes ML model performance. The confusion matrix is used to calculate the

accuracy, precision, and sensitivity of the model. Precision is the ratio of correctly

predicted values to total predicted values. Sensitivity evaluates the proportion of correct

prediction the model gets right [7].

Fig. 1. ML model ﬂowchart

4 Implementation and Result

The “plot_importance” function in Scikit-learn library help in plotting the important

features that affect student ﬁnal grade. In predicting student academic performances,

the order of importance of features and its score can be seen in Fig. 2. The number of

school absences has the highest importance score. This indicates that students who miss

school are more likely to have poor academic performance. Current health status, going

out with friends, free time after school, quality of family relationships are major social

features that affect student academic performance. Mother’s job, father’s job, Parent’s

cohabitation status, student’s home address type, and reason to choose this school are

the most minor features that affect student academic performance.

488 O. Ojajuni et al.

Fig. 2. Important features and its score

To get an accurate evaluation of our model, the dataset containing 1044 students is

split into train and test dataset in 70% to 30% ratio using the ‘train_test_split’ func-

tion in sci-kit learn. After building and training the ML model, the cross-validation

function ‘cross_val_score’ helped compute the model’s average accuracy on the test

dataset. The cross-validation function divides the test dataset into smaller subsets. The

subsets are then ﬁt into the model and compute the accuracy score ﬁve times with differ-

ent subsets each time [17]. After applying various classiﬁcation models to the dataset,

different accuracy and cross-validation score were obtained for each model. Table 2

shows the accuracy and cross-validation scores for each model. The Deep Learning

model gave an accuracy of 72.74%, precision of 30.31%, and sensitivity of 31.38% .

Figure 3shows the confusion matrix used in calculating the performance matrix. The

Extreme Gradient Boosting (XGBoost) model outperforms other models in predicting

student academic performance. XGBoost Model gave 97.12% accuracy and 35.67%

cross-validation. Since the XGBoost model gave the best accuracy, this indicates that

the XGBoost ML model is the most suitable ML model considering the nature of our

dataset and research objectives.

Predicting Student Academic Performance Using Machine Learning 489

Table 2. Comparison of Machine Learning models

ML classiﬁer Accuracy (%) Cross validation (%)

Decision Tree Model 47.95 30.89

Random Forest Model 92.60 35.66

Support Vector Classiﬁer Model 42.88 34.39

Logistic Regression Model 40.96 36.62

Ada Boost Model 35.75 32.48

Stochastic Gradient Descent 33.69 33.121

XGBoost Model 97.12 35.67

Deep Learning (CNN) 72.22 Precision =30.31

Sensitivity =31.38

Fig. 3. Deep Learning confusion matrix

5 Conclusion and Future Work

This study has strengthened and explored how Machine learning can empower educators

with informed decision-making. Predicting student academic performance or success is

an essential concept in tackling the student academic performance crisis. This study

used several ML classiﬁcation models to predict student academic performance. Results

showed a range of accuracy from 33% to 98% and a range of cross-validation from

30% to 37%. The XGBoost Model is the most suitable ML model by achieving 97.12%

accuracy and 35.67% cross-validation. Furthermore, results showed that the number of

school absences, current health status, going out with friends, free time after school,

quality of family relationships is signiﬁcant features that affect student academic perfor-

mance. This study concludes that this research work can help educators identify gaps in

490 O. Ojajuni et al.

student learning and enable early detection of underachieving students, thus empower-

ing educators with informed decision-making, ultimately improving student academic

success and learning process.

References

1. Jayaprakash, S., Krishnan, S., Jaiganesh, V.: Predicting students academic performance using

an improved random forest classiﬁer. In: 2020 International Conference on Emerging Smart

Computing and Informatics (ESCI), Pune, India, pp. 238–243, March 2020. https://doi.org/

10.1109/ESCI48226.2020.9167547

2. Bhutto, E.S., Siddiqui, I.F., Arain, Q.A., Anwar, M.: Predicting students’ academic perfor-

mance through supervised machine learning. In: 2020 International Conference on Informa-

tion Science and Communication Technology (ICISCT), Karachi, Pakistan, pp. 1–6, February

2020. https://doi.org/10.1109/ICISCT49550.2020.9080033

3. Jacob, J., Jha, K., Kotak, P., Puthran, S.: Educational data mining techniques and their appli-

cations. In: 2015 International Conference on Green Computing and Internet of Things

(ICGCIoT), pp. 1344–1348, October 2015. https://doi.org/10.1109/ICGCIoT.2015.7380675

4. Al Mayahi, K., Al-Bahri, M.: Machine learning based predicting student academic success. In:

2020 12th International Congress on Ultra Modern Telecommunications and Control Systems

and Workshops (ICUMT),Brno, Czech Republic, pp. 264–268, October 2020. https://doi.org/

10.1109/ICUMT51630.2020.9222435

5. Olaperi, Y., Fernandez-Sanz, L., Medina, J., Misra, S.: Framework for academic advice

through mobile applications (2016)

6. Statement from Secretary DeVos on 2019 NAEP Results. U.S. Department of Educa-

tion. https://www.ed.gov/news/press-releases/statement-secretary-devos-2019-naep-results.

Accessed 24 Feb 2021

7. Rimadana, M.R., Kusumawardani, S.S., Santosa, P.I., Erwianda, M.S.F.: Predicting student

academic performance using machine learning and time management skill data. In: 2019 Inter-

national Seminar on Research of Information Technology and Intelligent Systems (ISRITI),

Yogyakarta, Indonesia, pp. 511–515, December 2019. https://doi.org/10.1109/ISRITI48646.

2019.9034585

8. bin Mohd Nasir, M.A.H., bin Asmuni, M.H., Salleh, N., Misra, S.: A review of student

attendance system using near-ﬁeld communication (NFC) technology. In: Gervasi, O., et al.

(eds.) ICCSA 2015. LNCS, vol. 9158, pp. 738–749. Springer, Cham (2015). https://doi.org/

10.1007/978-3-319-21410-8_56

9. Xu, X., Wang, J., Peng, H., Wu, R.: Prediction of academic performance associated with

internet usage behaviors using machine learning algorithms. Comput. Hum. Behav. 98, 166–

173 (2019). https://doi.org/10.1016/j.chb.2019.04.015

10. Hasan, H.M.R., Rabby, A.S.A., Islam, M.T., Hossain, S.A.: Machine learning algorithm

for student’s performance prediction. In: 2019 10th International Conference on Computing,

Communication and Networking Technologies (ICCCNT), Kanpur, India, pp. 1–7, July 2019.

https://doi.org/10.1109/ICCCNT45670.2019.8944629

11. Scikit-learn: machine learning in Python—scikit-learn 0.24.2 documentation. https://scikit-

learn.org/stable/. Accessed 04 May 2021

12. TensorFlow. https://www.tensorﬂow.org/. Accessed 04 May 2021

13. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/index.php.Accessed28Feb

2021

14. Nuraﬁfah, M.S., Abdul-Rahman, S., Mutalib, S., Hamid, N.H.A., Malik, A.M.A.: Review on

predicting students’ graduation time using machine learning algorithms. Int. J. Mod. Educ.

Comput. Sci. 11(7), 1 (2019). https://doi.org/10.5815/ijmecs.2019.07.01

Predicting Student Academic Performance Using Machine Learning 491

15. Lye, C.-T., Ng, L.-N., Hassan, M.D., Goh, W.-W., Law, C.-Y., Ismail, N.: Predicting pre-

university student’s mathematics achievement. Procedia. Soc. Behav. Sci. 8, 299–306 (2010).

https://doi.org/10.1016/j.sbspro.2010.12.041

16. Vijayalakshmi, V., Venkatachalapathy, K.: Comparison of predicting student’s performance

using machine learning algorithms. Int. J. Intell. Syst. Appl. 11(12), 34 (2019). https://doi.

org/10.5815/ijisa.2019.12.04

17. 3.1. Cross-validation: evaluating estimator performance—scikit-learn 0.24.2 documentation.

https://scikit-learn.org/stable/modules/cross_validation.html. Accessed 04 May 2021

Pakistan Journal of Life and Social Sciences Machine Learning in Multicultural Education

Article

Full-text available

Jan 2024

This study explores machine learning's role in multicultural education, identifying benefits and challenges, and offering practical recommendations for effective implementation. It aims to cultivate inclusivity, empathy, and mutual understanding in a diverse society. The research uses a library research methodology to analyze existing literature from academic journals, books, and reports, providing case studies and practical examples to demonstrate the application of machine learning in multicultural education. Machine learning can improve multicultural education by personalizing learning experiences and adapting content to diverse backgrounds, but challenges like algorithmic bias, inadequate educator training, and resource limitations need to be addressed. This research explores the integration of machine learning (ML) in multicultural education, offering practical insights and recommendations for promoting inclusivity and equity in diverse educational settings

Using ML to Predict User Satisfaction with ICT Technology for Educational Institution Administration

Article

Full-text available

Apr 2024

Effective and efficient use of information and communication technology (ICT) systems in the administration of educational organisations is crucial to optimise their performance. Earlier research on the identification and analysis of ICT users’ satisfaction with administration tasks in education is limited and inconclusive, as they focus on using ICT for nonadministrative tasks. To address this gap, this study employs Artificial Intelligence (AI) and machine learning (ML) in conjunction with a survey technique to predict the satisfaction of ICT users. In doing so, it provides an insight into the key factors that impact users’ satisfaction with the ICT administrative systems. The results reveal that AI and ML models predict ICT user satisfaction with an accuracy of 94%, and identify the specific ICT features, such as usability, privacy, security, and Information Technology (IT) support as key determinants of satisfaction. The ability to predict user satisfaction is important as it allows organisations to make data-driven decisions on improving their ICT systems to better meet the needs and expectations of users, maximising labour effort while minimising resources, and identifying potential issues earlier. The findings of this study have important implications for the use of ML in improving the administration of educational institutions and providing valuable insights for decision-makers and developers.

PREDICTING ACADEMIC PERFORMANCE USING MACHINE LEARNING (HSE CASE STUDY)

Thesis

Full-text available

Jun 2024

Predicting Academic Performance among Students Using Machine Learning (HSE Case Study)

Thesis

Full-text available

Jun 2024

In higher education, accurately predicting a student's academic success is still a major difficulty. Conventional approaches, which frequently depend on statistics and results from standardized tests, might not adequately account for the variety of factors that affect students' achievement. This study investigates how elements other than traditional measurements can be taken into account by machine learning algorithms to improve the prediction of academic performance for students. We constructed a variety of machine learning models (both Regressors and Classifiers), including Decision Tree, Support Vector Machine (SVM), Gradient Boosting, Random Forest, K-Nearest Neighbors (KNN), and Naïve Bayes, using a diversified dataset from the Higher School of Economics. According to our research, these sophisticated methods can reveal minute patterns and connections, which could increase prediction accuracy and provide us a better understanding of the variables affecting academic success. The SVM model outperformed other models in classification tests, as evidenced by its greatest AUC-ROC score, which indicates its greater ability to differentiate between various student performance levels. The Random Forest regressor was found to be the best-performing algorithm for regression tasks, highlighting important factors including self-discipline, university affiliation, class attendance, and enjoyment of projects. The goal of this research is to provide efficient support systems for student learning and success by highlighting the useful advantages of using machine learning into educational environments. Educational institutions can gain a deeper understanding of and improve the elements that contribute to student accomplishment by utilizing these sophisticated analytical techniques.

XGBoost To Enhance Learner Performance Prediction

Article

Full-text available

Jun 2024

The huge amount of data generated by an Intelligent Tutoring System becomes useful when analyzed in an appropriate way to provide significant insights about learners, especially his or her performance. Performance data retrieved from historical interactions is the main engine for learner performance prediction, where the likelihood of the learner answering correctly future questions is calculated. Modeling learner performance can provide significant insights into individual students to promote successful learning and maximize educational achievement. This study aims to enhance the learner performance prediction of some logistic regression-based models, namely Item Response Theory, Performance Factor Analysis, and DAS3H using XGBoost, including an empirical comparison of eight real-world datasets, containing performance log data collected from different online intelligent tutoring systems, involving the first time a new dataset from Moodle Morocco. The results have demonstrated that the XGBoost has enhanced PFA predictive performance on seven datasets with an AUC of up 0.88 and improved the DAS3H AUC on the ASSISTment17 dataset while conserving almost the same predictive results for Item Response Theory on some datasets.

Identification of Social Anxiety in High School: A Machine Learning Approaches to Real-Time Analysis of Student Characteristics

Article

Full-text available

Jan 2024

Students in high school commonly struggle with social anxiety, which has a negative effect on both their academic performance and emotional health. The various forms of social anxiety that students at Little Scholars Matriculation Hr. Sec. School in Thanjavur, Tamil Nadu, India, exhibit become the subject of this study. The study uses a strong analytical framework to investigate social phobia experiences by utilizing techniques like machine learning, clustering techniques, data exploration, and correlation analyses. A measurable increase in distress with the severity of social phobia is revealed by visual plots based on answers to a 17-item Social Phobia Inventory (SPIN) questionnaire. Correlation analyses clarify complex relationships between survey items, revealing the complex dynamics of high school social interactions. Using clustering techniques, different subgroups of students are found within the student population according to shared or unique traits related to social anxiety. By utilizing machine learning, the latent features linked to every survey question offer a more thorough comprehension of the factors affecting the reported levels of distress. In addition to defining social anxiety, the study draws attention to particular social phobia characteristics at Little Scholars School. In order to address identified fears, the research suggests an innovative strategy that involves creating customized scenarios using Virtual Reality (VR) and Augmented Reality (AR) technologies. This creative method emphasizes the cooperation of experts in psychology, education, and technology across disciplinary boundaries, providing a focused and immersive approach to reduce social anxiety. The study concludes by making recommendations for future paths for widespread adoption and ongoing investigation of cutting-edge technological advancements in mental health support systems, in addition to highlighting the possible advantages of VR and AR therapy for high school students.

A comprehensive analysis of the role of artificial intelligence in aligning tertiary institutions academic programs to the emerging digital enterprise

Article

Full-text available

May 2024
Educ Inform Tech

The study explores the use of Artificial Intelligence (AI) frameworks in transforming academic programs into adaptive, industry-relevant programs. The paper explores the development, validation, and effectiveness of artificial intelligence (AI) frameworks in aligning academic programs with the digital enterprise while highlighting the importance of these frameworks in enhancing graduates' digital skills and employability. Through a comprehensive analysis of existing literature, the paper highlights the significance of AI frameworks in bridging the gap between academia and industry requirements. It also presents case studies and empirical evidence to demonstrate the effectiveness of these frameworks in enhancing graduates' digital competencies. The findings emphasize the need for educational institutions to adopt AI frameworks to equip graduates with the necessary skills for success in the digital age. The research highlights the need for educational institutions to adapt to the rapidly changing digital landscape. The study shows significant improvements in graduates' digital literacy, problem-solving abilities, and adaptability to technological advancements. The real-world implications of these AI-driven educational interventions highlight the transformative potential of integrating AI technologies in education.

Application of Learning Analytics in Higher Education: Datasets, Methods and Tools

Article

Full-text available

Jun 2024

Yulia Dyulicheva

The accumulation of big educational data on the platforms of universities and social media leads to the need to develop tools for extracting regularities from educational data, which can be used for understanding the behavioral patterns of students and teachers, improve teaching methods and the quality of the educational process, as well as form sound strategies and policies for universities development. This article provides an analysis and systematization of datasets on available repositories, taking into account the learning analytics problems solved on their basis. In particular, the article notes the predominance of datasets aimed at solving analytical problems at the level of student’s behavior understanding, Datasets aimed at solving analytical problems at the level of understanding the needs of teachers and administrative and managerial staff of universities are practically absent. Meanwhile, the full potential of learning analytics tools can only be revealed by introducing an integrated approach to the analysis of educational data, taking into account the needs of all participants and organizers of the educational process. This review article discusses learning analytics methods related to the study of social interaction patterns between students and teachers, and learning analytics tools from the implementation of simple dashboards to complex frameworks that explore various levels of learning analytics. The problems and limitations that prevent learning analytics from realizing its potential in universities are considered. It is noted that universities are generally interested in introducing learning analytics tools that can improve the quality of the educational process by developing strategies for targeted support for individual groups of students, however, teachers treat such initiatives with caution due to a lack of data analysis skills and correct interpretation of analysis results. The novelty of this analytical review is associated with the consideration of learning analytics at different levels of its implementation in the context of approaches to openness, processing and analysis of educational data. This article will be of interest to developers of learning analytics tools, scientific and pedagogical workers, and administrative and managerial staff of universities from the point of view of forming an idea of the integrity of the university analytics process, taking into account various levels of analytics implementation aimed at understanding the needs and requirements of all participants in the educational process.

Prediction-based techniques on Academic Performance of Graduating Students through Machine Learning

Conference Paper

Apr 2024

Recent advancements in information technologies have made it a common practice to analyze educational data from various sources. Determining and analyzing these data could improve the education system by identifying the factors influencing students' academic performance and assessing their performance status. Thus, studies on predicting students' academic performance, gaining a high accuracy level, and extracting insights from massive volumes of educational data are still significant factors among researchers. In this study, predicting the academic performance of students using decision tree (DT), logistic regression (LR), random forest (RF), support vector machine (SVM), and naïve Bayes (NB) algorithm utilizing the data from Agusan del Sur State College of Agriculture and Technology (ASSCAT). The result shows that the best model of all five algorithms was the logistic regression, which obtained an accuracy of 0.91, followed by the support vector machine, which yielded an accuracy rate of 0.90. Logistic regression model can assist the university administrators, faculty, and students predict which students may underperform, allowing for timely intervention.

Data-driven Fuel Flow Prediction Model for Aircraft Engines

Conference Paper

Nov 2023

Machine Learning Based Predicting Student Academic Success

Article

Full-text available

Oct 2020

Al-Bahri Mahmood

Today, all institutions and companies are accelerating the use of AI technologies in their businesses to achieve a clear vision and quality results. The education sector is one of the sectors where AI can be used because of big data. In this work we created a machine-based learning model to predict a student's educational performance. The developed model relied on the student's previous data and performance in the last stage of the school. The model showed a very accurate accuracy rate that can be adopted.

Comparison of Predicting Student‘s Performance using Machine Learning Algorithms

Article

Full-text available

Dec 2019
IJISA

Machine Learning Algorithm for Student's Performance Prediction

Conference Paper

Full-text available

Dec 2019

Student performance prediction is very important to understand the student progress rate. It is said that ‘Prevention is better than the cure’. In this Research, we are trying to find out student's current status and predict his/her future results. After the outcome, teachers can give him/her proper advice to avoid the poor result and also can groom the student. By finding out the dependencies for final examinations. Which courses he/she should take in the upcoming semester (roles of adviser/teacher). Every year a lot of students lag behind because of lack of proper advice and monitoring. A teacher can't monitor each and every single student at once. If a system can help a Teacher about the students like which student needs which kind of help. Then it will be much helpful for both teachers and students. The aim is helping the student to avoid his/her predicted poor result using Artificial Intelligence. If a student could know what will be his/her result in the future and notify him/her what to do to avoid his/her bad results by predicting the final examinations mark. This research would be helpful for the students and teachers with the highest accuracy of 94.88%.

Framework for Academic Advice through Mobile Applications

Conference Paper

Full-text available

Aug 2016

The increasing rate of high (secondary) school leavers choosing academic majors to study at the university without proper guidance has most times left students with unfavorable consequences including low grades, extra year(s), the need to switch programs and ultimately having to withdraw from the university. In a bid to proffer a solution to the issue, this research aims to build an expert system that recommends university or academic majors to high school students in developing countries where there is a dearth of human career counselors. This is to reduce the adverse effects caused as a result of wrong choices made by students. A mobile rule-based expert system supported with ontology was developed for easy accessibility by the students.

Educational Data Mining techniques and their applications

Conference Paper

Full-text available