ChapterPDF Available

Mining in Educational Data: Review and Future Directions

Authors:

Abstract and Figures

One of the developing fields of the present times is educational data mining that pertains to developing methods that help in examining various kinds of data obtained from the educational field. A vital part is played by data mining in the education field, particularly when behavior is being assessed in an online learning setting. This is because data mining is capable of analyzing and identifying the hidden information regarding the data itself, which is very difficult and takes up a lot of time if performed manually. This review has the objective of examining the way data mining was handled by researchers in the past and the most recent trends on data mining in educational research, as well as to evaluate the likelihood of employing machine learning in the field of education. The various limitations inherent in the current research are examined and recommendations are made for future research.
Content may be subject to copyright.
Mining in Educational Data:
Review and Future Directions
Said A. Salloum
1,2(&)
, Muhammad Alshurideh
3,4
,
Ashraf Elnagar
1,5
, and Khaled Shaalan
2
1
Research Institute of Sciences and Engineering,
University of Sharjah, Sharjah, UAE
ssalloum@sharjah.ac.ae
2
Faculty of Engineering and IT, The British University in Dubai, Dubai, UAE
3
Faculty of Business, University of Jordan, Amman, Jordan
4
Management Department, University of Sharjah, Sharjah, UAE
5
Department of Computer Science, University of Sharjah, Sharjah, UAE
Abstract. One of the developing elds of the present times is educational data
mining that pertains to developing methods that help in examining various kinds
of data obtained from the educational eld. A vital part is played by data mining
in the education eld, particularly when behavior is being assessed in an online
learning setting. This is because data mining is capable of analyzing and
identifying the hidden information regarding the data itself, which is very dif-
cult and takes up a lot of time if performed manually. This review has the
objective of examining the way data mining was handled by researchers in the
past and the most recent trends on data mining in educational research, as well
as to evaluate the likelihood of employing machine learning in the eld of
education. The various limitations inherent in the current research are examined
and recommendations are made for future research.
Keywords: Educational data ming Online learning Machine learning
1 Introduction
The most advanced universities of present times frequently use data mining methods to
examine the data collected and to extract information and knowledge to facilitate
decision-making [16]. To offer signicant understanding of the present research trends
of the Education Data Mining (EDM), several review studies were carried out [711].
However, further studies are still needed to examine this issue from different angles.
Previous research was found to disregard the examination of EDM in terms of electronic
learning (E-learning) studies [1214] from a variety of perspectives [1518]. The pur-
pose of Educational data mining is to develop techniques that analyze different kinds of
data obtained from the eld of education [19]. In addition, the eld of educational data
mining is evolving and is related to the development of techniques for examining the
different kinds of data obtained from the educational domain [20,21]. The purpose of
the this review paper is to determine ways of applying data mining techniques to higher
education system by providing the most widely used methods and the most relevant
studies carried out in this eld to date. The remaining parts of the paper include the
©Springer Nature Switzerland AG 2020
A.-E. Hassanien et al. (Eds.): AICV 2020, AISC 1153, pp. 92102, 2020.
https://doi.org/10.1007/978-3-030-44289-7_9
following: A summary of some relevant literature is presented in related work, as well as
the data mining techniques in educational systems are explained in Sect. 2.
2 Literature Review
Data mining is the most robust methodology used for assessing valuable information
from the data warehouse [22]. Data mining is used for predicting hidden information
through extraction method so that decision-making can be improved [23]. The use of
data mining for educational activities has been extended on the basis of performance of
students, staff and managerial decisions [19]. Another term that can be used for data
mining is knowledge discovery through data [24,25,26]. Data mining is a multidi-
mensional eld comprising of different aspects, for example, learning, statistics,
information technology, articial intelligence, retrieval and visualization of data [27].
The education system has become more balanced due to the improved mining appli-
cation [28]. The education eld has seen a fast development of the idea of educational
data mining (EDM) with respect to the various kinds of educational institutes [29]. On
the other hand, academic analyst is specically linked to institutional effectiveness and
problems related to student performance [12,13,3037]. Those areas that directly
inuence those studying at the institution are part of the EDM [38]. In this section, we
provide a review of the last ten studies applied to data mining techniques in educational
environment ranging from 2016 to 2019, as shown in Table 1.
Table 1. List of studies focused on exploring educational data mining.
Authors Research problem Technique Approach
[39] Prediction of students
performance
Classication Decision tree
[40] Predicting academic
performance of students
Classication
and clustering
SVM, Naïve Bayes, Decision
tree and Neural Network
classiers
[41] Predicting studentsacademic
procrastination
Classication
and clustering
k-means clustering, ZeroR,
OneR, ID3, J48, random
forest, decision stump, JRip,
PART, NBTree, and Prism
[42] How data mining can help
admission working process,
and how data mining can
predict the students jobs
Association
Rule Mining
and
Classication
ID 3 Decision Tree
[43] Predicting students
performance
Clustering,
classication,
regression
Naive Bayes, Decision Tree,
and Articial Neural Network
[44] Studentsdisposition analysis Clustering k-means
[45] Predicting students
graduation
Classication Multi-Layer Perceptron
(MLP)
(continued)
Mining in Educational Data: Review and Future Directions 93
2.1 Data Mining Techniques in Educational Systems
One of the elements of data mining is educational data mining, the key focus of which
is on developing models for extracting hidden knowledge from the students data,
using which the academic performance of students may be enhanced. In the process of
Education Data mining, raw material may be converted from various educational
systems into valuable information that can be employed by teachers, students and their
parents, educational researchers and the developers of educational software system.
Educational data mining may also be considered as a new model that is part of the
prevailing education system, which is able to generate positive interaction with dif-
ferent parts of the system. This will enable it to eventually attain the objective of
enhancing teaching [51].
The denition of Educational Data Mining (EDM) is the application of techniques
of traditional data mining to educational data analysis, with the objective of obtaining
solutions to problems in the eld of education [10]. There are certain EDM applications
that include the formulation of e-learning systems [10,52], clustering educational data
[53] as well as making student performance predictions [54]. Several kinds of tech-
niques are presently quite popular in educational data mining, which are part of the
following categories: sequential pattern, clustering, prediction, classication, machine
learning models and association rule analysis.
Table 1. (continued )
Authors Research problem Technique Approach
[46] Student placement prediction Clustering,
classication,
regression
J48, Naïve Bayes, Random
Forest, and Random Tree,
Multiple Linear Regression,
binomial logistic regression,
Recursive Partitioning and
Regression Tree (rpart),
conditional inference tree
(ctree) and Neural Network
(nnet) algorithms
[47] Studentsperformance
prediction- predicting nal
grades of students
Multiple
regression
analysis
Recurrent Neural Network
(RNN)
[48] Studentsperformance
prediction
Classication NaiveBayes, Bayesian
Network, ID3, J48 and
Neural Network
[49] Predict dropout at the
universities
Multiple
regression
analysis
Neural Network - Multilayer
perceptron algorithms and
radial basis function
[50] Predicting studentsacademic
performance
Classication Decision tree classiers and
neural networks
94 S. A. Salloum et al.
2.2 The Traditional Data Mining Techniques Applied in Educational
Settings
Clustering is the most extensively known techniques for data mining [39,5562],
followed by classication [42,55,6365] sequential pattern [18,58,66], prediction
[40,67], and nally, association rule analysis [42,55]. Going back to the year 1995 till
the year 2005, majority of the studies on educational data mining often used the
association rule analysis technique [11] because it involved a lesser degree of expertise
compared to other techniques [68]. Nonetheless, at the start of the year 2005, the trend
shifted as clustering and classication methods were often used for analysis by the
researchers [9]. There are frequently a large number of outputs obtained for association
rule, majority of which are not interesting and cannot be comprehended easily by those
who are not experienced in data mining [69]. To select the correct algorithms,
researchers should initially design the data and make it consistent with the required
output [9]. When their study is small scale in nature, they can use clustering approach
because data splitting that is required in classication approach is not needed in this
technique [9]. In addition, the researchers are always able to compare with various
algorithms using the same database that was used in [55]. This would help in deter-
mining if identical results would be attained when a distinct approach was used.
2.3 Machine Learning Applied to Learning Analytics and Educational
Data Mining
It would be appropriate to consider Machine Learning (ML) as a component of articial
intelligence (AI). Machine learning is essentially the process through which a machine
or model is provided access to data and is able to learn on its own. It was conceived by
Arthur Samuel in 1959 that we should not have to teach computers; instead, we should
allow them to learn themselves. To explain his theory, he came up with the term
Machine Learning, which is now the standard term used to explain the computers
ability to learn on their own [70]. Machine Learning (ML) refers to the programming of
computers to enhance a performance standard through example data or experience [71].
When a machine learning algorithm is implemented, it indicates that a model is
implemented that outputs appropriate information, considering that input data has been
given [72]. Another extensively used method in the eld of Data Mining is Articial
Neural Network (ANN) [73,74]. This method has been used in clustering [59,75],
regression [47,76], classication [47,7680], time series forecasting [47,79] and
visualization. Though ANN has been used quite extensively in data mining, a signif-
icant aspect to point out is that there is a section of the data mining community that
somewhat criticizes the fact that the models formulated using this paradigm cannot be
interpreted [81]. These models do offer accurate predictions from the data; however, it
is not easy to obtain human-interpretable rules that encapsulate their predictions [74].
At present, researchers and scientists are intrigued by the use of machine learning in the
eld of education. A few of the areas of interest include the following:
Mining in Educational Data: Review and Future Directions 95
2.3.1 Predict Student Performance
Predicting student performance is an important use of machine learning. The machine
learning model is used for learningabout every student, which enables it to identify
their shortcomings and determine ways through which they can improve, for example
by attending more lectures or reviewing further literature [47,54,78,80,8284]. Using
these models many kinds of knowledge can be discovered such as clustering, associ-
ation rules, and classications [85]. The discovered knowledge can be used for pre-
diction regarding enrolment of students [86] in a particular course, alienation of
traditional classroom teaching model [87], detection of unfair means used in online
examination, detection of abnormal values in the result sheets of the students, pre-
diction about students performance and so on [85].
2.3.2 Use Unbiased Methods for Testing and Grading Students
It becomes possible to generate computerized adaptive assessments through machine
learning [88,89]. Teachers and students receive consistent feedback from the machine
learning-based assessment regarding the way the student learns, the support that they
require and the developments they are experience in terms of their learning objectives
[72,76,90].
2.3.3 Enhance Retention
Retention rates are also enhanced through machine learning, for example learning
analytics. When students at riskare determined, they can be approached by schools
and universities to assist them in achieving success [91]. Prediction models that include
all personal, social, psychological and other environmental variables are necessitated
for the effective prediction of the retention rate of the students. The retention of stu-
dents with high accuracy is benecial for identify the students with low academic
achievements initially. It is required that the identied students can be assisted more by
the teacher so that their performance is improved in future [86].
2.3.4 Provide Support to Teachers and Institution Stuff
Using algorithms based on machine learning, it becomes easier to perform classica-
tion of the written assessment tests of students [92]. Intelligent analysis of assessment
data assists in achieving a better understanding of student performance, the quality of
the test and individual question [93].
3 Conclusion and Future Work
In this paper, the state of the art is evaluated in terms of EDM and surveys of the most
pertinent studies till now in this eld. Once all the published bibliography in the area of
EDM has been gathered and reviewed, we chose the most signicant studies of every
author. After this, every study was categorized, not only on the basis of the kind of data
and DM techniques employed, but also on the basis of the kind of educational activity
they tackle. The introduction of EDM is considered as a forthcoming research domain
that is relevant to various well-established research domains, such as classical data
mining techniques, association, classication, regression, clustering and prediction.
96 S. A. Salloum et al.
In this study, new methods are examined and the latest studies in the eld of Learning
Analytics and Educational Data Mining that have used Deep Learning techniques are
determined. It is believed that EDM is an imminent event that is going to alter the
overall eld of education. In the scope of future section, the different areas that require
improvement are stated, for example: obtaining bigger datasets, including adaptable
datasets, hybridizing the techniques used, improving the credibility of EDM and per-
forming comparisons between the various methods. An idea of the way the ndings of
the studies performed on the suggested lines will enhance the EDM research is pre-
sented in future scope. In addition, other DM elds will benet from the latest algo-
rithms created. The study will also enhance the condence of the EDM authorities to
incorporate the recommendations put forward into the practical world systems. The
system will be inuenced by the solutions put forward in a way that the students,
teachers and the administration are able to achieve the ideal outcomes from the system.
References
1. Saa, A.A., Al-Emran, M., Shaalan, K.: Factors affecting studentsperformance in higher
education: a systematic review of predictive data mining techniques. Technol. Knowl. Learn.
24, 567598 (2019)
2. Salloum, S.A., AlHamad, A.Q., Al-Emran, M., Shaalan, K.: A survey of Arabic text mining,
vol. 740 (2018)
3. Mhamdi, C., Al-Emran, M., Salloum, S.A.: Text mining and analytics: a case study from
news channels posts on Facebook, vol. 740 (2018)
4. Hassanien, A.E., Darwish, A., El-Askary, H.: Machine Learning and Data Mining in
Aerospace Technology. Springer, Cham (2020)
5. Hassanien, A.E.: Machine Learning Paradigms: Theory and Application. Springer, Cham
(2019)
6. Ismail, F.H., Hassanien, A.E.: Extracting valuable associations among textural features of
medical images. In: 2018 13th International Conference on Computer Engineering and
Systems (ICCES), pp. 605608 (2018)
7. Ahuja, R., Jha, A., Maurya, R., Srivastava, R.: Analysis of educational data mining. In:
Harmony Search and Nature Inspired Optimization Algorithms, pp. 897907. Springer
(2019)
8. Sarra, A., Fontanella, L., Di Zio, S.: Identifying students at risk of academic failure within
the educational data mining framework. Soc. Indic. Res. 146(12), 4160 (2019)
9. Mohamad, S.K., Tasir, Z.: Educational data mining: a review. Procedia-Soc. Behav. Sci. 97,
320324 (2013)
10. Baker, R.S.J.D., Yacef, K.: The state of educational data mining in 2009: a review and future
visions. JEDM: J. Educ. Data Min. 1(1), 317 (2009)
11. Romero, C., Ventura, S.: Educational data mining: a survey from 1995 to 2005. Expert Syst.
Appl. 33(1), 135146 (2007)
12. Salloum, S.A., Alhamad, A.Q.M., Al-Emran, M., Monem, A.A., Shaalan, K.: Exploring
studentsacceptance of E-learning through the development of a comprehensive technology
acceptance model. IEEE Access 7, 128445128462 (2019)
13. Alshurideh, M., Salloum, S.A., Al Kurdi, B., Al-Emran, M.: Factors affecting the social
networks acceptance: an empirical study using PLS-SEM approach. In: 8th International
Conference on Software and Computer Applications (2019)
Mining in Educational Data: Review and Future Directions 97
14. Alshurideh, M.T., Salloum, S.A., Al Kurdi, B., Monem, A.A., Shaalan, K.: Understanding
the quality determinants that inuence the intention to use the mobile learning platforms: a
practical study. Int. J. Interact. Mob. Technol. 13(11), 157183 (2019)
15. Mitrofanova, Y.S., Sherstobitova, A.A., Filippova, O.A.: Modeling smart learning processes
based on educational data mining tools. In: Smart Education and e-Learning 2019, pp. 561
571. Springer (2019)
16. Menaka, M.S., Kesavaraj, G.: A study on e-learning system to analyse student performance
using data mining (2019)
17. Cerezo, R., Bogarín, A., Esteban, M., Romero, C.: Process mining for self-regulated learning
assessment in e-learning. J. Comput. High. Educ. 32,7488 (2020)
18. Keskin, S., Şahin, M., Yurdugül, H.: Online learnersnavigational patterns based on data
mining in terms of learning achievement. In: Learning Technologies for Transforming
Large-Scale Teaching, Learning, and Assessment, pp. 105121. Springer (2019)
19. Fernandes, E., Holanda, M., Victorino, M., Borges, V., Carvalho, R., Van Erven, G.:
Educational data mining: predictive analysis of academic performance of public school
students in the capital of Brazil. J. Bus. Res. (2018)
20. Salloum, S.A., Al-Emran, M., Monem, A.A., Shaalan, K.: Using text mining techniques for
extracting information from research articles. In: Studies in Computational Intelligence, vol.
740. Springer (2018)
21. Salloum, S.A., Al-Emran, M., Abdallah, S., Shaalan, K.: Analyzing the Arab gulf
newspapers using text mining techniques. In: International Conference on Advanced
Intelligent Systems and Informatics, pp. 396405 (2017)
22. Salloum, S.A., Al-Emran, M., Shaalan, K.: Mining social media text: extracting knowledge
from Facebook. Int. J. Comput. Digit. Syst. 6(2), 7381 (2017)
23. Salloum, S.A., Mhamdi, C., Al-Emran, M., Shaalan, K.: Analysis and classication of
Arabic newspapersFacebook pages using text mining techniques. Int. J. Inf. Technol. Lang.
Stud. 1(2), 817 (2017)
24. Cummins, M.R.: Nonhypothesis-driven research: data mining and knowledge discovery. In:
Clinical Research Informatics, pp. 341356. Springer (2019)
25. Salloum, S.A., Al-Emran, M., Monem, A.A., Shaalan, K.: A survey of text mining in social
media: Facebook and Twitter perspectives. Adv. Sci. Technol. Eng. Syst. J 2(1), 127133
(2017)
26. Alomari, K.M., AlHamad, A.Q., Salloum, S.: Prediction of the digital game rating systems
based on the ESRB (2019)
27. Arunachalam, A.S., Velmurugan, T.: Analyzing student performance using evolutionary
articial neural network algorithm. Int. J. Eng. Technol. 7(2.26), 6773 (2018)
28. Romero, C., Ventura, S., García, E.: Data mining in course management systems: moodle
case study and tutorial. Comput. Educ. 51(1), 368384 (2008)
29. Sachin, R.B., Vijay, M.S.: A survey and future vision of data mining in educational eld. In:
2012 Second International Conference on Advanced Computing & Communication
Technologies, pp. 96100 (2012)
30. Salloum, S.A., Shaalan, K.: Factors affecting studentsacceptance of e-learning system in
higher education using UTAUT and structural equation modeling approaches. In:
International Conference on Advanced Intelligent Systems and Informatics, pp. 469480
(2018)
31. Salloum, S.A., Al-Emran, M., Habes, M., Alghizzawi, M., Ghani, M.A., Shaalan, K.:
Understanding the impact of social media practices on e-learning systems acceptance. In:
International Conference on Advanced Intelligent Systems and Informatics, pp. 360369
(2019)
98 S. A. Salloum et al.
32. Salloum, S.A., Mhamdi, C., Al Kurdi, B., Shaalan, K.: Factors affecting the adoption and
meaningful use of social media: a structural equation modeling approach. Int. J. Inf. Technol.
Lang. Stud. 2(3), 96109 (2018)
33. Salloum, S.A., Maqableh, W., Mhamdi, C., Al Kurdi, B., Shaalan, K.: Studying the social
media adoption by university students in the United Arab Emirates. Int. J. Inf. Technol.
Lang. Stud. 2(3), 8395 (2018)
34. Salloum, S.A.S., Shaalan, K.: Investigating studentsacceptance of e-learning system in
higher educational environments in the UAE: applying the extended technology acceptance
model (TAM). The British University in Dubai (2018)
35. Habes, M., Alghizzawi, M., Khalaf, R., Salloum, S.A., Ghani, M.A.: The relationship
between social media and academic performance: Facebook perspective. Int. J. Inf. Technol.
Lang. Stud. 2(1), 1218 (2018)
36. Salloum, S.A., Al-Emran, M., Shaalan, K., Tarhini, A.: Factors affecting the E-learning
acceptance: a case study from UAE. Educ. Inf. Technol. 24, 509530 (2019)
37. Al-Emran, M., Salloum, S.A.: Studentsattitudes towards the use of mobile technologies in
e-evaluation. Int. J. Interact. Mob. Technol. 11(5), 195202 (2017)
38. Kabakchieva, D.: Predicting student performance by using data mining methods for
classication. Cybern. Inf. Technol. 13(1), 6172 (2013)
39. Durairaj, M., Vijitha, C.: Educational data mining for prediction of student performance
using clustering algorithms. Int. J. Comput. Sci. Inf. Technol. 5(4), 59875991 (2014)
40. Francis, B.K., Babu, S.S.: Predicting academic performance of students using a hybrid data
mining approach. J. Med. Syst. 43(6), 162 (2019)
41. Akram, A., et al.: Predicting studentsacademic procrastination in blended learning course
using homework submission data. IEEE Access 7, 102487102498 (2019)
42. Rojanavasu, P.: Educational data analytics using association rule mining and classication.
In: 2019 Joint International Conference on Digital Arts, Media and Technology with ECTI
Northern Section Conference on Electrical, Electronics, Computer and Telecommunications
Engineering (ECTI DAMT-NCON), pp. 142145 (2019)
43. Sana, B., Siddiqui, I.F., Arain, Q.A.: Analyzing studentsacademic performance through
educational data mining. 3c Tecnol. glosas innovación Apl. a la pyme 8(29), 402421 (2019)
44. Bharara, S., Sabitha, S., Bansal, A.: Application of learning analytics using clustering data
Mining for Studentsdisposition analysis. Educ. Inf. Technol. 23(2), 957984 (2018)
45. Nurhayati, O.D., Bachri, O.S., Supriyanto, A., Hasbullah, M.: Graduation prediction system
using articial neural network. Int. J. Mech. Eng. Technol. 9(7), 10511057 (2018)
46. Rao, K.S., Swapna, N., Kumar, P.P.: Educational data mining for student placement
prediction using machine learning algorithms. Int. J. Eng. Technol. Sci. 7(1.2), 4346 (2018)
47. Okubo, F., Yamashita, T., Shimada, A., Ogata, H.: A neural network approach for students
performance prediction. In: LAK 2017, pp. 598599 (2017)
48. Almarabeh, H.: Analysis of studentsperformance by using different data mining classiers.
Int. J. Mod. Educ. Comput. Sci. 9(8), 9 (2017)
49. Alban, M., Mauricio, D.: Neural networks to predict dropout at the universities. Int. J. Mach.
Learn. Comput. 9(2), 149153 (2019)
50. Feng, J.: Predicting studentsacademic performance with decision tree and neural network
(2019)
51. Jie, W., Hai-yan, L., Biao, C., Yuan, Z.: Application of educational data mining on analysis
of studentsonline learning behavior. In: 2017 2nd International Conference on Image,
Vision and Computing (ICIVC), pp. 10111015 (2017)
Mining in Educational Data: Review and Future Directions 99
52. Lara, J.A., Lizcano, D., Martínez, M.A., Pazos, J., Riera, T.: A system for knowledge
discovery in e-learning environments within the European Higher Education Area-
Application to student data from Open University of Madrid, UDIMA. Comput. Educ.
72,2336 (2014)
53. Chakraborty, B., Chakma, K., Mukherjee, A.: A density-based clustering algorithm and
experiments on student dataset with noises using Rough set theory. In: 2016 IEEE
International Conference on Engineering and Technology (ICETECH), pp. 431436 (2016)
54. Chauhan, N., Shah, K., Karn, D., Dalal, J.: Prediction of students performance using
machine learning (2019). SSRN 3370802
55. Pechenizkiy, M., Calders, T., Vasilyeva, E., De Bra, P.: Mining the student assessment data:
lessons drawn from a small scale case study. In: Educational Data Mining 2008 (2008)
56. Shih, Y.-C., Huang, P.-R., Hsu, Y.-C., Chen, S.Y.: A complete understanding of
disorientation problems in Web-based learning. Turkish Online J. Educ. Technol. 11(3),
113 (2012)
57. Talavera, L., Gaudioso, E.: Mining student data to characterize similar behavior groups in
unstructured collaboration spaces. In: Workshop on Articial Intelligence in CSCL. 16th
European Conference on Articial Intelligence, pp. 1723 (2004)
58. Perera, D., Kay, J., Koprinska, I., Yacef, K., Zaïane, O.R.: Clustering and sequential pattern
mining of online collaborative learning data. IEEE Trans. Knowl. Data Eng. 21(6), 759772
(2008)
59. Dutt, A., Aghabozrgi, S., Ismail, M.A.B., Mahroeian, H.: Clustering algorithms applied in
educational data mining. Int. J. Inf. Electron. Eng. 5(2), 112 (2015)
60. Bogarín, A., Romero, C., Cerezo, R., Sánchez-Santillán, M.: Clustering for improving
educational process mining. In: Proceedings of the Fourth International Conference on
Learning Analytics and Knowledge, pp. 1115 (2014)
61. Fernandes, E., Holanda, M., Victorino, M., Borges, V., Carvalho, R., Van Erven, G.:
Educational data mining: predictive analysis of academic performance of public school
students in the capital of Brazil. J. Bus. Res. 94, 335343 (2019)
62. Palomo-Duarte, M., Berns, A., Yañez Escolano, A., Dodero, J.-M.: Clustering analysis of
game-based learning: worth it for all students? J. Gaming Virtual Worlds 11(1), 4566
(2019)
63. Ahmed, A.B.E.D., Elaraby, I.S.: Data mining: a prediction for students performance using
classication method. World J. Comput. Appl. Technol. 2(2), 4347 (2014)
64. Anjewierden, A., Kolloffel, B., Hulshof, C.: Towards educational data mining: using data
mining methods for automated chat analysis to understand and support inquiry learning
processes (2007)
65. Adebayo, A.O., Chaubey, M.S.: Data mining classication techniques on the analysis of
students performance. GSJ 7(4), 4552 (2019)
66. Kay, J., Maisonneuve, N., Yacef, K., Zaïane, O.: Mining patterns of events in students
teamwork data. In: Proceedings of the Workshop on Educational Data Mining at the 8th
International Conference on Intelligent Tutoring Systems (ITS 2006), pp. 4552 (2006)
67. Tiwari, A.K., Ramakrishna, G., Sharma, L.K., Kashyap, S.K.: Academic performance
prediction algorithm based on fuzzy data mining. Int. J. Artif. Intelegence 8(1), 2632 (2019)
68. Merceron, A., Yacef, K.: Revisiting interestingness of strong symmetric association rules in
educational data. In: Proceedings of the International Workshop on Applying Data Mining in
e-Learning, Creete, Greece, pp. 312 (2007)
69. García, E., Romero, C., Ventura, S., Calders, T.: Drawbacks and solutions of applying
association rule mining in learning management systems. In: Proceedings of the International
Workshop on Applying Data Mining in e-Learning (ADML 2007), Crete, Greece, pp. 1322
(2007)
100 S. A. Salloum et al.
70. Samuel, A.L.: Some studies in machine learning using the game of checkers. IIrecent
progress. IBM J. Res. Dev. 11(6), 601617 (1967)
71. Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2009)
72. Kučak, D., Juričić, V., Đambić, G.: Machine learning in education-a survey of current
research trends. In: Annals of DAAAM and Proceedings, vol. 29 (2018)
73. Stahl, F., Jordanov, I.: An overview of the use of neural networks for data mining tasks.
Wiley Interdiscip Rev. Data Min. Knowl. Discov. 2(3), 193208 (2012)
74. Coelho, O.B., Silveira, I.: Deep learning applied to learning analytics and educational data
mining: a systematic literature review. In: Brazilian Symposium on Computers in Education
(Simpósio Brasileiro de Informática na Educação-SBIE), vol. 28, no. 1, p. 143 (2017)
75. Vellido, A., Castro, F., Nebot, A.: Clustering educational data. In: Handbook of Educational
Data Mining, pp. 7592 (2010)
76. Li, J., Wong, Y., Kankanhalli, M.S.: Multi-stream deep learning framework for automated
presentation assessment. In: 2016 IEEE International Symposium on Multimedia (ISM),
pp. 222225 (2016)
77. Gross, E., Wshah, S., Simmons, I., Skinner, G.: A handwriting recognition system for the
classroom. In: Proceedings of the Fifth International Conference on Learning Analytics and
Knowledge, pp. 218222 (2015)
78. Guo, B., Zhang, R., Xu, G., Shi, C., Yang, L.: Predicting students performance in
educational data mining. In: 2015 International Symposium on Educational Technology
(ISET), pp. 125128 (2015)
79. Tang, S., Peterson, J.C., Pardos, Z.A.: Deep neural networks and how they apply to
sequential education data. In: Proceedings of the Third (2016) ACM Conference on Learning
@ Scale, pp. 321324 (2016)
80. Wang, L., Sy, A., Liu, L., Piech, C.: Deep knowledge tracing on programming exercises. In:
Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale, pp. 201204
(2017)
81. Craven, M.W., Shavlik, J.W.: Using neural networks for data mining. Futur. Gener. Comput.
Syst. 13(23), 211229 (1997)
82. Anozie, N., Junker, B.W.: Predicting end-of-year accountability assessment scores from
monthly student records in an online tutoring system (2006)
83. Khan, I., Al Sadiri, A., Ahmad, A.R., Jabeur, N.: Tracking student performance in
introductory programming by means of machine learning. In: 2019 4th MEC International
Conference on Big Data and Smart City (ICBDSC), pp. 16 (2019)
84. Livieris, I.E., Drakopoulou, K., Tampakas, V.T., Mikropoulos, T.A., Pintelas, P.: Predicting
secondary school studentsperformance utilizing a semi-supervised learning approach.
J. Educ. Comput. Res. 57(2), 448470 (2019)
85. Yadav, S.K., Pal, S.: Data mining: a prediction for performance improvement of engineering
students using classication, arXiv Prepr. arXiv:1203.3832 (2012)
86. Yadav, S.K., Bharadwaj, B., Pal, S.: Mining education data to predict students retention: a
comparative study, arXiv Prepr. arXiv:1203.2987 (2012)
87. Akinola, O.S., Akinkunmi, B.O., Alo, T.S.: A data mining model for predicting computer
programming prociency of computer science undergraduate students (2012)
88. Luckin, R., Holmes, W., Grifths, M., Forcier, L.B.: Intelligence unleashed: an argument for
AI in education (2016)
89. Meseguer-Brocal, G., Cohen-Hadria, A., Peeters, G.: DALI: a large dataset of synchronized
audio, lyrics and notes, automatically created using teacher-student machine learning
paradigm, arXiv Prepr. arXiv:1906.10606 (2019)
90. El-Alfy, E.-S.M., Abdel-Aal, R.E.: Construction and analysis of educational tests using
abductive machine learning. Comput. Educ. 51(1), 116 (2008)
Mining in Educational Data: Review and Future Directions 101
91. Đambić, G., Krajcar, M., Bele, D.: Machine learning model for early detection of higher
education students that need additional attention in introductory programming courses. Int.
J. Digit. Technol. Econ. 1(1), 111 (2016)
92. Celar, S., Stojkic, Z., Seremet, Z., Marusic, Z., Zelenika, D.: Classication of test documents
based on handwritten student IDs characteristics. Procedia Eng. 100, 782790 (2015)
93. Pechenizkiy, M., Trcka, N., Vasilyeva, E., Van der Aalst, W., De Bra, P.: Process mining
online assessment data. In: International Working Group on Educational Data Mining (2009)
102 S. A. Salloum et al.
... There are also several review studies that indirectly focus on limited aspects of ML for educational data in a given timeline. Alonso-Fernández et al. (2019) have investigated game learning analytics using literature review; Bachhal et al. (2021) have discussed the most important studies conducted until 2021 in educational data mining in general; Yunita et al. (2021) has reviewed the relevant literature on big data in education; Khan and Ghosh (2021) have examined the educational data mining publications from the perspective of student performance analysis and prediction in classroom learning; Salloum et al. (2020) have analysed the literature to find out how data mining was handled by researchers in the past and the most recent trends on data mining in educational research between 2016 and 2019; Albreiki et al. (2021) have reviewed the literature on student' performance prediction using ML techniques where they focused identifying student dropouts and students at risk in literature between 2009 and 2021; Du et al. (2020) have examined 33 publications between 2007 and the first quarter of 2019 to analyse educational data mining research trends where they analysed research topics, methods and sample; Khalaf et al. (2021) have anlaysed the literature on using only supervised ML in the period of 2010-2020; Peña-Ayala (2014) has reviewed the literature on educational data mining between 2010 and first quarter of 2013. ...
... In terms of the fragmentation, the majority of review studies in this area adopt a temporal scope, focusing on specific timeframes. For instance, Alonso-Fernández et al. (2019) explored game learning analytics, Bachhal et al. (2021) covered studies up to 2021, and Salloum et al. (2020) analyzed trends between 2016 and 2019. These fragmented timelines create a gap in understanding the evolution and continuity of ML applications in educational data over an extended period. ...
Article
Full-text available
Integrating machine learning (ML) methods in educational research has the potential to greatly impact upon research, teaching, learning and assessment by enabling personalised learning, adaptive assessment and providing insights into student performance, progress and learning patterns. To reveal more about this notion, we investigated ML approaches used for educational data analysis in the last decade and provided recommendations for further research. Using a systematic literature review (SLR), we examined 77 publications from two large and high-impact databases for educational research using bibliometric mapping and evaluative review analysis. Our results suggest that the top five most frequently used keywords were similar in both databases. The majority of the publications (88%) utilised supervised ML approaches for predicting students’ performances and finding learning patterns. These methods include decision trees, support vector machines, random forests, and logistic regression. Semi-supervised learning methods were less frequently used, but also demonstrated promising results in predicting students’ performance. Finally, we discuss the implications of these results for statisticians, researchers, and policymakers in education.
... In order to forecast student conduct, authors use deep learning, which undoubtedly involves several levels of representation [8]. The hidden information inside the data itself can be analyzed and found through data mining, which is much more time-and labor-intensive than doing it manually [9]. Educational institutions' main goal is to give students a high-quality education that will improve their academic performance. ...
... Over the years, various ML approaches have been used in this discipline, but it is only recently that deep learning has come to the attention of the educational community. The hidden information inside the data itself can be analyzed and found through data mining, which much is more time-and labor-intensive when done manually [9]. Data mining is having a tremendous impact on education, which is an important factor in social upliftment [14]. ...
Article
Full-text available
In educational institutions, it is now more important than ever to deliver high-quality academic instruction, and educational data mining is essential for resolving problems that arise from challenging unstructured data in this field. Using machine learning (ML) approaches, the performance of students and traits related to academia, a crucial indicator of higher education, is examined. In the proposed study, the educational dataset is subjected to feature ranking algorithms, including MRMR, ReliefF, Chi-Square, ANOVA, and Kruskal–Wallis, followed by important feature selection using Shapley. The dataset has 16 attributes of integer, categorical type, and after feature ranking approaches, the features with the most important information are chosen and ML techniques are applied to them. It takes two phases to complete the work. The results are obtained after the first phase, in which all features are taken into account for ML training. The second phase of ML training takes into account selective features that are derived using ranking approaches. ML models with only selective attributes are compared to models with all features in order to determine which is more precise. In comparison, the results of the ML models with selective attributes outperformed the models with all attributes. Overall, the ensemblers, i.e., bagged tree and AdaBoost, outperformed other ML techniques such as decision trees, neural networks, naive Bayes, K-nearest neighbor, and support vector machines presented in the proposed study. Bagged trees achieved an accuracy of 81.0 percent, while AdaBoost achieved an accuracy of 74.2 percent.
... Classification, a fundamental task within data mining, entails predicting the category to which a given dataset belongs [8]. While data mining-driven classification [9], has permeated various domains, encompassing manufacturing [10], agriculture [11], economics [12], education [13], and healthcare [14], the adaptation of these techniques to journal quartile classification remains an underexplored frontier. The landscape of classification models presents a rich tapestry, including K-Nearest Neighbors (KNN) [15], Support Vector Machines (SVM) [16], Naïve Bayes [17], Multi-Layer Perceptron (MLP) [18], and Random Forest [19]. ...
Article
Full-text available
Journals play a pivotal role in disseminating scientific knowledge, housing a multitude of valuable research articles. In this digital age, the evaluation of journals and their quality is essential. The SCImago Journal Rank (SJR) stands as one of the prominent platforms for ranking journals, categorizing them into five index classes: Q1, Q2, Q3, Q4, and NQ. Determining these index classes often relies on classification methodologies. This research, drawing inspiration from the Cross-Industry Standard Process for Data Mining (CRISP-DM), seeks to employ the Random Forest method to classify journals, thus contributing to the refinement of journal ranking processes. Random Forest stands out as a robust choice due to its remarkable ability to mitigate overfitting, a common challenge in machine learning classification tasks. In the context of approximating SJR index classes, Random Forest, when utilizing the Gini index, exhibits promise, albeit with an initial accuracy rate of 62.12%. The Gini index, an impurity measure, enables Random Forest to make informed decisions while classifying journals into their respective SJR index classes. However, it is worth noting that this accuracy rate represents a starting point, and further refinement and feature engineering may enhance the model's performance. This research underscores the significance of machine learning techniques in the domain of journal classification and journal-ranking systems. By harnessing the power of Random Forest, this study aims to facilitate more accurate and efficient categorization of journals, thereby aiding researchers, academics, and institutions in identifying and accessing high-quality scientific literature.
... The generic state-of-the-art overview of traditional and recently proposed clustering methods and their application domains can be found in Ezugwu et al. (2022). Educational Data Mining (EDM) is an emerging discipline that exploits statistical, machine learning, and data mining algorithms over the different types of educational data (Romero & Ventura, 2010;Salloum et al., 2020). Dutt et al. (2017) provide a comprehensive overview of EDM techniques. ...
Article
Full-text available
In cyber security education, hands-on training is a common type of exercise to help raise awareness and competence, and improve students’ cybersecurity skills. To be able to measure the impact of the design of the particular courses, the designers need methods that can reveal hidden patterns in trainee behavior. However, the support of the designers in performing such analytic and evaluation tasks is ad-hoc and insufficient. With unsupervised machine learning methods, we designed a tool for clustering the trainee actions that can exhibit their strategies or help pinpoint flaws in the training design. By using a k-means++ algorithm, we explore clusters of trainees that unveil their specific behavior within the training sessions. The final visualization tool consists of views with scatter plots and radar charts. The former provides a two-dimensional correlation of selected trainee actions and displays their clusters. In contrast, the radar chart displays distinct clusters of trainees based on their more specific strategies or approaches when solving tasks. Through iterative training redesign, the tool can help designers identify improper training parameters and improve the quality of the courses accordingly. To evaluate the tool, we performed a qualitative evaluation of its outcomes with cybersecurity experts. The results confirm the usability of the selected methods in discovering significant trainee behavior. Our insights and recommendations can be beneficial for the design of tools for educators, even beyond cyber security.
... This was especially true in the case of performance prediction and identification of students at risk. In another review of ten specific studies, Salloum et al. (2020) assert that machine learning techniques are best suited for their predictive capability regarding student success. ...
Article
Full-text available
The study aims to compare the performance of various machine learning models for student persistence prediction. The research starts with a historical review of student retention studies and the evolution of predictive models in the field. It highlights the importance of predicting student persistence for educational institutions and individuals. It then describes a dataset from ResearchGate, consisting of anonymized undergraduate student data collected between 2008 and 2018, with 37 features and 4,424 records. Ten machine learning algorithms are considered, with two popular machine learning algorithms, Logistic Regression, and Random Forest classification, being compared in more detail for their performance in predicting student persistence. Evaluation metrics such as prediction accuracy, precision, recall, and F1-score are used. Results show that the Random Forest model outperforms Logistic Regression in predicting student outcomes, particularly when using the synthetic minority oversampling technique (SMOTE) to address the class imbalance. Overall, this study contributes to student retention research and provides insights for developing targeted support measures to enhance student success in higher education.
Chapter
Data analytics provide an important contribution to improving educational processes, as well as managerial and organisational processes in higher education. They are also significant in transforming behavioural, performance, and interaction data on digital learning platforms into pivotal information about the learning process. This chapter presents analysis results from instructional and behavioural formative assessment data for an online common course stored on a Turkish state university’s learning management system using cluster analysis. This chapter focuses on evaluating activity and assessment data to discover student groups as an important input of assessment analytics. Developing supervised models require a priori labels, such as individual student success or fail scores, and a multidimensional dataset identifying relevant performance information gathered during the learning process. In a more realistic setting, the labels are not fully known until the semester end and supervised learning models may then create learning bias when trained only using assessment data. Therefore, cluster analysis presents a promising approach through observing performance similarities between students not yet evaluated for a course or subjected to end-of-term assessment. Although feedback mechanisms were not studied in this chapter, this approach can further help achieve necessary steps to increase student success before a course ends. This study grouped students based on learning performance data using cluster analysis, with models developed and evaluated according to various performance indicators. The main concepts related to the function of the methods in assessment analytics are discussed, and descriptive results obtained through the model are shared and evaluated for a specific case.
Chapter
This research model proposed to empirically examine the impact of big data security on digital operations with the mediating role of supply chain risk in the UAE transportation and shipment industry. This is the first study linking perspectives on big data security and digital operations with a mediating effect of supply chain risk. This research does not claim to be comprehensive; instead, it analyses empirical research and literature to further the conversation by using a conceptual framework for examining the UAE transportation and shipment industry. An online survey was administered to the 226 employees of the UAE’s Dubai transportation and shipment companies. Exploratory research proposed data analysis performed by regression ANOVA using SPSS. Supply chain risk as a mediation link in the research originated as positively significant, whereas there was a direct relationship between big data security and digital operations. To support long- and short-term strategic decision-making, it is necessary to examine risk management procedures as supply chain and transportation networks develop in a dynamic environment. Organizational risk exposure must be rigorously analyzed using objective, transparent criteria. The costs and advantages of different risk mitigation strategies must be considered with digital criteria.
Chapter
Recently, smart cities are developing more slowly, gathering plenty of data and communication skills to improve service worth. Despite the smart city concept offerings many beneficial services, security management is still a significant problem because of shared threats and activities. The security aspects of smart cities should be constantly assessed to remove the unnecessary events employed to improve the superiority of the facilities to solve the issues. This study shows how robots are used in the smart city to manage privacy-related problems and actively learn how to forecast the superiority of facilities. Today, smart city development depends heavily on advancing technologies like the Internet of Things (IoT), Artificial Intelligence (AI), Blockchain, and Geospatial Technology. Machine learning, a branch of artificial intelligence, excels in security management systems. The proposed model may overwhelm the security challenges and presents how to keep and obtain their necessary robot-based security solutions by providing maintaining security services.
Article
Full-text available
The DALI dataset is a large dataset of time-aligned symbolic vocal melody notations (notes) and lyrics at four levels of granularity. DALI contains 5358 songs in its first version and 7756 for the second one. In this article, we present the dataset, explain the developed tools to work the data and detail the approach used to build it. Our method is motivated by active learning and the teacher-student paradigm. We establish a loop whereby dataset creation and model learning interact, benefiting each other. We progressively improve our model using the collected data. At the same time, we correct and enhance the collected data every time we update the model. This process creates an improved DALI dataset after each iteration. Finally, we outline the errors still present in the dataset and propose solutions to global issues. We believe that DALI can encourage other researchers to explore the interaction between model learning and dataset creation, rather than regarding them as independent tasks.
Article
Full-text available
There is a widespread use of Internet technology in the present times, because of which universities are making investments in Mobile learning to augment their position in the face of extensive competition and also to enhance their students' learning experience and efficiency. Nonetheless, Mobile Learning Platform are only going to be successful when students show acceptance and adoption of this technology. Our literature review indicates that very few studies have been carried out to show how university students accept and employ Mobile Learning Platform. In addition, it is asserted that behavioral models of technology acceptance are not equally applied in different cultures. The purpose of this study is to develop an extension of Technology Acceptance Model (TAM) by including four more constructs: namely, content quality, service quality, information quality and quality of the system. This is proposed to make it more relevant for the developing countries, like the United Arab Emir-ates (UAE). An online survey was carried out to obtain the data. A total of 221 students from the UAE took part in this survey. Structural equation modeling was used to determine and test the measurement and structural model. Data analysis was carried out, which showed that ten out of a total of 12 hypotheses are supported. This shows that there is support for the applicability of the extended TAM in the UAE. These outcomes suggest that Mobile Learning Platform should be considered by the policymakers and education developers as be-iJIM
Conference Paper
Full-text available
There have been several longitudinal studies concerning the learners’ acceptance of e-learning systems using the higher educational institutes (HEIs) platforms. Nonetheless, little is known regarding the investigation of the determinants affecting the e-learning acceptance through social media applications in HEIs. In keeping with this, the present study attempts to understand the influence of social media practices (i.e., knowledge sharing, social media features, and motivation and uses) on students’ acceptance of e-learning systems by extending the technology acceptance model (TAM) with these determinants. A total of 410 graduate and undergraduate students enrolled at the British University in Dubai, UAE took part in the study by the medium of questionnaire surveys. The partial least squares-structural equation modeling (PLS-SEM) is employed to analyze the extended model. The empirical data analysis triggered out that social media practices including knowledge sharing, social media features, and motivation and uses have significant positive impacts on both perceived usefulness (PU) and perceived ease of use (PEOU). It is also imperative to report that the acceptance of e-learning systems is significantly influenced by both PU and PEOU. In summary, social media practices play an effective positive role in influencing the acceptance of e-learning systems by students.
Article
Full-text available
Extending the Technology Acceptance Model (TAM) for studying the e-learning acceptance is not a new research topic, and it has been tackled by many scholars. However, the development of a comprehensive TAM that could be able to examine the e-learning acceptance under any circumstances is regarded to be an essential research direction. To identify the most widely used external factors of the TAM concerning the e-learning acceptance, a literature review comprising of 120 significant published studies from the last twelve years was conducted. The review analysis indicated that computer self-efficacy, subjective/social norm, perceived enjoyment, system quality, information quality, content quality, accessibility, and computer playfulness were the most common external factors of TAM. Accordingly, the TAM has been extended by the aforementioned factors to examine the students acceptance of e-learning in five different universities in the United Arab of Emirates (UAE). A total of 435 students participated in the study. The results indicated that system quality, computer self-efficacy, and computer playfulness have a significant impact on perceived ease of use of e-learning system. Furthermore, information quality, perceived enjoyment, and accessibility were found to have a positive influence on perceived ease of use and perceived usefulness of e-learning system.
Article
Full-text available
Academic procrastination has been reported affecting students’ performance in computer-supported learning environments. Studies have shown that students who demonstrate higher procrastination tendencies achieve less than the students with lower procrastination tendencies. It is important for a teacher to be aware of the students’ behaviors especially their procrastination trends. EDM techniques can be used to analyze data collected through computer-supported learning environments and to predict students’ behaviors. In this paper, we present an algorithm called Students’ Academic Performance Enhancement through homework late/ non-submission detection (SAPE) for predicting students’ academic performance. This algorithm is designed to predict students with learning difficulties through their homework submission behaviors. First, students are labeled as procrastinators or non-procrastinators using k-means clustering algorithm. Then, different classification methods are used to classify students using homework submission feature vectors. We use ten classification methods, i.e., ZeroR, OneR, ID3, J48, random forest, decision stump, JRip, PART, NBTree and Prism. A detailed analysis is presented regarding performance of different classification methods for different number of classes. The analysis reveals that in general the prediction accuracy of all methods decreases with increase in number of classes. However, different methods perform best or worst for different number of classes.
Article
Full-text available
This paper presents an algorithm for prediction of academic performance of students by fuzzy data mining. The fuzzy-trace concept applied to predict the academic performance of the students. An algorithm is proposed in this paper lies with this idea. The fuzzy academic set is generated from the student’s academic data. This is analyzed by the fuzzy-matrix set. The prediction
Book
The book focuses on machine learning. Divided into three parts, the first part discusses the feature selection problem. The second part then describes the application of machine learning in the classification problem, while the third part presents an overview of real-world applications of swarm-based optimization algorithms. The concept of machine learning (ML) is not new in the field of computing. However, due to the ever-changing nature of requirements in today’s world it has emerged in the form of completely new avatars. Now everyone is talking about ML-based solution strategies for a given problem set. The book includes research articles and expository papers on the theory and algorithms of machine learning and bio-inspiring optimization, as well as papers on numerical experiments and real-world applications.
Article
Data mining involves the searching of large information of the data or records to discover patterns and utilize these patterns in the prediction the future events. In most educational sectors such as high schools, polytechnics and universities; classification technique is a vital analytical mechanism in prediction of various levels of accuracy. Classification is one of the methods in data mining for categorizing a particular group of items to targeted groups. Main goal of classification is to predict the nature of an items or data based on the available classes of items. Construction of the classification model always defined by the available training data set. In this paper we will only discuss about the classification algorithms, although there are different types of algorithms available in data mining for the prediction of the future strategy for a business. The decision tree classification technique utilized in this work focused mainly on data of the student's performance obtained in a high school during a quiz using the KNIME tool.
Chapter
Aeronautical systems are no longer traditional masterpieces of autonomous mechanical engineering. Today, they are characterized by many intelligent technologies that include sensors, wireless standards and data analysis tools. Known as Aerospace Cyber-physical Systems (CPS), these CPSes are undergoing a massive transformation to increase the safety, efficiency and reliability of their operations. The physical system has created the Internet of Things IoT by integrating sensors, controllers and actuators. Nevertheless, the cyberspace of these aerospace CPSes offers many opportunities for malicious actors who threaten the security and privacy of vehicles/aircraft and their applications. Unprotected or poorly protected systems can easily be exploited for malicious purposes. Indeed, aerospace CPSes are always under threat from an increasing number of cyber-attacks through sensory or wireless channels, hardware, software or actuators. Recently, due to the significant advances and impressive results of machine learning techniques in the fields of image recognition, natural language processing and speech recognition for various long-standing artificial intelligence tasks, there has been a great interest in applying them to intrusion detection in the field of cybersecurity. In this chapter, we present different machine learning techniques for IoT intrusion detection in aerospace cyber-physical systems. The application of machine learning for cybersecurity in IoT requires the availability of substantial data on IoT attacks, but the lack of data on IoT attacks is a significant problem. In our study, the Cooja IoT simulator was used to generate high fidelity attack data in IoT 6LoWPAN networks. The efficient network architecture for all machine models is chosen based on comparing the performance of various network topologies and network scenarios. The experimental results show that Machine learning models for intrusion detection give better results by more than 99% in terms of accuracy, efficiency and detection rate. Also, it requires a low energy consumption overhead and memory, which proves that the proposed models can be used in constrained environments such as IoT sensors.