ArticlePDF Available

Introduction to the special section on educational data mining

Authors:

Abstract and Figures

Educational Data Mining (EDM) is an emerging multidisciplinary research area, in which methods and techniques for exploring data originating from various educational information systems have been developed. EDM is both a learning science, as well as a rich application area for data mining, due to the growing availability of educational data. EDM contributes to the study of how students learn, and the settings in which they learn. It enables data-driven decision making for improving the current educational practice and learning material. We present a brief overview of EDM and introduce four selected EDM papers representing a crosscut of different application areas for data mining in education.
Content may be subject to copyright.
Introduction to The Special Section on
Educational Data Mining
Toon Calders
Department of Computer Science
Eindhoven University of Technology
P.O. Box 513
5600 MB Eindhoven
t.calders@tue.nl
Mykola Pechenizkiy
Department of Computer Science
Eindhoven University of Technology
P.O. Box 513
5600 MB Eindhoven
m.pechenizkiy@tue.nl
ABSTRACT
Educational Data Mining (EDM) is an emerging multidisci-
plinary research area, in which methods and techniques for
exploring data originating from various educational informa-
tion systems have been developed. EDM is both a learning
science, as well as a rich application area for data mining,
due to the growing availability of educational data. EDM
contributes to the study of how students learn, and the set-
tings in which they learn. It enables data-driven decision
making for improving the current educational practice and
learning material. We present a brief overview of EDM and
introduce four selected EDM papers representing a crosscut
of different application areas for data mining in education.
1. INTRODUCTION
Recently, the increase in dissemination of interactive learn-
ing environments, learning management systems (LMS), in-
telligent tutoring systems (ITS), and educational hyperme-
dia systems as well as the wider use of ICT in education
in general has allowed the collection of huge amounts of
data. The increase in instrumented educational software, as
well as state databases of student test scores, created large
repositories of data reflecting how students learn. Some ex-
amples of popular systems include: general purpose LMS
such as Sakai
1
and Moodle
2
, specialized ITSs like the Cog-
nitive Tutors
3
or SQL Tutor
4
, professional education and
training systems such as simulators, systems for learning el-
ementary skills; for instance reading and performing arith-
metic operations such as Neure and Ekapeli
5
, and eHealth
and patient education such as Philips Motiva
6
. Educational
Data Mining aims at discovering useful information from
the large amounts of electronic data collected by these ed-
ucational systems. EDM as an emerging multidisciplinary
research area brings together researchers and practitioners
from computer science, education, psychology, psychomet-
1
http://sakaiproject.org
2
http://moodle.org/
3
http://pact.cs.cmu.edu/
4
http://www.cosc.canterbury.ac.nz/tanja.mitrovic/
sql-tutor.html
5
http://www.lukimat.fi/
6
http://www.healthcare.philips.com/main/products/
telehealth/products/motiva.wpd
rics, and statistics.
EDM as a separate research field started to mature a few
years ago. The Educational Data Mining International Con-
ference series was launched; the 4th edition of the conference
was held this year in Eindhoven, the Netherlands [6]. In 2010
the KDD Cup at the ACM SIGKDD Conference was de-
voted to the Educational Data Mining Challenge
7
- the task
was to predict student performance on mathematical prob-
lems based on data from logs of student interaction with an
ITS. The web portal of the International Educational Data
Mining Society
8
provides pointers to the main resources and
scientific events in this field. In the EDM area there are not
as many benchmarks as in data mining or information re-
trieval. Student enrollment data and LMS data is rarely
anonymized and made publicly available. The most known
repository for data on the interactions between students and
ITS educational software is maintained by the Pittsburgh
Science of Learning Center (PSLC) DataShop [4]. Next to
data, the repository also includes a suite of tools to process,
explore and visualize the data through a web-based inter-
face.
Historically, the majority of the EDM researchers has a
background in ITS, AI in education (AIED), user model-
ing, technology enhanced learning (TEL), or adaptive edu-
cational hypermedia. Relatively few scientists come with a
data mining background. The goal of this special section is
therefore twofold: providing an overview of the field and also
attracting interest from the Data Mining community. We
feel that EDM can and should attract further attention of
the KDD community. With this introduction to the special
section we attempt to answer the question—“What is inter-
esting in EDM for the Data Mining and KDD community?”
We discuss the landscape of EDM applications and tasks in
Section 2, pointing to different kinds of data available for
mining, and introduce four papers selected to represent the
current state of the art in the field in Section 3. Section 4
concludes the introduction.
2. TYPICAL EDM TASKS
Figure 1 presents the basic setting of EDM having a few
groups of stakeholders (learners, teachers, study advisers, di-
rectors of education, educational researchers) who can ben-
efit from EDM in different ways. For instance, students
can receive advice and recommendations about available
7
https://pslcdatashop.web.cmu.edu/KDDCup/
8
http://www.educationaldatamining.org/
SIGKDD Explorations
Volume 13, Issue 2
Page 3
courses, learning activities, resources, or tasks that are the
most suitable w.r.t. their current knowledge and learning
objectives; teachers can see how effective their learning ma-
terial is, how well the students are doing on particular tasks,
and how informative test assignments are; a study adviser
can identify risk groups among the students; directors of ed-
ucation can see how the students actually study and what
the bottlenecks are in the current curriculum. In either case
it is expected that the mined knowledge can give a better
insight, facilitate and enhance the educational processes and
the learning as a whole. The educational data mining survey
by Romero and Ventura [8] provides an elaborate overview
of how different EDM stakeholders can benefit from mining
various educational data sources, and several success stories
can be found in the first Handbook on EDM [9].
EDMTasks
Studentprofiling,
knowledgemodeling,
dropoutprediction
Educational
Inf.Systems
ITS,AEH,TEL,LMS
enroll(tocourses),
use(learning)
resources,
passtests,
collaborate(with
otherstudents),
Educators
Teac hers,
Studyadvisers,
Directorsof
education,
Education
researchers
Learners
Pupils,
Students,
Professionals,
Patients
DiscoveredKnowledge
Descriptive(process)models,
(learning)patterns,outliers,
(performance)predictions,
advicesandrecommendations
Educat.Data
Learningobjects,
eventlogs(usage,
interaction),grades,
leanerprofiles
collectsanduse
Figure 1: Educational data mining in a nutshell.
The current mainstream EDM research is primarily focused
on mining ITS and LMS logs. However, EDM in a wider per-
spective is aimed at helping to address problems related to
different phases in the leaning process, whether it is formal
(e.g. tests) or informal (e.g. educational games), intentional
(e.g. tutoring) or unexpected (e.g. using the social media).
Examples of particular problems include:
How to (re)organize the classes, or assessment, or place-
ment of materials based on usage and performance
data.
How to identify those who would benefit from provided
feedback, study advice or other help.
How to decide which kind of help, feedback or advice
would be most effective.
How to help learners in finding and searching useful
material, individually or in collaboration with peers.
Available Data Sources. Different kinds of information
systems are supporting educational processes at different
levels. For instance, administrative databases store enroll-
ment information; i.e., who follows which program, takes
which courses and (re-)exams, the student demographics
and their pre-university data, such as school grades. LMSs
store more fine-grained data including resource usage logs
(e.g. handouts, videorecordings), assessment data, collabo-
rations in wikis or versioning systems, and participation in
forums. ITSs and educational games often have learners’
performance data over a large collection of learning taks.
Consequently, learning-related data may have varying char-
acteristics. In traditional education, faculty or university
level data is longitudinal (including exams data over 5-year
study programmes) but corresponds only to a few hundred
or thousand students. In e-learning the use of widely ac-
cepted ITSs like SQL tutor or some of the Carnegie Learn-
ing
9
tutoring tools used in schools at the national level in
the United States resulted is huge datasets containing long
sequences of learners’ actions and their correctness. It is
typical to assume that the knowledge of learners increases
and skills improve over time and need to be modeled and
traced. In general, the data can be seen and modeled at
different levels of aggregation.
EDM Problem Formulations. A lot of basic EDM tasks
can be mapped to traditional data mining problem formu-
lations:
Classification: categorizing and profiling students, de-
termine their learning styles and preferences [1].
Predictive modeling: inducing models that can predict
whether (and when) a student will pass a course or
not [3], will eventually graduate or drop out [2].
Clustering: grouping similar students (based on be-
havior, performance, etc) or grouping similar courses,
assignments, etc together, exploring collaborative learn-
ing patterns [7].
Biclustering: finding which questions (tasks, courses,
etc) are difficult/easy for which students.
Frequent pattern mining: finding (elective) courses of-
ten taken together or popular paths in study programs
or actions in LMS [10].
Emerging pattern mining: finding patterns that cap-
ture significant differences in behavior of students who
graduated vs. those students who did not or that ex-
plain the changes in behavior of student generations
over different years.
Collaborative filtering and recommendations: recom-
mending suitable learning objects, based on the analy-
sis of the performance of other learners, recommending
remedial classes to students [5].
Visual analytics: facilitating reasoning about the ed-
ucational processes or learning results via interactive
data/model visualization, e.g. visualizing collaborations
of students.
Process mining: understanding the study curriculum,
how students follow it, (not) obeying particular con-
straints, understanding bottlenecks in particular study
programs.
Some of the state-of-the-art data mining techniques already
have been shown to be useful in particular educational do-
mains. However, many other EDM-related areas still remain
unexplored.
9
http://www.carnegielearning.com/
SIGKDD Explorations
Volume 13, Issue 2
Page 4
3. CONTRIBUTED ARTICLES
To illustrate the current state of the EDM field, we have se-
lected four contributions that together provide an overview
of the main research directions in EDM. The goal of this
special section on EDM is by no means to be exhaustive,
yet to provide a crosscut of the field. There have been nu-
merous other nice contributions in the field, many of which
can be found in the proceedings of the past and upcoming
EDM
10
, ITS
11
, and AIED
12
conferences and in the JEDM
13
and UMAUI
14
journals, among others.
In this special section we included the following four papers:
- Data Mining for Improving Textbooks by Rakesh
Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krish-
naram Kenthapadi. This paper discusses various ways for
assessing the quality of existing textbooks, as well as for sug-
gesting additional material, such as illustrations or Wikipe-
dia pages. The quality assessment of the textbook sections is
not only based upon a textual analysis of, e.g., average word
and sentence lengths, but also includes an elaborated anal-
ysis of the concepts in the text and their relations. Based
upon the concept graph, the dispersion of the book section is
measured. In the process of analyzing the texts and suggest-
ing additions, a lot of external information sources are used
and combined, including synsets from Wordnet, and pages
from Wikipedia pages with their revision history. This paper
is a nice example of how a creative combination of existing
techniques with the wealth of available online material al-
lows for new applications in the educational field that were
previously impossible.
- Social Network Analysis and Mining to Support
the Assessment of On-line Student Participation by
Reihaneh Rabbany, Mansoureh Takaffoli and Osmar R. Za¨ıa-
ne. Next to the study material, also the way students use it
and discuss about it can be analyzed. Many electronic learn-
ing environments such as Moodle, Blackboard and others
offer tools for students to collaborate. A popular example
of such a collaborative tool is a forum in which students
can post questions and remarks, and react to each other’s
contributions. Nevertheless, as Rabbany et al. argue, it is of-
ten quite difficult to analyze in what way students are using
these tools, how they are collaborating, and what topics they
are discussing about. Therefore, Rabbany et al. present their
Meerkat-ED toolbox for social network analysis in the con-
text of the assessment of student collaborations and course
participation. The visualizations include the visualization
of detected communities among the students, of keywords
representing discussion topics and their relations, and the
relative centrality of students in the discussions. A case
study for one course is presented.
- Mapping Question Items to Skills with Non-negati-
ve Matrix Factorization by Michel C. Desmarais. An-
other important source of information in the educational
process are the test scores of students. Desmarais shows
how the scores of different students on a set of questions
10
http://www.educationaldatamining.org/EDM2012/
11
http://its2012.teicrete.gr/
12
http://www.aied2011.canterbury.ac.nz/
13
http://www.educationaldatamining.org/JEDM/
14
http://www.umuai.org/
can be used to determine the skills required for a particu-
lar question, and how strong the different students are for
these skills. Desmarais applies matrix factorization tech-
niques for this purpose. The student-question score matrix
is decomposed into two matrices: one students-skills and
one skills-questions matrix. Given the constraints of the do-
main, non-negative matrix factorization is used; i.e., it is
assumed that the skill mastery level of the students is non-
negative and being more skilled will never have a negative
impact on the student’s ability to answer a question cor-
rectly. Desmarais studies the capabilities and limitations
of this technique and illustrates them on two real datasets
and on simulated data. The performance of the technique
is measured as how good it clusters the questions according
to a pre-defined categorization.
- The Sum is Greater than the Parts: Ensembling
Models of Student Knowledge in Educational Soft-
ware by Zachary A. Pardos, Sujith M. Gowda, Ryan S.J.D.
Baker, and Neil T. Heffernan. Another example of ana-
lyzing test results is given by Pardos et al. In contrast to
Desmarais, however, whose focus was mainly on detecting
the required skills for different questions, Pardos et al. con-
centrate on the knowledge level of the students, and this
knowledge is assumed to be non-static. The assumption is
that students who solve problems evolve their knowledge,
and a better knowledge will allow them to improve their
performance on further questions. Knowledge about a topic,
however, can be observed only indirectly through the scores
of the student on questions for this topic. The knowledge of
a student on a topic is therefore identified with the proba-
bility that the student will answer the next question on that
topic correctly. In this way, the performance of the knowl-
edge models can easily be assessed in controlled settings.
Several models for assessing the evolving knowledge level of
the students are presented, and it is shown how their pre-
dictions can be combined in ensemble methods to further
boost their performance.
4. CONCLUDING REMARKS
EDM took-off. The years to come will show how this field
evolves, and how it will be perceived by the KDD community
will it be yet another application domain of data mining
or does it have the capacity to grow into a new subfield with
its own challenges for data mining and multidisciplinary re-
search, alike it happened for bioinformatics?
In this special section we present the current state of the art
in the area inviting four representative papers, including:
the evaluation and improvement of study material; assess-
ing the knowledge of studentes based upon how they score
on a set of questions; analyzing the required skills for differ-
ent questions, based upon how students answer them; and
visualizing collaborations of students in order to detecting
groups of topics and clusters of students.
We hope you will enjoy reading the papers on EDM included
in this special section and find an inspiration for formulating
new data mining problems or try out your own favorite data
mining algorithm on the available EDM datasets.
5. ACKNOWLEDGEMENTS
We would like to thank all the authors who contributed to
this special section.
SIGKDD Explorations
Volume 13, Issue 2
Page 5
6. REFERENCES
[1] H. J. Cha, Y. S. Kim, S. H. Park, T. B. Yoon, Y. M.
Jung, and J.-H. Lee. Learning styles diagnosis based
on user interface behaviors for the customization of
learning interfaces in an intelligent tutoring system.
In Proceedings of the 8th International Conference on
Intelligent Tutoring Systems, ITS 2006, volume 4053
of Lecture Notes in Computer Science, pages 513–524.
Springer, 2006.
[2] G. Dekker, M. Pechenizkiy, and J. Vleeshouwers. Pre-
dicting students drop out: A case study. In Proceed-
ings of the 2nd International Conference on Educa-
tional Data Mining, EDM’09, pages 41–50, 2009.
[3] W. am¨al¨ainen and M. Vinni. Comparison of machine
learning methods for intelligent tutoring systems. In
Proceedings of the 8th International Conference on In-
telligent Tutoring Systems, ITS 2006, volume 4053 of
Lecture Notes in Computer Science, pages 525–534.
Springer, 2006.
[4] K. Koedinger, R. Baker, K. Cunningham,
A. Skogsholm, B. Leber, and J. Stamper. A data
repository for the EDM community: The PSLC
DataShop. In Handbook of Educational Data Mining.
Boca Raton, FL: CRC Press, Taylor&Francis, 2010.
[5] Y. Ma, B. Liu, C. K. Wong, P. S. Yu, and S. M.
Lee. Targeting the right students using data mining.
In Proceedings of the 6th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining,
KDD’00, pages 457–464, New York, USA, 2000. ACM.
[6] M. Pechenizkiy, T. Calders, C. Conati, S. Ventura,
C. Romero, and J. Stamper, editors. Proceedings of the
4th International Conference on Educational Data Min-
ing, Eindhoven, the Netherlands, July 6-8, 2011, 2011.
[7] D. Perera, J. Kay, I. Koprinska, K. Yacef, and O. R.
Za¨ıane. Clustering and sequential pattern mining of on-
line collaborative learning data. IEEE Transactions on
Knowledge and Data Engineering, 21(6):759–772, 2009.
[8] C. Romero and S. Ventura. Educational data mining:
A survey from 1995 to 2005. Expert Systems with Ap-
plication, 33:135–146, July 2007.
[9] C. Romero, S. Ventura, M. Pechenizkiy, and R. Baker.
Handbook of Educational Data Mining. Boca Raton,
FL: CRC Press, Taylor&Francis, 2010.
[10] O. R. Za¨ıane. Web usage mining for a better web-based
learning environment. In Proceedings of the Conference
on Advanced Technology for Education, Banff, Alberta,
pages 60–64, 2001.
SIGKDD Explorations
Volume 13, Issue 2
Page 6
... Multidisciplinary research communities in education, including educational data mining (EDM) (20,21), learning analytics (LA), computer-supported collaborative learning (CSCL) (22), and AI for education (AIED) (23,24), have been active, with overlapping research interests and philosophical differences described in ref. 25. Surveys of research trends and challenges in these communities (23,26,27) have indicated rapid growth in the use of technology in education and consequent research interest. ...
... This growth necessitates a deep dive into the technological priorities in applied ML research and development, which is central to the current work. A shared interest in common problem formulations and technologies notwithstanding, Calders et al. (21) suggest that there are differences in background and motivation between education research communities and general ML and data science communities like KDD * (28). ...
... A. Related Work in Education Technology. The educational data mining (EDM) (20,21), learning analytics (LA), computersupported collaborative learning (CSCL) (22), and AI for education (AIED) (23,24) communities have been active for the last one to three decades. Retrospective surveys of the research trends and ongoing challenges of these fields have analyzed common and growing ML paradigms being applied in education, like neural networks for learning characteristic prediction and teacher evaluation (23,58), NLP for language education (59), and AI-assisted personalization (23,60,61). ...
Article
Full-text available
Machine learning (ML) techniques are increasingly prevalent in education, from their use in predicting student dropout to assisting in university admissions and facilitating the rise of massive open online courses (MOOCs). Given the rapid growth of these novel uses, there is a pressing need to investigate how ML techniques support long-standing education principles and goals. In this work, we shed light on this complex landscape drawing on qualitative insights from interviews with education experts. These interviews comprise in-depth evaluations of ML for education (ML4Ed) papers published in preeminent applied ML conferences over the past decade. Our central research goal is to critically examine how the stated or implied education and societal objectives of these papers are aligned with the ML problems they tackle. That is, to what extent does the technical problem formulation, objectives, approach, and interpretation of results align with the education problem at hand? We find that a cross-disciplinary gap exists and is particularly salient in two parts of the ML life cycle: the formulation of an ML problem from education goals and the translation of predictions to interventions. We use these insights to propose an extended ML life cycle, which may also apply to the use of ML in other domains. Our work joins a growing number of meta-analytical studies across education and ML research as well as critical analyses of the societal impact of ML. Specifically, it fills a gap between the prevailing technical understanding of machine learning and the perspective of education researchers working with students and in policy.
... Multidisciplinary research communities in education, including educational data mining (EDM) [14,48], learning analytics (LA), Computer-Supported Collaborative Learning (CSCL) [53], and AI for education (AIED) [19,67] have been active, with overlapping research interests and philosophical differences described in [70]. Surveys of research trends and challenges in these communities [19,32,37] have indicated rapid growth in the use of technology in education and consequent research interest. ...
... This growth necessitates a deep dive into the technological priorities in applied ML research and development, which is central to the current work. A shared interest in common problem formulations and technologies notwithstanding, Calders et al. [14] suggest that there are salient disciplinary differences between education research communities and traditional ML and data science communities such as KDD 1 [66]. ...
... The educational data mining (EDM) [14,48], learning analytics (LA), Computer-Supported Collaborative Learning (CSCL) [53], and AI for education (AIED) [19,67] communities have been active for the last one to three decades. Retrospective surveys of the research trends and ongoing challenges of these fields have analyzed common and growing ML paradigms being applied in education, like neural networks for learning characteristic prediction and teacher evaluation [19,83], NLP for language education [18], and AI-assisted personalization [16,19,95]. ...
Preprint
Full-text available
Machine learning (ML) techniques are increasingly prevalent in education, from their use in predicting student dropout, to assisting in university admissions, and facilitating the rise of MOOCs. Given the rapid growth of these novel uses, there is a pressing need to investigate how ML techniques support long-standing education principles and goals. In this work, we shed light on this complex landscape drawing on qualitative insights from interviews with education experts. These interviews comprise in-depth evaluations of ML for education (ML4Ed) papers published in preeminent applied ML conferences over the past decade. Our central research goal is to critically examine how the stated or implied education and societal objectives of these papers are aligned with the ML problems they tackle. That is, to what extent does the technical problem formulation, objectives, approach, and interpretation of results align with the education problem at hand. We find that a cross-disciplinary gap exists and is particularly salient in two parts of the ML life cycle: the formulation of an ML problem from education goals and the translation of predictions to interventions. We use these insights to propose an extended ML life cycle, which may also apply to the use of ML in other domains. Our work joins a growing number of meta-analytical studies across education and ML research, as well as critical analyses of the societal impact of ML. Specifically, it fills a gap between the prevailing technical understanding of machine learning and the perspective of education researchers working with students and in policy.
... EDM contributes to the study's settings and how students learn. It enables datadriven decision making for improving the current educational practice and learning material [9]. ...
Article
Full-text available
The involvement of teaching videos increases the learners' psychological stimulation during the learning process. In fact, this type of resource can facilitate comprehension by approaching the content at the learners own pace and coming back to it as many times as necessary. In this particular case, the possibility of keeping a trace of video playing will be considered for the teacher an important asset in monitoring the learning progress of the learners. The objective of our work is to analyze the learner traces when he/she is playing a video. We propose usable restitutions to improve the online learning. Among these restitutions, we suggest to the teacher groups of learners who will be able to work together since they have the same video playing strategies. Throughout this analysis, the results are from a specific case study on which we applied unsupervised classification methods to identify groups of learners with similar profiles.
... Educational Data Mining (EDM) is a closely related field to LA [15]. EDM "is an emerging multidisciplinary research area, in which methods and techniques for exploring data originating from various educational information systems have been developed" (p. 3) [16]. EDM has a stronger emphasis on the technical element of data mining and analysis, but shares the overarching goal with LA of generating insights to support learning and teaching improvement. ...
Article
Full-text available
Student performance predictive analysis has played a vital role in education in recent years. It allows for the understanding students’ learning behaviours, the identification of at-risk students, and the development of insights into teaching and learning improvement. Recently, many researchers have used data collected from Learning Management Systems to predict student performance. This study investigates the potential of clickstream data for this purpose. A total of 5341 sample students and their click behaviour data from the OULAD (Open University Learning Analytics Dataset) are used. The raw clickstream data are transformed, integrating the time and activity dimensions of students’ click actions. Two feature sets are extracted, indicating the number of clicks on 12 learning sites based on weekly and monthly time intervals. For both feature sets, the experiments are performed to compare deep learning algorithms (including LSTM and 1D-CNN) with traditional machine learning approaches. It is found that the LSTM algorithm outperformed other approaches on a range of evaluation metrics, with up to 90.25% accuracy. Four out of twelve learning sites (content, subpage, homepage, quiz) are identified as critical in influencing student performance in the course. The insights from these critical learning sites can inform the design of future courses and teaching interventions to support at-risk students.
... Although predicting academic performance is one of the most popular subject areas in EDM (Akçapınar et al., 2019;Fernandes et al., 2019), researchers also utilize EDM for purposes such as the discovery or improvement of models regarding the knowledge structure of the domain, drop-out prediction, student profiling, and achieving a deeper understanding of educational phenomena (Baker, 2011;Calders & Pechenizkiy, 2012;Romero & Ventura, 2010). In order to achieve these and similar purposes, there exists a wide variety of methods that have proven popular within EDM, and that have been subjected to different classifications by different researchers (Baker, 2011;Baker & Yacef, 2009;Bakhshinategh et al., 2018;Romero & Ventura, 2007Zaiane, 2002). ...
Article
Today, most educational institutions have become more interested in big data. Because the importance of extracting useful information from educational data to support decision-making on educational issues has increased day by day. In this context, through educational data mining, this research study aims to reveal the association rules among compulsory courses in the Computer Education and Instructional Technology curriculum within the faculty of education of a state university in Turkey. In this context, the research was conducted with data obtained from 258 preservice teachers who had completed all of their compulsory courses (n = 42) for the Computer Education and Instructional Technology curriculum, having graduated from the Computer Education and Instructional Technology program between 2012 and 2020. According to the experimental results, the academic performance of preservice teachers in some courses could be used as a predictor of their academic performance in other courses. Other findings from the study are discussed in detail, and suggestions put forth for future research.
... Data Mining also discovery in Database (KDD).Mine educational field is referred as Mining (EDM).The mined knowl better insight, facilitate and enhanc process and the learning a who be mine in more r can also learn more ctice by studying a lso called knowledge ined knowledge from s Educational Data wledge can give a ance the educational hole [10]. Figure 2 illustrate the various assoc ...
Article
Full-text available
Relationship between ICT and e-learning shows big benefits for education, especially in higher education. We cannot ignore technology and technical skills in higher education field, even when we are overtaken technical education .Now a days we can take great advantages of Information and communication Technology (ICT) in every sector of education like Teaching and learning process, curriculum development, student progress etc. Using Data mining we can identify problem in growth of e-learning in India. Using this paper we are trying to highlight problems in e-learning to grow in some areas and how it can be solve using ICT.
Conference Paper
Early identification of elementary students' literacy levels is vital for many reasons. With the availability of an extensive collection of data and advanced Machine learning (ML) algorithms, students' literacy levels can be predicted to ensure they are meeting the benchmark level. This manuscript presents the findings of such an investigation to predict end-of-the-year literacy levels of students at the elementary level. Using a dataset collected about students' related information and their academic scores (DIBELS), five machine learning models are constructed to classify whether students meet standards. Based on the experimental findings, the hyper-parameterized Random Forest model shows a significant performance with 81% of recall value in identifying a student at the risk of meeting the benchmark level.
Article
Full-text available
Sanal öğrenme ortamlarında öğrencilerin öğrenme içerikleri ile etkileşimlerinden dolayı çok sayıda veri ortaya çıkmaktadır. Bu veriler eğitsel veri madenciliği algoritmaları ile sanal öğrenme ortamlarının tasarımında yol gösterici nitelikte olmaktadır. Ancak öğrencilerin sanal öğrenme ortamında etkileşimde bulundukları öğretim materyallerinin başarı durumlarına etkisinin incelenmesine ihtiyaç duyulmaktadır. Bu tür ihtiyaçlara yönelik yapılan eğitsel veri madenciliği uygulamalarına yönelik izlenmesi gereken bilimsel süreçlerin de ortaya konulması gerekmektedir. Bu araştırmada öğrencilerin sanal öğrenme ortamında farklı öğrenme materyalleri ile etkileşimlerinin başarı durumuna etkisinin ortaya konulmasında veritabanından bilgi keşfi yönteminin kullanılması amaçlanmıştır. Araştırmada büyük veri olarak The Open University Öğrenme Analitiği Veri Seti kullanılmıştır. Bu veri seti ile veritabanından bilgi keşfi yöntemine göre süreç izlenmiş ve karar ağacı algoritmalarından CART algoritması ile araştırma sorularına yanıt aranmıştır. Araştırmanın bulgularına göre öğrencilerin öğretim materyalleri ile etkileşimlerinin başarının bir belirleyicisi olduğu görülmüştür. Bu doğrultuda öğretim tasarımcılarının uzaktan eğitim ortamlarında standartların oluşturulmasına ve etkili bir öğretim tasarımı için tercih edilecek öğretim materyallerinin belirlenmesine yönelik alacakları kararda araştırma bulgularının yol gösterici nitelikte olması öngörülmektedir.
Article
Full-text available
Predicting students’ overall performance turns into greater difficult because of the massive quantity of records in academic databases. Currently in India, the shortage of current system to examine and display the student development and overall performance isn't always being addressed. Hence on this paper, supplied an in-depth literature assessment on predicting student overall performance through the use of data mining strategies is proposed to enhance students’ achievements. The main goal of this paper is to offer an outline at the data mining strategies which have been used to predict students’ overall performance. We may want to really enhance students’ achievement and success greater efficaciously in an efficient manner the use of academic records mining strategies. It could convey the benefits and affects to students, educators and educational institutions.
Article
Full-text available
Group work is widespread in education. The growing use of online tools supporting group work generates huge amounts of data. We aim to exploit this data to support mirroring: presenting useful high-level views of information about the group, together with desired patterns characterizing the behaviour of strong groups. The goal is to enable the groups and their facilitators to see relevant aspects of the group's operation and provide feedback if these are more likely to be associated with positive or negative outcomes and where the problems are. We explore how useful mirror information can be extracted via a theory-driven approach and a range of clustering and sequential pattern mining. The context is a senior software development project where students use the collaboration tool TRAC. We extract patterns distinguishing the better from the weaker groups and get insights in the success factors. The results point to the importance of leadership and group interaction, and give promising indications if they are occurring. Patterns indicating good individual practices were also identified. We found that some key measures can be mined from early data. The results are promising for advising groups at the start and early identification of effective and poor practices, in time for remediation.
Article
Full-text available
Currently there is an increasing interest in data mining and educational systems, making educational data mining as a new growing research community. This paper surveys the application of data mining to traditional educational systems, particular web-based courses, well-known learning content management systems, and adaptive and intelligent web-based educational systems. Each of these systems has different data source and objectives for knowledge discovering. After preprocessing the available data in each case, data mining techniques can be applied: statistics and visualization; clustering, classification and outlier detection; association rule mining and pattern mining; and text mining. The success of the plentiful work needs much more specialized work in order for educational data mining to become a mature area.
Conference Paper
Full-text available
The education domain offers a fertile ground for many interesting and challenging data mining applications. These applications can help both educators and students, and improve the quality of education. In this paper, we present a real-life application for the Gifted Education Programme (GEP) of the Ministry of Education (MOE) in Singapore. The application involves many data mining tasks. This paper focuses only on one task, namely, selecting students for remedial classes. Traditionally, a cut-off mark for each subject is used to select the weak students. That is, those students whose scores in a subject fall below the cut-off mark for the subject are advised to take further classes in the subject. In this paper, we show that this traditional method requires too many students to take part in the remedial classes. This not only increases the teaching load of the teachers, but also gives unnecessary burdens to students, which is particularly undesirable in our case because the GEP students are generally taking more subjects than non-GEP students, and the GEP students are encouraged to have more time to explore advanced topics. With the help of data mining, we are able to select the targeted students much more precisely.
Conference Paper
Full-text available
The monitoring and support of university freshmen is considered very important at many educational institutions. In this paper we describe the results of the educational data mining case study aimed at predicting the Electrical Engineering (EE) students drop out after the first semester of their studies or even before they enter the study program as well as identifying success-factors specific to the EE program. Our experimental results show that rather simple and intuitive classifiers (decision trees) give a useful result with accuracies between 75 and 80%. Besides, we demonstrate the usefulness of cost-sensitive learning and thorough analysis of misclassifications, and show a few ways of further prediction improvement without having to collect additional data about the students.
Conference Paper
Full-text available
Each learner has different preferences and needs. Therefore, it is very crucial to provide the different styles of learners with different learning environments that are more preferred and more efficient to them. This paper re- ports a study of the intelligent learning environment where the learner's prefer- ences are diagnosed, and then user interfaces are customized in an adaptive manner to accommodate the preferences. A learning system with a specific in- terface has been devised based on the learning-style model by Felder & Silverman, so that different learner preferences are revealed through user inter- actions with the sys-tem. Using this interface, learning styles are diagnosed from learner behavior patterns on the interface using Decision Tree and Hidden Markov Model approaches.
Conference Paper
Full-text available
To implement real intelligence or adaptivity, the models for intelligent tutoring systems should be learnt from data. However, the educational data sets are so small that machine learning methods cannot be applied directly. In this paper, we tackle this problem, and give general outlines for creating accurate classifiers for educational data. We describe our experiment, where we were able to predict course success with more than 80% accuracy in the middle of course, given only hundred rows of data.
Article
Full-text available
Web-based technology is often the technology of choice for distance education given the ease of use of the tools to browse the resources on the Web, the relative affordability of accessing the ubiquitous Web, and the simplicity of deploying and maintaining resources on the WorldWide Web. Many sophisticated web-based learning environments have been developed and are in use around the world. The same technology is being used for electronic commerce and has become extremely popular. However, while there are clever tools developed to understand online customer's behaviours in order to increase sales and profit, there is very little done to automatically discover access patterns to understand learners' behaviour on webbased distance learning. Educators, using on-line learning environments and tools, have very little support to evaluate learners' activities and discriminate between different learners' on-line behaviours. In this paper, we discuss some data mining and machine learning techniques that could be used to enhance web-based learning environments for the educator to better evaluate the leaning process, as well as for the learners to help them in their learning endeavour. KEY WORDS Data Mining, e-learning, Web Usage Mining, Learning Activity Evaluation, Adaptive Web Sites 1.
Book
Handbook of Educational Data Mining (EDM) provides a thorough overview of the current state of knowledge in this area. The first part of the book includes nine surveys and tutorials on the principal data mining techniques that have been applied in education. The second part presents a set of 25 case studies that give a rich overview of the problems that EDM has addressed. Researchers at the Forefront of the Field Discuss Essential Topics and the Latest Advances With contributions by well-known researchers from a variety of fields, the book reflects the multidisciplinary nature of the EDM community. It brings the educational and data mining communities together, helping education experts understand what types of questions EDM can address and helping data miners understand what types of questions are important to educational design and educational decision making. Encouraging readers to integrate EDM into their research and practice, this timely handbook offers a broad, accessible treatment of essential EDM techniques and applications. It provides an excellent first step for newcomers to the EDM community and for active researchers to keep abreast of recent developments in the field.