Content uploaded by Andre Damasceno
Author content
All content in this area was uploaded by Andre Damasceno on Sep 16, 2019
Content may be subject to copyright.
What the Literature and Instructors Say about the
Analysis of Student Interaction Logs on
Virtual Learning Environments
Andr´
e Luiz de Brand˜
ao Damasceno, Dalai dos Santos Ribeiro, Simone Diniz Junqueira Barbosa
Pontifical Catholic University of Rio de Janeiro
Rio de Janeiro, Brazil
{adamasceno,dribeiro,simone}@inf.puc-rio.br
Abstract—Online education has broadened the avenues of
research on student behavior and performance. In this paper,
we shed light on how student interaction logs on virtual learning
environments have been analyzed. We conducted two studies:
interviews with instructors and a systematic mapping of 136
papers on Education Data Mining and Learning Analytics,
published between 2010 and Feb/2019. Then, we triangulate the
results between the instructors’ answers and research results
found in the mapping. The main contributions of this paper
are: (i) an overview of Brazilian instructors’ experience with
Virtual Learning Environments and requirements to improve
student analysis; (ii) an overview and classification of the papers
in this topic; (iii) a ranking of importance of the instructors’
main statements and paper results; and (iv) guidelines based
on the requirements mentioned by instructors for designing
tools that support their analyses. Finally, we discuss gaps in
existing work that can ground and guide future research, such
as tools to evaluate student interaction logs with video lectures
and to analyze instructor interaction logs on Virtual Learning
Environments.
Index Terms—LA, EDM, VLE, logs, mapping, interviews,
eLearning, student, teacher, instructor, behavior, interaction
I. INTRODUCTION
Virtual Learning Environments (VLE) are not exclusive
to distance education. Some VLEs are used together with
face-to-face learning (aka blended learning). The usage of
VLE in all classroom modes (i.e., face-to-face, blended and
online education) has leveraged much research on Informatics
in Education. As students’ interactions with VLEs can be
captured in logs, by analyzing these logs we can evaluate their
learning achievement in a course, identify behavior patterns,
and even predict their performance [1].
Two main research groups have related goals regarding
the exploration of education data: Educational Data Mining
(EDM) and Learning Analytics (LA). Taking into account the
last three years, we have found in the literature some review
papers that present the state-of-the-art in EDM and LA [2]–[5].
Nevertheless, each one of these papers focus on either EDM
or LA, but not both, and highlights a single research topic,
such as an intervention during learning, support to instructor,
data clustering, or data visualization.
The purpose of this paper is to identify which kinds of
information about students the instructors regard as meaningful
(e.g., performance, behavior, engagement); how these kinds of
information are gathered; and how they drive requirements for
improving the analyses. Understanding the learning process
can help instructors design better courses and improve the
learning effectiveness.
To achieve our goal, we led interviews with instructors
who work in Brazil and conducted a systematic mapping
on EDM and LA. The aim of this mapping was to uncover
papers that discuss the use of logs to analyze and predict both
student behavior and performance, and to propose a paper
classification for this domain. We then triangulated the answers
obtained in the interviews with the instructors and the paper
results found in the systematic mapping. The main outcome
of this triangulation is a broad assessment of the area, which
can ground and guide future research.
The remainder of this paper is structured as follows. Sec-
tion II describes the interviews with instructors and Section III
describes the systematic mapping. Section IV triangulates and
discusses the answers obtained from the instructors and the
results found in the literature. Lastly, Section V presents some
final considerations and recommendations for future work.
II. INTERVIEWS WITH INSTRU CTORS
Between November 2017 and April 2018, we invited 37
instructors from 11 education institutions located in different
Brazilian regions to participate in an individual interview. This
sample was selected aiming to answer the following questions:
•Which resources instructors have been using in VLEs
to analyze student behavior and performance?
•What do the instructors need to improve those anal-
yses?
The interviews were conducted with 18 university instruc-
tors (13 men and 5 women) from institutions located in six
Brazilian states (Rio de Janeiro, Maranh˜
ao, Minas Gerais,
Piau´
ı, Goi´
as, and Pernambuco). The interviews were one-
hour long, semi-structured1, conducted remotely or in-person
(depending on the availability and location of the participant).
All instructors work in institutions that make use VLEs. For
instance, Alura2and UNASUS-UFMA3are institutions that
1Available at: http://dx.doi.org/10.6084/m9.figshare.8285597
2https://www.alura.com.br
3http://www.unasus.ufma.br/site/
TABLE I
GEN ERA L RE SULTS O F TH E INT ERV IEW S.
Factors that affect Students’
Analysis of Behavior the student interaction
students’ interaction Meaningful informations patterns to be Evaluation of to watch and
Used Resources in VLEs with the content about students identified students’ motivation a video lecture performance
Instructor number
Communication tools (e.g., forum, chat)
Postings (e.g., educational content, news)
Assigned activities (e.g., quiz, exercises)
Content repository
References from Web
Videoconference
Wiki
Blog
Badges and rankings
Dashboard for monitoring the student access
Other
Access logs
Logs of interactions with the materials
Do not analyze student interactions
Forum postings and participation
Evaluating assessments and exams
Other (e.g., classroom assistant feedback)
Student feedback
Observing which students are online
Background
Interaction
Performance
Learning
Course feedback
Expectations, Intentions, and Motivations
Knowledge usefulness
Profile
Perception
Age
Difficulties
Participation
Fulfilled goals and expectations
Drop out
Other (e.g., connection problems, availability, retention)
Access
Interest
Forum usage
Interaction with content, instructor and each other
Participation
Performance
Responsibility
Other (e.g., Students’ preferred resources, pace, reaction, mood)
Communication (e.g., forum, chat, talking with students)
Assignment completion
Access
Reaction
Participation
Does not know
Performance
Feedback (e.g., qualitative questionnaire)
Badges
Interest
Other (e.g., interaction on wikis, face recognition)
Video length
Way in which instructors pose and express themselves in videos
Format in which the content is presented
Closeness to an exam date
Video production and editing
Interactivity and the use of images to present the content
Video content
Being able to watch a video on a mobile device
Students’ access to materials
Students’ participation in forum
Assignment completion
Video access
This information is not available to the instructor
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
Total 18 10 11 4 4 4 3 2 1 1 7 11 9 6 5 5 3 2 1 16 14 12 12 10 8 7 6 4 3 3 2 2 2 3 8 6 5 4 4 2 2 4 13 7 5 3 3 3 2 2 2 2 2 15 8 8 7 7 5 3 1 14 7 5 3 2
offer large-scale online courses. The former has around 500
courses available in several topics related to Computing. The
latter has on average 20,000 students per semester and offers
distance learning graduate courses, as well as training on
demand, mostly in the Healthcare area.
Table I compiles the interview results relating each instruc-
tor4to the topics he/she discussed. Most instructors (except
I06, I07, I08 and I09) teach in STEM5courses. In particular,
I03 teaches engineering courses, and I02, I04, I05, I10, I11,
I12, I13, I15 and I18 teach computing courses. They all had
experience with Distance Learning (DL): nine had taught
exclusively distance learning courses, one taught a blended
learning course, and eight taught both types of courses. In to-
tal, the instructors mentioned having worked with 10 different
VLEs, and Moodle was the most often cited one (by 15 of the
18 instructors).
I01 reported that he had been using VLEs in face-to-
face teaching and he noticed that the students’ performance
improved, and I06 remarked that the students participate more
in online courses.
All instructors stated they make use of some communication
tool with the students. Most of them use the VLE to assign
activities (e.g., quiz, exercises) and to post educational content
and news. I04 uses gamification techniques in the VLE and
noticed an improvement in the class. According to him, stu-
dents like to earn badges, participate in rankings, and compete
with one another. In line with this statement, I10 said that
4In this paper, instructors are identified in the format I99.
5Acronym of Science, Technology, Engineering, and Math.
competitive students tend to perform better.
We asked how they analyze the students’ interaction with
the content. Although only I17 informed the use of a tool
(i.e., dashboard) for monitoring student access in the VLE,
11 interviewees reported that they examine the access logs,
9 analyze the logs of students interactions with the materials
available in the VLE, and 6 do not analyze student interactions.
In particular, I05 said that he had used a dashboard to oversee
students and another tool designed for students to highlight
text, from which he gathered information to support his
pedagogical decisions.
Not all students behave the same way. I04 said older and
younger students have distinct interaction patterns. Further-
more, some instructors (I03, I04, I05, I09, and I11) reported
that there are students who only turn in their assignments,
without any other interaction. According to I09, this “virtual
silence” is challenging, whereas I03 and I05 reported that in
these cases the only feasible analysis is through the assign-
ments and exam results. Regarding students who achieve a
good grade despite having very little interaction with the VLE,
some instructors provided the following interpretations: (i) not
every student likes to interact with instructors (I03, I04, and
I11) and (ii) there are students who do not interact because
they already know the content.
We asked instructors which kinds of information they find
meaningful. Besides interaction data analysis, 16 instructors
in this study believe that identifying the student background
is important. They mentioned, for instance, the importance of
knowing how much students know about the course material
and their individual background knowledge. They state that
such information helps the overall planning of the course
and to match the difficulty level of the content. They also
mentioned their interest in student performance, learning,
expectations, and course feedback.
When asked about how they evaluate student learning and
motivation, 15 instructors answered that they use exams and
assignments to assess student learning; and 13 instructors es-
timate students motivation by observing their communication
in the VLE, such as the use of the forum and chats, and
the students’ contributions and questions. 7 said they also
evaluated motivation by observing which students completed
assignments and which ones were accessing the VLE. I07
believes motivation to be one of the main factors influencing
student learning. I06 stated that the instructor has to make
dynamic lectures and incentivize students in order to engage
and motivate them. For instance, I04 said that he motivates
the students using gamification techniques. Nevertheless, I03
and I09 said that it is hard to evaluate student motivation in
DL. With face-to-face teaching, this evaluation is made easy
by analyzing facial expressions and voice intonation. In turn,
I02, I03 and I05 claim that, in DL, the signals emitted by
students are their interaction with the materials and forum. In
line with them, I06 reported that student motivation cannot
be quantified. In contrast, I16 said face-to-face teaching is
more difficult because the student is a black-box, whereas a
VLE provides more student information (e.g., participation and
contributions; performance; engagement).
We asked the instructors what student behavior patterns
they would like to identify. They reported access patterns,
interest (inferred, for instance, from students’ questions [I18]
and from watching video segments more than once [I09]);
forum usage; interaction with the content, with the instructor,
and with one another; and participation, among others. As only
one instructor claimed to use a tool to analyze the student
access logs, most instructors said that they do not have any
analytic tool or information besides the student access logs,
which makes it impractical to analyze the students’ interaction
with the VLE in classes with a large number of students.
A total of 12 instructors pointed out the need for tools to
analyze students access and interaction logs with the material
and other VLE resources, such as the forum, video player,
quiz, and e-books.6The instructors believe that systems who
satisfy those needs will allow identifying student interaction
patterns and predicting student performance and drop out.
The analysis suggestions made by the instructors were: (i) to
analyze emoticons to identify students’ mood, (ii) to identify
students background through students’ behavior and reaction,
(iii) to adapt content for students using decision trees, (iv) to
correlate student access to materials and drop out, and (v) to
correlate student navigation on video and their performance.
In addition, I12 and I16 mentioned that these pieces of
information should be presented in a dashboard. According to
I16, in face-to-face teaching, instructors are very often over-
whelmed with preparing their classes, assessing coursework
6http://dx.doi.org/10.6084/m9.figshare.8285618
and evaluating students. The usage of VLEs has mitigated
this problem and, by using a dashboard that provides student
behavior information, they could spend more of their time
working on teaching methods and materials. The instructors
also said that a dashboard could help them in several ways:
(i) to produce educational material that takes into consideration
the students’ background and performance, (ii) to know what
materials to upload in the VLE, (iii) to understand the reason
for students to drop out, (iv) to compare the performance of
students in his/her class, and (v) to make pedagogical decisions
to enhance the students’ performance and reduce drop out.
In line with this, I18 stated that, without such a tool, it is
unfeasible to evaluate students in an online course, whose
number of enrollments can be huge.
Out of the 18 instructors, only I11 had never created an
online educational content, because he only tutored students
and had a content manager to create the course materials.
Conversely, 14 instructors reported having experience in the
authoring of a video lecture. Some of them emphasized the
value of using video in education. They reported that video
is the preferred content format by students because it allows
the students to watch (part of) a class more than once, and
the video lecture is often the gateway to knowledge. I11 and
I12 also remarked that video lectures improved both student
understanding and performance. Moreover, I01 and I11 noticed
that students are less engaged in videos using just slides or
captured from a live classroom lecture. By contrast, I01, I16,
and I18 stated that the video format in Khan Academy and
Talking Head engaged students the most.
When asked about student interaction with videos, most
instructors (17) said they would like to know the number of
views of particular parts of the videos, and which segments
were skipped or re-watched by the students. According to
most of them, this information could provide insights to the
instructor about what the students found relevant and about
which segments the students had more doubts. In addition,
some instructors stated their needs explicitly: (i) to know
whether the students are watching and understanding the
video, (ii) to add markers to the video player showing students
where each content topic begins and ends, thereby enabling
them to find the content more efficiently, and (iii) to identify
(parts of) videos the students liked.
The instructors cited the following factors that influence a
student to watch a video: (i) the closeness of an exam date,
(ii) the quality of video production and editing, (iii) the format
in which the content is presented, and (iv) the way in which
the instructors pose and express themselves in videos. For
instance, I04 and I12 noticed that students are less engaged on
videos where instructors speak slowly. The video length was
the factor most often mentioned (by 15 instructors). According
to them, students do not usually access long videos and, when
they do, they do not watch the whole video. However, there
was no consensus on the ideal length of a video lecture.
Responses ranged from 5 to 30 minutes.
Another point noticed by the instructors is that students
are less engaged with videos that have more theoretical
content. I01 and I06 said students are more engaged on math
video lecture presenting exercise solutions. I06 also remarked
that the students provided feedback through questionnaires,
reporting that they do not like video lectures presenting only
the instructor speaking, without demonstrations and images
related to the content. Furthermore, the instructors highlighted
that (i) the most important is the content presented, (ii) a super
production of a video lecture requires a reason, (iii) watching
a video lecture does not ensure that the student has learned the
content, and (iv) there is no single video format that performs
well with all the students.
Regarding the relationship between the students’ usage of
the VLE and their performance in the course, 14 instructors
reported that students’ access to materials is related to their
grades, 7 said that the students who participate more (e.g.,
making questions, using the forum, chatting) perform better,
and 5 use assignment completion as a cue that the student
will perform well. I01 said that showing the solution after
the students have answered questions negatively affects their
performance. In addition, I03, I04, and I05 stated that each
student has his/her own study style and, according to I07, the
instructors should provide the content using more than one
format (e.g., text, video). I05 said students who already know
the content prefer text material instead of other media format,
and I02 reported that lecture notes are one of the contents that
the students like the most.
We also asked instructors whether they had identified any
relationship between student interaction and drop out rate in
VLEs. Half of them reported not having identified any rela-
tionship, and 7 perceived a correlation between student access
and course completion. Another 10 answers were provided
for this question, but none was mentioned by more than 2 in-
structors. These answers relate student drop out to: (i) affinity
with the content, (ii) accumulation of homework assignments,
(iii) problems with network bandwidth limitations, (iv) student
time available, (v) students with problems of interaction with
another student or group of students (aka peer interaction),
(vi) problems in the usage of VLE, (vii) students interested in
part of the content, and (viii) access in only the first weeks
(i.e., drop outs do not usually occur later in the course). I07,
I10, I11, and I13 reported problems with network bandwidth
limitations, which may complicate such analysis. For instance,
I10 and I11 said that, in general, the students tend to download
the course materials instead of accessing them online.
III. SYS TE MATI C MAP PI NG O F EDM AND LA
The systematic mapping conducted in this paper was based
on the method described by Kitchenham and Charters [6],
following a well-defined protocol to highlight the main prob-
lems, objectives, methods, case studies, and results presented
in the gathered papers. The first step was to define the research
questions to guide the mapping:
•Which results related VLE logs analysis and student
behavior and performance?
•Which tools (e.g., dashboard) are used by instructors
to analyze logs of student interactions with VLEs?
Next, we defined a search string stemming from the com-
bination of keywords related to the research questions: (ed-
ucation OR course OR MOOC OR e-learning OR teaching
OR virtual learning environments OR virtual training envi-
ronments OR learning management systems OR LMS) AND
(engagement OR behavior OR behaviour) AND (analysis
OR analyses OR analytics OR analytic OR visualisation OR
visualization OR data mining OR learning analytics).
We used the advanced search systems of three digital
libraries: ACM7, Elsevier8and IEEE9. We set filters in all
libraries to return only papers published after 2009, in PDF
format, and written in English or Portuguese. As Figure 1
shows, this procedure followed four steps: (1) the search on
the digital libraries returned 2,174 papers, (2) we removed
duplicate papers, leaving 1,835 papers, (3) we analyzed titles
and abstracts of each paper using the inclusion and exclusion
criteria, resulting in 320 papers, and (4) we read the 320
papers in full, also applying the inclusion and exclusion
criteria. Finally, 136 papers were selected for this mapping.
We highlight that the whole procedure was performed by only
one person.
Search on digital
libraries
Step 1
N = 2174
Remove duplicate
papers
Step 2
N = 1835
Titles and abstracts
analysis
Step 3
N = 320
Full text reading
Step 4
N = 136
Fig. 1. Study selection process.
The inclusion and exclusion criteria were outlined to filter
irrelevant papers that, despite including the defined keywords,
do not present results to answer the research questions. The
inclusion criteria were: (i) papers that present results, method-
ologies or case studies related to data analysis (e.g., logs) to
measure the student performance, motivation, participation, or
drop out in VLEs, (ii) papers that present ways to detect
students’ behavior pattern in VLEs, (iii) papers that show
results, methodologies, or case studies to view logs in VLEs,
and (iv) papers that evaluate students’ interaction problems
in VLEs. The exclusion criteria were: (i) call for papers
or keynotes, (ii) papers focused on face-to-face teaching or
course recommendation, (iii) papers aiming to improve the
accessibility for people with special needs, (iv) papers that
present results related to emoticons analysis, (v) papers that
analyze data only from questionnaires, and (vi) papers without
results.
Table II shows the paper distribution by year and digital
library. It is worth noting that, although only one paper
appears in 2019, this is expected because the search was
7http://dl.acm.org/advsearch.cfm
8http://sciencedirect.com/science/search
9http://ieeexplore.ieee.org/search/advsearch.jsp
performed on 25 February 2019. The sources with the most
papers (i.e., at least 5 papers) were: Conference on Learning
Analytics & Knowledge (LAK), Conference on Learning @
Scale (L@S), Computers in Human Behavior, Frontiers in
Education Conference (FIE), and Conference on Technological
Ecosystems for Enhancing Multiculturality (TEEM).
TABLE II
PAPERS DISTRIBUTION BY YEAR AND DIGITAL LIBRARY.
Library
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019*
Total
ACM - 2 3 6 14 13 16 21 23 1 99
Elsevier 1 - - 3 2 6 2 3 2 0 19
IEEE 1 - - - 3 2 2 8 2 0 18
Total 2 2 3 9 19 21 20 32 27 1 136
We classified the papers according to problems, objectives,
methods, case studies, and results.10. We also present the
type of course analyzed by each paper: 38 papers analyzed
theoretical courses, 29 practical, and 32 both. However, 37
papers do not describe the type of course analyzed. Most
papers focus on students behavior or performance, aiming at
identifying student behavior or performance patterns. Most of
these papers (101 out of 136) focused on STEM courses as
case studies (e.g., Computing and Engineering courses), and
16 papers had miscellaneous areas as case studies, without
specifying them.
We also classified papers according to the methods they
adopted. We found 63 different methods. Most papers used
methods based on clustering (e.g., K-means) and prediction
(e.g., Logistic Regression, Decision Tree). We found 18 papers
that applied methods using metrics defined by their own
authors, relating the students’ interaction with communication
tools (e.g., forum, chat) and their grades, and 20 papers
relating students’ interaction or performance with data from
a qualitative questionnaire.
We categorized 137 different results in 14 topics. We noticed
that 53 papers show that it is possible to cluster students based
on their access and interaction patterns.
Regarding the students’ performance and interaction with
forums, we found: (i) models to predict student performance;
(ii) correlations between student access, completion assess-
ment, participation, and performance; and (iii) evidences that
(a) materials available in distance courses are ignored even by
good performance students, and (b) student groups that use
forums more tend to have a good performance. Papers address-
ing tools yielded the following main result categories (with at
least 5 papers each): (i) analytics in learning systems used to
provide both auditing and interventions in student learning;
(ii) tools to aid instructors to analyze student behavior; and
(iii) proposals of tools that use logs from eLearning systems
for instructors to monitor students behavior, motivation, or
performance; and (iv) tools to aid students in analyzing their
own performance.
10Available at: http://dx.doi.org/10.6084/m9.figshare.8285657
Most papers related to the Course completion topic found
that: (i) MOOCs typically have lower completion rates; (ii) as-
signment completion can be used as a predictor of student
course completion; and (iii) students that initiate threads in
forums tend to complete the course. The term engagement is
used in 23 papers, most of which measure engagement as:
(i) assignments posted on the VLE; (ii) materials accessed
on the VLE; and (iii) total hits, readings, and postings on
communication tools (e.g., forum). Some papers also related
engagement to how long students take (i) watching each video
and (ii) taking notes. We found 23 result categories related to
the Video topic. However, the only category including more
than 5 papers is the one stating that students often do not
watch the entire videos. The others include just 1 or 2 papers.
In regard to the Drop out topic, all papers present proposals
to predict student drop out, but only 4 papers show models to
predict drop out through data analysis of student interaction
logs. Another 6 result categories are assigned to this topic, but
none of them has more than 2 papers.
Papers assigned to the Students with self-regulated learning
topic state that students with negative self-regulation have poor
academic performance. In addition, 3 papers claim that having
self-regulation does not necessarily imply a good performance.
The Personality topic includes 2 papers showing that the
students’ personality can be identified through the interaction
logs. We also identified the Attending topic to include papers
presenting results of students’ attendance detection in DL
and correlations between attendance and demographic data
(e.g., country of origin and education level). The 2 papers
assigned to the Click activity topic claim that the click count
increase and decrease are related with a probability of passing
and failing a course, respectively. The Students’ intention
topic includes 2 papers relating the students’ intention in the
course with their behavior and performance. In those papers,
the students’ intention was assessed through questionnaires.
Lastly, the Other topic presents 5 categories, but only one of
them has more than 5 papers: it states that access to the online
environment resources increases in periods close to exams or
assignment deadlines.
IV. TRIANGULATION OF RESULTS
In this section, we present the main relations between the
results of the interviews and of the systematic mapping. We
relate the instructors’ statements with the results of the selected
papers. The results mentioned in both studies (i.e., by the
interviewed instructors and in the literature) are listed in Table
III. Due to page constraints, we present up to three references
for each topic discussed hereinafter.
I04 noted that older and younger students have distinct
interactions with VLEs, in line with the results presented in
[7], which shows that older students tend to have a good
performance and realize backjump actions more frequently on
video players than younger students. Papers [7]–[9] show that
older students participate more in forums. I03, I04, I05, I09,
and I11 stated that there are students who do not interact either
with the instructor or in forums; they just access the materials
TABLE III
TRIANGULATION RESULTS.
Topics Instructors’ statements and paper results
N instructors
N papers
Total
Performance
& Course Assignment completion is a cue that the student
Completion will achieve a good performance 6 29 35
Students who use the forum more
tend to achieve good grades 7 20 27
There are students who do not interact
Performance
either with the instructor or in forums,
& Forum
access the materials and achieve a good grade
5 10 15
Students engage less with long video lectures 15 5 20
Format how video lecture is presented
influences the student to watch it 8 2 10
Video editing and production
Video
affect the student who watches it 7 1 8
Engagement Students’ interest can be identified
& Forum by their interactions on the forum 4 18 22
Access close to exam dates or
assignment deadlines increase 7 10 17
Each student has his/her own study style 3 9 12
Older and younger students have distinct
Others
interaction patterns 1 3 4
There is a positive correlation between student
access to the materials and performance 14 2 16
There is a positive relation between
video access and student performance 3 3 6
Performance
Gamification improves student performance 1 2 3
Course There is a positive correlation between
Completion student access and course completion 7 3 10
Students are less engaged in videos with more
theoretical content 4 2 6
Students are less engaged in videos
where instructors speak slowly 2 1 3
Students are more engaged in videos with
Khan Academy format 2 1 3
Students are more engaged in videos with
Talking Head format 2 1 3
Engagement
Students are less engaged in videos
& Video
captured from a live classroom lecture 1 1 2
The instructor has to encourage students
to keep them motivated 1 2 3
Motivation is one of the main supporters of
Engagement
student learning 1 1 2
and achieve a good grade. In line with this statement, [10]–[12]
show clustering techniques of VLE data and present a similar
student profile (aka lurkers), among others. To motivate stu-
dents, I04 makes use of gamification techniques (e.g., badges
and rankings) and has noticed improvements in the class. In
line with this result, [13], [14] identified improvements in
student engagement when compared to distance courses that
use and do not use badges.
I04, I05, I09, I10, I11, I16, and I17 noticed that students
using the forum more tend to achieve good grades. In addition,
I02, I03, I05, and I09 identify the students’ interest by their
interactions on the forum. In line with these answers, we found
the following results: (i) student groups that use more forums
tend to have a good performance [13], [15], [16]; (ii) student
groups that post more replies in forums tend to complete the
course [17] and have a good performance [10], [15], [16];
(iii) student groups that initiate threads in forums tend to com-
plete the course [17] and have a good performance [10], [15],
[16]; (iv) there is a positive correlation between the number
of questions students asked the instructor and their final grade
[18]; (v) student groups that have more posts are more likely
to complete the course [17], [19], [20]; (vi) engagement with
the online environment can be measured by total hits, readings,
and postings [9], [21], [22]; (vii) student groups that complete
more assignments tend to use more forums [8]; (viii) forum
usage can be used as a predictor of students completing the
course [12], [17], [19]; and (ix) comments can be used as a
predictor of student performance [20], [23]. The increase of
access to VLE resources and materials (e.g., videos) in periods
close to exam dates or assignment deadlines was noted by I01,
I02, I03, I04, I06, I12, and I13, as well as in [21], [24], [25].
Complementing that, [22], [24] show an increase in forum
posts close to deadlines.
According to I01, I02, I10, I11, I14, I15, as well as
[17], [26], [27], assignment completion is a cue that the
student will achieve good performance, and it can also be
used as a predictor of course completion. For instance, [28]
presents positive correlations between productive, assignment
completion, and pass rates. I06, I10, I11, I12, I13, I15, I17,
[20], [25], [27] also noticed that there is a positive correlation
between student access and course completion. In line with
this, [13], [26], [27] show that successful students are more
frequently engaged with online assignments and participate
regularly. Additionally, almost all instructors (except I06, I08,
I10, and I18) stated that there is a positive correlation between
student access to the materials and performance. In line with
them, [27], [29] found that viewing the course materials and
the students’ previous performance contribute to predicting
grades.
I07 claims that motivation is one of the main supporters to
student learning. In line with him, [30] stated that online learn-
ing requires even more learner motivation and self-direction
than traditional classroom-based instruction. According to I06,
the instructor has to encourage students in order to keep them
motivated, and [31] showed that an active participation of the
teaching staff in the forum is associated with a higher post
volume. This issue was also explored in [32], finding that
instructor participation (e.g., posts, activity) leads to student
engagement (e.g., module, wiki, blog, form, forum). Moreover,
I03, I04, and I05, as well as [20], [25], [33], report that
each student has his/her own study style. For instance, [33]
identified different clusters of students based on differences
in their regulation strategies. They also report that the use of
the same learning resources to the same extent may have a
different impact on different groups of students.
In regard to video, most interviewees (except I02, I08, and
I13) highlighted video length as one of the main reasons that
influence a student to watch a video lecture (or not). In line
with this, [34], [35] stated that short videos promote more
engagement. Furthermore, [35] found a correlation between
audience retention and video length. Another meaningful result
was identified in [19], [27], [30], which found a positive
correlation between the amount of time students watch a video
and learning results. Besides, I04 and I12, as well as [34], note
students generally engage less with videos where instructors
speak slowly.
I04, I06, and I16, as well as [19], [30], stated that there
is a positive relation between video access and student per-
formance. However, there are clues that student retention on
videos is related to the video authoring. According to I01, I05,
I06, I10, I12, I17, and I18, video editing and production affect
the student watching it. I01 believes that videos using slides do
not achieve a good result. I01, I06, I09, and I14 have noticed
that there is little student engagement with videos with more
theoretical content, without solving exercises. In addition, we
found in [34] that students re-watch tutorials more frequently
than lectures. Moreover, [35] also stated that theoretical videos
perform worst in terms of holding the students’ attention on
video than videos with code demonstration and videos with
coding walkthroughs tend to have a higher engagement than
the active coding sections.
Lastly, I01, I06, I07, I08, I09, I11, I17, and I18 reported
that the video lecture format affects the student watching it.
I01 and [34] support that students generally engaged less with
video captured from a live classroom lecture. They noted
that, when a pre-production was made, the students engaged
more with video lectures. In [34] it was also revealed that
students engaged more with videos filmed informally with the
instructor sitting at his/her office desk when compared with a
video filmed in a multi-million dollar TV production studio.
In line with this result, I03 believes that a super production of
a video lecture requires some justification. Additionally, I01,
I16, and I18, supported by [34], said that Khan Academy and
Talking Head video formats perform better in terms of student
engagement. By contrast, [35] claimed that video lectures
using slides with audio in the background perform worst.
V. F INAL CONSIDERATIONS
This paper reported an analysis of the answers to our
research questions obtained through interviews with course
instructors who use VLEs. Some instructors said they would
like to compare performance and drop out with the student
interaction in the VLE. Normally, instructors get information
from observing what students say and do on VLEs. A few
challenges interfere with this analytical process because most
instructors are not statistical experts or do not receive training
to extract key information from VLEs. Therefore, tool support
is called for. The instructors suggested tools to analyze student
interaction logs in the forums, to capture how students react to
course content, and to detect patterns of student navigation in
the VLE materials and resources. They also want to identify
the relationship between student access and the drop out rate.
Some instructors emphasized the importance of visualizing
these data through dashboards that presented, for example,
the weekly summary of a class and of each student (e.g.,
who accessed, who participated in the forum or chat, who
submitted the assignment). For instance, I10 and I12 highlight
the difficulty of doing such an analysis in the platform they
use, because, even using the available filters, the logs are
presented in a confusing way.
We also presented the results of a systematic mapping
of EDM and LA. A total of 136 papers were selected and
categorized according to problems, objectives, methods, case
studies, and results. 137 results were categorized related to
student logs analysis and tools. The mapping shares limitations
with similar mappings in the literature: some important work
may not have been included, such as thesis and dissertations,
books, and even some papers, which may not have been found
in the digital libraries using the search and selection protocol.
In order to overcome this problem, we adopted a snowballing
approach: we analyzed all papers cited by or that cite one of
the 136 papers. Applying the same inclusion and exclusion
criteria to the titles and abstracts, we gathered 245 additional
papers, whose full text reading we leave for future work.
Although the number of instructors is small and their
answers are anecdotal, the interviews showed close corre-
spondence between their statements and the paper results. It
was not our goal to achieve statistical significance, but even
with the small number of interviewees we uncovered existing
gaps in the literature. We found in the mapping 30 tools
to support instructors in analyzing logs. Because Moodle is
the most cited VLE by the instructors, we looked into the
documentation11 of version 3.6, finding 37 additional tools
to provide information based on student logs. In Table IV
we summarize the instructors’ needs and the characteristics
of existing tools (with at least three mentions each, either in
the interviews or in the literature). We note that most of those
needs are satisfied by one tool or another. However, none of the
existing tools fulfills all requirements raised by the instructors.
From what we have learned, we show in Table V some design
guidelines for student log dashboards. These guidelines are
sorted by the number of papers (3 at a minimum) that refer to
them. Obviously, the requirements and guidelines lists are not
exhaustive, and further research is called for. We conducted
another study to identify how much the instructors take into
account the requirements and guidelines uncovered in this
research, as well as their visualization preferences [36].
Some instructors’ statements (mentioned by at least 2 in-
structors) were not addressed in any of the papers found: (i) the
way teachers pose and express themselves in the video lectures
affects the student watching it; (ii) videos need to show images
related to the content to engage the student watching it;
(iii) video content affects the student watching it, (iv) videos
using slides are not effective; (v) video lectures improve both
student understanding and performance; (vi) student drop out
occurs only in the first few weeks; (vii) students without
affinity with the content tend to drop out of the course; and
(viii) the course needs to match the students’ learning styles.
These statements can ground and guide future research.
Although the instructors had defined requirements to sup-
port them in their decision making, we have not found in the
literature any evidence relating the use of the analytic tools
supporting such requirements and the improvement of student
performance. We also noticed a gap in regard to analyzing
11https://docs.moodle.org
TABLE IV
REQU IR EME NT S GATH ERE D WIT H IN STR UCT ORS A ND I MPL EM ENT ED B Y
TOO LS D ESC RI BED O N FO UND PA PER S AND MOODLE DOCUMENTATION.
Requirement description
N instructors
N tools
Total
Statistics of interactions on video (e.g., access, re-watch, seek) 17 1 18
Identify student interest patterns on the course 6 9 15
Identify student access patterns (e.g., login, materials) 9 5 14
Identify student performance patterns 5 8 13
Predict student performance 2 10 12
Identify self-regulated students 2 8 10
Provide a course progress bar to students 1 9 10
Identify student usage patterns on the forum 5 3 9
Identify student drop out patterns 2 7 9
Identify student interaction patterns (e.g., materials) 4 3 7
Identify student participation patterns on the course 4 3 7
Know whether the student has understood the video 3 4 7
Identify student navigation patterns on the VLE 2 2 4
Capture students’ reactions to materials 3 1 4
Know which videos (or video segments) the students have liked 2 2 4
Know if another material was accessed with the video 1 3 4
Know if the student is watching the video 2 1 3
Know in which video segments the students have difficulty 1 2 3
Resources for students to evaluate the materials 1 2 3
Provide achievements to engage the students 1 2 3
TABLE V
GUIDELINES FOR STUDENT LOG DASHBOARDS.
Guidelines description N
Identify behavior patterns by students access and interaction 53
Identify successful students by access and assessments 23
Identify performance by student groups that use more forums 14
Predict the students’ performance from their interaction 12
Identify student engagement by materials accessed 10
Identify student engagement by assessments done 10
Predict course completion by students that complete assessements 10
Identify increase of resources access with deadlines closeness 9
Predict drop out from students’ interaction 6
Identify student engagement by forum interactions 6
Identify course completion by students that have more posts 5
Identify performance by student groups that initiate more threads in forums 4
Predict course completion from students that use forum 4
Identify performance by student groups that post more replies in forums 3
Identify personality by students interaction logs 3
Identify student engagement by how long students watch videos 3
Identify self-regulated students 3
instructor behavior in VLEs. All papers we found analyze
only student behavior. As future work, we intend to analyze
VLE logs that can be found in online courses performed
in Brazil and compare with the instructors’ statements and
literature results about student behavior and performance.
In addition, we plan to develop a dashboard using visual
analytics techniques, taking into account the requirements and
guidelines described in this paper. To evaluate the dashboard,
we will want to assess whether there are changes in students’
performance when instructors are able to see information about
their behavior and performance, and act accordingly.
ACK NOW LE DG ME NT S
We thank CAPES, Coordenac¸˜
ao de Aperfeic¸oamento
de Pessoal de N´
ıvel Superior, and CNPq (processes
#309828/2015-5 and #311316/2018-2) for the partial financial
support to this work.
REFERENCES
[1] C. Romero and S. Ventura, “Educational Data Mining: A Review of the
State of the Art,” IEEE Transactions on Systems, Man, and Cybernetics,
Part C (Applications and Reviews), vol. 40, no. 6, pp. 601–618, Nov
2010. [Online]. Available: http://ieeexplore.ieee.org/document/5524021/
[2] A. Dutt, M. A. Ismail, and T. Herawan, “A Systematic Review on
Educational Data Mining,” IEEE Access, vol. 5, pp. 15991–16 005,
2017. [Online]. Available: http://ieeexplore.ieee.org/document/7820050/
[3] K. S. Na and Z. Tasir, “A Systematic Review of Learning Analytics
Intervention Contributing to Student Success in Online Learning,” in
2017 International Conference on Learning and Teaching in Computing
and Engineering (LaTICE). IEEE, apr 2017, pp. 62–68. [Online].
Available: http://ieeexplore.ieee.org/document/8064433/
[4] S. Sergis and D. G. Sampson, “Teaching and Learning Analytics to
Support Teacher Inquiry: A Systematic Literature Review,” in Learning
Analytics: Fundaments, Applications, and Trends: A View of the Current
State of the Art to Enhance e-Learning, Pe˜
na-Ayala, Alejandro, Ed.
Springer International Publishing, 2017, pp. 25–63. [Online]. Available:
http://link.springer.com/10.1007/978-3-319- 52977-6 2
[5] C. Vieira, P. Parsons, and V. Byrd, “Visual learning analytics
of educational data: A systematic literature review and research
agenda,” Computers & Education, vol. 122, pp. 119–135, jul
2018. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/
S0360131518300770
[6] B. Kitchenham and S. Charters, “Guidelines for performing systematic
literature reviews in software engineering,” 2007.
[7] P. J. Guo and K. Reinecke, “Demographic differences in how
students navigate through moocs,” in Proceedings of the First
ACM Conference on Learning @ Scale Conference, ser. L@S ’14.
New York, NY, USA: ACM, 2014, pp. 21–30. [Online]. Available:
http://doi.acm.org/10.1145/2556325.2566247
[8] R. F. Kizilcec, C. Piech, and E. Schneider, “Deconstructing
disengagement: Analyzing learner subpopulations in massive open
online courses,” in Proceedings of the Third International Conference
on Learning Analytics and Knowledge, ser. LAK ’13. New
York, NY, USA: ACM, 2013, pp. 170–179. [Online]. Available:
http://doi.acm.org/10.1145/2460296.2460330
[9] S. Ransdell, “Meaningful posts and online learning in Blackboard
across four cohorts of adult learners,” Computers in Human Behavior,
vol. 29, no. 6, pp. 2730–2732, nov 2013. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0747563213002598
[10] ´
A. Hern´
andez-Garc´
ıa, I. Gonz´
alez-Gonz´
alez, A. I. Jim´
enez-Zarco, and
J. Chaparro-Pel´
aez, “Applying social learning analytics to message
boards in online distance learning: A case study,” Computers in
Human Behavior, vol. 47, pp. 68–80, jun 2015. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0747563214005615
[11] X. Wang, M. Wen, and C. P. Ros´
e, “Towards triggering higher-order
thinking behaviors in moocs,” in Proceedings of the Sixth International
Conference on Learning Analytics & Knowledge, ser. LAK ’16.
New York, NY, USA: ACM, 2016, pp. 398–407. [Online]. Available:
http://doi.acm.org/10.1145/2883851.2883964
[12] A. S. Sunar, S. White, N. A. Abdullah, and H. C. Davis, “How Learners’
Interactions Sustain Engagement: A MOOC Case Study,” IEEE
Transactions on Learning Technologies, vol. 10, no. 4, pp. 475–487, oct
2017. [Online]. Available: http://ieeexplore.ieee.org/document/7762189/
[13] A. Anderson, D. Huttenlocher, J. Kleinberg, and J. Leskovec,
“Engaging with massive online courses,” in Proceedings of the 23rd
International Conference on World Wide Web, ser. WWW ’14. New
York, NY, USA: ACM, 2014, pp. 687–698. [Online]. Available:
http://doi.acm.org/10.1145/2566486.2568042
[14] D. Dicheva, K. Irwin, and C. Dichev, “OneUp: Engaging Students
in a Gamified Data Structures Course,” in Proceedings of the
50th ACM Technical Symposium on Computer Science Education
- SIGCSE ’19, ser. {SIGCSE}’19. New York, New York,
USA: ACM Press, 2019, pp. 386–392. [Online]. Available: http:
//dl.acm.org/citation.cfm?doid=3287324.3287480
[15] Y. Feng, D. Chen, Z. Zhao, H. Chen, and P. Xi, “The impact of students
and tas’ participation on students’ academic performance in mooc,”
in Proceedings of the 2015 IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining 2015, ser. ASONAM
’15. New York, NY, USA: ACM, 2015, pp. 1149–1154. [Online].
Available: http://doi.acm.org/10.1145/2808797.2809428
[16] A. S. Carter, C. D. Hundhausen, and O. Adesope, “Blending measures
of programming and social behavior into predictive models of student
achievement in early computing courses,” ACM Trans. Comput. Educ.,
vol. 17, no. 3, pp. 12:1–12:20, Aug. 2017. [Online]. Available:
http://doi.acm.org/10.1145/3120259
[17] J. M. L. Andres, R. S. Baker, D. Gaˇ
sevi´
c, G. Siemens, S. A. Crossley,
and S. Joksimovi´
c, “Studying MOOC completion at scale using the
MOOC replication framework,” in Proceedings of the 8th International
Conference on Learning Analytics and Knowledge - LAK ’18, ser.
{LAK}’18. New York, New York, USA: ACM Press, 2018, pp. 71–78.
[Online]. Available: http://doi.acm.org/10.1145/3170358.3170369
[18] W. He, “Examining students’ online interaction in a live video streaming
environment using data mining and text mining,” Computers in Human
Behavior, vol. 29, no. 1, pp. 90–102, jan 2013. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0747563212002233
[19] J. Qiu, J. Tang, T. X. Liu, J. Gong, C. Zhang, Q. Zhang, and Y. Xue,
“Modeling and predicting learning behavior in moocs,” in Proceedings
of the Ninth ACM International Conference on Web Search and Data
Mining, ser. WSDM ’16. New York, NY, USA: ACM, 2016, pp. 93–
102. [Online]. Available: http://doi.acm.org/10.1145/2835776.2835842
[20] N. Bosch, R. W. Crues, G. M. Henricks, M. Perry, L. Angrave,
N. Shaik, S. Bhat, and C. J. Anderson, “Modeling Key Differences
in Underrepresented Students’ Interactions with an Online STEM
Course,” in Proceedings of the Technology, Mind, and Society
- TechMindSociety ’18, ser. {TechMindSociety}’18. New York,
New York, USA: ACM Press, 2018, pp. 1–6. [Online]. Available:
http://doi.acm.org/10.1145/3183654.3183681
[21] T. Haig, K. Falkner, and N. Falkner, “Visualisation of learning
management system usage for detecting student behaviour patterns,”
in Proceedings of the Fifteenth Australasian Computing Education
Conference - Volume 136, ser. ACE ’13. Darlinghurst, Australia,
Australia: Australian Computer Society, Inc., 2013, pp. 107–115.
[Online]. Available: http://dl.acm.org/citation.cfm?id=2667199.2667211
[22] M. Wells, A. Wollenschlaeger, D. Lefevre, G. D. Magoulas, and
A. Poulovassilis, “Analysing engagement in an online management
programme and implications for course design,” in Proceedings of the
Sixth International Conference on Learning Analytics & Knowledge,
ser. LAK ’16. New York, NY, USA: ACM, 2016, pp. 236–240.
[Online]. Available: http://doi.acm.org/10.1145/2883851.2883894
[23] S. Sorour, K. Goda, and T. Mine, “Correlation of topic model and
student grades using comment data mining,” in Proceedings of the
46th ACM Technical Symposium on Computer Science Education,
ser. SIGCSE ’15. New York, NY, USA: ACM, 2015, pp. 441–446.
[Online]. Available: http://doi.acm.org/10.1145/2676723.2677259
[24] D. Nandi, M. Hamilton, and J. Haland, “How active are students
in online discussion forums?” in Proceedings of the Thirteenth
Australasian Computing Education Conference - Volume 114, ser. ACE
’11. Darlinghurst, Australia, Australia: Australian Computer Society,
Inc., 2011, pp. 125–134. [Online]. Available: http://dl.acm.org/citation.
cfm?id=2459936.2459952
[25] A. Cicchinelli, E. Veas, A. Pardo, V. Pammer-Schindler, A. Fessl,
C. Barreiros, and S. Lindst¨
adt, “Finding traces of self-regulated learning
in activity streams,” in Proceedings of the 8th International Conference
on Learning Analytics and Knowledge - LAK ’18, ser. {LAK}’18.
New York, New York, USA: ACM Press, 2018, pp. 191–200. [Online].
Available: http://doi.acm.org/10.1145/3170358.3170381
[26] P. J. Samson, “Can student engagement be measured? And,
if so, does it matter?” in 2015 IEEE Frontiers in Education
Conference (FIE). IEEE, oct 2015, pp. 1–4. [Online]. Available:
http://ieeexplore.ieee.org/document/7344077/
[27] R. Al-Shabandar, A. J. Hussain, P. Liatsis, and R. Keight,
“Analyzing Learners Behavior in MOOCs: An Examination of
Performance and Motivation Using a Data-Driven Approach,” IEEE
Access, vol. 6, pp. 73 669–73 685, 2018. [Online]. Available: https:
//ieeexplore.ieee.org/document/8502208/
[28] B. Rienties, L. Toetenel, and A. Bryan, “Scaling Up Learning
Design: Impact of Learning Design Activities on LMS Behavior and
Performance,” in Proceedings of the Fifth International Conference
on Learning Analytics And Knowledge, ser. LAK ’15. New
York, NY, USA: ACM, 2015, pp. 315–319. [Online]. Available:
http://doi.acm.org/10.1145/2723576.2723600
[29] A. Elbadrawy, R. S. Studham, and G. Karypis, “Collaborative
multi-regression models for predicting students’ performance in course
activities,” in Proceedings of the Fifth International Conference
on Learning Analytics And Knowledge, ser. LAK ’15. New
York, NY, USA: ACM, 2015, pp. 103–107. [Online]. Available:
http://doi.acm.org/10.1145/2723576.2723590
[30] H. Wang, X. Hao, W. Jiao, and X. Jia, “Causal Association Analysis
Algorithm for MOOC Learning Behavior and Learning Effect,” in 2016
IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing,
14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl
Conf on Big Data Intelligence and Computing and Cyber Science and
Technology Congress(DASC/PiCom/DataCom/CyberSciTech). IEEE,
aug 2016, pp. 202–206. [Online]. Available: http://ieeexplore.ieee.org/
document/7588847/
[31] C. G. Brinton, M. Chiang, S. Jain, H. Lam, Z. Liu, and F. M. F.
Wong, “Learning about Social Learning in MOOCs: From Statistical
Analysis to Generative Model,” IEEE Transactions on Learning
Technologies, vol. 7, no. 4, pp. 346–359, oct 2014. [Online]. Available:
http://ieeexplore.ieee.org/document/6851916/
[32] S. B. Dias, S. J. Hadjileontiadou, L. J. Hadjileontiadis, and J. A.
Diniz, “Fuzzy cognitive mapping of LMS users’ Quality of Interaction
within higher education blended-learning environment,” Expert Systems
with Applications, vol. 42, no. 21, pp. 7399–7423, nov 2015.
[Online]. Available: https://www.sciencedirect.com/science/article/pii/
S095741741500384X
[33] N. Bos and S. Brand-Gruwel, “Student differences in regulation
strategies and their use of learning resources: Implications for
educational design,” in Proceedings of the Sixth International
Conference on Learning Analytics & Knowledge, ser. LAK ’16.
New York, NY, USA: ACM, 2016, pp. 344–353. [Online]. Available:
http://doi.acm.org/10.1145/2883851.2883890
[34] P. J. Guo, J. Kim, and R. Rubin, “How video production affects student
engagement: An empirical study of mooc videos,” in Proceedings of
the First ACM Conference on Learning @ Scale Conference, ser.
L@S ’14. New York, NY, USA: ACM, 2014, pp. 41–50. [Online].
Available: http://doi.acm.org/10.1145/2556325.2566239
[35] A. McGowan, P. Hanna, and N. Anderson, “Teaching programming:
Understanding lecture capture youtube analytics,” in Proceedings
of the 2016 ACM Conference on Innovation and Technology
in Computer Science Education, ser. ITiCSE ’16. New York,
NY, USA: ACM, 2016, pp. 35–40. [Online]. Available: http:
//doi.acm.org/10.1145/2899415.2899421
[36] A. L. B. Damasceno, D. S. Ribeiro, and S. D. J.
Barbosa, “Visualizing student interactions to support instruc-
tors in Virtual Learning Environments,” in Proceedings of
the International Conference on Human-Computer Interaction.
Springer International Publishing, 2019. [Online]. Available:
http://link.springer.com/10.1007/978-3-030- 23560-4 33