Conference PaperPDF Available

Exploring Student Approaches to Learning through Sequence Analysis of Reading Logs

March 2020

March 2020

DOI:10.1145/3375462.3375492

Conference: The 10th International Conference on Learning Analytics & Knowledge (LAK20)
At: Frankfurt, Germany

Authors:

Gökhan Akçapınar

Hacettepe University

Mei-Rong Alice Chen

Soochow University, Taiwan

Rwitajit Majumdar

Kumamoto University

Brendan Flanagan

Kyoto University

Show all 5 authorsHide

In this paper, we aim to explore students' study approaches (e.g., deep, strategic, surface) from the logs collected by an electronic textbook (eBook) system. Data was collected from 89 students related to their reading activities both in and out of the class in a Freshman English course. Students are given a task to study reading materials through the eBook system, highlight the text that is related to the main or supporting ideas, and answer the questions prepared for measuring their level of comprehension. Students in and out of class reading times and their usage of the marker feature were used as a proxy to understand their study approaches. We used theory-driven and data-driven approaches together to model the study approaches of students. Our results showed that three groups of students who have different study approaches could be identified. Relationships between students' reading behaviors and their academic performance is also investigated by using association rule mining analysis. Obtained results are discussed in terms of monitoring, feedback, predicting learning outcomes, and identifying problems with the content design.

Dendrogram of the hierarchical cluster analysis

…

Visualization of students’ reading behaviors in each cluster

…

Box plots of aggregated data related to reading behaviors and quiz scores by cluster

…

Frequent items in each cluster

…

Figures - uploaded by Gökhan Akçapınar

Content may be subject to copyright.

Content uploaded by Gökhan Akçapınar

Content may be subject to copyright.

Exploring Student Approaches to Learning through Sequence

Analysis of Reading Logs

Gökhan Akçapınar

Hacettepe University

Ankara, Turkey

gokhana@hacettepe.edu.tr

Brendan Flanagan

Kyoto University

Kyoto, Japan

flanagan.brendanjohn.4n@kyoto-

ac.jp

Mei-Rong Alice Chen

Kyoto University

Kyoto, Japan

chen.meirong.6s@kyoto-u.ac.jp

Hiroaki Ogata

Kyoto University

Kyoto, Japan

ogata.hiroaki.3e@kyoto-u.ac.jp

Rwitajit Majumdar

Kyoto University

Kyoto, Japan

majumdar.rwitajit.4a@kyoto-

u.ac.jp

ABSTRACT

In this paper, we aim to explore students’ study approaches (e.g.,

deep, strategic, surface) from the logs collected by an electronic

textbook (eBook) system. Data was collected from 89 students

related to their reading activities both in and out of the class in a

Freshman English course. Students are given a task to study reading

materials through the eBook system, highlight the text that is

related to the main or supporting ideas, and answer the questions

prepared for measuring their level of comprehension. Students in

and out of class reading times and their usage of the marker feature

were used as a proxy to understand their study approaches. We used

theory-driven and data-driven approaches together to model the

study approaches of students. Our results showed that three groups

of students who have different study approaches could be

identified. Relationships between students’ reading behaviors and

their academic performance is also investigated by using

association rule mining analysis. Obtained results are discussed in

terms of monitoring, feedback, predicting learning outcomes, and

identifying problems with the content design.

CCS CONCEPTS

• Information systems~Data mining • Computing methodologies~

Machine learning • Applied computing~Interactive learning

environments • Applied computing~E-learning

KEYWORDS

Study approaches, sequence analysis, reading logs, clustering,

association rule mining, learning analytics

ACM Reference format:

Akçapınar, G., Chen, M. R. A., Majumdar, R., Flanagan, B. and Ogata, H.

2020. Exploring Student Approaches to Learning through Sequence

Analysis of Reading Logs. In Proceedings of the 10th International

Conference on Learning Analytics & Knowledge (LAK’20). ACM, New

York, NY, USA, 6 pages. https://doi.org/10.1145/3375462.3375492

1 Introduction

Students are using different study approaches to achieve a specific

learning task [1-3]. Understanding these approaches is important

for designing further interventions for particularly low-performing

students [4]. It is also a challenging task for researchers for several

reasons. First of all, study approaches are dynamic phenomenon

and may vary depending on many variables (e.g., subject, task

difficulty, etc.). Therefore, in many cases, it might not be

convenient to capture students’ study approaches by using self-

report methods. On the other hand, previous studies showed that

students’ learning traces (observable behaviors) in online learning

environments can be used as a proxy to understand latent constructs

such as students’ cognitive and metacognitive strategies [4],

learning strategies [5], and study patterns [6]. Although written

materials are the core of education, there is still limited research

that analyzes reading logs to understand students’ learning

processes.

Thanks to digital textbook systems, now it is possible to collect

detailed data regarding the students’ reading processes which is not

possible with traditional textbooks. A previous study that analyzed

students’ digital textbook interaction data indicates that the course

outcome is directly related to reading of a textbook [7, 8]. Junco

and Clem [8] found that students who were in the top 10th percentile

in the number of highlights had significantly higher course grades

than those in the lower 90th percentile. They also found that

students, those who spent a longer time reading textbooks earned

higher grades in the course over those who spent less time. Huang,

et al. [9] proposed a Knowledge Tracing model that measures

students’ level of knowledge on the underlying concept by looking

Permission to make digital or hard copies of part or all of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for profit or commercial advantage and that copies bear this notice and the full

citation on the first page. Copyrights for third-party components of this work must

be honored. For all other uses, contact the owner/author(s).

LAK '20, March 23–27, 2020, Frankfurt, Germany

ACM.

ACM ISBN 978-1-4503-7712-6/20/03$15.00

https://doi.org/10.1145/3375462.3375492

LAK’20, March 23–27, 2020, Frankfurt, Germany

G. Akçapınar et al.

at the amount of time s/he has spent on the related pages (e.g.

read/skimmed).

In this study, we aim to explore the study approaches of students

while they are studying the content in or out of the class. We use a

theory-driven and data-driven approaches together to this end.

Based on the existing Student Approaches to Learning (SAL)

theory literature, we have formed our research questions and

decided features that can be used as a proxy to understand students’

study approaches. On the other hand, the data-driven approach

helped us to identify students who are using similar learning

approaches. We also analyzed relationships between study

approaches and learning outcomes.

2 Background

2.1 Student Approaches to Learning Theory

The origins of the Student Approaches to Learning (SAL) theory

date back to the 1970s. In one of the early attempts, Marton and

Säljö [2] asked the students to read the reading passages within the

given time limit. Then students were asked to answer a series of

questions measuring their level of understanding. Students were

also asked open-ended questions about how they approached

reading tasks. When the researchers compared the students’ level

of understanding and the approach they used, they found that the

students are using either surface or deep approaches while

performing the reading task.

Later studies confirmed these findings and also found that there

was a third approach in addition to deep and surface. This approach

was named achieving by Biggs [1] and strategic by Entwistle and

Ramsden [10]. Characteristics of each approach are given below [5,

11].

Deep approach: A deep approach to learning is characterized

by students' desires to understand, learn with meaning, and

recognize underlying principles and connections among related

principles.

Surface approach: A surface approach to learning often

involves students' memorizing information and doing only what is

necessary to succeed on an upcoming assessment. Students with a

surface approach prefer teaching that directs learning towards

assessment requirements even if this leads to a lack of both

understanding and purpose.

Strategic approach: A strategic approach to learning is

accompanied by students' close attention to details such as expected

test format, the structure of the content as laid out in the text, and

close adherence to an instructor's guidelines for studying. Students

who show a strategic approach can discern and use the aspects of a

learning environment that will support their way of studying.

Previous studies also investigated the relationships between

study approaches and learning outcomes. The surface approach

linked to poor learning outcomes, while the deep approach linked

to better learning outcomes.

Questionnaires are mainly used to measure students' approaches

to learning. However, students may use different strategies at

different times. Moreover, students may not want to self-report

their approaches to learning accurately, especially if they are

surface learners [12]. Therefore, in this study, we aimed at

identifying students’ study approaches from the reading logs

collected by the eBook reader.

2.2 Measuring Latent Variables from Learning

Traces

Analyzing latent variables from the students’ learning traces has

recently gained attention in learning analytics and educational data

mining communities. Cicchinelli, et al. [4], tried to identify

students’ self-regulation strategies (e.g., metacognitive and

cognitive strategies) from their interactions with the learning

management system in a blended course setting. They also

compared the results with the self-report data. They found that

observable features (e.g. content access, question-solving, etc.)

could better explain self-regulated learning behavior and its effects

on academic performance than self-report data. In another study,

researchers investigated temporal characteristics of learning

strategies and their association with feedback from the three years

of logs were collected from online pre-class activities of a flipped

classroom [13]. After analyzed data by using clustering, sequence

mining, and process mining approaches, researchers found a

positive association between personalized feedback and effective

strategies. Boroujeni and Dillenbourg [6] analyzed video viewing

and assignment submission behaviors of 7527 students in a MOOC

environment to find out temporal study patterns of the students

during assessment periods.

While most of the studies were conducted with data from

MOOC and learning management systems and often with video-

based learning materials, we focus on the reading-based learning

scenario for our study.

3 Research Questions

In this paper, we hypothesize that features extracted from digital

textbook reader logs can be used to identify students’ approaches

to learning (e.g., surface, deep and strategic). Specifically, the

following research questions were addressed:

 RQ 1: Is it possible to identify surface, deep and strategic

learners from the reading logs?

 RQ 2: What is the relationship between study approaches and

learning outcomes?

 RQ 3: What are the characteristic association rules between

surface, deep, strategic learners’ reading behaviors and their

academic performance?

4 Method

4.1 Instructional Context & Data Collection

We analyzed more than 25,000 rows of click-stream data that are

collected (see Table 1 for details) from 89 students registered in a

Freshman English course at a university. The course was offered to

first-year undergraduate university students. Students used the

Exploring Student Approaches to Learning through Sequence

Analysis of Reading Logs

LAK’20, March 23–27, 2020, Frankfurt, Germany

WOODSTOCK’18, June, 2018, El Paso, Texas USA

eBook system to access course materials that were uploaded by the

instructor. Data collection took place in two weeks. In the first

class, students introduced the reading material and were instructed

by the teacher regarding how to use functions in the eBook system.

Students were asked to read content through the eBook system,

highlight main ideas and answer the questions that were developed

to assess their level of comprehension. In the second week, the rest

of the content is completed. All the interactions (e.g. next, previous,

jump, highlight, adding a memo, bookmark, etc.) with the eBook

system were recorded in a database.

Table 1: Number of logs in each event

Event Type

Number of Logs

Open

622

7826

2992

Jump

648

Marker

4539

Memo

2888

Bookmark

102

Quiz attempt

2603

Other

3043

Total

25657

In this study, data was collected from an eBook system which is

currently being used in different universities in Asia. More than

10,000 university-students are using this eBook system as their

main source of learning inside and outside of the classrooms. The

eBook system is an integrated component of a learning analytics

framework. This framework makes it possible to collect all kinds

of interaction data related to students’ eBook reading while

ensuring their privacy. eBook tool has a feature similar to red or

yellow markers to highlight some parts of the text. Students’ can

add memos to remember important points or bookmark pages to

access them quickly while they are reviewing the content.

4.2 Reading Pattern Extraction

At the beginning of data analysis, features from the click-stream

data were extracted. Extracted features were used as a proxy to

understand students’ study approaches. A brief description of

features is given below.

● Time IN: Total time spent on content during the class.

● Time OUT: Total time spent on content during out of the class.

● Marker: Number of yellow and red markers added by the

student.

The content consists of 25 pages. Students’ level of

comprehension was assessed based on 11 questions located inside

the eBook system. With the help of an automated script, students’

in and out-class reading times and marker counts for each page

were extracted. After extracting features, all the numerical data

discretized into three levels. If a student does not have any activity

on a specific page, it is labeled as no activity (na). Then the rest of

the data split into low and high by using the median as a cut-off.

This process is repeated for each feature (e.g., Time IN, Time Out,

and Marker) and each page of the content. At the end of the feature

extraction, 75 columns long data obtained for each student (89 x 75

matrix).

4.3 Data Analysis

After transforming students’ click-stream data into the page level

categorical data, Agglomerative Hierarchical Clustering based on

Ward’s algorithm [14] was used to group students with similar

reading patterns. Optimal matching distance (OM distance) was

used as a similarity calculation method. The optimal number of

clusters decided based on the SAL theory. Dendrogram of the

hierarchical cluster analysis also checked to validate the theoretical

decision. A similar approach previously applied successfully for

detecting students’ learning strategies [4, 6, 13].

For labeling obtained clusters, two graphs were checked. First,

we compared the visualization of page-level data for each cluster.

Then, we analyzed the distribution of aggregated raw data in each

cluster along with quiz results. To extract representative learning

patterns of each cluster, association rule mining analysis was

employed. Data analysis was conducted by the R data mining tool

[15] with the following packages. Sequence analysis conducted by

TraMiner [16], and Association Rules were extracted by using

arules [17] package.

5 Results

5.1 Cluster Analysis

Based on the SAL theory we aim at identifying three clusters in

data related to surface, strategic and deep learning approaches.

Therefore, after confirming with the dendrogram of the hierarchical

cluster analysis (see Fig. 1) we clustered data into 3 groups.

Figure 1: Dendrogram of the hierarchical cluster analysis

Fig. 2 shows the distribution of students’ reading behaviors in each

cluster. Here, each row represents the data of a single student. The

x-axis shows the students’ reading behaviors related to different

LAK’20, March 23–27, 2020, Frankfurt, Germany

G. Akçapınar et al.

features (e.g., Time IN, Time Out, and Marker). Time IN part

shows students’ reading times during the class for each page of the

content. The middle part shows students’ out of class reading times.

The last block shows students' marker activities.

It can be seen from Fig. 2 that students in Cluster 1 (n=38) are

mainly not active in terms of out of the class activity and marker

usage. Regarding the time spent in class, most of them have low

activity in the first 5-10 pages of the content, however, they do not

have any activity on the other pages. Students in Cluster 2 (n = 26)

are highly active in-class and in terms of marker usage. Although

some of them have low activity, most of them have no activity out

of the class. Students in Cluster 3 (n = 25) have similar patterns

with the students in Cluster 2, however, almost all of the students

in this cluster also have high activity across the content during out

of the class.

Figure 2: Visualization of students’ reading behaviors in each

cluster

Before labeling the clusters, we also checked the distribution of

quiz scores in each cluster along with the total time spent in class,

out class, and the total number of markers added. Distribution

observed in Fig 3. is in accordance with the page level data. In terms

of quiz scores, students in Cluster 1 have the lowest scores and

students in Cluster 3 have the highest scores. However, students in

Cluster 2 have both low and high scores. Finally, we labeled Cluster

1 as a surface approach, Cluster 2 as a strategic approach, and

Cluster 3 as a deep approach.

Figure 3: Box plots of aggregated data related to reading

behaviors and quiz scores by cluster

5.2 Association Rules

To see the representative patterns of each cluster and its relation

with the academic performance we conducted association rule

mining analysis. To make obtained rules simple and easy to

understand we used aggregated data instead of page-level data. We

calculated total values for Time IN, Time OUT and Marker

features. Discretized Quiz scores (quiz_low, quiz_high) were also

included data to see the relationship between students’ reading

behaviors and their academic performances. Rules are generated for

each cluster with minimum support of 0.1 (%10) and minimum

confidence of 0.8 (%80). Rules which are not related to academic

performance were filtered. In this case, 12 rules generated for

Cluster 1, 10 rules generated for Cluster 2 and 12 rules generated

for Cluster 3. The frequency of items in each cluster can be seen in

Fig 4. The top 5 rules for each cluster selected based on the support

values are discussed below.

Cluster 1 - Surface Approach: The most frequent items in Cluster

1 are quiz_low, timein_low, and timeout_na (see Fig. 4). The first

rule in Table 2 means that if the student has no activity during out-

class time then s/he will get a low quiz score with a support value

Exploring Student Approaches to Learning through Sequence

Analysis of Reading Logs

LAK’20, March 23–27, 2020, Frankfurt, Germany

WOODSTOCK’18, June, 2018, El Paso, Texas USA

of 76% and a confidence value of 94%. support value shows that

this rule covers 76% of the students in Cluster 1 and confidence

value means that the probability of getting a low quiz score after

low out-class time is 0.94. Similar patterns can be observed for the

other rules given in Table 2. Most of the students in Cluster 1 have

no activity or low activity in terms of reading times and marker

usage. Most of them also have low quiz scores.

Figure 4: Frequent items in each cluster

Table 2: Top 5 Rules with the highest Support for Cluster 1

Pattern

SUP

CON

[timeout_na] => [quiz_low]

76%

94%

[timein_low] => [quiz_low]

76%

91%

[timein_low, timeout_na] =>

[quiz_low]

74%

97%

[marker_low, timein_low] =>

[quiz_low]

50%

90%

[marker_low] => [quiz_low]

50%

86%

Cluster 2 - Strategic Approach: The most frequent items in Cluster

2 are timein_high, quiz_low, and timeout_na (see Fig. 4). Different

than other clusters, this cluster has a similar number of low and high

performers. From the rules given in Table 3, it can be noted that

marker usage is key to separate low and high performers in this

cluster. If a student spends more time in class but his/her marker

usage is low then s/he will get a low quiz score (Rule 1). On the

other hand, high marker usage can be related to high quiz scores

(e.g., Rule 4, Rule 5). Therefore, coverage (support) of the rules in

this cluster is lower than others. High confidence values also

indicate that different rules can be used to identify low and high

performers in this cluster.

Table 3: Top 5 Rules with the highest Support for Cluster 2

Pattern

SUP

CON

[marker_low, timein_high] =>

[quiz_low]

27%

78%

[timeout_low] => [quiz_low]

27%

70%

[timein_high, timeout_low] =>

[quiz_low]

23%

75%

[marker_high, timein_high,

timeout_na] => [quiz_high]

19%

83%

[marker_high, timeout_na] =>

[quiz_high]

19%

71%

Cluster 3 - Deep Approach: The most frequent items in Cluster 3

are marker_high, quiz_high, and timein_high (see Fig. 4). The first

rule in Table 4 means that if the student has high marker usage, then

s/he will get a high quiz score with a support value of 64% and a

confidence value of 76%. All the rules with higher support value

related to high quiz performance. High out-class time also related

to high quiz scores for this group of students (e.g. Rule 4).

Table 4: Top 5 Rules with the highest Support for Cluster 3

Pattern

SUP

CON

[marker_high] => [quiz_high]

64%

76%

[marker_high, timein_high] =>

[quiz_high]

56%

82%

[timein_high] => [quiz_high]

56%

78%

[timeout_high] => [quiz_high]

52%

72%

[marker_high, timeout_high] =>

[quiz_high]

48%

75%

6 Conclusions

In this study, we tried to determine students’ approaches to learning

from their reading behaviors exhibited while performing a given

reading task. For this purpose, a theoretical basis of students'

learning approaches from the SAL literature was considered.

Features from the reading log data were extracted that can then be

used as a proxy to understand the approaches. These features are

LAK’20, March 23–27, 2020, Frankfurt, Germany

G. Akçapınar et al.

students' in and out of class reading times and the number of

markers they used. Obtained results showed that the students could

be divided into three clusters identified as surface, strategic and

deep study approaches. Further, the relationship between reading

behaviors and quiz performances of each cluster was examined by

association rule mining analysis.

The results highlighted that the majority of the students who had

followed surface approach did not use markers, their content

completion rates were also low, and they did not use the tool outside

the class. Also, their quiz performance was low. Students using a

deep approach showed high activity both within the class and out

of the class. They used markers actively while reading the content

and their quiz performances were also high. Students using a

strategic approach actively used the tool in the class while they did

not use it outside the class. In terms of quiz performance in that

cluster, there were both low and high performing students. These

findings are in accordance with the ones in the SAL studies [12, 18,

19]. While surface learners tend to complete the task with minimum

effort, deep learners tend to spend time in the content outside the

class and learn the information deeply by using marker function.

The fact that strategic learners actively used the tool in class and

did not use it outside class, can be interpreted as they want to

succeed with minimum effort.

This study has some limitations. First, the sample size is

relatively small, which limits the generalizability of the obtained

results. Second, although the results of the clustering analysis and

association rule mining analysis support our initial hypothesis, we

cannot make a strong claim that these clusters are definitely

representing the three learning approaches. Further validation with

additional data is required to make a stronger claim.

Obtained association rules can be used to predict students'

learning approaches and accordingly to predict their academic

performances. Data showed students who use surface strategy are

mostly active only in the first few pages of the content and even

there is no activity on the following pages. Interventions to ensure

the continuity of these students' reading can be in redesigning the

content. For instance, quiz questions at the beginning of the content

might have led to more marker activity in that part by strategic and

deep learners. Exploring the effect of such reflective questions

across the content on the students' marker behaviors can be

examined in further studies.

ACKNOWLEDGMENTS

This work was partly supported by JSPS Grant-in-Aid for Scientific

Research (S) 16H06304, NEDO Special Innovation Program on AI

and Big Data 18102059-0, Hacettepe University Scientific

Research Projects Coordination Center Grant Number SBI-2017-

16268 and JSPS KAKENHI Research Activity Start-up Grant

Number 18H05746.

REFERENCES

[1] J. B. Biggs, "The Role of Metalearning in Study Processes," British

Journal of Educational Psychology, vol. 55, no. 3, pp. 185-212,

1985.

[2] F. Marton and R. Säljö, "On Qualitative Differences in Learning: I—

Outcome and Process," British Journal of Educational Psychology,

vol. 46, no. 1, pp. 4-11, 1976.

[3] F. Marton and R. Säljö, "On Qualitative Differences in Learning:

II—Outcome as a Function of the Learner's Conception of the Task,"

British Journal of Educational Psychology, vol. 46, no. 2, pp. 115-

127, 1976.

[4] A. Cicchinelli et al., "Finding traces of self-regulated learning in

activity streams," presented at the Proceedings of the 8th

International Conference on Learning Analytics and Knowledge,

Sydney, New South Wales, Australia, 2018.

[5] J. Jovanović, D. Gašević, S. Dawson, A. Pardo, and N. Mirriahi,

"Learning analytics to unveil learning strategies in a flipped

classroom," The Internet and Higher Education, vol. 33, pp. 74-85,

2017/04/01/ 2017.

[6] M. S. Boroujeni and P. Dillenbourg, "Discovery and temporal

analysis of latent study patterns in MOOC interaction sequences,"

presented at the Proceedings of the 8th International Conference on

Learning Analytics and Knowledge, Sydney, New South Wales,

Australia, 2018.

[7] G. Akçapınar, M. N. Hasnine, R. Majumdar, B. Flanagan, and H.

Ogata, "Developing an Early-Warning System for Spotting At-Risk

Students by using eBook Interaction Logs," Smart Learning

Environments, vol. 6, no. 4, pp. 1-15, 2019.

[8] R. Junco and C. Clem, "Predicting course outcomes with digital

textbook usage data," The Internet and Higher Education, vol. 27,

pp. 54-63, 2015/10/01/ 2015.

[9] Y. Huang, M. Yudelson, S. Han, D. He, and P. Brusilovsky, "A

Framework for Dynamic Knowledge Modeling in Textbook-Based

Learning," presented at the Proceedings of the 2016 Conference on

User Modeling Adaptation and Personalization, Halifax, Nova

Scotia, Canada, 2016.

[10] N. Entwistle and P. Ramsden, Understanding Student Learning.

New York: Nichols Publishing Company, 1982.

[11] D. Tomanek and L. Montplaisir, "Students' studying and approaches

to learning in introductory biology," (in eng), Cell biology education,

vol. 3, no. 4, pp. 253-262, Winter 2004.

[12] G. Akçapınar, "Predicting Students' Approaches to Learning Based

on Moodle Logs," 8th International Conference on Education and

New Learning Technologies, pp. 2347-2352, 2016.

[13] W. Matcha, D. Gašević, N. A. A. Uzir, J. Jovanović, and A. Pardo,

"Analytics of Learning Strategies: Associations with Academic

Performance and Feedback," presented at the Proceedings of the 9th

International Conference on Learning Analytics & Knowledge,

Tempe, AZ, USA, 2019.

[14] A. Gabadinho, G. Ritschard, M. Studer, and N. S. Müller, "Mining

sequence data in R with the TraMineR package: A user’s guide,"

2009.

[15] R Core Team, "R: A language and environment for statistical

computing," ed: R Foundation for Statistical Computing, 2017.

[16] A. Gabadinho, G. Ritschard, N. S. Müller, and M. Studer,

"Analyzing and visualizing state sequences in R with TraMineR,"

Journal of Statistical Software, vol. 40, no. 4, pp. 1-37, 2011.

[17] M. Hahsler, B. Grün, and K. Hornik, "arules - A Computational

Environment for Mining Association Rules and Frequent Item Sets,"

Journal of Statistical Software, vol. 14, no. 15, pp. 1-25, 2005.

[18] R. A. Ellis, F. Han, and A. Pardo, "Improving Learning Analytics–

Combining Observational and Self-Report Data on Student

Learning," Journal of Educational Technology & Society, vol. 20,

no. 3, pp. 158-169, 2017.

[19] D. Gasevic, J. Jovanovic, A. Pardo, and S. Dawson, "Detecting

learning strategies with analytics: Links with self-reported measures

and academic performance," Journal of Learning Analytics, vol. 4,

no. 2, pp. 113–128, 2017.

A reflective e-learning approach for reading, thinking, and behavioral engagement

Article

Jan 2024
LANG LEARN TECHNOL

One of the main goals of the English as a Foreign Language (EFL) course is to facilitate the development of learners' reading comprehension and reflective skills in English, which can be developed with appropriate instruction. However, in EFL courses, many students are inactive in reflecting on their reading and are disengaged from learning. To fill this gap, a reflective reading-based e-learning approach was proposed to explore the impact of the suggested approach on reading comprehension, reflective thinking, and behavioral engagement. The study aimed to improve the comprehension of the student's reading using the proposed reflective e-learning approach. The study employed a quasi-experimental design in which the experimental group used reflective reading-based e-learning (n = 51) and the control group used conventional e-learning (n = 50) for a total of 13 weeks of participation. The experiment was designed to examine reading comprehension, reflective thinking, and behavioral engagement (e.g., reading time, Marker list, Quiz score, Memo list). The results revealed that the reflective reading-based e-learning approach could improve the comprehension and reflective thinking of the learners and promote behavioral engagement. These findings can be valuable for educators designing strategies to improve students' reading comprehension skills and stimulate behavioral engagement in e-learning systems.

Data-Driven Analysis of Student Engagement in Time-Limited Computer Laboratories

Article

Full-text available

Oct 2023

Computer laboratories are learning environments where students learn programming languages by practicing under teaching assistants’ supervision. This paper presents the outcomes of a real case study carried out in our university in the context of a database course, where learning SQL is one of the main topics. The aim of the study is to analyze the level of engagement of the laboratory participants by tracing and correlating the accesses of the students to each laboratory exercise, the successful/failed attempts to solve the exercises, the students’ requests for help, and the interventions of teaching assistants. The acquired data are analyzed by means of a sequence pattern mining approach, which automatically discovers recurrent temporal patterns. The mined patterns are mapped to behavioral, cognitive engagement, and affective key indicators, thus allowing students to be profiled according to their level of engagement in all the identified dimensions. To efficiently extract the desired indicators, the mining algorithm enforces ad hoc constraints on the pattern categories of interest. The student profiles and the correlations among different engagement dimensions extracted from the experimental data have been shown to be helpful for the planning of future learning experiences.

International Journal of Mobile Learning and Organisation A reading engagement-promoting strategy to facilitate EFL students' mobile learning achievement, behaviour and engagement A reading engagement-promoting strategy to facilitate EFL students' mobile learning achievement, behaviour and engagement

Article

Full-text available

Jul 2022

Personalized Navigation Recommendation for E-book Page Jump

Conference Paper

Full-text available

Mar 2024

As the utilization of digital learning materials continues to rise in higher education, the accumulated operational log data provide a unique opportunity to analyze student reading behaviors. Previous works on reading behaviors for e-books have identified jump-back as frequent student behavior, which refers to students returning to previous pages to reflect on them during the reading. However, the lack of navigation in e-book systems makes finding the right page at once challenging. Students usually need to try several times to find the correct page, which indicates the strong demand for personalized navigation recommendations. This work aims to help the student alleviate this problem by recommending the right page for a jump-back. Specifically, we propose a model for personalized navigation recommendations based on neural networks. A two-phase experiment is conducted to evaluate the proposed model, and the experimental result on real-world datasets validates the feasibility and effectiveness of the proposed method.

LECTOR: An attention-based model to quantify e-book lecture slides and topics relationships

Poster

Full-text available

Jul 2023

The use of digital lecture slides in e-book platforms allows the analysis of students' reading behavior. Previous works have made important contributions to this task, but they have focused on students' interactions without considering the content they read. The present work complements these works by designing a model able to quantify the e-book LECture slides and TOpic Relationships (LECTOR). Our results show that LECTOR performs better in extracting important information from lecture slides and suggest that read-ers' topic preferences extracted by our model are important factors that can explain students' academic performance.

AI and Big Data in Education: Learning Patterns Identification and Intervention Leads to Performance Enhancement

Article

Full-text available

Dec 2023

Improving learning outcomes is always one of the key objectives of learning analytics (LA) and educational data mining (EDM). In recent years, many Massive Open Online Courses (MOOC) have been deployed and making it easier to collect learners’ data for further analysis. Naturally, leveraging AI to process such kind of big data becomes one of the main research streams to support education. In this paper, we collected data and defined student learning patterns by leveraging online courses on Python programming and we then verified if their learning performance was influenced by different learning patterns and interventions. We designed the intervention process, explored the impact of final learning outcomes, and analyze Self-Regulated Learning (SRL) abilities. From the experimental results, we share the learning outcomes and the difference in SRL with detailed explanation based on different groups.

Video Analytics in Digital Learning Environments: Exploring Student Behaviour Across Different Learning Contexts

Article

Full-text available

Aug 2023

The use of videos in teaching has gained impetus in recent years, especially after the increased attention towards remote learning. Understanding students’ video-related behaviour through learning (and video) analytics can offer instructors significant potential to intervene and enhance course designs. Previous studies explored students’ video engagement to reveal learning patterns and identify at-risk students. However, the focus has been mostly placed on single contexts, and therefore, limited insights have been offered about the differences and commonalities between different learning settings. To that end, the current paper explored student video engagement in three disparate contexts. Following a case study research approach, we uncovered the commonalities and differences of video engagement in the context of SPOC, MOOC, and an undergraduate university course. The findings offer a deeper and more comprehensive understanding of students’ video-related engagement and shed light into several key aspects related to video analytics that should be considered during the design of video-based learning (e.g., learning objectives in relation to video type or context). Additionally, the three cases indicated the important role of the content type, the length, and the aim of the video on students’ engagement. Further implications of the work are also discussed in the paper.

A Quality Data Set for Data Challenge: Featuring 160 Students' Learning Behaviors and Learning Strategies in a Programming Course

Conference Paper

Full-text available

Jan 2022

Emerging science requires data collection to support the research and development of advanced methodologies. In the educational field, conceptual frameworks such as Learning Analytics (LA) or Intelligent Tutoring System (ITS) also require data. Prior studies demonstrated the efficiency of academic data, for example, risk student prediction and learning strategies unveiling. However, a publicly available data set was lacking for benchmarking these experiments. To contribute to educational science and technology research and development, we conducted a programming course series two years ago and collected 160 students' learning data. The data set includes two well-designed learning systems and measurements of two welldefined learning strategies: Self-regulated Learning (SRL) and Strategy Inventory for Language Learning (SILL). Then we summarized this data set as a Learning Behavior and Learning Strategies data set (LBLS-160) in this study; here, 160 indicates a total of 160 students. Compared to the prior studies, the LBLS data set is focused on students' book reading behaviors, code programming behaviors, and measurement results on students' learning strategies. Additionally, to demonstrate the usability and availability of the LBLS data set, we conducted a simple risk student prediction task, which is in line with the challenge of cross-course testing accuracy. Furthermore, to facilitate the development of educational science, this study summarized three data challenges for the LBLS data set.

Seeking to Reduce Physical Distancing in Teacher-Student Interactions. Online Education: Teaching in a Time of Change. Architecture, Media, Politics & Society (AMPS) Proceedings Series.

Conference Paper

Full-text available

Apr 2021

Extending unified theory of acceptance and use of technology to understand the acceptance of digital textbook for elementary School in Indonesia

Article

Full-text available

Feb 2023

The rapid development of technology has led to the change of textbooks from printed to digital forms accessible by students irrespective of their location, thereby improving their overall academic performance. This change is appropriate to the sustainable learning program, where digital textbooks support online learning and students can access material from anywhere and at any time. This research aims to analyze the factors affecting the intention of elementary school teachers to use digital textbooks. Quantitative data were collected and measured from 493 elementary school teachers in Riau, Indonesia, and analyzed using structural equation modeling (SEM). The results showed that performance expectancy (PE), Effort Expectancy (EE), Social Influence (SI), Perceived learning opportunities (PLO), Self-efficacy (SE), and Facilitating Condition (FC) positively affected teachers’ intention to use digital textbooks. SI was found to be the factor with the greatest effect on BI. However, attitude, affective need (AN), ICT usage habits, gender, age, and education level did not affect teachers’ intention to use digital textbooks. This research provides important information for the government, decision-makers, and schools on using digital textbooks at the elementary level in the future.

Mining sequence data in R with the TraMineR package: A user's guide

Book

Full-text available

Mar 2011

This is a User's Guide of the R package TraMineR version 1.8.

Developing an early-warning system for spotting at-risk students by using eBook interaction logs

Article

Full-text available

May 2019

Early prediction systems have already been applied successfully in various educational contexts. In this study, we investigated developing an early prediction system in the context of eBook-based teaching-learning and used students’ eBook reading data to develop an early warning system for students at-risk of academic failure -students whose academic performance is low. To determine the best performing model and optimum time for possible interventions we created prediction models by using 13 prediction algorithms with the data from different weeks of the course. We also tested effects of data transformation on prediction models. 10-fold cross-validation was used for all prediction models. Accuracy and Kappa metrics were used to compare the performance of the models. Our results revealed that in a sixteen-week long course all models reached their highest performance with the data from the 15th week. On the other hand, starting from the 3rd week, the models classified low and high performing students with an accuracy of over 79%. In terms of algorithms, Random Forest (RF) outperformed other algorithms when raw data were used, however, with the transformed data J48 algorithm performed better. When categorical data were used, Naive Bayes (NB) outperformed other algorithms. Results also indicated that models with transformed data performed lower than the models created using categorical data. However, models with categorical data showed similar performance with models with raw data. The implications of the results presented in this research were also discussed with respect to the field of Learning Analytics.

Analytics of Learning Strategies: Associations with Academic Performance and Feedback

Conference Paper

Full-text available

Mar 2019

Learning analytics has the potential to detect and explain characteristics of learning strategies through analysis of trace data and communicate the findings via feedback. However, the role of learning analytics-based feedback in selection and regulation of learning strategies is still insufficiently explored and understood. This research aims to examine the sequential and temporal characteristics of learning strategies and investigate their association with feedback. Three years of trace data were collected from online pre-class activities of a flipped classroom, where different types of feedback were employed in each year. Clustering, sequence mining, and process mining were used to detect and interpret learning tactics and strategies. Inferential statistics were used to examine the association of feedback with the learning performance and the detected learning strategies. The results suggest a positive association between the personalised feedback and the effective strategies.

Finding traces of self-regulated learning in activity streams

Conference Paper

Full-text available

Mar 2018

This paper aims to identify self-regulation strategies from students' interactions with the learning management system (LMS). We used learning analytics techniques to identify metacognitive and cognitive strategies in the data. We define three research questions that guide our studies analyzing i) self-assessments of motivation and self regulation strategies using standard methods to draw a baseline, ii) interactions with the LMS to find traces of self regulation in observable indicators, and iii) self regulation behaviours over the course duration. The results show that the observable indicators can better explain self-regulatory behaviour and its influence in performance than preliminary subjective assessments.

Discovery and temporal analysis of latent study patterns in MOOC interaction sequences

Conference Paper

Mar 2018

Capturing students' behavioral patterns through analysis of sequential interaction logs is an important task in educational data mining and could enable more effective and personalized support during the learning processes. This study aims at discovery and temporal analysis of learners' study patterns in MOOC assessment periods. We propose two different methods to achieve this goal. First, following a hypothesis-driven approach, we identify learners' study patterns based on their interaction with lectures and assignments. Through clustering of study pattern sequences, we capture different longitudinal activity profiles among learners and describe their properties. Second, we propose a temporal clustering pipeline for unsupervised discovery of latent patterns in learners' interaction data. We model and cluster activity sequences at each time step and perform cluster matching to enable tracking learning behaviours over time. Our proposed pipeline is general and applicable in different learning environments such as MOOC and ITS. Moreover, it allows for modeling and temporal analysis of interaction data at different levels of actions granularity and time resolution. We demonstrate the application of this method for detecting latent study patterns in a MOOC course.

Improving learning analytics - Combining observational and self-report data on student learning

Article

Jan 2017
EDUC TECHNOL SOC

The field of education technology is embracing a use of learning analytics to improve student experiences of learning. Along with exponential growth in this area is an increasing concern of the interpretability of the analytics from the student experience and what they can tell us about learning. This study offers a way to address some of the concerns of collecting and interpreting learning analytics to improve student learning by combining observational and self-report data. The results present two models for predicting student academic performance which suggest that a combination of both observational and self-report data explains a significantly higher variation in student outcomes. The results offer a way into discussing the quality of interpretations of learning analytics and their usefulness for helping to improve the student experience of learning and also suggest a pathway for future research into this area.

R: A Language and Environment for Statistical Computing

Book