Conference PaperPDF Available

Abstract and Figures

Motion-Based Touchless Games (MBTG) are being investigated as a promising interaction paradigm in children’s learning experiences. Within these games, children’s digital persona (i.e, avatar), enables them to efficiently communicate their motion-based interactivity. However, the role of children’s Avatar Self-Representation (ASR) in educational MBTG is rather under-explored. We present an in-situ within subjects study where 46 children, aged 8–12, played three MBTG with different ASRs. Each avatar had varying visual similarity and movement congruity (synchronisation of movement in digital and physical spaces) to the child. We automatically and continuously monitored children’s experiences using sensing technology (eye-trackers, facial video, wristband data, and Kinect skeleton data). This allowed us to understand how children experience the different ASRs, by providing insights into their affective and behavioural processes. The results showed that ASRs have an effect on children’s stress, arousal, fatigue, movement, visual inspection (focus) and cognitive load. By exploring the relationship between children’s degree of self-representation and their affective and behavioural states, our findings help shape the design of future educational MBTG for children, and emphasises the need for additional studies to investigate how ASRs impacts children’s behavioural, interaction, cognitive and learning processes.
Content may be subject to copyright.
Using Sensing Technologies to Explain Children’s
Self-Representation in Motion-Based Educational Games
Serena Lee-Cultura
Norwegian University of
Science and Technology
Trondheim, Norway
serena.leecultura@ntnu.no
Kshitij Sharma
Norwegian University of
Science and Technology
Trondheim, Norway
kshitij.sharma@ntnu.no
Sofia Papavlasopoulou
Norwegian University of
Science and Technology
Trondheim, Norway
spapv@ntnu.no
Symeon Retalis
University of Piraeus
Piraeus, Greece
retal@unipi.gr
Michail Giannakos
Norwegian University of
Science and Technology
Trondheim, Norway
michailg@ntnu.no
ABSTRACT
Motion-Based Touchless Games (MBTG) are being inves-
tigated as a promising interaction paradigm in children’s
learning experiences. Within these games, children’s digital
persona (i.e, avatar), enables them to efficiently communi-
cate their motion-based interactivity. However, the role of
children’s Avatar Self-Representation (ASR) in educational
MBTG is rather under-explored. We present an in-situ within
subjects study where 46 children, aged 8–12, played three
MBTG with different ASRs. Each avatar had varying visual
similarity and movement congruity (synchronisation of move-
ment in digital and physical spaces) to the child. We automati-
cally and continuously monitored children’s experiences using
sensing technology (eye-trackers, facial video, wristband data,
and Kinect skeleton data). This allowed us to understand how
children experience the different ASRs, by providing insights
into their affective and behavioural processes. The results
showed that ASRs have an effect on children’s stress, arousal,
fatigue, movement, visual inspection (focus) and cognitive
load. By exploring the relationship between children’s degree
of self-representation and their affective and behavioural states,
our findings help shape the design of future educational MBTG
for children, and emphasises the need for additional studies to
investigate how ASRs impacts children’s behavioural, interac-
tion, cognitive and learning processes.
Author Keywords
Multimodal data; Avatar; Educational Technologies;
Embodied Interaction; Embodied Learning; Motion-Based
Games.
This article has been accepted for publication in:
IDC’20, Interaction Design and Children June 21–24, 2020, London, UK
Content may change prior to final publication. Citation information:
Lee-Cultura, S., Sharma, K., Papavlasopoulou, S., Retalis, S., & Giannakos, M. (2020).
Using sensing technologies to explain children's self-representation in motion-based
educational games. In Proceedings of the ACM Interaction Design and Children
Conference (IDC'20), ACM, New York, NY, USA, pp. 541-555.
DOI: 10.1145/3392063.3394419
CCS Concepts
Human-centered computing Empirical studies in HCI;
User interface design;
Applied computing Interactive
learning environments;
INTRODUCTION & MOTIVATION
Engaging children in the learning process through motion
and body based gaming is a complex yet promising endeavour
[87]. During the last years, the use of Motion-Based Touchless
Games (MBTG) have become increasingly popular across a
multitude of educational domains, such as literacy [36, 87],
mathematics [48], programming [81], social skills [25] and
motor skills development [9]. Children communicate with
the MBTG via their avatars, which have different levels of
children’s representation.
Avatars are the portrayal of the child via graphical personifier
within a virtual gaming environment [7]. Recently, this topic
has gained the attention of researchers in the Child-Computer
Interaction (CCI) [7, 9, 59, 68] and learning sciences [58]
communities. Banakou et al. [4] embodied a virtual full-body
avatar of a child and suggested that higher-level cognitive pro-
cesses (i.e., the implications of the representation of the body
in terms of the age) could influence users’ perceptual interpre-
tations of an object’s size in the physical world. Avatars can
take on several different visualisations, ranging from arrow
or hand icon, to a relatively basic two-dimensional fantasy
character (sometimes chosen by the user) [84], to a humanoid
avatar as found in virtual worlds (e.g., Second Life). The ex-
tent to which the child is represented by the avatar is referred
to as Avatar Self-Representation (ASR). In educational MBTG
all three levels of representation are possible. However, the
relationship between the avatar’s level of representation and
child’s affect and behaviour in educational MBTG remains an
open debate in the literature.
Previous works have employed diverse practices and data
sources to assess children’s interaction with MBTG, such as
interviews, observations and interaction analysis. Interaction
between children and MBTG offers an opportunity to collect
rich and Multi-Modal Data (MMD) through wearable and
ubiquitous sensing. Prior research [26, 62, 71] illustrates the
value and usefulness of wearable and ubiquitous data sources,
including eye-tracking, facial-feature, and skin-conductance
data. Combining such data can provide continuous informa-
tion regarding participants’ cognitive-affective states through
their arousal levels, including states “outside of awareness”
(i.e., not directly observable) [78]. Furthermore, there is strong
evidence that wearable and ubiquitous data sources (e.g., [26,
62, 71]) are capable of offering important information explain-
ing the different aspects of children’s learning experiences and
interaction.
This study aims to fill the gap in the literature by empirically
addressing the impact of ASR on children’s affective and
behavioural processes, through the use of sensing technology,
during educational MBTG play, in order to enable researchers
to create more appropriate motion-based games to support
learning experiences. Specifically, we address the following
research question:
RQ
How does the degree of ASR relate to children’s affect
and behaviour in educational MBTG?
To tackle the aforementioned research question, we conducted
an in-situ experiment with 46 children playing three educa-
tional MBTG, with varying degrees of ASR. We collected
data from multiple sources (eye-tracking glasses, wrist band,
Microsoft Kinect and webcam). By using MMD to inves-
tigate children’s educational MBTG play problem-solving
experiences, we show that the contrasting degrees of self-
representation (i.e., hand avatar with minimal movement map-
ping, fantasy avatar with partial body mapping and realistic-
self avatar with full body mapping) have different implications
on children’s affective and behavioural experiences. In partic-
ular, we make the following contributions:
We present insights from an in-situ experiment with 8-12
years old children playing three educational MBTG as they
are monitored by sensing technologies.
We identify the effect of ASR on children’s affective and
behavioural processes during MBTG gameplay.
We discuss how our findings can be used to design MBTG
for children.
We show that ubiquitous and wearable sensing has the ca-
pacity to monitor children’s affective and behavioural pro-
cesses, and discuss its potential in CCI research.
RELATED WORK
Our works borrows from and builds upon previous research in
the domains of educational MBTG, avatars in virtual environ-
ments and the use of wearable and ubiquitous sensing in CCI,
specifically concerning affect and behaviour. In this section,
we provide an overview of relevant studies that ground and
guide our research.
Educational MBTG
In the context of educational research, many MBTG have been
proposed and implemented [46, 87, 1]. These games invite
the learner inside a virtual world where they interact with
educational content for the purposes of developing cognitive,
motor, or social skills. Their application has seen much trac-
tion, with research permeating maths [41, 48, 73], science [47,
81], language development [36, 87], vegetation succession
[1] and special educational needs [9, 6, 7]. Notable studies
suggest that in the context of maths, MBTG might have a
positive impact on student learning; particularly concerning
enhanced problem understanding [73], reduced maths anxiety
[40] and increased academic performance [46, 81]. MBTG
has also shown promise in development of language skills.
For example, the Word Out! game [87] used motion sensing
to aid children in learning to recognise the characteristic fea-
tures of the alphabet. Results showed that the game motivated
children, while fostering creative and collaborative strategies
throughout their playful educational experiences. Collectively,
these contributions demonstrate that researchers and teachers
are beginning to consider MBTG as a viable solution by which
to augment the current instructional approach [39].
The Role of the Avatar in Learning Spaces
As virtual environments continue to penetrate educational
spaces [14, 22], the role of the avatar as a pedagogical agent
has become an increasingly popular topic [8, 68, 85, 76].
Avatars have been deployed across a wide range of roles;
namely as virtual instructor/tutors [77], coaches [59], peers
and co-learners [55], and learner representatives [7, 8, 56].
Several studies advocate the positive influence of avatars in
learning [55, 77]; suggesting their capacities to act as moti-
vational agents [8, 85], foster creative ideation [56], promote
development of social skills in special education [7, 59], as
well as establish and sustain a strong sense of community be-
tween members of virtual classrooms in distance learning [66].
In addition, research also claims that learners enjoy interact-
ing with avatars while undertaking educational content [9, 68,
77]. This particularly applies to children, who tend to have a
fondness for animated cartoon characters [77].
More relevant to our study is the role that the avatar plays
while representing the learner in educational spaces. Learners
form strong connections with their avatars [10]; which can
take many forms, ranging from basic arrow icon or geometric
shape [44], stick figure [7] or silhouette [68], fantasy character
[85], or even a self-similar (i.e., shared visual likeness between
avatar and learner) humanoid [56]. Avatar identification is the
cognitive phenomenon whereby an individual perceives the
avatar’s experience as their own [10, 17]. Trepte and Reinecke
[80] showed that avatar-player similarity may be positively re-
lated to avatar identification. Moreover, Birk et al. [10] demon-
strated that identification can lead to increased performance
motivation, translating to extended gameplay duration, which
could yield significant ramifications in educational games.
Building on this, learners can be influenced by manipulat-
ing the different characteristics of their avatars (i.e., alter-
ing their ASR). Wallace et al. [85] investigated the impact
of avatar’s appearance (self-similar, ideal-self, or super hero
avatar) on student performance motivation in an online un-
dergraduate class. Results showed that use of the ideal-self
and super hero avatars lead to greater performance motivation
(higher levels of engagement) in avatar-based activities than
in non-avatar based activities (e.g., exam study guide posting).
These results are supported by Baylor [8], who argues that
appearance is the most important aspect in facilitating learner
motivation. Specifically, these avatar-selves should share de-
mographic characteristics with the learner, and embody what
the learner “aspires to be” (i.e., ideal-self). The latter, exploits
the thoroughly researched Proteus effect [88], which states
that “people infer their expected behaviours and attitudes from
observing their avatar’s appearance”.
Adding movement to the discussion, Steed et al. [76] argue
that pairing the use of avatar and gesture might represent an
appropriate medium for reducing cognitive load when learning
spatial rotation in VR. In their explorations of avatar-based
touchless gestural interfaces, Rubegni at al. [68] identified
the avatar as a catalyst in facilitating engagement, and in
turn improving information recall, amongst young children.
Concerning MBTG, use of a movement-driven avatar can offer
new frames of reference, enabling the learner to reflect on how
their actions elevate or hinder their educational success. This
notion is supported by Bartoli et al. [7], who suggests that
using an avatar helps direct children’s focus on the outcome
of their movements, rather than on the motions themselves
[7]. Furthermore, Bhattacharya et al. [9] hypothesises that
the advantages of embodied interaction qualify movement-
driven avatars as an ideal tool for promoting engagement in
children with special needs. By observing the movement
congruity between a self-similar avatar and learner, the learner
is provided immediate feedback on their movement [9], and
potentially on their understanding of the content to be learned.
These studies emphasise the importance of the learner avatar
relationship as a resource to be strategically exploited through-
out the design, development and assessment of educational
technologies. However, experiments typically centre on adult
populations [56, 76, 85] or children with special educational
needs [6, 7, 9]. Fewer studies examine how avatars impact
typically developing children [68, 58]. Therefore, despite the
fact that avatar representation has been found to be decisive on
children’s experience (e.g., gameplay, motivation), the exact
effect of the various types of avatars in children’s experience
is rather under-explored.
Affect and Behaviour / MMD in CCI
Large-scale MMD collection of children’s affective and be-
havioural data is a relatively new practice [37]. In recent years,
the CCI community has engaged in discussions regarding the
promised benefits and ethical impositions of utilising MMD
to tailor children’s experiences, and further CCI research [38].
Previous works advocate the use of MMD to analyse the com-
plex interactions exchanged between children and the systems
they employ. [11, 52]. This endorsement is collectively driven
by different data stream’s capacities to inform on the unique
qualities of children’s behaviour [11, 52] and contribute to
a holistic understanding of their experiences. For example,
eye-tracking can indicate the areas where children direct their
attention [42] or help quantify the cognitive effort children
invest when problem solving [15, 64]. Electrodermal Activity
(EDA) and temperature could be used to quantify engagement
and stress, respectively [50, 35]. Facial videos are capable
of informing on the emotions displayed by children corre-
sponding to different events during an interaction [30, 3]. Re-
searchers have also employed MMD to capture the cognitive,
meta-cognitive, affective and motivational states of learners
over time, to adequately scaffold the learning process [5, 70].
Accordingly, there has been much interest in using MMD to
understand and/or explain children’s behaviour as they interact
with technology for educational [2, 16, 74] and entertain-
ment [18, 19, 60] purposes. For example, audio and video
recordings assisted researchers in recognising children’s affec-
tive state as they reasoned through sorting and pattern recogni-
tion problems [89]. They have also been used in conjunction
with system logs, to develop personalised numeric learning for
preschooler’s interactions with socially assistive robot tutors
[16]. Analysis of log files and facial video have also con-
tributed to the creation of constructive user-friendly interaction
to facilitate fun and learning in acquisition of programming
skills [2]. More recently, Sridhar et al. [74] demonstrated
the use of Heart Rate Variability (HRV) and Galvanic Skin
Response (GSR) to differentiate between children’s cognitive-
affective states as they executed tasks of variable mental effort.
In addition to detecting potential indicators of cognitive load
in learning, they demonstrated the feasibility and importance
of physiological MMD in data triangulation.
Research has also leveraged MMD practices to explore the
many angles of children’s entertainment system experiences
by combining traditional, physiological, and motion sensing
data capture. Children’s EDA, HRV and affective states have
been assessed in pursuit of understanding of the dynamics of
goal-oriented open-ended gameplay, proxemics, and encour-
age group collaboration [18]. A similar study conducted by
Crowell et al. [19], also used video, HRV and EDA to measure
children’s anxiety levels during embodied mixed-reality play.
Collectively, these studies endorse the validity of MMD to
further researchers understanding of children’s experiences
within education and entertainment. However, researchers
highlight an irrefutable need for additional studies to fully
examine the trade-offs between the advantages (e.g., explain-
ability of gaze, brain, face, skin etc,) and limitations (e.g.,
ecology, cost) of the various modalities before researchers can
leverage from the complete range of MMD affordances [27].
THE MOTION BASED TOUCHLESS GAMES
In this section, we present a detailed account of the three edu-
cational MBTG employed in our study: Suffiz, Marvy Learns,
and Sea Formuli, and discuss their comparability. Researchers
selected these games because they offered different levels
of ASR, while maintaining the same basic interaction mecha-
nisms (see section Comparability of MBTG). We also describe
the degrees of ASR (see section Avatar Self Representation
(ASR)) that were considered in our analysis.
Suffiz: A Literacy Suffix Game Show
Suffiz focused on developing the children’s literacy ability
through practice of English grammar. Children were pre-
sented a sentence in the form of a fill-in-the-blank multiple
choice question, offering 3 potential answers (Figure 1a). The
children needed to read the sentence, determine the correct
answer, then select and move the answer onto the blank space
(a) child is presented with a multiple choice
English question and white hand avatar.
(b) child performs a grab gesture to select a
word. The avatar shows visual feedback by
closing into a fist.
(c) child bends their body in order to move
the selected word to the blank space.
Figure 1: ©Serena Lee-Cultura. A child gesturing through a Suffiz game problem using low ASR.
located at the bottom of the screen. Children were represented
by a hand cursor avatar that was mapped to minimal move-
ments of the player’s dominant hand. Questions regarded use
of irregular plural nouns, verb tense, correlative conjunctions
and regular and intensive pronouns. Once a question had been
answered, the selected word turned green if correct and red,
otherwise. Then, a new question appeared for further practice.
Each gameplay session was comprised of 5 multiple choice
questions. Figure 1 illustrates an exemplar flow of gameplay
when a child must select the correct suffix for the word funny
provided the sentence “this cartoon is ___ than the one we saw
yesterday”. The player is provided three potential answers:
funny, the funniest, and funnier.
Marvy Learns: A Literacy or Geometry Matching Game
Marvy Learns was used to develop literacy or geometry skills
as determined by the age of the child (e.g.,
810
focused on
literacy,
11
and
12
on geometry). In the context of literacy
proficiency, children needed to identify real-life connections
between items and their attributes. For example, 6 image cards
may be: cloud, water, ice cube, table, soda pop, and fire; with
box labels Liquids, Solid, and Gas. The child must read the
box labels, decide which items correspond to each box, then
move them accordingly (e.g., match fire and cloud to Gas,
water and soda pop to Liquid, and ice cube to Solid). For
sharpening geometry skills, children were asked to either visu-
alise and match shapes that resulted from connecting multiple
points on a grid to their shape names, or match flattened shape-
nets to their 3D representations (Figure 2). The children’s
body movement was partially mimicked by that of a large
blue yeti, Marvy, so the arrangement of item cards occurred
as the children moved their body physical space. In this way,
children became familiar with new concepts by associating
the displayed items with the defined words on the boxes. In
addition, Marvy Learns fosters logical and inductive thinking
through practice of arranging and classifying objects.
Sea Formuli: An Arithmetic Operations Game
Sea Formuli centred on developing algebraic thinking through
practice of arithmetic problems involving whole numbers, frac-
tions and decimals. To solve the presented problems, children
needed to calculate the missing number (operand) or operator
in an equation relating 3 terms, represented as baskets sitting
on the ocean floor. Three floating jellyfish, each labelled with
an operand or an operator, represented potential answers to be
selected. Children performed the mental calculation, selected
the jellyfish containing the correct answer, and then moved the
jellyfish to the empty basket. Each gameplay session consisted
of 5 multiple choice maths questions. Figure 3 illustrates Sea
Formuli’s interaction flow with the addition question:
4.02+
_=8.12, and potential answers: 4.1, 6.36, and 6.07.
Comparability of MBTG
Though the focus of each game differed (Suffiz concentrated
on English skills, Sea Formuli centred on arithmetic, and
Marvy Learns targeted geometry or English depending on the
(a) child is presented a collection of geomet-
ric shapes to sort into labelled boxes using a
fantasy character avatar, Marvy.
(b) child performs a gesture and Marvy’s
body moves in sync with child’s to select
the geometric shape.
(c) child bends their body to place the se-
lected geometric shape into one of the la-
belled boxes.
Figure 2: ©Serena Lee-Cultura. A child gesturing through a Marvy Learns game problem using moderate ASR.
(a) child is presented with multiple choice
maths problem to solve using a photorealistic
avatar with full movement congruity.
(b) child performs a gesture and avatar mim-
ics the child to select the jelly fish.
(c) child bends their body to guide the se-
lected jellyfish to the empty basket to com-
plete the equation.
Figure 3: ©Serena Lee-Cultura. A child gesturing through a Sea Formuli game problem using high ASR.
grade setting), each of the questions presented were structured
as either a multiple-choice or sorting problem. The sorting
problem can also be viewed as an extension of the multiple-
choice style problem, as for each item to be sorted, the child
was required to made a choice from the labelled boxes. In
Suffiz and Sea Formuli, each question had three possible an-
swers. In Marvy Learns there were always three labelled
boxes. Furthermore, the interaction mechanisms that each
child used to engage the game content was identical across
the aforementioned games. To answer a question, the child
gestured (using either a delay or grab gesture) to select an item
from a collection of items, and then move the selected item to
a target destination (i.e., blank space, labelled box, or empty
basket). Target destinations were located at the bottom of the
screen in all three games. Thus, the underlying movement that
the children was required to perform was to select the desired
item and bend in order to place the item into the destination at
the bottom of the screen.
Avatar Self Representation (ASR)
The three games followed similar game mechanics but took
a different approach to children’s ASR. We classify ASR as
low, moderate, and high, according to the visual similarity
(i.e., appearance congruity), and the precision and breadth of
movement (i.e, movement congruity) mapped between avatar
and child. A detailed description of the ASR classifications
is provided in Figure 4. A description of each game’s avatar,
together with ranking, follows.
Low ASR: Hand avatar with minimal movement mapping
In Suffiz, the player was represented by a single white hand
avatar, which was mapped to and controlled by the child’s
dominant hand (as indicated by the child prior to commencing
gameplay). Though the hand avatar was dynamic, it did not
track the complete range of the child’s own hand movement.
Rather, the hand avatar mimicked lateral and vertical (i.e., was
capable of moving around on screen), and the full range of
motion between an open palm (i.e., delay gesture) and closed
fist position (i.e., grabbing gesture). Visual feedback was
provided to the player as they performed these motions.
Moderate ASR: Fantasy avatar with partial body mapping
Marvy Learns represented the player as a large blue yeti that
was controlled by the child’s body movement. Similar to the
hand avatar, Marvy was capable of lateral (i.e., side stepping)
and vertical (i.e., slight bending forward and backwards) move-
ment. The fantasy avatar could bend at all major body joints
(i.e., neck, shoulders, elbows, hips and knees). However, al-
though the child’s full body was tracked, the avatar did not
mimic the child’s complete range of motion. For example,
Marvy was unable to turn around, jump or sit on the floor.
High ASR: Realistic-self avatar with full body mapping
Sea Formuli utilised a photo-realistic avatar of the child. This
self-similar avatar mimicked the complete range of the child’s
movements and facial expressions throughout gameplay. This
was accomplished by projecting a video stream of the player
directly into the game setting.
Figure 4: ©Serena Lee-Cultura. Descriptions and examples of the varying degrees of ASR.
Figure 5:
©
Serena Lee-Cultura. Experimental setup of a
child playing Sea Formuli. Labels indicate the MMD capture
devices.
METHODS
Context
During the experimental design phase of the experiment, re-
searchers worked tightly with the Kinems teams, as well as
local teachers, to create game content that assured the three
ASR’s provided equivalent level of difficulty across the dif-
ferent subjects. Our experiment took place in a science mu-
seum and a local public elementary school in Trondheim,
Norway. Children volunteered to participate upon receiving a
thorough explanation of the study by researchers and school
teachers. In both cases, the experiment was conducted by
the researchers/authors, in a room dedicated strictly to the
experiment. The room was organised to accommodate two
experimental setups running in parallel.
Participants
Our sample was composed of 46 typically developing children
(28 F, 18 M) with an average age of 10.3 years (
SD =1.32
,
min =8
,
max =12
years). 30 children participated at the
science centre, and 16 at the elementary school. Children
participated in 9 gameplay sessions lasting between
25 35
minutes in total. Each child received a gift card for their
participation. All procedures were approved by the national
human research ethics organisation and all children and their
guardians provided verbal/written informed assent/consent,
respectively, prior to participation.
Procedure
We conducted a within subjects experiment to investigate the
affective and behavioural processes experienced by children
as they engaged with three educational MBTG, each offering a
different degree of ASR (i.e., low, moderate, and high). After
obtaining legal guardian consent and children’s assent, the
children were given a pair of Tobii eye-tracking glasses, and
an Empatica E4 wristband to wear. For each of the three ASRs,
children played three consecutive sessions: a practice round,
in which researchers assisted the children in understanding the
associated game’s objective and rules, and two non-practice
sessions. We ensured a balanced ASR (i.e., gameplay) order.
None of the children had prior exposure to MBTG. Figure 5
shows a child in action, wearing the data collection devices,
together with experimental set up.
Data Collection
During the study, we collected sensor data from four different
sources: Eye-tracking, facial video, wristband (with sensors
for HRV, blood-pressure, temperature and EDA levels), and
Kinect skeleton data.
Eye-tracking:
We used Tobii eye tracking glasses at 50Hz
sampling rate and one-point calibration, to capture children’s
eye data. Tobii glass controller software recorded video doc-
umenting the child’s field of view, via objective camera built
into the nose-bridge of the glasses. Video resolution was
1920x1080 at 25 Frames Per Second (FPS).
Facial Video:
To capture facial expressions, we installed a
front facing Logitech web camera to the top of the gameplay
screen. The camera recorded HD video at 10 FPS and was
zoomed in to 200% to target the child’s face. Prior to running
the experiment, we assessed video from an equipment setup
trail and determined that this setup produced high quality data
of participant’s facial expressions.
Empatica E4 wristbands:
We captured children’s wrist-data
according to 4 different variables: HRV (1Hz), EDA (64Hz),
skin temperature (4Hz), and Blood Volume Pulse (4Hz).
Kinect Skeleton:
The skeleton data from the Kinect sensor
was recorded at a sampling rate of 1Hz, and consisted of the
3D position for 20 joints: head, shoulder-centre, spine and
hip-centre, as well as hand, wrist, elbow, shoulder, feet, ankle,
knee, hip (both left and right for the last 8).
Data Pre-processing
Eye-tracking:
Fixations and saccades were identified using
Tobii’s default algorithm (for details please see [61]). A filter
was applied to remove raw gaze points that classified as blinks.
Pupil dilation is highly susceptible to personal and contextual
biases. For example, brightness of screen, time of day, pre-
existing physical health conditions, the child’s gender, age,
amount of sleep, and so on. Thus, we used the first 30 seconds
of eye tracking data to normalise pupil dilation, effectively
removing subjective and contextual biases. Further normalisa-
tion was obtained using the darkest (i.e., put to maximum) and
brightest (i.e., put to minimum) screen shots obtained from the
child’s complete interaction, to account for screen brightness.
Facial Video:
We used the OpenFace recognition algo-
rithm[3], which assesses each frame and assigns a face ID. We
also removed each non-participant face (e.g., researchers’ face
appeared sometimes since we used a wide angle camera).
Wrist band:
We applied a simple smoothing function to the
time series of the four data streams obtained by the Empatica
E4 device (to remove unwanted spikes). We sectionize our
signal in windows, each ‘window’ describes a time segment
containing 100 successive data points. Our function examined
consecutive windows in the time series and calculated a run-
ning average accordingly. Successive windows contained a
Table 1: Dependent variables used in this paper.
Dep. Var. Data source Definition and literature references
Movement
(meters) Kinect The total distance travelled by each joint in the skeleton data, averaged over the whole body.
Fatigue
(meter/second3)Kinect
Fatigue is proportional to energy spent. For moving objects, it can be shown that the
trajectory with the lowest jerk (rate of change of acceleration) is least energy consuming.
Hence, greater jerk leads to greater fatigue.
Arousal
micro Simens Wristband Arousal is computed by the increasing slope of the EDA. The more positive the slope of
the EDA in a given time window is, the higher the arousal is [49, 34].
Stress
(celsius) Wristband Stress is computed as temperature’s decreasing slope. The more negative the slope of
the temperature is in a given time window, the higher the stress is [35, 32].
Hand
movement
(meters)
Wristband The total hand movement (from the accelerometer), in a given time window, can be computed
as: distance =0.5×accelerat ion ×time2
Cognitive
load Eye-tracker
The cognitive load was calculated using the four measures proposed by [15] (i.e., mean pupil
diameter, pupil diameter standard deviation, saccade speed, and number of fixations longer
than 500ms) for every participant (as has also been used in the CCI literature, see [26]).
Focus Eye-tracker
Focus is the ratio of time spent during local information processing (long fixations, short
saccades) and global information processing (short fixations, long saccades) in unit time.
The thresholds are 200 millisec. for fixation duration and 11 degrees for the saccades [63, 83].
Anticipaiton
(degrees/second) Eye-tracker
Skewness of the saccade velocity, that is, “how fast the eyes were moving”, was used [72].
Anticipation is related to the individual’s familiarity with the interface (stimulus), since
familiarity with the interface results in faster eye movement between the different parts [26].
On-task
ratio Eye-tracker The ratio between the gaze on-screen and off-screen during the gameplay.
Emotions Webcam
Extracted from the face images based on the facial Action Units (AUs)[30, 82] and OpenFace
framework [3] (common approach in CCI [71]). Following combinations were used for the
respective emotions: Happy (AU6, A10), Sad (AU1, AU4, AU15), Surprise (AU1, AU2, AU5,
AU26), Anger (AU4, AU5, AU7, AU23).
50 sample overlap. Similar to eye-tracking, the physiological
data obtained (namely, HR, BVP and skin temperature), are
highly susceptible to personal and contextual biases; such as
time of day, pre-existing physical health conditions, the child’s
gender, age, amount of sleep, and so on. These features were
normalised using the first 30 seconds of the data streams to
remove the subjective and contextual bias from the data.
Kinect Skeleton: No pre-processing was required.
Variables
Our analysis investigates the effect of ASR (independent vari-
able) in children’s stress, arousal, amount of hand movement,
facial expression, fatigue, total body movement, cognitive
load, on-task:off-task ratio, and global:local information pro-
cessing ratio (dependant variables). The degree of ASR is
conceptualised from the three games (see: section The Motion
Based Touchless Games, subsection Avatar Self Representa-
tion). The dependent variables, together with their literature
references are presented in Table 1.
Facial Action Coding System (FACS) is taxonomy for human
facial movements. Movements of individual facial muscles
are encoded byFACS from slight different instant changes in
facial appearance. Using FACS it is possible to code nearly all
anatomically possible emotions, deconstructing them into their
specific Action Units (AU) that produced the facial expression.
It is common standard to objectively describe emotions from
facial expressions using such techniques [82], with several
uses involving children [71].
Data Analysis
We conducted an analysis of variance (ANOVA) using the
degree of ASR as the independent variable and the MMD
variables (Table 1) as the dependant variables. Prior to the
main analysis, we conducted the Normality (Shapiro-Wilk
test [67]) and Homoscedasticity (Breush-Pagan test [13]) test
to ensure that the data satisfies the necessary preconditions for
ANOVA. When normality was not satisfied, we normalised the
data by subtracting the mean and dividing by standard devia-
tion (i.e., “cognitive load”). When homoscedasticity was not
satisfied, we applied ANOVA with the Welch correction [86]
(i.e., “focus”). Once the ANOVA yielded a significant relation-
ships between the independent and dependent variables, we
used pairwise one-way ANOVA comparisons to determine the
differences between the three conditions of the independent
variable. We also used Pearson correlation to examine poten-
tial bias from age, paired-t-test to examine potential bias from
gender, and ANOVA to examine the order effect of the ASR.
RESULTS
Initially, we checked for ASR order bias on the dependent
variables. This did not yield significant results for any of the
orders utilised. Nor did we observed any biases emerging
from the children’s age, gender or the experimental context
(museum or school setting) on any of the dependent variables.
Table 2: Pairwise comparison the different degrees of ASR.
Dep.
var.
Pairwise comparison self-representation
high vs mod high vs low mod vs low
F p F p F p
Movement 19.64 .001 27.35 .001 43.06 .0001
Fatigue 19.95 .0001 26.23 .0001 52.99 .0001
Arousal 15.34 .0001 47.9 .0001 21.97 .0001
Stress 24.11 .0001 64.49 .0001 4. 81 .03
Cognitive
load 4.52 .03 63.8 .0001 45.99 .0001
Focus 2.63 .10 4.03 .04 1.59 .21
The rest of this section presents the main effect of the degree
of ASR on the dependent variables. The post-hoc pairwise
tests are reported in Table 2.
The results from the ANOVA show that a significant difference
in the amount of stress associated with the different degrees
of ASR (F
[
2,43
] =
72.12, p
=
0.00001). The highest stress
levels were observed when children use low ASR, followed
by moderate ASR, and then high ASR. Further, arousal is also
related to the degree of self representation (F
[
2,43
] =
66.26,
p
=
0.00001), with high ASR proving to be most arousing,
followed by the moderate ASR and then the low ASR.
Moreover, children’s cognitive load is significantly related
to the degree of ASR (F
[
2,43
] =
83.31, p
=
0.00001). High
ASR is the most cognitively demanding, followed by moderate
ASR and lastly, low ASR. The focus, as measured using eye-
tracking data, also relates to the degree of ASR (F
[
2,43
] =
3.29, p
=
0.04). High ASR yields the most local processing
and the moderate and low account for more global processing.
There is no significant difference between the focus for the
moderate and low ASRs.
Finally, we observe a significant difference in the total amount
of movement corresponding to the degree of ASR (F
[
2,43
]
=
72.12, p
=
0.00001). Moderate ASR incurs the greatest
total amount of movement, followed by high ASR. The least
amount of movement transpires with low ASR. There is also a
significant difference in children’s fatigue for the different de-
grees of ASR (F
[
2,43
] =
71.81, p
=
0.00001). Using low ASR
results in the lowest fatigue, followed by high and moderate
ASR, respectively.
However, we did not encounter a significant difference in
anticipation levels (F
[
2,43
] =
0.49, p
=
0.61), on-task ratio
(from eye-tracker, F
[
2,43
] =
1.12, p
=
0.31), emotions (from
Figure 6:
©
Serena Lee-Cultura. Children’s MMD measurements of the dependent variables per condition (low, moderate and high
ASR). The blue bars in all the figures show the 95% confidence interval. Statistically significant differences are marked with * for
p <= .05, ** for p <= .001 and *** for p <= .0001.
facial video, Happy: F
[
2,43
] =
1.89, p
=
0.17; Sad: F
[
2,43
]
=
0.13, p
=
0.71; Surprise: F
[
2,43
] =
0.14, p
=
0.70; Anger:
F
[
2,43
] =
1.34, p
=
0.24) and total hand movement (from
wristband accelerometer, F
[
2,43
] =
0.12, p
=
0.72) across the
differing degrees of ASR.
DISCUSSION
This study investigates the effects of the degree of Avatar Self-
Representation (ASR) on children’s affective and behavioural
processes while playing educational MBTG. From the data
analysis, we observed that the degree of ASR has a direct
effect on children’s arousal, stress, focus, cognitive load, total
body movement, and fatigue. This section presents plausible
interpretations of the results, followed by implications for
research and design.
Interpretation of Results
We observe that the arousal (measured by EDA) and stress
(measured by skin temperature) are significantly different
across the degrees of ASR (Figures 6a and 6d). The arousal
significantly increases as we move from low ASR to moderate
ASR and to high ASR(Figure 6a). Conversely, the stress sig-
nificantly decreases as we move from low ASR to moderate
ASR and to high ASR(Figure 6d). Regarding arousal, prior
research provides evidence that arousal as measured using
EDA corresponds to higher levels of engagement [50, 21, 12].
In our case, arousal increases with the increasing degree of
ASR; this combined with the previous findings [50, 21, 12]
indicates that children were most engaged using high ASR and
least engaged with low ASR. Primary reasoning for this could
be the “playfulness” of the avatar and the intrigue that each
ASR carries within itself. The higher is the ASR the higher
is the intriguing factor and thus, higher is the engagement
(rising EDA). The “playfulness” encoded into the ASR could
also justify that children feel least stressed [75, 29] when they
are interacting with the high ASR and most when they are
interacting with the low ASR.
We observed a significant difference in children’s focus for
the different degrees of ASR (Figure 6b), however, the only
pairwise difference existed between low and high ASR. Recall
that focus depends on the length of time allocated to fixations
and saccades (section Variables 4.6) over a fixed time interval.
The low ASR incurred the least amount of focus, followed by
the moderate ASR and then the high ASR. We argue that the
statistical difference is directly linked to the amount of visual
detail and movement congruity given by each avatar. The low
ASR was a white glove avatar that was small, offered nominal
aesthetic detail, and minimal movement mapping (avatar’s
range of motion covered from flat open hand to closed fist).
Children might have instantly felt familiar with the character-
istics of this style of avatar, as it is a minor extension of the
conventional cursor. Consequently, this avatar does not neces-
sitate a great deal of visual attention in order to understand,
and so children spent less time looking at it. This opened up
more time to explore the rest of the on-screen content, which
in turn, increased global information processing and reduced
children’s focus. Moreover, when using the high ASR, chil-
dren were presented with a photo-realistic avatar of themselves.
Children demonstrated excitement at seeing themselves pro-
jected within the game, and were visually engaged with their
avatar. We suggest that children enjoyed looking at themselves
on-screen and accordingly, spent more time fixated on the fine
grain details of their avatar’s face; which lead to heightened
local processing (focus).
There was also a significant relation between the different
degrees of ASR and cognitive load; with cognitive load in-
creasing in conjunction with the degree of ASR (Figure 6e).
Although playfulness is related to children having fun [45],
to play with the avatar is a (secondary) task in its own and
carries additional cognitive load to the main task at hand (i.e.,
solving the game question). Thus, greater degree of ASR leads
to greater playfulness (cognitive load) while solving problems.
Essentially, children are doing a pseudo dual task
1
[20] playing
(i.e., interacting with avatar) and solving the problem. How-
ever, our findings suggest that such a behaviour could lead
to cognitive overload, which might incur in children making
mistakes in solving the problem [33, 51].
There was a significant relation between the different degrees
of ASR and children’s fatigue (calculated as the change in
acceleration over time, Figure 6c) and total movement (Figure
6f). Children moved in a confined space (based on limitations
of Kinect sensing), this incited several directional changes,
caused change in acceleration, and consequently increased
fatigue. Children performed the most movement when using
moderate ASR, followed by high ASR, and lastly low ASR
(Figure 6f). We attribute this ordering to 1) the process of the
child learning to understand (how to control) the avatar, and
2) child-avatar playfulness. It may be tempting to attribute the
difference in total movement to the game mechanics of prob-
lem presentation (i.e., Marvy Learn’s offers sorting problems.
Suffiz and Sea Formuli delivers multiple choice). However, in
Marvy Learns we regard each item to be sorted as a separate
multiple choice question. Thus, each Marvy Learns game play
session is composed of 6 multiple choice question to be solved
which is similar to the 5 multiple choice questions offered in
each Suffiz and Sea Formuli per game session.
In order to participate in the MBTG, children must understand
the avatar’s control mechanisms. As mentioned previously,
children may already feel familiar with the low ASR due to
its likeness to the computer mouse, and therefore need not to
invest much into its understanding. However, children most
likely needed to become accustom with the movement con-
gruity between themselves and both the high and moderate
ASR. This is supported by several children revealing that this
was their first time playing MBTG, and expressing that they
"have never seen [them]self in a game before". Moreover,
moderate ASR provided children with the greatest demand for
exploration into the avatar’s control mechanisms, as the move-
ments of the moderate ASR were only partially synchronised
with the movement’s of the child.
On the other hand, some children did not invest much into
understanding the avatar’s range of motion and instead played
within the realm of the selection mode (grab or delay). With
1the subject concurrently performs a primary and secondary task.
these children, researchers often witnessed a Eureka! mo-
ment precisely when they realised that they could control the
avatar’s movement beyond the assigned gesture. This surprise
discovery caused children to cease answering questions and
engage in playfulness with their avatar; producing additional
movement. Furthermore, these Eureka! moments frequently
occurred when using the fantasy avatar, as the realistic avatar
offered little surprise because it looked and moved in complete
synchronicity with the child.
Moreover, we postulate that the increased avatar anthropo-
morphism offered by the different ASRs induces increased
playfulness (i.e., controlling the avatar for purposes beyond
the intended activity, such as making it dance). This playful-
ness often takes shape by children instructing the avatar to
perform exuberant movements; movements with much greater
range of motion than the normally selecting items on-screen to
answer the questions. We observed that children played with
the moderate ASR at the greatest frequency and propose that
children find it more fun to play with than the high ASR. In
addition, the Eureka! moments that accompanied use of the
moderate ASR also incite increased around of playfulness. In
summary, we infer that children move the least when using
low ASR based on prior experience with similar cursors and
lack of affordances leading to minimal play; and they move
the most using moderate ASR because it requires the most
exploration in order to become familiar with it’s movement
patterns and stimulates avatar playfulness.
Implications for research
Contemporary CCI research is mainly centred on subjective
measures, such as motivation [81, 87] and enjoyment [81] to
evaluate children’s experience with educational games. Re-
searchers rarely leverage the full capacities of MMD to assess
children’s experiences. However, recent studies suggest that
MMD are capable of providing actionable insights of chil-
dren’s experiences [71, 26, 62]. The present study is one of
the first to use a wide range of MMD (mobile eye-tracking
data, wrist-data, facial videos and Kinect skeleton data) to
examine children’s experience. When children were initially
asked to wear the equipment, they wanted to know more about
the purpose of the study and the functionality of the equipment.
Therefore, the researchers had to spend time explaining the
purpose and functionalities of the sensing devices. Most of the
children (and parents) had never seen anything similar before
and were excited to wear the glasses. Therefore, when plan-
ning to utilise sensing devices in your research, we highlight
that it is not enough to describe the details in the consent form.
Rather, it is extremely important to engage in discussion with
the children and parents to explaining the rationale and added
value of such data collections.
Utilising MMD in CCI research also has some practical chal-
lenges. Children’s constant movements during the activity, and
the fact that these devices (e.g., wristbands, glasses and other
wearable) are mainly designed for adults (e.g., size, weight,
tolerance), need to be considered during the research design.
Difficulties aside, we found that contemporary sensing devices
(Empatica E4 and Tobii glasses) have the capacity to be used
with children. Therefore, despite the additional investment
in discussions concerning equipment, device calibration, and
gameplay issues that might arise (e.g., stopping if the child
feels tired and/or removing the equipment if need be), con-
temporary wearables and ubiquitous sensing devices are a
valuable solution that enable the unveiling of rich interactions
in the context of CCI.
Implications for design
We offer a collection of analysis driven design implications.
Reduce Fatigue with Low ASR: We observed that children’s
fatigue was lowest when using low ASR. Thus we recommend
to use cursor style avatars in cases where educational MBTG
are intended to be played for a long duration. For example,
children that might need lots of practice on a given topic, like
fractions [53, 43, 65].
Maximise Total Movement with Moderate ASR: To promote
increased total body movement, it is important to utilise a
moderate degree of ASR. As seen with Marvy Learns, use of
moderate ASR encourages children to engage in additional
overall movement through exploration to develop increased
understanding of the avatar’s movement patterns and through
playful mimicry. Though the former might be attributed to
novelty effect, in many cases, the latter playfulness was ob-
served throughout all three rounds of children’s Marvy Learns
gameplay. Thus, we do not suspect novelty to be the root cause.
However, it should be highlighted that greater total movement
comes at the trade-off of increased fatigue, as discussed in the
section 6.1. This is particularly relevant to MBTGs designed
to address children’s physical/occupational therapy needs.
Increase Engagement by Increasing ASR: Children’s arousal
(as measured by EDA [50, 21, 12]) increased in tandem with
their avatar’s anthropomorphism. Arousal may be a critical
precursor to learning and flow [28, 31]. Thus, the benefits
coupled with using high ASR could be utilised, particularly
for educational games that concern topics that are typically
viewed by young children as difficult or less engaging.
Reduce Stress by Increasing ASR: Contrary to the case of
arousal, stress levels decrease as the degree of ASR increases.
Since stress can have the capacity to negate/deter children’s
learning and flow [54, 69] in order to endorse continous en-
gagement, we reccommend to pair high ASR with already
difficulty topics to manage children’s stress levels. However,
based on the work of Bhattacharya et al. [9], this recommen-
dation might not extend to certain demographics. Thus, it
is advised to be mindful of any special needs or concerns
belonging to children’s demographic.
Trade-off Between Engagement, Stress and Cognitive Load:
A key consideration that emerged was the trade-off between
children’s arousal (engagement) and stress, and their cognitive
load. Using high ASR demonstrated the capacity to create de-
sired gameplay/learning conditions (i.e., high engagement and
low stress [57, 79, 54, 69]) at the expense of high cognitive
load. Though cognitive load is not negative, per say, high cog-
nitive load situations can potentially lead to cognitive overload,
which can have detrimental effects on task completion [24, 23].
Furthermore, children each manage cognitive load differently,
thus we caution designers to consider incorporating choice of
ASR to accommodate these nuances.
Limitations
The findings of this paper support our initial proposition that
the inherent benefits of sensing technologies (e.g., automatic,
pervasive, temporal measures) have the capacity to support
CCI research. However, our findings are subject to certain
limitations. The participants of our study were 8–12 years
old children, representing an appropriate sample for our study
because we wanted children that could effectively read and
provide cognitively-processed responses. However, younger
or older populations might produce slightly different results
and their needs in ASR might differ. To test our RQ, we
conducted an in-situ study; such studies produce data of high
ecological validity, but are vulnerable to potential disruptions
and noise. In our case, however such disruptions were very
rare as we had an isolated space in the science museum and
school, in turn, the data-quality was very high.
Measuring children’s experience via MMD involves inference,
and inference involving complex psycho-physiological con-
structs, involves a degree of error. We captured four different
data sources. We selected state-of-the-art sensing devices
(e.g., Tobii glasses, Empatica E4), and data streams that have
been used to infer various learning and user experience-related
constructs in previous works (e.g., [71, 26, 62]). Thus, al-
though different methodological decisions might have had
a slight impact on the results, our approach followed valid
and time-tested devices and variables. Consideration of ad-
ditional data-streams (e.g., audio) and data collections (e.g.,
interviews) may have offered additional insights, however, this
doesn’t weaken the results of this study. Finally, in our ap-
proach we used various measurements to portray children’s
experience (i.e., the dependent variables). These measure-
ments are widely used, and their selection was grounded in the
literature; however, it is arguable that different variables could
have been used. Thus, although we followed an ecological,
but also accurate, research design, we understand that other
methodological decisions may have played an important role
in the results. However, our methodology includes a robust
set of data streams that are common to contemporary HCI and
learning sciences research. Lastly, we recognise that our work
constitutes a single short term study with findings determined
through children’s first time game play impressions. Future
work, including longitudinal studies, are needed to determine
if the findings hold true over time, and investigate the effect
of ASR on children’s learning experiences from new contexts
and perspectives.
CONCLUSION
In this paper, we ask how the degree of ASR relates to chil-
dren’s affect and behaviour in educational MBTGs? We moti-
vate our research question with relevant literature and present
an experiment that investigates the relationship between three
degrees of ASR and children’s affective and behavioural states,
in the context of educational MBTG. We conclude that ASR
has an impact of children’s arousal, stress, focus, cognitive
load, total body movement and fatigue. Although our work
does not focus on learning gains, per say, (i.e., we did not
perform pre or post assessment) our work highlights the value
of ASR in educational MBTG and uncovers the profound ca-
pacities of using MMD in research addressing children’s em-
bodied learning experiences. We provide guidelines to direct
educational MBTG designers to create more informed learn-
ing experiences for children. Lastly, our findings emphasise
the need for additional studies to investigate how ASR impacts
children’s behavioural, interaction, cognitive and learning pro-
cesses.
SELECTION AND PARTICIPATION OF CHILDREN
All the study’s participants were students from public schools
in Trondheim, Norway. The study took place at a science mu-
seum (Vitensenteret) and a primary school, in rooms strictly
designated to the experimental setup. Data related to the study
were collected after approval from the national Data Protection
Official for Research (Norsk Senter for Forskningsdata), fol-
lowing all the regulations and recommendations for research
with children. A researcher contacted the teacher and legal
guardian of each child to obtain written consent permitting the
data collection. Children were informed about the data collec-
tion process and their participation in the study was completely
voluntary. In addition, children were able to withdraw their
consent for the data collection at any time without affecting
their participation in the activity.
ACKNOWLEDGEMENTS
We would like to thank Kinems Inc. for providing for free user
licenses of its educational gaming platform as well as access
to the skeletal data.
REFERENCES
[1] Takayuki Adachi, Masafumi Goseki, Keita Muratsu,
Hiroshi Mizoguchi, Miki Namatame, Masanori
Sugimoto, Fusako Kusunoki, Etsuji Yamaguchi,
Shigenori Inagaki, and Yoshiaki Takeda. 2013. Human
SUGOROKU: full-body interaction system for students
to learn vegetation succession. In Proceedings of the
12th International Conference on Interaction Design
and Children. ACM, 364–367.
[2] Efthimios Alepis. 2011. AFOL: Towards a new
intelligent interactive programming language for
children. In Intelligent Interactive Multimedia Systems
and Services. Springer, 199–208.
[3] Brandon Amos, Bartosz Ludwiczuk, Mahadev
Satyanarayanan, and others. 2016. Openface: A
general-purpose face recognition library with mobile
applications. CMU School of Computer Science 6
(2016).
[4] Domna Banakou, Raphaela Groten, and Mel Slater.
2013. Illusory ownership of a virtual child body causes
overestimation of object sizes and implicit attitude
changes. Proceedings of the National Academy of
Sciences 110, 31 (2013), 12846–12851.
[5] Maria Bannert, Inge Molenaar, Roger Azevedo, Sanna
Järvelä, and Dragan Gaševi´
c. 2017. Relevance of
learning analytics to measure and support students’
learning in adaptive educational technologies. In
Proceedings of the Seventh International Learning
Analytics & Knowledge Conference. ACM, 568–569.
[6] Laura Bartoli, Clara Corradi, Franca Garzotto, and
Matteo Valoriani. 2013. Exploring motion-based
touchless games for autistic children’s learning. In
Proceedings of the 12th international conference on
interaction design and children. ACM, 102–111.
[7]
Laura Bartoli, Franca Garzotto, Mirko Gelsomini, Luigi
Oliveto, and Matteo Valoriani. 2014. Designing and
evaluating touchless playful interaction for ASD
children. In Proceedings of the 2014 conference on
Interaction design and children. ACM, 17–26.
[8]
Amy L Baylor. 2011. The design of motivational agents
and avatars. Educational Technology Research and
Development 59, 2 (2011), 291–300.
[9] Arpita Bhattacharya, Mirko Gelsomini, Patricia
Pérez-Fuster, Gregory D Abowd, and Agata Rozga.
2015. Designing motion-based activities to engage
students with autism in classroom settings. In
Proceedings of the 14th International Conference on
Interaction Design and Children. ACM, 69–78.
[10] Max V Birk, Cheralyn Atkins, Jason T Bowey, and
Regan L Mandryk. 2016. Fostering intrinsic motivation
through avatar identification in digital games. In
Proceedings of the 2016 CHI Conference on Human
Factors in Computing Systems. ACM, 2982–2995.
[11] Matthew P Black, Daniel Bone, Marian E Williams,
Phillip Gorrindo, Pat Levitt, and Shrikanth Narayanan.
2011. The usc care corpus: Child-psychologist
interactions of children with autism spectrum disorders.
In Twelfth Annual Conference of the International
Speech Communication Association.
[12] Boyan Bontchev and Dessislava Vassileva. 2016.
Assessing engagement in an emotionally-adaptive
applied game. In Proceedings of the Fourth
International Conference on Technological Ecosystems
for Enhancing Multiculturality. ACM, 747–754.
[13] Trevor S Breusch and Adrian R Pagan. 1979. A simple
test for heteroscedasticity and random coefficient
variation. Econometrica: Journal of the Econometric
Society (1979), 1287–1294.
[14] Stephen Bronack, Richard Riedl, and John Tashner.
2006. Learning in the zone: A social constructivist
framework for distance education in a 3-dimensional
virtual world. Interactive Learning Environments 14, 3
(2006), 219–232.
[15] Ricardo Buettner. 2013. Cognitive workload of humans
using artificial intelligence systems: towards objective
measurement applying eye-tracking technology. In
Annual Conference on Artificial Intelligence. Springer,
37–48.
[16] Caitlyn Clabaugh, Fei Sha, Gisele Ragusa, and Maja
Mataric. 2015. Towards a personalized model of number
concepts learning in preschool children. In Proceedings
of the ICRA Workshop on Machine Learning for Social
Robotics, Seattle, WA, USA. 26–30.
[17] Jonathan Cohen. 2001. Defining identification: A
theoretical look at the identification of audiences with
media characters. Mass communication & society 4, 3
(2001), 245–264.
[18]
Ciera Crowell. 2018. Analysis of Interaction Design and
Evaluation Methods in Full-Body Interaction for Special
Needs. In 23rd International Conference on Intelligent
User Interfaces. ACM, 673–674.
[19] Ciera Crowell, Batuhan Sayis, Juan Pedro Benitez, and
Narcis Pares. 2019. Mixed Reality, Full-Body
Interactive Experience to Encourage Social Initiation for
Autism: Comparison with a Control Nondigital
Intervention. Cyberpsychology, Behavior, and Social
Networking (2019).
[20]
Wim De Neys and Walter Schaeken. 2007. When people
are more logical under cognitive load: Dual task impact
on scalar implicature. Experimental psychology 54, 2
(2007), 128–133.
[21] Elena Di Lascio, Shkurta Gashi, and Silvia Santini.
2018. Unobtrusive Assessment of Students’ Emotional
Engagement During Lectures Using Electrodermal
Activity Sensors. Proceedings of the ACM on Interactive,
Mobile, Wearable and Ubiquitous Technologies 2, 3
(2018), 103.
[22] Michele D Dickey. 2005. Three-dimensional virtual
worlds and distance learning: two case studies of Active
Worlds as a medium for distance education. British
journal of educational technology 36, 3 (2005),
439–451.
[23] Michel Fayol, Pierre Largy, and Patrick Lemaire. 1994.
Cognitive overload and orthographic errors: When
cognitive overload enhances subject–verb agreement
errors. A study in French written language. The
Quarterly Journal of Experimental Psychology 47, 2
(1994), 437–464.
[24] Julia R Fox, Byungho Park, and Annie Lang. 2007.
When available resources become negative resources:
The effects of cognitive overload on memory sensitivity
and criterion bias. Communication Research 34, 3
(2007), 277–296.
[25] Franca Garzotto, Mirko Gelsomini, Luigi Oliveto, and
Matteo Valoriani. 2014. Motion-based touchless
interaction for ASD children: a case study. In
Proceedings of the 2014 International Working
Conference on Advanced Visual Interfaces. ACM,
117–120.
[26] M. N. Giannakos, S. Papavlasopoulou, and K. Sharma.
2020. Monitoring Children’s Learning Through
Wearable Eye-Tracking: The Case of a Making-Based
Coding Activity. IEEE Pervasive Computing (2020),
1–12. DOI:
http://dx.doi.org/10.1109/MPRV.2019.2941929
[27] Michail N Giannakos, Kshitij Sharma, Sofia
Papavlasopoulou, Ilias O Pappas, and Vassilis Kostakos.
2020. Fitbit for learning: Towards capturing the learning
experience using wearable sensing. International
Journal of Human-Computer Studies 136 (2020),
102384.
[28] MP Jacob Habgood and Shaaron E Ainsworth. 2011.
Motivating children to learn effectively: Exploring the
value of intrinsic integration in educational games. The
Journal of the Learning Sciences 20, 2 (2011), 169–206.
[29] Gary Hackbarth, Varun Grover, and Y Yi Mun. 2003.
Computer playfulness and anxiety: positive and negative
mediators of the system experience effect on perceived
ease of use. Information & management 40, 3 (2003),
221–232.
[30] P EKMAN-WV FRIESEN-JC HAGER. 2002. Facial
Action Coding System. The Manual On CD ROM.
(2002).
[31] Juho Hamari, David J Shernoff, Elizabeth Rowe,
Brianno Coller, Jodi Asbell-Clarke, and Teon Edwards.
2016. Challenging games help students learn: An
empirical study on engagement, flow and immersion in
game-based learning. Computers in human behavior 54
(2016), 170–179.
[32]
Noriaki Harada. 2002. Cold-stress tests involving finger
skin temperature measurement for evaluation of vascular
disorders in hand-arm vibration syndrome: review of the
literature. International archives of occupational and
environmental health 75, 1-2 (2002), 14–19.
[33] Joanne L Harbluk, Y Ian Noy, Patricia L Trbovich, and
Moshe Eizenman. 2007. An on-road assessment of
cognitive distraction: Impacts on drivers’ visual
behavior and braking performance. Accident Analysis &
Prevention 39, 2 (2007), 372–379.
[34] Uri Hasson, Orit Furman, Dav Clark, Yadin Dudai, and
Lila Davachi. 2008. Enhanced intersubject correlations
during movie viewing correlate with successful episodic
encoding. Neuron 57, 3 (2008), 452–462.
[35] Katherine A Herborn, James L Graves, Paul Jerem,
Neil P Evans, Ruedi Nager, Dominic J McCafferty, and
Dorothy EF McKeegan. 2015. Skin temperature reveals
the intensity of acute stress. Physiology & behavior 152
(2015), 225–230.
[36] Bruce D Homer, Charles K Kinzer, Jan L Plass,
Susan M Letourneau, Dan Hoffman, Meagan Bromley,
Elizabeth O Hayward, Selen Turkay, and Yolanta
Kornak. 2014. Moved to learn: The effects of
interactivity in a Kinect-based literacy game for
beginning readers. Computers & Education 74 (2014),
37–49.
[37]
Juan Pablo Hourcade, Alissa Antle, Lisa Anthony, Jerry
Fails, O Iversen, Elisa Rubegni, Mikael Skov, Petr
Slovak, Greg Walsh, Anja Zeising, and others. 2018a.
Child-computer interaction, ubiquitous technologies,
and big data. interactions 25, 6 (2018), 78–81.
[38] Juan Pablo Hourcade, Anja Zeising, Ole Sejer Iversen,
Mikael B Skov, Alissa N Antle, Lisa Anthony,
Jerry Alan Fails, and Greg Walsh. 2018b.
Child-Computer Interaction SIG: Ubiquity and Big
Data–A Changing Technology Landscape for Children.
In Extended Abstracts of the 2018 CHI Conference on
Human Factors in Computing Systems. ACM, SIG07.
[39] Hui-mei Justina Hsu. 2011. The potential of kinect in
education. International Journal of Information and
Education Technology 1, 5 (2011), 365.
[40] Katherine Isbister, Mike Karlesky, Jonathan Frye, and
Rahul Rao. 2012. Scoop!: a movement-based math
game designed to reduce math anxiety. In CHI’12
extended abstracts on human factors in computing
systems. ACM, 1075–1078.
[41] Keri Johnson, Jebediah Pavleas, and Jack Chang. 2013.
Kinecting to mathematics through embodied
interactions. Computer 46, 10 (2013), 101–104.
[42] Marcel A Just, Patricia A Carpenter, and Jacqueline D
Woolley. 1982. Paradigms and processes in reading
comprehension. Journal of experimental psychology:
General 111, 2 (1982), 228.
[43] Constance Kamii and Faye B Clark. 1995. Equivalent
fractions: Their difficulty and educational implications.
The Journal of Mathematical Behavior 14, 4 (1995),
365–378.
[44] Dominic Kao and D Fox Harrell. 2018. The Effects of
Badges and Avatar Identification on Play and Making in
Educational Games. In Proceedings of the 2018 CHI
Conference on Human Factors in Computing Systems.
ACM, 600.
[45]
Kristina Knaving and Staffan Björk. 2013. Designing for
fun and play: exploring possibilities in design for
gamification. In Proceedings of the first International
conference on gameful design, research, and
applications. ACM, 131–134.
[46] Maria Kourakli, Ioannis Altanis, Symeon Retalis,
Michail Boloudakis, Dimitrios Zbainos, and Katerina
Antonopoulou. 2017. Towards the improvement of the
cognitive, motoric and academic skills of students with
special educational needs using Kinect learning games.
International Journal of Child-Computer Interaction 11
(2017), 28–39.
[47] Chronis Kynigos, Zacharoula Smyrnaiou, and Maria
Roussou. 2010. Exploring rules and underlying concepts
while engaged with collaborative full-body games. In
Proceedings of the 9th International Conference on
Interaction Design and Children. ACM, 222–225.
[48] Elwin Lee, Xiyuan Liu, and Xun Zhang. 2012. Xdigit:
An arithmetic kinect game to enhance math learning
experiences. Retrieved February 14 (2012), 2013.
[49]
Dominik Leiner, Andreas Fahr, and Hannah Früh. 2012.
EDA positive change: A simple algorithm for
electrodermal activity to measure general audience
arousal during media exposure. Communication
Methods and Measures 6, 4 (2012), 237–250.
[50]
Todd P Levine, Elisabeth Conradt, Matthew S Goodwin,
Stephen J Sheinkopf, and Barry Lester. 2014.
Psychophysiological arousal to social stress in autism
spectrum disorders. Comprehensive guide to autism
(2014), 1177–1193.
[51]
Han-Chin Liu, Meng-Lung Lai, and Hsueh-Hua Chuang.
2011. Using eye-tracking technology to investigate the
redundant effect of multimedia web pages on viewers’
cognitive processes. Computers in human behavior 27, 6
(2011), 2410–2417.
[52] Christian S Loh. 2012. Information trails: In-process
assessment of game-based learning. In Assessment in
game-based learning. Springer, 123–144.
[53]
Hugues Lortie-Forgues, Jing Tian, and Robert S Siegler.
2015. Why is learning fraction and decimal arithmetic
so difficult? Developmental Review 38 (2015), 201–221.
[54] Victoria Luine, Miriam Villegas, Carlos Martinez, and
Bruce S McEwen. 1994. Repeated stress causes
reversible impairments of spatial memory performance.
Brain research 639, 1 (1994), 167–170.
[55] Heidy Maldonado and Clifford Nass. 2007. Emotive
characters can make learning more productive and
enjoyable: it takes two to learn to tango. Educational
Technology (2007), 33–38.
[56] Manon Marinussen and Alwin de Rooij. 2019. Being
yourself to be creative: How using self-similar avatars
can support the generation of original ideas in virtual
environments. In ACM Creativity and Cognition 2019.
[57] Gerald Matthews, Sian E Campbell, Shona Falconer,
Lucy A Joyner, Jane Huggins, Kirby Gilliland, Rebecca
Grier, and Joel S Warm. 2002. Fundamental dimensions
of subjective state in performance settings: Task
engagement, distress, and worry. Emotion 2, 4 (2002),
315.
[58] Camille McCue. 2008. Tween Avatars: What do online
personas convey about their makers?. In Society for
Information Technology & Teacher Education
International Conference. Association for the
Advancement of Computing in Education (AACE),
3067–3072.
[59] Behnaz Nojavanasghari, Charles E Hughes, and
Louis-Philippe Morency. 2017. Exceptionally social:
Design of an avatar-mediated interactive system for
promoting social skills in children with autism. In
Proceedings of the 2017 CHI Conference Extended
Abstracts on Human Factors in Computing Systems.
ACM, 1932–1939.
[60] Ayumi Ohnishi, Kaoru Saito, Tsutomu Terada, and
Masahiko Tsukamoto. 2017. Toward Interest Estimation
from Head Motion Using Wearable Sensors: A Case
Study in Story Time for Children. In International
Conference on Human-Computer Interaction. Springer,
353–363.
[61] A Olsen. 2012. The tobii i-vt fixation filter: Algorithm
description [white paper]. Retrieved from Tobii
Technology from http://www. tobiipro.
com/siteassets/tobiipro/learn-and-support/analyze/how-
do-we-classify-eyemovements/tobii-pro-i-vtfixation-
filter. pdf
(2012).
[62] Sofia Papavlasopoulou, Kshitij Sharma, Michail
Giannakos, and Letizia Jaccheri. 2017. Using
eye-tracking to unveil differences between kids and
teens in coding activities. In Proceedings of the 2017
Conference on Interaction Design and Children. ACM,
171–181.
[63] Ilias Pappas, Kshitij Sharma, Patrick Mikalef, and
Michail Giannakos. 2018. Visual Aesthetics of
E-Commerce Websites: An Eye-Tracking Approach.
(2018).
[64] Luis P Prieto, Kshitij Sharma, Yun Wen, and Pierre
Dillenbourg. 2015. The burden of facilitating
collaboration: towards estimation of teacher
orchestration load using eye-tracking measures.
International Society of the Learning Sciences,
Inc.[ISLS].
[65] Lauren B Resnick, Pearla Nesher, Francois Leonard,
Maria Magone, Susan Omanson, and Irit Peled. 1989.
Conceptual bases of arithmetic errors: The case of
decimal fractions. Journal for research in mathematics
education (1989), 8–27.
[66]
Alfred P Rovai. 2002. Building sense of community at a
distance. The International Review of Research in Open
and Distance Learning 3, 1 (2002).
[67] J Patrick Royston. 1982. An extension of Shapiro and
Wilk’s W test for normality to large samples. Journal of
the Royal Statistical Society: Series C (Applied
Statistics) 31, 2 (1982), 115–124.
[68] Elisa Rubegni, Vito Gentile, Alessio Malizia, Salvatore
Sorce, and Niko Kargas. 2019. Child-display interaction:
exploring avatar-based touchless gestural interfaces. In
Proceedings of the 8th ACM International Symposium
on Pervasive Displays. ACM, 23.
[69] Joseph Sharit and Sara J Czaja. 1994. Ageing,
computer-based task performance, and stress: issues and
challenges. Ergonomics 37, 4 (1994), 559–577.
[70] Kshitij Sharma, Zacharoula Papamitsiou, and Michail
Giannakos. 2019a. Building pipelines for educational
data using AI and multimodal analytics: A “grey-box”
approach. British Journal of Educational Technology
(2019).
[71] Kshitij Sharma, Sofia Papavlasopoulou, and Michail
Giannakos. 2019b. Joint Emotional State of Children
and Perceived Collaborative Experience in Coding
Activities. In Proceedings of the 18th ACM International
Conference on Interaction Design and Children. ACM,
133–145.
[72]
AC Smit and JAM Van Gisbergen. 1989. A short-latency
transition in saccade dynamics during square-wave
tracking and its significance for the differentiation of
visually-guided and predictive saccades. Experimental
Brain Research 76, 1 (1989), 64–74.
[73] Carmen Petrick Smith, Barbara King, and Jennifer
Hoyte. 2014. Learning angles through movement:
Critical actions for developing understanding in an
embodied activity. The Journal of Mathematical
Behavior 36 (2014), 95–108.
[74] Priyashri K Sridhar, Samantha WT Chan, and Suranga
Nanayakkara. 2018. Going beyond performance scores:
understanding cognitive-affective states in
kindergarteners. In Proceedings of the 17th ACM
Conference on Interaction Design and Children. ACM,
253–265.
[75] Marianne B Staempfli. 2007. Adolescent playfulness,
stress perception, coping and well being. Journal of
Leisure Research 39, 3 (2007), 393–412.
[76] Anthony Steed, Ye Pan, Fiona Zisch, and William
Steptoe. 2016. The impact of a self-avatar on cognitive
load in immersive virtual reality. In 2016 IEEE Virtual
Reality (VR). IEEE, 67–76.
[77] Yin-Leng Theng and Paye Aung. 2012. Investigating
effects of avatars on primary school children’s affective
responses to learning. Journal on Multimodal user
interfaces 5, 1-2 (2012), 45–52.
[78] Katherine R Thorson, Tessa V West, and Wendy Berry
Mendes. 2018. Measuring physiological influence in
dyads: A guide to designing, implementing, and
analyzing dyadic physiological studies. Psychological
methods 23, 4 (2018), 595.
[79] Mattie Tops and Maarten AS Boksem. 2010. Absorbed
in the task: Personality measures predict engagement
during task performance as tracked by error negativity
and asymmetrical frontal activity. Cognitive, Affective, &
Behavioral Neuroscience 10, 4 (2010), 441–453.
[80] Sabine Trepte and Leonard Reinecke. 2010. Avatar
creation and video game enjoyment. Journal of Media
Psychology (2010).
[81] Chih-Hsiao Tsai, Yin-Hao Kuo, Kuo-Chung Chu, and
Jung-Chuan Yen. 2015. Development and evaluation of
game-based learning system using the Microsoft Kinect
sensor. International Journal of Distributed Sensor
Networks 11, 7 (2015), 498560.
[82] Tzu-Wei Tsai, Hsiao Yu Lo, and Kai-Shao Chen. 2012.
An affective computing approach to develop the
game-based adaptive learning material for the
elementary students. In Proceedings of the 2012 Joint
International Conference on Human-Centered Computer
Environments. ACM, 8–13.
[83]
Pieter JA Unema, Sebastian Pannasch, Markus Joos, and
Boris M Velichkovsky. 2005. Time course of
information processing during scene perception: The
relationship between saccade amplitude and fixation
duration. Visual cognition 12, 3 (2005), 473–494.
[84] Asimina Vasalou, Adam Joinson, Tanja Bänziger, Peter
Goldie, and Jeremy Pitt. 2008. Avatars in social media:
Balancing accuracy, playfulness and embodied
messages. International Journal of Human-Computer
Studies 66, 11 (2008), 801–811.
[85] Paul Wallace and James Maryott. 2009. The impact of
avatar self-representation on collaboration in virtual
worlds. Innovate: Journal of Online Education 5, 5
(2009).
[86] Bernard Lewis Welch. 1951. On the comparison of
several mean values: an alternative approach.
Biometrika 38, 3-4 (1951), 330–336.
[87] Kelly Yap, Clement Zheng, Angela Tay, Ching-Chiuan
Yen, and Ellen Yi-Luen Do. 2015. Word out!: learning
the alphabet through full body interactions. In
Proceedings of the 6th Augmented Human International
Conference. ACM, 101–108.
[88]
Nick Yee, Jeremy N Bailenson, and Nicolas Ducheneaut.
2009. The Proteus effect: Implications of transformed
digital self-representation on online and offline behavior.
Communication Research 36, 2 (2009), 285–312.
[89] Serdar Yildirim and Shrikanth Narayanan. 2009.
Recognizing child’s emotional state in problem-solving
child-machine interactions. In Proceedings of the 2nd
Workshop on Child, Computer and Interaction. ACM,
14.
... The CCI community has engaged in discussions about the promised benefits and ethical issues of sensing and logging technologies [14]. In particular, studies have demonstrated the use of motion sensors (e.g., Microsoft Kinect, Leap Motion) that are capable of detecting body movements (walking, standing, balancing) and gestures (grasping, pointing, clicking) in touchless games [19,20,23]. Another example by Cuturi et al. [3], illustrates how visual and haptic/proprioceptive sensory information can support children's skills acquisition (e.g., mental rotation, 2D-3D transformation, and percentages). ...
... Another example by Cuturi et al. [3], illustrates how visual and haptic/proprioceptive sensory information can support children's skills acquisition (e.g., mental rotation, 2D-3D transformation, and percentages). Therefore, leveraging sensing technologies has the potential to amplify children's interaction and learning capacities, with several benefits (e.g., enhanced understanding of children's cognitive and affective processes, development of real-time artificially intelligent systems to best support the embodied grounding of curricula) [1,23]. In order to achieve successful adoption of MSEs for use with young user groups, teachers should be considered co-users. ...
... We thereby suggested that future research could intentionally integrate story-relevant avatars created through the generative AI to probe children's mental projection [130,131] in daily lives, which can help them to be better understood by their parents and therapists. These avatars might be applied in educational games for children to increase personal relevance or overcome mental challenges in the learning process [50,58]. Moreover, the avatar can be designed to serve as a conversational agent, providing companionship and fostering growth alongside children [125,127]. ...
... These monitoring systems utilize video-based facial expression analysis, emotion recognition from the human voice, and eyetracking technology. Eye-tracking, which captures eye movements and determines focus positions, has been used in interface design studies and can measure engagement, arousal, stress, and fatigue during user interaction [26]. Mapping facial expressions and tracking emotional responses have shown promise in understanding user experience and improving interaction with platforms and content. ...
Article
Full-text available
This study addresses the limited research on the user experience of e-learning platforms, specifically focusing on cognitive and emotional outcomes. The study proposes a non-invasive method to assess the emotional effects associated with e-learning platforms. The experiment involved 23 university students and compared the effectiveness of a real-time face and eye detection methodology (MIORA) with a retrospective questionnaire (SAM) to understand the emotional responses during user-platform interaction. Usability issues were intentionally introduced in the system to observe students' emotional reactions and examine the consistency between the two tools. The results confirmed the hypothesis that real-time non-invasive tools for assessing emotional reactions are more comprehensive and reliable compared to the SAM questionnaire. These tools also allow for dynamic adaptations to the site's usability and interface based on the student's emotional reactions, potentially enhancing satisfaction, and learning outcomes. These findings provide valuable insights for future research on the impact of emotional responses to e-learning platforms on user experience and learning outcomes. Ultimately, this study lays a foundation for effectively assessing the emotional outcomes of e-learning to improve online and hybrid education.
... Among other user groups and needs, sensing technologies have become particularly useful for facilitating children's play and learning in natural and meaningful ways [30]. Technologically enabled embodied interaction activities have proven their potential to enhance children's learning [3,21]. The employed interaction modalities are especially relevant for learning when orchestrated together and go beyond representationalist modes of cognition in favor of enactivist accounts [2,7]. ...
Conference Paper
Full-text available
Nowadays, learning activities have become more interactive and collaborative than ever before. However, it remains unclear what makes the group perform differently in such a learning context. With the empowerment of multimodal data (MMD), we conducted a field study involving 12 groups of children who collaborated during two-day-long classroom activities. This paper reports on a quantitative analysis and temporal explanation concerning the relation between children’s performance and their group-level MMD measurements during a collaborative coding session in a design thinking activity. We computed each group’s performance based on the created artefacts and compared the groups with better performance than the others. The results demonstrate that high-performing groups show more joint engagement, joint visual attention, and joint emotional intensity of delight, while low-performing groups show significantly more joint emotional intensity of frustration. In addition, the evolution over the four temporal phases showed different patterns between high and low-performing groups. Finally, this paper discusses design and theoretical implications for educators, researchers and practitioners.
Article
Many technological interventions have leveraged physical activity (PA, i.e., activities that involve whole-body movements) to provide young people with active and productive learning experiences. However, there is a lack of systematic understanding of how PA can serve as a learning medium—for example, how PA can support learning and how to design technologies to support PA-based learning. This paper conducts a systematic literature review (N = 141) of PA-based, technology-mediated learning experiences for young people with a focus on how PA supports learning, what technologies are involved, and the associated challenges. Through content analysis, we identified four approaches of how PA and learning were combined (i.e., PA embodied learning content, served as a functional input method for learning tasks, guided learners through different learning sites, and generated data for learning activities) and supporting technologies like full-body interaction learning environments and mobile apps. However, many challenges might arise, such as balancing learning and PA, as well as the scalability and reliability of technologies. We conclude with a discussion and reflection on design implications for more PA-based learning experiences and technologies. Overall, this paper provides a systematic overview of the different ways to design physically active learning experiences for young people and can serve as a reference for future designs of physically active learning experiences and technologies.
Article
Through emerging technologies, it is possible to use hand gestures to interact with computing systems in the form of embodied human-computer interaction (eHCI). There is much research done on improving gesture recognition accuracy and on understanding the factors that influence intuitive gesture choice; however, there is a lack of work investigating how to design the interface for 3D gestural interactions. Therefore, a between subjects experimental study was done to study the effect of interface design (e.g, 3D vs. 2D) on intuitive gesture choice and cognitive load for performing an embodied interaction. Two out of ten functions had the same intuitive gesture function mapping for 2D and 3D conditions. However, many of the functions had different mappings between the two different display types. The results illustrate the differences in embodied interactions between 2D and 3D interfaces, and future work should investigate the interface design comprehensively.
Article
Full-text available
Learning activities for/with children include rich interactions with peers, tutors, and learning materials (in digital or physical form). During such activities, children gain new knowledge and master their skills. Automatized and continuous monitoring of children's learning is a complex task, but, if efficient, can greatly enrich teaching and learning. Wearable devices, such as eye-tracking glasses, have the capacity to continuously and unobtrusively monitor children's interactions, and such interactions might be capable of predicting children's learning. In this article, we set out to quantify the extent to which children's gaze, captured with eye-tracking glasses, can predict their learning. To do so, we collected data from a case study with 44 children (8-17 years old) during a making-based coding activity. Our analysis shows that children's gaze can predict their learning with 15.79% error. Our results also identify the most important gaze measures with respect to children's learning, and pave the way for new research in this area.
Article
Full-text available
The assessment of learning during class activities mostly relies on standardized questionnaires to evaluate the efficacy of the learning design elements. However, standardized questionnaires pose additional strain on students, do not provide “temporal” information during the learning experience, require considerable effort and language competence, and sometimes are not appropriate. To overcome these challenges, we propose using wearable devices, which allow for continuous and unobtrusive monitoring of physiological parameters during learning. In this paper we set out to quantify how well we can infer students’ learning experience from wrist-worn devices capturing physiological data. We collected data from 31 students in 93 class sessions (3 class sessions per student), and our analysis shows that wrist data can predict the learning experience with 11% error. We also show that 6.25 min (SD = 3.1 min) of data are needed to achieve a reliable estimate (i.e., 13.8% error). Our work highlights the benefits and limitations of utilizing wearable devices to assess learning experiences. Our findings help shape the future of quantified-self technologies in learning by pointing out the substantial benefits of physiological sensing for self-monitoring, evaluation, and metacognitive reflection in learning.
Article
Full-text available
Students' on‐task engagement during adaptive learning activities has a significant effect on their performance, and at the same time, how these activities influence students' behavior is reflected in their effort exertion. Capturing and explaining effortful (or effortless) behavior and aligning it with learning performance within contemporary adaptive learning environments, holds the promise to timely provide proactive and actionable feedback to students. Using sophisticated machine learning (ML) algorithms and rich learner data, facilitates inference‐making about several behavioral aspects (including effortful behavior) and about predicting learning performance, in any learning context. Researchers have been using ML methods in a “black‐box” approach, ie, as a tool where the input data is the learner data and the output is a given class from the chosen construct. This work proposes a methodological shift from the “black‐box” approach to a “grey‐box” approach that bridges the hypothesis/literature‐driven (feature extraction) “white‐box” approach with the computation/data‐driven (feature fusion) “black‐box” approach. This will allow us to utilize data features that are educationally and contextually meaningful. This paper aims to extend current methodological paradigms, and puts into practice the proposed approach in an adaptive self‐assessment case study taking advantage of new, cutting‐edge, interdisciplinary work on building pipelines for educational data, using innovative tools and techniques. Practitioner Notes What is already known about this topic Capturing and measuring learners' engagement and behavior using physiological data has been explored during the last years and exhibits great potential. Effortless behavioral patterns commonly exhibited by learners, such as “cheating,” “guessing” or “gaming the system” counterfeit the learning outcome. Multimodal data can accurately predict learning engagement, performance and processes. What this paper adds Generalizes a methodology for building machine learning pipelines for multimodal educational data, using a modularized approach, namely the “grey‐box” approach. Showcases that fusion of eye‐tracking, facial expressions and arousal data provide the best prediction of effort and performance in adaptive learning settings. Highlights the importance of fusing data from different channels to obtain the most suited combinations from the different multimodal data streams, to predict and explain effort and performance in terms of pervasiveness, mobility and ubiquity. Implications for practice and/or policy Learning analytics researchers shall be able to use an innovative methodological approach, namely the “grey‐box,” to build machine learning pipelines from multimodal data, taking advantage of artificial intelligence capabilities in any educational context. Learning design professionals shall have the opportunity to fuse specific features of the multimodal data to drive the interpretation of learning outcomes in terms of physiological learner states. The constraints from the educational contexts (eg, ubiquity, low‐cost) shall be catered using the modularized gray‐box approach, which can also be used with standalone data sources.
Conference Paper
Full-text available
This paper employs facial features to recognize emotions during a coding activity with 50 children. Extracting group-level emotional states via facial features, allows us to understand how emotions of a group affect collaboration. To do so, we captured joint emotional state using videos and collaborative experience using questionnaires, from collaborative coding sessions. We define groups' emotional state using a method inspired from dynamic systems, utilizing a measure called cross-recurrence. We also define a collaborative emotional profile using the different measurements from facial features of children. The results show that the emotional cross recurrence (coming from the videos) is positively related with the collaborative experience (coming from the surveys). We also show that the groups with better experience than the others showcase more positive and a consistent set of emotions during the coding activity. The results inform the design of an emotion-aware collaborative support system.
Conference Paper
Full-text available
Cognitive-affective states during learning or interactions with technologies is dependent on the mental effort of the learner and / or the cognitive load imposed by the system. Despite the growing research on the importance of understanding cognitive-affective state and their relationship to learning, measurement of such states during the learning process in Kindergartners is still unclear. While most assessments of learning and usability evaluations with Kindergartners focus on performance, self-reports and inferring from observable behaviours, they provide limited insights into the cognitive load and emotional state during the learning or interaction that are essential for a holistic picture of learning. Through a study with 18 kindergartners, we explore the feasibility of understanding cognitive-affective states associated with mental effort by triangulating the data obtained from observations, physiological markers, self-reports and performance as they performed tasks of varying mental effort. We present findings on the reliable markers within these sources across tasks. Results reveal that such a triangulation offers deeper insights into the cognitive-affective state of the learner. We believe this work would be a step towards better understanding of the learning process thereby facilitating instruction that is more aligned with the learner's cognitive-affective architecture as well as establishing guidelines for comprehensive usability / evaluation processes based on well-defined associations between child behaviour and child action.
Article
Despite a proliferation in digital intervention tools for autism, many studies lack comparison with standard intervention tools, and are not evaluated with objective and standardized measures. In this article, we present research on the potential of mixed reality (MR) experiences using full-body interaction to foster social initiation behaviors in children with autism while playing with a child without autism, in a face-to-face colocated configuration. The primary goal was to test whether practicing socialization in a virtual environment catered toward individuals with autism spectrum disorders (ASDs) could be a way to reduce anxiety while simultaneously forming collaborative behavioral patterns. Building on the results of a preliminary study, this second phase compares our system with a typical LEGO social intervention strategy using construction tools and toys as an aid to the psychologist, therapist, or caregiver. Results are based on four data sources: (a) video coding of the externally observed behaviors during the video-recorded play sessions, (b) log files of our system showing the events triggered and the real-time decisions taken, (c) physiologic data (heart rate variability and electrodermal activity) gathered through child-appropriate wearable, (d) and a standardized anxiety questionnaire. The results obtained show that the MR setting generated as many social initiations as the control condition, and no significant difference existed in the reported anxiety levels of the children after playing in the two conditions.
Conference Paper
During the last decade, touchless gestural interfaces have been widely studied as one of the most promising interaction paradigms in the context of pervasive displays. In particular, avatars and silhouettes have been proved to be effective in communicating the touchless gestural interactivity supported by displays. In the paper, we take a child-display interaction perspective by exploring avatar-based touchless gestural interfaces. We believe that large displays offer an opportunity to stimulate child experience and engagement, for instance when learning about art, as well as bringing a number of challenges. The purpose of this study is twofold: 1) identifying the relevant aspects of children's interactions with a large display based on a touchless avatar-based interface, and 2) understanding the impact on recalling the content that arises from the interaction. We engaged 107 children over a period of five days during a public event at the university premises. Collected data were analyzed, and the outcomes transformed into three lessons learnt for informing the future design.
Article
In this forum we celebrate research that helps to successfully bring the benefits of computing technologies to children, older adults, people with disabilities, and other populations that are often ignored in the design of mass-marketed products. --- Juan Pablo Hourcade, Editor
Article
Modern wearable devices enable the continuous and unobtrusive monitoring of human physiological parameters, including heart rate and electrodermal activity. Through the definition of adequate models these parameters allow to infer the wellbeing, empathy, or engagement of humans in different contexts. In this paper, we show that off-the-shelf wearable devices can be used to unobtrusively monitor the emotional engagement of students during lectures. We propose the use of several novel features to capture students' momentary engagement and use existing methods to characterize the general arousal of students and their physiological synchrony with the teacher. To evaluate our method we collect a data set that -- after data cleaning -- contains data from 24 students, 9 teachers, and 41 lectures. Our results show that non-engaged students can be identified with high reliability. Using a Support Vector Machine, for instance, we achieve a recall of 81% -- which is a 25 percentage points improvement with respect to a Biased Random classifier. Overall, our findings may inform the design of systems that allow students to self-monitor their engagement and act upon the obtained feedback. Teachers could profit of information about non-engaged students too to perform self-reflection and to devise and evaluate methods to (re-)engage students.
Conference Paper
This SIG will provide child-computer interaction researchers and practitioners an opportunity to discuss topics related to challenges brought about by the increasing ubiquity of computing in children's lives, including the collection, and use of "big data". Topics include control and ownership of children's data, the impact of personalization on inclusion, the proper role for the quantification of children's lives, and the educational needs of children growing up in a society with ubiquitous computing and big data.