ArticlePDF Available

Vincent C., Soroli E., Engemann H., Hendriks H. & Hickmann M. (2018). Tobii or not Tobii? Assessing the validity of eye tracking data - challenges and solutions. Journal of Eye Movement research, 11(5): 7. DOI: 10.16910/jemr.11.5.7

Authors:
Vincent C., Soroli E., Engemann H., Hendriks H. & Hickmann M. (2018). Tobii or not Tobii? Assessing the validity
of eye tracking data - challenges and solutions. Journal of Eye Movement research, 11(5): 7. DOI:
10.16910/jemr.11.5. Online access.
Tobii or not Tobii?
Assessing the validity of eyetracking data: Challenges and solutions
Coralie Vincent¹, Efstathia Soroli², Helen Engemann³, Henriëtte Hendriks⁴, & Maya Hickmann¹ ¹ CNRS
& University of Paris 8, France ² CNRS & University of Lille, France ³ University of Mannheim,
Germany ⁴ University of Cambridge, UK
Eye tracking (ET) methods become more and more popular in psycholinguistic research because they
offer the possibility to record visual processing in real time, allowing for the study of the relation
between cognition and language, two systems often considered independent (Pinker, 1994).
In order to evaluate the impact of specific language properties on online visual processing, we coupled
a production task with an ET paradigm. A total of 473 native speakers of two typologically different
languages (234 English and 239 French) within three age groups (142 sevenyearold children, 155
tenyearold children, and 176 adults) were tested in a production task involving 36 dynamic motion
scenes (videos), that first had to be visually explored and then verbally described (e.g., “a man running
up a hill”).
With respect to the ET data, which is the main focus of the present paper, and in order to properly
compare the gaze patterns of the groups, a thorough validity check (preprocessing and quality
assessment) was necessary. Indeed, validity is an issue that is almost never addressed in
psycholinguistic research, even though an increasing number of researchers report it as one of the
main sources of methodological bias (Holmqvist et al., 2011). Apart from the fact that a recording may
include segments that are irrelevant for the analysis (e.g., eye blinks, offscreen fixations), it has been
found that low quality data may misleadingly point to group differences in gaze behaviour, for instance
between adults and children (Wass et al., 2014). More specifically, low precision due to incorrect gaze
detection may “flatten out” the gaze distribution across different areas of interest (AOIs) or across
groups, while low robustness (i.e., resulting from missing or fragmented data) can make visit durations
seem shorter than they actually are, and thus bias interpretation of results.
The present paper compares results obtained with a turnkey solution (namely, Tobii Studio) to results
obtained with inhouse developed algorithms that: (a) carefully discard irrelevant parts of the
recording; (b) exclude gaze initiation latencies; and (c) detect and compensate for spatial inaccuracies
of the ET data. The findings show that turnkey solutions may be only relevant for some designs (i.e.,
more appropriate for static/picture material). However, designadapted validity checks (preprocessing
the recordings and quality assessment) as well as targetrelated compensations of inaccuracies, as
proposed in this paper, are crucial and should be common practice for researchers who wish to
compare gaze patterns or to evaluate group differences objectively. Challenges related to typical ET
measures, such as gaze proportions to different dynamic AOI and visit durations are also discussed as
they seem to be sensitive and subject to change due to validityrelated factors.
References
Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., & Weijer, J. van de. (2011). Eye
tracking: A comprehensive guide to methods and measures (1st ed.). New York, NY: Oxford University Press.
Pinker, S. (1994). The language instinct: How the mind creates language. New York, NY: William Morrow &
Company.
Wass, S. V., Forssman, L., & Leppänen, J. (2014). Robustness and precision: How data quality may influence
key dependent variables in infant eyetracker analyses. Infancy, 19(5), 427460.
... ET is a method of continuously measuring and recording eye movements as a person interacts with a stimulus in real time, with the aim of knowing what a person has seen (Halszka et al., 2017;Becker et al., 2021;Jarodzka et al., 2021;Wang, 2022) based on detecting the pupil and tracking the corneal reflection (Huang, 2018;Vincent et al., 2018). Two distinct methods are used to measure and analyze people's eye movements. ...
... These elements will be used to establish several indicators (Guerdelli et al., 2008;Vincent et al., 2018;Ju, 2019;Cilia et al., 2021;Loignon, 2021) the main ones of which are detailed below. Before listing them, it is worth clarifying the concept of Area of Interest (AOI). ...
Article
Full-text available
This study explores the visual strategies of University Supervisor Trainers (UST) for teachers [Upper Secondary Education Teaching Certification—Agrégation de l’Enseignement Secondaire Supérieur (AESS)] in French-speaking Belgium and the pre-service teachers (PT) they train. It aims to understand how these two groups observe a teaching situation, on video, using an eye-tracking device. The video shows the start of a geography lesson given by a trainee in a primary school class. Three research questions were formulated, examining (a) the actor observed (the trainee, the pupil working groups and 4 pupil profiles present in the scene), (b) the visual strategies used to access these actors, and (c) the visual itineraries when a planning error by the trainee is presented on the screen. To answer, we chose to carry out an analysis based on oculometric indicators (fixing, visit, and first view). The results show that UST and PT focus their attention on the same groups of students. However, they do not do so in the same way. UST adopt visual strategies that are distinct from those of PT, thus aligning their approaches with those of expert teachers in other studies using eye tracking. Within these strategies, we highlight two important points: (a) the emergence of dynamic and floating visual strategies in the UST, characterized by more frequent revisits (significantly validated) and fixations of shorter duration than in PT; and (b) less fixation of UST in observing students who are very active in class compared to PT. Finally, the specific analysis of the UST gaze itineraries at the time of the trainee’s planning error reflected both common elements (e.g., teaching tools) and divergent elements (e.g., checking pupils).
Article
Full-text available
Holmqvist, K., Nyström, N., Andersson, R., Dewhurst, R., Jarodzka, H., & Van de Weijer, J. (Eds.) (2011). Eye tracking: a comprehensive guide to methods and measures, Oxford, UK: Oxford University Press.
Article
In recent years, eye-tracking has become a popular method for drawing conclusions about infant cognition. Relatively little attention has been paid, however, to methodological issues associated with infant eye-tracking. Here, we consider the possibility that systematic differences in the quality of raw eye-tracking data obtained from different populations and individuals might create the impression of differences in gaze behavior, without this actually being the case. First, we show that lower quality eye-tracking data are obtained from populations who are younger and populations who are more fidgety and that data quality declines during the testing session. Second, we assess how these differences in data quality might influence key dependent variables in eye-tracking analyses. We show that lower precision data can appear to suggest a reduced likelihood to look at the eyes in a face relative to the mouth. We also show that less robust tracking may manifest as slower reaction time latencies (e.g., time to first fixation). Finally, we show that less robust data can manifest as shorter first look/visit duration. We argue that data quality should be reported in all analyses of infant eye-tracking data and/or that steps should be taken to control for data quality before performing final analyses.