Figure 1 - uploaded by Lucas Paletta
Content may be subject to copyright.
3D gaze recovery: eye tracking glasses localize human fixations in video (above). The user’s view frustum and gaze pointer is reconstructed in the 3D environment model. 

3D gaze recovery: eye tracking glasses localize human fixations in video (above). The user’s view frustum and gaze pointer is reconstructed in the 3D environment model. 

Source publication
Conference Paper
Full-text available
Understanding and estimating human attention in different interactive scenarios is an important part of human computer interaction. With the advent of wearable eye-tracking glasses and Google glasses, monitoring of human visual attention will soon become ubiquitous. The presented work describes the precise estimation of human gaze fixations with re...

Context in source publication

Context 1
... work presents a novel methodology that enables to precisely estimate the position and orientation of human view frustum and gaze and from this enables to precisely analyze human attention in the context of the semantics of the local environment (objects, signs, scenes, etc.). Figure 1 visualizes how accurately human gaze is mapped into the 3D model for further analysis. ...

Similar publications

Article
Full-text available
Altered attentional processing of pain-associated stimuli–which might take the form of either avoidance or enhanced vigilance–is thought to be implicated in the development and maintenance of chronic pain. In contrast to reaction time tasks like the dot probe, eye tracking allows for tracking the time course of visual attention and thus differentia...
Conference Paper
Full-text available
The image memorability consists in the faculty of an image to be recalled after a period of time. Recently, the memorability of an image database was measured and some factors responsible for this memorability were highlighted. In this paper, we investigate the role of visual attention in image memorability around two axis. The first one is experim...
Article
Full-text available
Mobile eye-tracking research has provided evidence both on teachers' visual attention in relation to their intentions and on teachers’ student-centred gaze patterns. However, the importance of a teacher’s eye-movements when giving instructions is unexplored. In this study we used mobile eye-tracking to investigate six teachers’ gaze patterns when t...
Article
Full-text available
Background Eye tracking technology is receiving increased attention in the field of virtual reality. Specifically, future gaze prediction is crucial in pre-computation for many applications such as gaze-contingent rendering, advertisement placement, and content-based design. To explore future gaze prediction, it is necessary to analyze the temporal...
Conference Paper
Full-text available
An increasing number of domains, including aeronautics, are adopting touchscreens. However, several drawbacks limit their operational use, in particular, eyes-free interaction is almost impossible making it difficult to perform other tasks simultaneously. We introduce GazeForm, an adaptive touch interface with shape-changing capacity that offers an...

Citations

... Blascheck et al. [8] as well as Sundstedt and Garro [56] provide a literature review on visualizations for 3D gaze data. The majority of presented works, however, concentrated only on the visualization of gaze recorded from a physical environment ( [47], [43], [45]) or from virtual environments, which are desktop-based ( [55]) or in head-mounted virtual reality (VR) displays ( [16], [50], [48]). The visualization techniques presented in these different works often consisted of heatmaps to show the distribution of visual attention or scan paths displaying the fixations and saccades of the gaze data or the derived gaze trajectories, i.e., the scanpaths. ...
Conference Paper
The use of augmented reality technology to support humans with situated visualization in complex tasks such as navigation or assembly has gained increasing importance in research and industrial applications. One important line of research regards supporting and understanding collaborative tasks. Analyzing collaboration patterns is usually done by conducting observations and interviews. To expand these methods, we argue that eye tracking can be used to extract further insights and quantify behavior. To this end, we contribute a study that uses eye tracking to investigate participant strategies for solving collaborative sorting and assembly tasks. We compare participants’ visual attention during situated instructions in AR and traditional paper-based instructions as a baseline. By investigating the performance and gaze behavior of the participants, different strategies for solving the provided tasks are revealed. Our results show that with situated visualization, participants focus more on task-relevant areas and require less discussion between collaboration partners to solve the task at hand.
... They show how the problem can be solved from the software side, by reconstructing a bronze statue. Other options can be approaches based on creating a 3D model with the help of a scanner or an RGB-D camera like Microsoft Kinect as suggested by or Paletta et al. (2013). ...
Article
Full-text available
Mobile eye tracking helps to investigate real-world settings, in which participants can move freely. This enhances the studies’ ecological validity but poses challenges for the analysis. Often, the 3D stimulus is reduced to a 2D image (reference view) and the fixations are manually mapped to this 2D image. This leads to a loss of information about the three-dimensionality of the stimulus. Using several reference images, from different perspectives, poses new problems, in particular concerning the mapping of fixations in the transition areas between two reference views. A newly developed approach (MAP3D) is presented that enables generating a 3D model and automatic mapping of fixations to this virtual 3D model of the stimulus. This avoids problems with the reduction to a 2D reference image and with transitions between images. The x, y and z coordinates of the fixations are available as a point cloud and as .csv output. First exploratory application and evaluation tests are promising: MAP3D offers innovative ways of post-hoc mapping fixation data on 3D stimuli with open-source software and thus provides cost-efficient new avenues for research.
... Maurus et al. (2014) analyze a more complex 3D scene as stimuli, not only single 3D objects, and proposed a more accurate projection of the fixation data over the 3D geometry taking into consideration also the occlusions that each 3D element of the scene produces from the point of view of the observer. In Paletta et al. (2013), fixation hits with the 3D geometry are projected onto a 3D model with colormap with increasing values going through white, yellow, and red. Attentional maps have been proposed not only in terms of surfaces but also volumes. ...
Article
Full-text available
This systematic literature review presents an update on developments in 3D visualization techniques and analysis tools for eye movement data in 3D environments. With the introduction of affordable and non-intrusive eye-tracking solutions to the mass market, access to users' gaze is now increasingly possible. As a result, the adoption of eye-tracking in virtual environments using head-mounted displays is expected to increase since the trend is to incorporate gaze tracking as part of new technical solutions. The systematic literature review presented in this paper was conducted using the Scopus database (using the period 2017 to 17th of May 2022), which after analysis, resulted in the inclusion of 15 recent publications with relevance in eye-tracking visualization techniques for 3D virtual scenes. First, this paper briefly describes the foundations of eye-tracking and traditional 2D visualization techniques. As background, we also list earlier 3D eye-tracking visualization techniques identified in a previous review. Next, the systematic literature review presents the method used to acquire the included papers and a description of these in terms of eye-tracking technology, observed stimuli, application context, and type of 3D gaze visualization techniques. We then discuss the overall findings, including opportunities, challenges, trends, and present ideas for future directions. Overall the results show that eye-tracking in immersive virtual environments is on the rise and that more research and developments are needed to create novel and improved technical solutions for 3D gaze analysis.
... In this work we present a method which aims to reduce the necessary resources involved in the process of annotating VOIs in interactive three-dimensional scenes. While other approaches either require additional motion tracking systems [Paletta et al. 2013;Pfeiffer et al. 2016], preexisting data sets [Sattar et al. 2017;Steil et al. 2018] or manually annotated data [Kurzhals et al. 2015;Toyama et al. 2012] to address the annotation problem, all we need is a computer aided design (CAD) model or another virtual representation of the scene. This makes our approach particularly attractive for all disciplines where eye-tracking studies are conducted either on virtual objects, in virtual environments or on real objects with digital twins, which is the case for most use cases in virtual-reality (VR) and augmented-reality (AR) settings. ...
... Here, many markers would be necessary to ensure their visibility from every possible viewing angles, while representing a disturbing factor as they expand the visually perceptible content (i.e. in product design evaluations tasks). When it comes to interactive three-dimensional scenes, geometry-based approaches [Paletta et al. 2013;Pfeiffer and Renner 2014;Pfeiffer et al. 2016] define the state-of-the-art. ...
Preprint
Virtual-reality (VR) and augmented-reality (AR) technology is increasingly combined with eye-tracking. This combination broadens both fields and opens up new areas of application, in which visual perception and related cognitive processes can be studied in interactive but still well controlled settings. However, performing a semantic gaze analysis of eye-tracking data from interactive three-dimensional scenes is a resource-intense task, which so far has been an obstacle to economic use. In this paper we present a novel approach which minimizes time and information necessary to annotate volumes of interest (VOIs) by using techniques from object recognition. To do so, we train convolutional neural networks (CNNs) on synthetic data sets derived from virtual models using image augmentation techniques. We evaluate our method in real and virtual environments, showing that the method can compete with state-of-the-art approaches, while not relying on additional markers or preexisting databases but instead offering cross-platform use.
... Hence, in terms of ecological validity, VR directly allows to investigate cognitive and emotional processes under realistic conditions and compare them to laboratory settings. Although there has been research on 3D attentional processing (e.g., Paletta et al. 2013) as well as comparisons between 2D and 3D experiences (e.g., Rooney and Hennessy 2013), up to our knowledge, this is the first experiment to make a comparison between real life and the laboratory using the exact same experimental design. This means that we showed all participants the same video, thereby excluding any variance in the playing sequence, which inevitably would occur in reality. ...
Article
Full-text available
Virtual reality (VR) might increase the ecological validity of psychological studies as it allows submerging into real-life experiences under controlled laboratory conditions. We intended to provide empirical evidence for this claim at the example of the famous invisible gorilla paradigm (Simons and Chabris in Perception, 28(9), 1059-1074, 1999). To this end, we confronted one group of participants with a conventional 2D-video of two teams passing basketballs. To the second group of participants, we presented the same stimulus material as a 3D360°-VR-video and to a third group as a 2D360°-VR-video. Replicating the original findings, in the video condition, only ~ 30% of the participants noticed the gorilla. However, in both VR-conditions, the detection rate was increased to ~ 70%. The illusion of spatial proximity in VR enhances the salience of the gorilla, thereby enhancing the noticing rate. VR mimics the perceptual characteristics of the real world and provides a useful tool for psychological studies.
... Some authors have presented 3D attention visualization and gaze tracking methods [25], [26]. In these approaches, camera-or RGB-D-camera-based SLAM methods, such as those presented in [27], [28] were used to build 3D maps. ...
Conference Paper
Full-text available
In this study, we attempt to establish the numerical safety criteria for negotiating blind corners in personal mobility vehicles (PMVs). Safety should be the most important consideration in designing autonomous PMVs. However, determining the suitable trade-off between safety and speed is a weighty concern because speed is significantly compromised when performing overly safe navigation. We analyze the driving behavior of a robotic PMV operated by a human driver. The robotic PMV can measure the driver's gaze, and allows us to recognize both the pose of the PMV and the driver's visual attention on a 3D map. As a result, the occluded areas for the driver can be estimated. Then, potential colliding hazard obstacles (PCHOs) are simulated based on the occlusion. PCHOs refer to occluded obstacles that the driver encounters suddenly with which he cannot avoid collision. The participants of our experiments were one skillful and three non-skilled ones. Experimental results demonstrate that similar PCHOs are observed even when the driving styles of the participants are different. Additionally, the existence of a boundary that distinguishes expected and unexpected obstacles is indicated by investigating the parameters of the PCHOs. Finally, we conclude that the boundary could be utilized as a numerical criterion for ensuring safety while negotiating blind corners.
... The extrinsics were then smoothed using a Kalman filter using the robot_localization ROS package [Moore and Stouch 2014]. One extension to this work is to consider alternate methods of extrinsic calibration; using monocular SLAM for gaze data has shown some success [Paletta et al. 2013b;Wang et al. 2018]. However, since in this situation the tag grid is already present, more advanced techniques were not necessary. ...
Conference Paper
Human-robot collaboration systems benefit from recognizing people's intentions. This capability is especially useful for collaborative manipulation applications, in which users operate robot arms to manipulate objects. For collaborative manipulation, systems can determine users' intentions by tracking eye gaze and identifying gaze fixations on particular objects in the scene (i.e., semantic gaze labeling). Translating 2D fixation locations (from eye trackers) into 3D fixation locations (in the real world) is a technical challenge. One approach is to assign each fixation to the object closest to it. However, calibration drift, head motion, and the extra dimension required for real-world interactions make this position matching approach inaccurate. In this work, we introduce velocity features that compare the relative motion between subsequent gaze fixations and a finite set of known points and assign fixation position to one of those known points. We validate our approach on synthetic data to demonstrate that classifying using velocity features is more robust than a position matching approach. In addition, we show that a classifier using velocity features improves semantic labeling on a real-world dataset of human-robot assistive manipulation interactions.
... SMI glasses were also involved in the work ofPaletta et al. (2013), who used them in combination with Microsoft Kinect. A 3D model of the environment was ac-quired with Microsoft Kinect and gaze positions captured by the SMI glasses were mapped onto the 3D model. ...
Thesis
Full-text available
The main aim of this dissertation was to discover the differences between user aspects of interactive 3D maps and static 3D maps and to examine the way in which users with different levels of experience work with interactive 3D maps. First, a literature review was conducted. Based on this review, the requirements for application enabling user testing of interactive 3D maps were defined. The testing application was then implemented using an iterative approach (according to a spiral model). The developed application 3DmoveR (3D Movement and Interaction Recorder) is based on a combination of the user logging method, a digital questionnaire and practical spatial tasks. From this application, the 3DtouchR (3D Touch Interaction Recorder) and 3DgazeR (3D Gaze Recorded) variants were derived. Nine partial experiments were carried out. Four of these experiments were exploratory and verified the functionality of the applications and demonstrated the ability to analyse and visualise the recorded data. Two experiments compared static and interactive 3D maps, while one compared real-3D and pseudo-3D visualisations. In two experiments the performances of different user groups (experts on geography versus laypersons, digital natives versus digital immigrants) were compared when working with interactive 3D maps. Experiences with the design and realisation of user testing of interactive 3D visualisations, which were gained during these experiences, were summarised, and a list of recommendations for interactive 3D visualisation user testing and analysis of these data was formulated. A decision tree for selecting appropriate methods of user interaction and virtual movement analysis was created. The main findings regarding 3D maps are as follows: • Interactive 3D maps are suitable when a correct decision is needed and there is no time pressure with regard to the decision-making speed. • Interactive 3D maps are suitable for more complex tasks. • Interactive 3D maps are more suitable for experts on geospatial data. • Real-3D visualisation increases the accuracy of user responses when working with static 3D maps, but this difference is less significant when working with interactive 3D maps. In general, the benefits of interactive 3D maps are influenced by the purpose of the map, the map use conditions, the type and complexity of the map tasks and the map users. These outcomes are relevant, for example, when deploying interactive 3D maps in the fields of crisis management or geography education. There is also a clear recommendation for future user studies: If the experimental results should be generalised to interactive 3D maps and virtual reality, interactive 3D maps should be used as stimuli in this user studies.
... Figure 3: Since two estimated vectors in 3D space coming from the right (n R ) and left (n L ) eyes will most likely not intersect, the midpoint of the shortest segment between gaze rays (in red) is a common measure of gaze estimation for geometric-based models. [Paletta et al. 2013] have both used a head-mounted setup with an RGB-D egocentric camera, but they limited themselves to perform only a 2D calibration step for posterior analysis of gaze data overlaid in depth images, thus incurring on ambiguities associated with the lack of calibration to the scene volume. ...
Conference Paper
Full-text available
Most applications involving gaze-based interaction are supported by estimation techniques that find a mapping between gaze data and corresponding targets on a 2D surface. However, in Virtual and Augmented Reality (AR) environments, interaction occurs mostly in a volumetric space, which poses a challenge to such techniques. Accurate point-of-regard (PoR) estimation, in particular, is of great importance to AR applications, since most known setups are prone to parallax error and target ambiguity. In this work, we expose the limitations of widely used techniques for PoR estimation in 3D and propose a new calibration procedure using an uncalibrated head-mounted binocular eye tracker coupled with an RGB-D camera to track 3D gaze within the scene volume. We conducted a study to evaluate our setup with real-world data using a geometric and an appearance-based method. Our results show that accurate estimation in this setting still is a challenge, though some gaze-based interaction techniques in 3D should be possible.
... In order to translate these gaze estimates to world-centric coordinates, the position and orientation of the eye tracker must be estimated dynamically as the user moves. This problem has only begun to attract investigation [Paletta et al. 2013]. ...
... The algorithm also allows for a wider range of head movements than previously reported by [Wang et al. 2017], who requested that the subjects keep their heads still, and by [Hennessey and Lawrence 2009], who reported results for head movements only over the range of 3.2 × 9.2 × 14 cm (horizontal×vertical×depth). Finally, by using SLAM to dynamically estimate head pose and map the environment based on the past trajectory and image data, we expect that our system will be more robust to changes in the environment than approaches which use SLAM to obtain a precomputed static model of the environment and localize the camera using image matching with the eye tracker's scene camera image [Paletta et al. 2013]. Fig. 1 shows the system architecture. ...
Conference Paper
Past work in eye tracking has focused on estimating gaze targets in two dimensions (2D), e.g. on a computer screen or scene camera image. Three-dimensional (3D) gaze estimates would be extremely useful when humans are mobile and interacting with the real 3D environment. We describe a system for estimating the 3D locations of gaze using a mobile eye tracker. The system integrates estimates of the user's gaze vector from a mobile eye tracker, estimates of the eye tracker pose from a visual-inertial simultaneous localization and mapping (SLAM) algorithm, a 3D point cloud map of the environment from a RGB-D sensor. Experimental results indicate that our system produces accurate estimates of 3D gaze over a much larger range than remote eye trackers. Our system will enable applications, such as the analysis of 3D human attention and more anticipative human robot interfaces.