Figure 2 - uploaded by Jeffrey B. Mulligan
Content may be subject to copyright.
Schematic diagram showing the geometry used in the simulation.

Schematic diagram showing the geometry used in the simulation.

Source publication
Conference Paper
Full-text available
We describe a system designed to monitor the gaze of a user working naturally at a computer workstation. The system consists of three cameras situated between the keyboard and the monitor. Free head movements are allowed within a three-dimensional volume approximately 40 centimeters in diameter. Two fixed, wide-field "face" cameras equipped with ac...

Context in source publication

Context 1
... simula- tions differ from the real situation in several (hopefully in- significant) ways: first, the camera geometry is only a rough approximation to the actual setup; second, in our model steering system, the centers of rotation for pan and tilt are coincident (in the actual scanner, the two mirrors are sepa- rated by about a centimeter); and, finally, the face cameras are modeled as pinhole cameras, with no lens distortion. The geometry used in the simulation is illustrated in figure 2. For our model system, the origin of the coordinate system is at the center of rotation of the scanning system, sending the laser beam along the % axis for pan and tilt angles of 0. The projection screen is modeled as a plane normal to the % axis, which is placed at a series of linearly spaced depths ranging from 30 to 60 centimeters. ...

Similar publications

Article
Full-text available
For flight training, head-worn displays represent low-cost, wide field of regard, deployable systems when compared to traditional simulation facilities. However, current head-worn systems provide limited effective fields of view. Wide field of view alternatives promise to increase transfer of training effectiveness through enhanced situation awaren...

Citations

... Multiple methods have been implemented in an attempt to solve the gaze tracking problem under different contexts, and they primarily consist of video-based systems that track the pupil of the subject. Two-dimensional regression methods that identify pupil contours and corneal reflections (PCCR) have been used to estimate gaze using simplistic single camera setups [28][29][30][31]; however, these methods assume little to no translational motion of the eye. Cross-ratio methods use projective transformation matrices to map the camera image space to the illumination source space [32,33], but they are vulnerable to axial motion from the subject [34]. ...
Article
Full-text available
Optical coherence tomography (OCT) has revolutionized diagnostics in ophthalmology. However, OCT requires a trained operator and patient cooperation to carefully align a scanner with the subject’s eye and orient it in such a way that it images a desired region of interest at the retina. With the goal of automating this process of orienting and aligning the scanner, we developed a robot-mounted OCT scanner that automatically aligned with the pupil while matching its optical axis with the target region of interest at the retina. The system used two 3D cameras for face tracking and three high-resolution 2D cameras for pupil and gaze tracking. The tracking software identified 5 degrees of freedom for robot alignment and ray aiming through the ocular pupil: 3 degrees of translation (x, y, z) and 2 degrees of orientation (yaw, pitch). We evaluated the accuracy, precision, and range of our tracking system and demonstrated imaging performance on free-standing human subjects. Our results demonstrate that the system stabilized images and that the addition of gaze tracking and aiming allowed for region-of-interest specific alignment at any gaze orientation within a 28° range.
... Gaze focalization is obtained by the intersection of the gaze vector with the object [16][17][18][19][20]. The later ones [21][22][23][24][25] map the image features to the gaze coordinates using neural networks, as in [26], or polynomials, as in [27]. Most of these works use IR lights to obtain a stable gaze, whose usage for a long period of time causes dryness and fatigue in the eyes [28]. ...
Article
Full-text available
Monitoring driver attention using the gaze estimation is a typical approach used on road scenes. This indicator is of great importance for safe driving, specially on Level 3 and Level 4 automation systems, where the take over request control strategy could be based on the driver’s gaze estimation. Nowadays, gaze estimation techniques used in the state-of-the-art are intrusive and costly, and these two aspects are limiting the usage of these techniques on real vehicles. To test this kind of application, there are some databases focused on critical situations in simulation, but they do not show real accidents because of the complexity and the danger to record them. Within this context, this paper presents a low-cost and non-intrusive camera-based gaze mapping system integrating the open-source state-of-the-art OpenFace 2.0 Toolkit to visualize the driver focalization on a database composed of recorded real traffic scenes through a heat map using NARMAX (Nonlinear AutoRegressive Moving Average model with eXogenous inputs) to establish the correspondence between the OpenFace 2.0 parameters and the screen region the user is looking at. This proposal is an improvement of our previous work, which was based on a linear approximation using a projection matrix. The proposal has been validated using the recent and challenging public database DADA2000, which has 2000 video sequences with annotated driving scenarios based on real accidents. We compare our proposal with our previous one and with an expensive desktop-mounted eye-tracker, obtaining on par results. We proved that this method can be used to record driver attention databases.
... As the main advantage, the usage of eyes as input [15] allows those users who, due to disease or physiological status, cannot use standard interfaces such as a joystick or a keyboard, to interact with other people or the environment quite efficiently. The main disadvantage affecting this technology is the sensibility to various factors such as light condition, the color of the iris, head movement [16,17]. In this context, [18][19][20] analyzed these elements and provided a more detailed characterization of the measurement process together with a method to assess and compensate [21,22] for uncertainty in eye tracking. ...
Article
Full-text available
Any severe motor disability is a condition that limits the ability to interact with the environment, even the domestic one, caused by the loss of control over one’s mobility. This work presents RoboEYE, a power wheelchair designed to allow users to move easily and autonomously within their homes. To achieve this goal, an innovative, cost-effective and user-friendly control system was designed, in which a non-invasive eye tracker, a monitor, and a 3D camera represent some of the core elements. RoboEYE integrates functionalities from the mobile robotics field into a standard power wheelchair, with the main advantage of providing the user with two driving options and comfortable navigation. The most intuitive and direct modality foresees the continuous control of frontal and angular wheelchair velocities by gazing at different areas of the monitor. The second, semi-autonomous modality allows navigation toward a selected point in the environment by just pointing and activating the wished destination while the system autonomously plans and follows the trajectory that brings the wheelchair to that point. The purpose of this work was to develop the control structure and driving interface designs of the aforementioned driving modalities taking into account also uncertainties in gaze detection and other sources of uncertainty related to the components to ensure user safety. Furthermore, the driving modalities, in particular the semi-autonomous one, were modeled and qualified through numerical simulations and experimental verification by testing volunteers, who are regular users of standard electric wheelchairs, to verify the efficiency, reliability and safety of the proposed system for domestic use. RoboEYE resulted suitable for environments with narrow passages wider than 1 m, which is comparable with a standard domestic door and due to its properties with large commercialization potential.
... At present, the methods of mapping equations are mainly divided into two categories: appearancebased methods [1][2][3][4][5][6] and feature-based gaze estimation methods [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25]. The interpolation-based mapping method does not need to consider the geometric relationship of scene plane and camera calibration [1]. ...
Article
Full-text available
In recent years, the gaze estimation system, as a new type of human-computer interaction technology, has received extensive attention. The gaze estimation model is one of the main research contents of the system. The quality of the model will directly affect the accuracy of the entire gaze estimation system. To achieve higher accuracy even with simple devices, this paper proposes an improved mapping equation model based on homography transformation. In the process of experiment, the model mainly uses the “Zhang Zhengyou calibration method” to obtain the internal and external parameters of the camera to correct the distortion of the camera, and uses the LM(Levenberg-Marquardt) algorithm to solve the unknown parameters contained in the mapping equation. After all the parameters of the equation are determined, the gaze point is calculated. Different comparative experiments are designed to verify the experimental accuracy and fitting effect of this mapping equation. The results show that the method can achieve high experimental accuracy, and the basic accuracy is kept within 0.6∘. The overall trend shows that the mapping method based on homography transformation has higher experimental accuracy, better fitting effect and stronger stability.
... These algorithms commonly rely on one or more criteria (e.g., thresholds) to establish if a given feature is present or not; by means of the threshold values encoded as parameters, the user can ''tune'' the tracking process. Different features can be chosen depending on the images' spectrum: for example, intensity gradients allow algorithms to detect the pupil contour in infrared spectrum images [13] or the limbus in visible spectrum images [14], [15]. ...
Article
Full-text available
Pupil detection plays a key role in eye and gaze video-based tracking algorithms. Various algorithms have been proposed through the years in order to improve the performances or the robustness in real-world scenarios. However, the development of an algorithm which excels in both execution time and pupil detection precision is still an open challenge. This paper presents a novel, feature-based eye-tracking algorithm for pupil detection. Morphological operators are used to remove corneal reflections and to reduce noise in the pupil area prior to the pupil detection step: this solution allows to significantly reduce the computational overhead without lowering the tracking precision. Moreover, a shape validation step is performed after the elliptical fitting and, if the elliptical shape is not detected properly, a set of additional steps is performed to improve the pupil estimation. The proposed solution, Pupil Detection after Isolation and Fitting (PDIF), has been compared with other state-of-the-art tracking algorithms that use morphological operations such as ElSe (Ellipse Selection) and ExCuSe (Exclusive Curve Selector) to evaluate both speed and robustness; the proposed algorithm has been tested over numerous datasets offering different pupil detection challenges. Obtained results show how PDIF provides comparable tracking precision at a significantly lower computational cost compared to ElSe and ExCuSe.
... There are various configurations of lights and cameras, such as single camera, single light [88,[92][93][94]; single camera, multiple lights [85,[95][96][97][98]; and multiple cameras, multiple lights [30,[99][100][101][102][103]. A complementary practice performed in all gaze tracking schemes is known as calibration. ...
... The regression-based methods (e.g., [27,69,100,[111][112][113][114][115]), on the other hand, map the image features to the gaze coordinates. They either have a nonparametric form, such as in neural networks [113,116] or a specific parametric form, such as polynomials [112,117]. ...
Article
Full-text available
Tracking drivers’ eyes and gazes is a topic of great interest in the research of advanced driving assistance systems (ADAS). It is especially a matter of serious discussion among the road safety researchers’ community, as visual distraction is considered among the major causes of road accidents. In this paper, techniques for eye and gaze tracking are first comprehensively reviewed while discussing their major categories. The advantages and limitations of each category are explained with respect to their requirements and practical uses. In another section of the paper, the applications of eyes and gaze tracking systems in ADAS are discussed. The process of acquisition of driver’s eyes and gaze data and the algorithms used to process this data are explained. It is explained how the data related to a driver’s eyes and gaze can be used in ADAS to reduce the losses associated with road accidents occurring due to visual distraction of the driver. A discussion on the required features of current and future eye and gaze trackers is also presented.
... The appearance-based approach uses the appearance of various detection characteristics such as the original color distribution [41], [42], [43], [44], [45] or the distribution after filtering [46], [47], [48], [49], [50]. The gaze can be estimated using either a modelbased approach [51], [52], [53], [54] or an appearance-based approach [55], [56], [57], [58]. Model-based approaches simulate the physical structure of the human eye and typically consider physiological behaviors such as eyelid movement. ...
Preprint
A head-mounted display (HMD) is a portable and interactive display device. With the development of 5G technology, it may become a general-purpose computing platform in the future. Human-computer interaction (HCI) technology for HMDs has also been of significant interest in recent years. In addition to tracking gestures and speech, tracking human eyes as a means of interaction is highly effective. In this paper, we propose two UnityEyes-based convolutional neural network models, UEGazeNet and UEGazeNet*, which can be used for input images with low resolution and high resolution, respectively. These models can perform rapid interactions by classifying gaze trajectories (GTs), and a GTgestures dataset containing data for 10,200 "eye-painting gestures" collected from 15 individuals is established with our gaze-tracking method. We evaluated the performance both indoors and outdoors and the UEGazeNet can obtaine results 52\% and 67\% better than those of state-of-the-art networks. The generalizability of our GTgestures dataset using a variety of gaze-tracking models is evaluated, and an average recognition rate of 96.71\% is obtained by our method.
... However, in practice, it is necessary to obtain the 2-D gaze point. To avoid additional calculation of point of regard (PoR), the interpolation-based methods [3,8,27,48] directly find the underlying mapping from the eye image features to the gaze coordinates. The relation between pupil and glints (cornea reflection) is the most Relation between pupil and glints popular and widely used for gaze estimation under active light models [5] as shown in Fig. 2. In addition, the pupil centre-eye corners vector is also regarded as a good feature with acceptable accuracy [34]. ...
Article
Full-text available
As modern assistive technology advances, eye-based text entry systems have been developed to help a subset of physically challenged people to improve their communication ability. However, speed of text entry in early eye-typing system tends to be relatively slow due to dwell time. Recently, dwell-free methods have been proposed which outperform the dwell-based systems in terms of speed and resilience, but the extra eye-tracking device is still an indispensable equipment. In this article, we propose a prototype of eye-typing system using an off-the-shelf webcam without the extra eye tracker, in which the appearance-based method is proposed to estimate people’s gaze coordinates on the screen based on the frontal face images captured by the webcam. We also investigate some critical issues of the appearance-based method, which helps to improve the estimation accuracy and reduce computing complexity in practice. The performance evaluation shows that eye typing with webcam using the proposed method is comparable to the eye tracker under a small degree of head movement.
... To maintain high accuracy with remote, feature-based systems while allowing head movement, the image acquisition system must be able to follow the eyes (Mulligan & Gabayan, 2010). Some of the first approaches to achieve this goal made use of multiple cameras and/or multiple illuminators (Beymer & Flickner, 2003;Brolly & Mulligan, 2004;Matsumoto & Zelinsky, 1999;Ohno & Mukawa, 2004;Ohno, Mukawa, & Kawato, 2003;Pérez et al., 2003;Shih, Wu, & Liu, 2000;Yoo & Chung, 2005). Mostly these systems have one camera with a wide field of view to track head movements and another with a small field of view to track the eyes. ...
Article
Full-text available
Following a patent owned by Tobii, the framerate of a CMOS camera can be increased by reducing the size of the recording window so that it fits the eyes with minimum room to spare. The position of the recording window can be dynamically adjusted within the camera sensor area to follow the eyes as the participant moves the head. Since only a portion of the camera sensor data is communicated to the computer and processed, much higher framerates can be achieved with the same CPU and camera. Eye trackers can be expected to present data at a high speed, with good accuracy and precision, small latency and with minimal loss of data while allowing participants to behave as normally as possible. In this study, the effect of headbox adjustments in real-time is investigated with respect to the above-mentioned parameters. It was found that, for the specific camera model and tracking algorithm, one or two headbox adjustments per second, as would normally be the case during recording of human participants, could be tolerated in favour of a higher framerate. The effect of adjustment of the recording window can be reduced by using a larger recording window at the cost of the framerate.
... Several studies analyzed the influence of the order and the number of terms of the polynomial equation on the performance of eye tracking systems. [22][23][24][25][26] Although extensive research has been done to determine the best mapping function, it is not clear whether the conclusions can be generalized to other VOG systems due to the distinct hardware and methodology used in the different studies. ...
... 10 Other interpolationbased methods using one glint obtained an accuracy around 0.8 deg. 26,38 Cerrolaza et al. 24 obtained a considerably better accuracy with two IR LEDs, a second order interpolation equation, the interglint distance to normalize the pupil-glint vectors, and with the patients' head stabilized using a chin rest (0.2 deg horizontally and 0.3 deg vertically). ...
Article
Full-text available
A set of methods in terms of both image processing and gaze estimation for accurate eye tracking is proposed. The eye-tracker used in this study relies on the dark-pupil method with up to 12 corneal reflections and offers an unprecedented high resolution imaging of the pupil and the cornea. The potential benefits of a higher number of glints and their optimum arrangement are analyzed considering distinct light sources configurations with 12, 8, 6, 4, and 2 corneal reflections. Moreover, a normalization factor of the pupil-glint vector is proposed for each configuration. There is a tendency for increasing accuracy with the number of glints, especially vertically (0.47 deg for 12 glints configuration versus 0.65 deg for 2 glints configuration). Besides the number of corneal reflections, their arrangement seems to have a stronger effect. A configuration that minimizes the interference of the eyelids with the corneal reflections is desired. Finally, the normalization of the pupil-glint vectors improves the vertical eye tracking accuracy up to 43.2%. In addition, the normalization also limits the need for a higher number of light sources to achieve better spatial accuracy.