ArticlePDF Available

Keep It Simple: Depth-based Dynamic Adjustment of Rendering for Head-mounted Displays Decreases Visual Comfort

Authors:

Abstract and Figures

Head-mounted displays cause discomfort. This is commonly attributed to conflicting depth cues, most prominently between vergence, which is consistent with object depth, and accommodation, which is adjusted to the near eye displays. It is possible to adjust the camera parameters, specifically interocular distance and vergence angles, for rendering the virtual environment to minimize this conflict. This requires dynamic adjustment of the parameters based on object depth. In an experiment based on a visual search task, we evaluate how dynamic adjustment affects visual comfort compared to fixed camera parameters. We collect objective as well as subjective data. Results show that dynamic adjustment decreases common objective measures of visual comfort such as pupil diameter and blink rate by a statistically significant margin. The subjective evaluation of categories such as fatigue or eye irritation shows a similar trend but was inconclusive. This suggests that rendering with fixed camera parameters is the better choice for head-mounted displays, at least in scenarios similar to the ones used here.
Content may be subject to copyright.
16
Keep It Simple: Depth-based Dynamic Adjustment of Rendering
for Head-mounted Displays Decreases Visual Comfort
JOCHEN JACOBS, XI WANG, and MARC ALEXA, TU Berlin
Head-mounted displays cause discomfort. This is commonly attributed to conicting depth cues, most prominently between
vergence, which is consistent with object depth, and accommodation, which is adjusted to the near eye displays.
It is possible to adjust the camera parameters, specically interocular distance and vergence angles, for rendering the
virtual environment to minimize this conict. This requires dynamic adjustment of the parameters based on object depth.
In an experiment based on a visual search task, we evaluate how dynamic adjustment aects visual comfort compared to
xed camera parameters. We collect objective as well as subjective data. Results show that dynamic adjustment decreases
common objective measures of visual comfort such as pupil diameter and blink rate by a statistically signicant margin. The
subjective evaluation of categories such as fatigue or eye irritation shows a similar trend but was inconclusive. This suggests
that rendering with xed camera parameters is the better choice for head-mounted displays, at least in scenarios similar to
the ones used here.
CCS Concepts: Computing methodologies Rendering;
Additional Key Words and Phrases: Head-mounted displays, vergence, fatigue
ACM Reference format:
Jochen Jacobs, Xi Wang, and Marc Alexa. 2019. Keep It Simple: Depth-based Dynamic Adjustment of Rendering for Head-
mounted Displays Decreases Visual Comfort. ACM Trans. Appl. Percept. 16, 3, Article 16 (September 2019), 16 pages.
https://doi.org/10.1145/3353902
1 INTRODUCTION
Immersive virtual reality (VR) has become commonplace with the advent of aordable head-mounted stereo
displays (e.g., HTC Vive, Oculus Rift, PlayStation VR, Google Daydream View, etc.). These devices are equipped
with two displays, one for each eye, presenting two dierent images rendered based on the view of each of
the eyes. This provides realistic binocular depth cues; in particular, it requires convergence or divergence of
the eyes toward the object of interest. However, accommodation is adjusted to the focal plane of the device,
which is usually set to a xed distance of a few meters from the viewer. The inconsistency between vergence
and accommodation, the vergence-accommodation (VA) conict,isassumedtobethemajorsourceofdiscomfort
experienced by many individuals under prolonged use of head-mounted stereo displays.
While technical solutions are possible to adjust the accommodation (Johnson et al. 2016;Konradetal.2016;
Padmanaban et al. 2017), they are technically involved and likely unavailable for the consumer market. A variety
Authors’ address: J. Jacobs, X. Wang, and M. Alexa, TU Berlin, Marchstrasse 23, Berlin 10587, Germany; emails: jochen.jacobs@campus.tu-
berlin.de, {xi.wang, marc.alexa}@tu-berlin.de.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that
copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst
page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy
otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions
from permissions@acm.org.
© 2019 Association for Computing Machinery.
1544-3558/2019/09-ART16 $15.00
https://doi.org/10.1145/3353902
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:2 J. Jacobs et al.
of software approaches have been shown to be ineective (Koulieris et al. 2017). With the advent of eye tracking
built into the display, solutions based on dynamic adjustment of rendering parameters depending on the current
vergence situation become tractable. Specically, we adopt the idea of modifying the interocular distance and
vergence of the virtual cameras so that the vergence induced by a rendered object matches the accommodation
induced by the display (see Section 3for the details of our approach). This consistency of vergence and accommo-
dation comes at the expense of dynamic modication of perceived absolute depth. Importantly, however, relative
depth perception remains intact.
In an experiment based on a visual search task, we evaluate the visual comfort relative to a baseline method
that uses xed interocular distance and parallel view directions for rendering. We are specically interested in
how matching of vergence and accommodation aects objective measures of discomfort such as pupil diameter
and blink rate and subjective assessment based on self-report. As part of this evaluation we also determine if
participants notice the dynamic adjustment at all, and if so, how this behavior is judged.
Results show that dynamic adjustment is commonly not noticed by participants and has no signicant impact
on task performance. Interestingly, it nonetheless decreases visual comfort based on objective measurements by
a signicant margin. This result is consistent with subjective assessment. We conclude that dynamic adjustment
of camera parameters to reduce the VA conict, even if not consciously noticed by participants, introduces
equivalent amount of visual fatigue to the constant inconsistency between vergence and accommodation.
2 RELATED WORK
2.1 Vergence-accommodation Conflict
Vergence describes the type of eye movements when both eyes move in opposite directions (Holmqvist et al.
2011). For example, when we change gaze from distant to close objects, both eyes rotate inward. At the same time,
the ciliary muscle changes the shape of the lens such that sharp images are obtained on the retina (Atchison et al.
2000). During this accommodation the power of the eye lenses are adjusted. Vergence and accommodation are
coupled, and we can measure the level of accommodation using the vergence angle; however, the accommodation
level is not equivalent to the change in lens power. Accommodation is much slower than the movements of the
eyes (Lockhart and Shi 2010a), taking about 200–500ms, whereas saccades are around 30–40ms (Holmqvist et al.
2011).
In conventional stereoscopic displays, two dierent images are presented to the eyes and the disparity be-
tween the two images correspond to the depth of scene elements. In other words, objects at dierent distances
correspond to dierent vergence angles, facilitating depth perception. However, the accommodative images are
presented on screen with a xed distance to the eyes. The resulting unnatural decoupling of vergence and ac-
commodation leads to conicting cues. Studies have shown that this conict contributes substantially to the
visual discomfort in stereo displays (Lockhart and Shi 2010b; Schor et al. 1999; Shiwa et al. 1996).
2.2 Available Solutions to the Vergence-Accommodation Conflict
Several methods have been proposed to reduce the conict between vergence and accommodation in stereo
viewing conditions, including both algorithm solutions (e.g., depth-of-eld (DoF) rendering (Duchowski et al.
2014)) and hardware designs (e.g., light-eld displays (Maimone et al. 2013)).
Many available algorithmic solutions have been proposed to reduce the VA conict by modifying the stereo
images (Peli et al. 2001). The essential idea is to adapt the vergence angle to the accommodation level on the
screen. In virtual environments, camera parameters, including interocular distance and vergence angle, are ad-
justed dynamically based on object depth. These methods are commonly called convergence adjustment (Fisker
et al. 2013; Sherstyuk et al. 2012). Other methods aim to minimize the conict by remapping the disparity func-
tion of depth such that most scene content is viewed in a comfort zone (Lang et al. 2010), where image-based
saliency maps are used to guide the stereoscopic image warping. By tracking the eye movements, the remapping
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
Depth-based Dynamic Adjustment of Rendering for HMDs 16:3
function can be dynamically adjusted based on where is currently being looked at, and it has been demonstrated
to improve depth perception (Kellnhofer et al. 2016). Camera adjustment was proposed in context of interactive
stereoscopic applications (Oskam et al. 2011), where large motion of viewers or objects often results in visual
artifacts. Linear interpolation of camera parameters was proposed to control the disparities of visual content.
The idea was further applied in Koulieris et al. (2016), which trained a decision forest to predict gaze positions in
stereo images of video games. Instead of using image-based saliency maps as in Lang et al. (2010), local dispar-
ity adjustment based on predicted gaze positions was then used to improve the perceptual experience. Results
demonstrated that gaze-based disparity manipulation generalized well, especially for scenes with a large depth
range. We refer our readers to work by Terzić and Hansard (2016) for a comprehensive review on available solu-
tions. In this work, we focus on the experimental examination of a vergence-based camera adjustment algorithm
and its eects on the vergence-accommodation conict. Other object-based local disparity adjustments, which
remap the range of depth variations in the scene (Lang et al. 2010) or dynamically change the disparities of cer-
tain objects (Kellnhofer et al. 2016), are considered as dierent approaches, as the adjustment algorithms depend
on the scene content.
A recent study (Koulieris et al. 2017) proposed a device design to measure the accommodation in head-mounted
displays and evaluated the eectiveness of several algorithmic methods and hardware designs in handling the VA
conict. The results showed that only the focus-adjustable-lens design (Johnson et al. 2016;Konradetal.2016;
Padmanaban et al. 2017) can signicantly improve visual comfort, where accommodation is changed eectively.
2.3 Visual Comfort Measurements
Visual discomfort has been a major drawback that poses limitations on the usage of stereoscopic displays. The
question of how VA conict aects visual comfort and fatigue has led to a rich body of literature (Kooi and Toet
2004; Shibata et al. 2011). Visual comfort is mostly assessed by questionnaires of subjective evaluation (Chen
et al. 2011; Shibata et al. 2011; Tam et al. 2011). Typically participants are asked to report their fatigue, eye strain,
body strain, and headache. Eye tracking data have been used as an indicator of mental fatigue in many studies
(Yamada and Kobayashi 2017,2018; Zhao and Shen 2010). Especially with the improved accessibility of video-
based eye tracking techniques, more studies report the characteristics of eye movements as an objective measure
of visual comfort (Iatsun et al. 2013; Kim et al. 2011; Morad et al. 2000a). Interestingly, this is not the case in most
studies of visual comfort in head-mounted displays.
3 DEPTH-BASED DYNAMIC CAMERA ADJUSTMENT
In this work, we aim to evaluate the eect of depth-based dynamic camera adjustment methods on visual comfort
of head-mounted displays (HMDs). The central idea is to avoid the vergence-accommodation conict by adjusting
the camera parameters for rendering so that the vergence matches the accommodation on the physical display.
As vergence depends on object depth, the approach requires estimating the depth users are attending to and
then slowly adjusting the parameters.
We start with a brief overview of rendering for HMDs. As accurate gaze depth estimation is a prerequisite for
any dynamic adjustment, we propose a probability model incorporating uncertainty presented in eye movements
data as well as ambiguity with respect to scene geometries. Finally, we introduce two types of camera adjustments
and a simple protocol of how to combine them based on the gaze depth.
3.1 Background in Head-Mounted Display Rendering
The major optical components in most HMDs are micro-displays and magnifying lenses, see Figure 1.Through
magnication a virtual image is presented at a distance of approximately 2m in front of the eyes. Note that the
center point of each eye corresponds to dierent positions in the visual image. To generate the presentation of
the scene, two cameras are placed at eye positions and two images are rendered, one for each eye. Disparity of
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:4 J. Jacobs et al.
Fig. 1. Illustration of a simplified head-mounted display. Through lens magnification, a virtual image is presented at about
2m in front. When the eyes aend to objects at dierent depths, the corresponding vergence angle reduces from close to far.
the same object between two images facilitates depth perception. As shown in Figure 1, closer objects correspond
to larger vergence angles compared to objects at larger distances.
3.2 Accurate Depth Estimation
Our proposed method of real-time dynamic camera adjustment poses a demanding requirement on the accuracy
of gaze estimation. The problem of accurately estimating gaze in HMDs is a topic of several current projects (e.g.,
OpenEDS Challenge (Garbin et al. 2019)). Especially in practice, even a slight shift of the headset would change
the underlying gaze mapping functions and result in large errors in the estimated gaze directions.
Vergence based three-dimensional (3D) gaze estimation is an ill-conditioned problem and associates to a large
inherent error (Wang et al. 2017). In face, two lines of sight do not even necessarily intersect in space. To better
incorporate the uncertainty, we propose a probability model to estimate the gaze positions in 3D. We associate
normal distributions to each estimated eye ray direction (i.e., Gaussian distributions with p(θ)=1
σ2πe1
2(θμ
σ)2)
and consider the regions of all possible intersection points, each with its corresponding joint probability. We also
explicitly model the dependency between the two eye ray directions. For example, a divergence of two viewing
directions is less likely to happen, and its corresponding probability supposes to be low. In such a way, we also
assume that errors in the estimated eye ray directions (e.g., caused by camera noise) are dependent, and two eye
rays are more likely to have the same directional errors. For each estimated viewing direction represented by θ,
we introduce a deviation angle δto compute all possible intersection points under small perturbation. Given two
deviated eye ray directions, we calculate the dierence between the two deviation angles δlδrand estimate
how likely this angle dierence can be observed according to the distribution pd. In summary, the probability
associated to each intersection point is
p(θl,θr)=pθl(δl|μ,σ2)pθr(δr|μ,σ2)pdδlδr|μ,σ2
d,(1)
where θland θrare the angles of the left eye ray and the right eye ray. We assume that the deviated eye ray follows
a normal distribution around the given ray direction, with the mean μ=0 and the standard deviation σ.The
angles δl(or δr) represents a deviation angle from the measured left (or right) eye ray. σdis the standard deviation
for the probability distribution of the dierence between the deviation angles. Using a test scene that contains 29
selected target points sampled from the 3D space, we have experimentally determined that σd=0.15σproduces
the best result. The joint probability distribution of the intersection point in 2D is visualized in Figure 2(a) and
the probability distribution of the angle dierence pdin Figure 2(b). Multiplying the probability distribution in
(a) and (b) results in a joint distribution as shown in Figure 2(c), which largely limits the possible intersection
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
Depth-based Dynamic Adjustment of Rendering for HMDs 16:5
Fig. 2. Probability distributions of the estimated position of a fixation target in 2D. Red lines correspond to the given eye ray
directions and yellow lines mark the area within two standard deviations of 2σ. (a) The joint distribution of pθlpθr, with each
modeled as a normal distribution and (b) the distribution of pd, following a normal distribution with standard deviation of
0.15σ. Product of the combined distribution of pθlpθrpdis shown in (c); however, if we only consider the absolute dierences
between angles (without directional information), then error occurs in the estimated probability distribution as shown in
(d), where p
dcorresponds to the distribution of dierences between unsigned angles.
regions given two eye rays. Apart from the ratio between σand σd, the exact values do not change the position
of the maximum response as shown in Figure 2(c). As a common practice in eye tracking, directional angles
are considered. Ignoring the sign of rotation angles would lead to artifacts in the joint distribution as visualized
in Figure 2(d). As two lines do not necessarily intersect in space, this probability model provides a reasonable
distribution, especially when estimating gaze positions in 3D under uncertainty. The last factor we consider is
the scene geometry: We assume all intersection points lie on the geometry of the scene (i.e., no gaze points are
located in the air); only points visible to both eyes are considered as potential xation targets, as intended targets
are visible to both eyes in our designed scenario and the estimation of gaze depth relies on the vergence angle
formed by two eye rays. Standard ray-scene intersection test was used to compute the visibility.
3.3 Interocular Distance and Vergence Angle
We consider two dierent transformations of the cameras, namely translation and rotation, which correspond
to the modications in interocular distances and vergence angles, formed by the two cameras and the position
of the xation target. Note that each of the parameters can be used to increase or decrease the eective vergence
angle for the user of the HMDs. As gaze estimation follows the changes of viewing directions, it is important
that scene objects are visible in both of the two images rendered for the two eyes.
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:6 J. Jacobs et al.
Fig. 3. Illustration of camera adjustments when the fixation target is located in front of the screen. In such a case, we can
either decrease the camera distance as shown on the le or rotate both cameras inward as shown on the right. Camera
distance is denoted by dand target distance by t. Vergence angle is represented by V. Reducing camera distance from d0
to d1reduces the vergence angle for gaze point in front of the screen at t1(see Equation (2)). By rotating both cameras
inward by αdegree, we eectively reduce the vergence angle from V1to Vowhen viewing the rendered images parallel (see
Equation (3)).
The idea of adjusting camera parameters to improve visual comfort has been proposed in Peli et al. (2001),
but only parallel camera setups were considered. In practice, this can be implemented as horizontal shifts of the
two images, which results in vergence eye movements. Such adjustment of two rendered images often generates
artifacts that lead to reduced visual comfort. As shown in previous studies, inward camera rotation (also called
camera toeing-in) introduces vertical disparities (Stelmach et al. 2003; Woods et al. 1993), especially for objects
appear close to the corners of the image. For a given camera toeing-in conguration, closer objects result in
larger distortion than objects at larger distances. Our idea is to parameterize the solution space in terms of
camera parameters by considering changes of both interocular distance and vergence angle. This gives us the
freedom to choose between camera translation and rotation, and we propose a simple protocol to combine both
transformations such that distortions in the rendered images are minimized (more details in Section 3.4).
Distance between Two Cameras. We only consider the one-dimensional translation along the axis between two
cameras. Increased camera distance results in larger disparity in the rendered images, consequently increasing
the corresponding eective vergence angle. As an example shown on the left in Figure 3, when xation targets
come closer to the eyes, we move the virtual cameras toward each other. The distance between two cameras is
proportional to the depth of the xation target:
d1
t1
=
d0
t0
.(2)
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
Depth-based Dynamic Adjustment of Rendering for HMDs 16:7
Fig. 4. Extreme parameter seings for the virtual cameras. Gray triangles indicate the view frustums for each camera (small
white triangle). Only the intersection area of the gray triangles is useful in practice, covering only a small area away from the
cameras for large interocular distance (le), and close to the cameras for extreme inward rotation (right). The areas marked
with horizontal lines are completely invisible. Note that these illustrations are exaggerations—real seings are kept within
a physiologically plausible range.
Here ddenotes the distance between the two virtual cameras and tdescribes the distance of the xated object to
the eyes. We limit the distance of the virtual cameras to a plausible physiological range, as unrealistic adjustments
lead to visual artifacts: small camera distance leads to a loss of stereoscopic depth cues with two nearly identical
images; large camera distance leads to double vision, and, at extreme distances, close objects become invisible as
the required vergence angle becomes too large (see Figure 4, left). Note that only symmetric viewing frustums
are considered here. Parallel projection with asymmetric frustum may eectively reduce the invisible areas in
front as shown on the left in Figure 4, and it has been used in previous work to correct the keystone distortions
(Zelle and Figura 2004).
Vergence Angle between Two Cameras. It is common to set the optical axes orthogonal to the plane formed by
the axis between two cameras and the up vector. This means the optical axes are parallel. In our scenario, we
want to rotate them. The change in angle of the optical axes of the virtual cameras changes the eective vergence
angle of the eyes of the user (in the opposite direction).
Let αbe the rotation angle of each camera when the current vergence angleV1is modied to Vo, corresponding
to the vergence angle when viewing objects at screen distance:
α=|VoV1|
2.(3)
Similarly to the adjustment of camera distances, large inward rotation leads to double vision and large outward
rotation results in discontinuity in the perceived scene. The right illustration in Figure 3shows an example.
Note that when viewing the scene, we still assume the two cameras are parallel to each other but the images are
rendered from adjusted view points. If the cameras converge (i.e., rotate inward), then objects that are farther
away than the convergence point would require divergent eye movements, which is essentially invisible as shown
on the right in Figure 4. Therefore, the cameras can only converge as close as the farthest object. Large camera
divergence also leads to large convergence of the eyes, but the largest camera divergence angle required when
looking at innity is only 2, which is the vergence angle corresponding to screen distance.
3.4 From Vergence Angle to Camera Adjustment
Using the method described in Section 3.2, we estimate the location of the xation target as well as the corre-
sponding vergence angle, given two viewing directions reported by the eye tracker. Recall that the idea is to
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:8 J. Jacobs et al.
Fig. 5. The disparity function of depth. Black line shows the default disparity map. Orange line draws the mapping when
distances between two cameras are small while blue line depicts the mapping when camera distances are large. Green line
corresponds to diverging cameras while red line corresponds to converging cameras.
dynamically adjust the camera parameters such that the vergence angle is kept as close as possible to Vo,which
is the vergence angle when viewing objects at screen distance. Specically, we consider the camera parameters
of interocular distance and vergence angle.
If xated objects are in front of the screen, then we can either reduce the interocular distance by moving
the cameras closer to each other or rotate them inward to change the vergence angle. Similarly, when attention
changes to points that are far away with a large depth value, we can either increase the interocular distance or
rotate the cameras outward. Based on the assumption that unexpected changes in the scene may distract users,
we aim to minimize such changes while dynamically adjusting the cameras. We experimented with dierent
parameter settings as well as the speed of adjustments.
In practice, each parameter can be adjusted only within a limited range. In principle, adjustment of interocular
distance corresponds to the scaling of the total depth range, as shown by the orange and blue curves in Figure 5;
adjustment of vergence angle corresponds to shifts in the disparity depth map as depicted by the green and
red curves in Figure 5. As discussed in previous section, large camera distance leads to large disparity for closer
objects (the blue curve) while converging cameras causes problems to objects that are far away (the red curve). As
inward camera translation (i.e., close camera distance) and outward camera rotation (i.e., divergent cameras) up
to 2do not lead to visual artifacts, we optioned for a simple protocol: Only interocular distance is changed when
viewing objects in front of the screen, and camera rotation is applied when viewing objects behind the screen.
Therefore, the camera distance was adjusted not larger than the average interocular distance; the vergence angle
was kept not larger than the angle when focusing on the screen, which corresponds to 2of visual angle. In such a
way, we avoid large noticeable artifacts caused by inward camera rotations, i.e., the vertical disparities that appear
close to the image corners of toeing-in cameras. Note that even though no object-based disparity adjustment is
included and the relative order of objects remains the same, the perceived absolute distance between objects
can be aected by the camera adjustments, as illustrated in Figure 5. The adjustment speed was set constant
such that the xated point changed at a rate of 0.2D
s,1D(dioptre) =m1. Instead of using arcmin/s, which varies
depending on the depth, we measure the change rate in D
s, which can be easily used for both types of adjustment.
4 EXPERIMENT
To evaluate the eect of proposed dynamic adjustment of camera parameters on visual comfort, we designed
an experiment where participants perform a visual search task. Fixed camera parameters of interocular distance
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
Depth-based Dynamic Adjustment of Rendering for HMDs 16:9
Fig. 6. Experimental scene. The le figure shows one pair of disks, with five symbols on each. Two disks are presented
simultaneously to both eyes and participants suppose to find the symbol that appears in both disks (i.e., the cross in this
example). The right figure shows the stereo images of tested three depth distances. In the experiment, only one pair at one
distance is visible at one time.
equal to 6.9cm and parallel viewing directions serve as a baseline for comparison. We collected both objective
eye movement data and subjective evaluations of visual comfort. Additionally, we also collected subjective as-
sessment of the two conditions.
4.1 Participants
Eighteen participants (all students from the university) joined our experiment (5 female, mean age =24, SD =
3.7). They all had normal or corrected to normal visual acuity. Three of them wore contact lenses. Glasses were
not allowed due to concerns about eye tracking accuracy in the HMDs. Fourteen participants reported they had
previous experience in VR, and the average rating of their experience between 1 (very bad) and 5 (very good) was
3.8. They were kept naive as to the purpose of the experiment, and their time was compensated at common hourly
rates. Written consent was given before the experiment. We also tested the stereoscopic vision of participants
using the FLY stereo acuity test (Vision Assessment Corporation). The average stereo acuity is 56s of arc. There
was no apparent correlation between the stereo acuity and the objective or subjective measures.
4.2 Apparatus
We used a HTC Vive Pro headset together with its motion tracking system. The touch pads of two HTC Vive Pro
controllers were used for the selections in the experiment. Two add-on eye cameras (with frame rate of 120Hz)
from Pupil Labs were inserted in the headset to track the eye movements. Unity (Version 2018.3) was used to
setup and render the virtual scene, and together with StreamVR they controlled the display and interaction. We
used the Unity pupil plugin provided by Pupil Labs to interface with the eye tracker.
4.3 Task
We considered a visual search task to engage participants in the experiments and used the task completion time
as an indicator of fatigue, assuming less fatigue corresponds to shorter time. Additionally, task completion time
is linked to visual performance. For instance, it indicates whether depth perception is inuenced by the dynamic
adjustments of the cameras. In the experiments, participants played a Dobble game (also called Spot it!), which
is a simple pattern recognition game. As shown on the left in Figure 6, the task was to nd a pattern shown on
two disks.
In each trial, two disks were presented in front with each showing 5 Unicode characters from the Miscellaneous
Symbols block. All 10 symbols were presented simultaneously in front, and the left image in Figure 6shows an
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:10 J. Jacobs et al.
example. Participants were asked to nd the symbol that appeared in both disks, and there was exactly one
symbol in common in each trial. Two touch pads on the controllers were used as interface, and participants
needed to move the cursors to the selected symbols on both discs simultaneously. It continued to the next trial
when the selection was completed. To minimize the bias caused by various gaze positions in the previous trial,
we replaced the cursors at the center at the beginning of each trial, and participants were asked to look at them.
The discs were colored red for a short time if wrong symbols were selected. Participants were asked to nd
the matching symbols as quickly as possible. We computed a performance score based on trial correctness and
completion time for each participant, and this score was presented to participants as a motivation.
4.4 Protocol
We wanted to compare the eects of dynamic adjustment and xed parameter rendering on visual comfort and
considered these two methods as two dierent conditions in the experiment. In condition one, camera parameters
were dynamically adjusted based on the depth of xation target; in condition two, xed camera parameters were
used for all participants, and images were rendered by two parallel front-facing cameras with the interocular
distance set to 6.9cm. We suspected that the inter-subject variance may be high and decided to collect data of
both conditions from each participant. Half of the participants started with the session with camera adjustment
and half of them had the session with xed parameters rst.
To test the eects when viewing objects at dierent depths, we considered three distances for presentation,
as shown on the right in Figure 6. The backgrounds of the scene were kept the same for all participants in both
conditions. The accommodation on the screen corresponds to a viewing distance of 2m, and we considered a
closer distance at 0.46m and a further distance at 50m. To ensure a comparability for the search task, we varied
the disk size (as well as the symbol size) such that they spanned the same visual angle of 8.3.
Three variations in depth lead to six directional jumps in total. We counterbalanced the sequences of jumps
over all participants and considered one round of going through all six jumps as one block. Five trials were
presented at each distance level to have stable results. Each distance appeared more than once in one block, and
in total each block consisted of 35 trials. To motivate participants, we showed them their performance score of
each block, as well as the rank of their score in the collected dataset. By doing so, we could also align the time
so that each block started at the same time for each participant and the total time wearing the headset was the
same for all participants.
One session of six blocks took about 20 minutes in total. Each block started with a calibration of the eye tracker.
The calibration of the next block is used as validation of the previous block. We planed to re-calibrate the eye
tracker when the validation accuracy was above 2, but this was never the case. Participants had a trial session
at the beginning to familiarize themselves with the procedure. At the end of each session, they were asked to
ll out a questionnaire to evaluate their fatigue (detailed questions can be found in the supplementary material).
Their eye movements data during the sessions were collected, including eye ray direction, pupil diameter, and
occurrence of blink.
4.5 Measures of Visual Comfort
Objective Measures. Task performance was evaluated by the accuracy and duration of task completion, namely
whether the matching pair of symbols was correctly selected and how long it took to nd the matching pair.
To measure fatigue, we considered the common eye movement statistics (Cardona et al. 2011; Kim et al. 2011;
Luedtke et al. 1998; Morad et al. 2000b), including blink rate, pupil diameter as well as pupil diameter variation.
With increased fatigue, blink rate as well as pupil diameter variability are expected to increase while average
pupil diameter is expected to decrease.
Subjective Measures. At the end of each session, participants answered a questionnaire to evaluate their expe-
rience including questions about their eyes, vision, focus, headache, and general feeling. After the completion of
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
Depth-based Dynamic Adjustment of Rendering for HMDs 16:11
Fig. 7. Histograms of error distribution of estimate fixation targets using the proposed probability-based method (on the
le) and the ray intersection method (on the right). We distinguish among three dierent depths and the proposed method
improved the estimation accuracy in general.
both sessions, participants also reported their preference of the two sessions. We followed the standard visual
comfort evaluation protocol (Homan et al. 2008; Shibata et al. 2011), and participants rated their experience on
a 5-point Likert scale (see the supplementary material for details) where 1 indicates positive experience and 5
corresponds to negative experience.
5RESULTS
We assess the eects of the described depth-based dynamic camera adjustment method on visual comfort com-
pared to xed camera parameters, following the visual search experiment. Here we report the results of both
subjective and objective evaluations.
5.1 Depth Estimation Accuracy
First, we evaluated our proposed probability-based method for depth estimation and compared it to the common
strategy of computing the intersection point of two eye rays by nding the point that has the smallest distances
to both eye rays. The estimation error was computed by the reciprocal length between the estimated depth and
the intended depth, assuming participants were looking at the discs while performing the task. As shown in the
equation
e=1
df1
do2
,(4)
dfis the depth of the estimated xation target and dois the disk depth. To limit the inuence of outliers, we
have excluded trials where the mean error is larger than 5D(10 trials in the whole dataset). Figure 7shows
the results. Comparing to the ray intersection method, our proposed algorithm achieved better results for the
close and far targets (error dierences between the two methods are 0.12Dand 0.34D,t(2152)=11.8; p0.01
and t(2183)=40.5; p0.01 respectively, t-test) and worse results for the middle target (error dierence of
0.15D,t(2195)=50.1; p0.01, t-test). Comparing to the ray intersection method, the proposed probability
model seems to benet from incorporating the noise and ambiguity into the computation and provides an ad-
vantage for gaze estimation in 3D.
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:12 J. Jacobs et al.
Fig. 8. Histograms of changes in blink rate (le) and pupil diameter (right). We distinguish between two groups depending
on the session order (see the legends). wcorresponds to the sessions when dynamical adjustment was enabled, and wo
corresponds to the sessions of no adjustment.
5.2 Objective Evaluation of Visual Comfort
Two participants’ data were excluded in the following analysis due to large errors in the estimation of xation
targets (based on the average error). Nearly all trials were completed correctly (accuracy =98.5%, mismatching
symbols were selected in only 96 of 6,681 trials). The average reaction time of one trial (i.e., the trial completion
time), when dynamic adjustment was enabled (mean duration =3.33s, SD =2.09s), does not dier signicantly
from trials with xed camera parameters (mean duration =3.37s, SD =2.21s). Similar reaction time in both
conditions indicates that the introduced camera adjustment does not signicantly inuence depth perception, at
least not from the perspective of task completion time. Based on participants’ self-evaluation, camera adjustment
is commonly not noticed, and no noticeable dierence between the two sessions was reported.
We observed a large variation among the participants, as indicated by the standard deviations of the reaction
time. Therefore, we only performed inter-participant comparison of the eye movements data. We computed the
trialwise dierences of eye movement statistics for each dataset, and dierentiated between the two groups of
dierent session orders (one group started with the adjustment session and the other group started with the
session using xed camera parameters).
For each trial, we computed the average blink rate and pupil diameter, and the dierences of two corresponding
trials between two sessions. More precisely, the trialwise dierence equaled to the increase of the second session
from the rst one. Positive number corresponded to an increase in the second session, and negative number cor-
responds to a decrease. Figure 8shows the histograms of changes in blink rate and pupil diameter. On average,
dynamic adjustment led to higher blink rate and smaller pupil diameter. Both are a sign of fatigue. The dier-
ences in the resulting distributions are both signicant (t(2907)=12.21; p0.01 and t(2907)=4.08; p0.01
respectively, t-test). However, we suspect that dynamic changes in the scene may contribute to the increased
blink rate; however, this needs to be conrmed in future study. Increase of pupil diameter variance does not
dier signicantly no matter what the session order was (t(2907)=0.83;p=0.40, t-test). The literature, where
pupil diameter variance was used as a measure of fatigue, were mainly focused on sleep. The level of fatigue
after a long time of being awake (Morad et al. 2000a) or shortly before falling asleep (Schumann et al. 2017)leads
to a signicant dierent prole of pupil diameter variance. In comparison, pupil diameter variance may not be
a valid measure for visual comfort.
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
Depth-based Dynamic Adjustment of Rendering for HMDs 16:13
Previous studies showed that humans can tolerate certain inconsistency between vergence and accommoda-
tion, and there exists a comfort zone for stereoscopic viewing given a xed accommodation level (Lang et al. 2010;
Mendiburu 2012). However, the exact size of the comfortable zone depends on many factors, such as the viewing
distance, the illumination, as well as scene content (Shibata et al. 2011). For instance, when viewing objects pre-
sented on a screen, the comfort zone in front is considerably larger than the area behind the screen. Depending
on the viewing distance t, we consider the region between 33%tin front and 25%tbehind as the comfort zone and
compute the percentage of time when vergence and accommodation are consistent (i.e., within the comfort zone).
On average, vergence and accommodation are consistent for 43% of the time when dynamic adjustments were
enabled (67% ±17% for close targets; 20% ±12% for middle targets; 42% ±26% for far targets), whereas only 33%
of time when camera parameters were xed (100% for middle targets and 0% for near and far targets). Note that
the amount of time required for the camera adjustment should been considered in such evaluation, as we made
a tradeo between the adjustment speed and noticeable visual changes. In an ideal eye tracking scenario, the in-
comfort percentage of time with dynamic adjustment is only 82% (21.7s adjusting time for each block on average;
83% for close targets, 85% for middle targets, and 76% for far targets). While the average consistency is low for
the middle area, for some participants it was much better (e.g., the mean in-comfort-percentage of the top four is
38%). Also in these cases there is no correlation between objective measures of comfort and consistency, but an
inconsistency was revealed by the objective measures. Compared to the xed camera parameter conguration,
dynamic adjustment leads to signicant increases in both blink rate (more fatigue, t(830)=10.54;p0.01,
t-test) and pupil diameter (less fatigue, (t(830)=3.16; p=0.0018),t-test).
5.3 Subjective Evaluation of Visual Comfort
Similarly, for each participant’s subjective evaluation, we reported the dierence between the rst and second
evaluations. No headache was reported regardless of the session order, but the reported eye tiredness indicates a
clear preference for the session with xed camera parameters (see Figure 9(a)). The same trend was observed from
the responses to the direct comparisons of two sessions (see Figure 9(b)–(d)). No matter whether it was about
fatigue, eye irritation, or depth changes, the majority of participants preferred the session with xed camera
parameters over the session with dynamic adjusted cameras.
5.4 Discussion
Our result goes in line with the previous nding (Koulieris et al. 2017) that improvements in visual comfort
achieved by algorithmic solutions are limited and it is dicult to eectively reduce the vergence-accommodation
conict unless physical changes of accommodation is involved.
In the parameter space of two adjustable variables, we only considered a subspace by following a simple
protocol. It remains unclear how other combinations of adjustments, for example, by allowing camera translation
and rotation at the same time, would aect visual comfort. How to nd an eective protocol is also an interesting
question.
One of our design goals was to minimize the changes while adjusting the cameras such that observers do not
notice any dierences or artifacts. It has to be balanced with the adjustment speed. In the experiment, when
gaze point changes from the close distance to the middle depth, it took 8.6s until cameras reached the stable
conguration; when changing from middle to far-away distance, it took 2.4s. These relative slow adjustments
result in no noticeable changes based on participants report; however, it may, on the other hand, induce fatigue.
Future studies are required to investigate this further.
Even though our experiment requires eye movement changes among three dierent depths, visual comfort is
mainly evaluated in a static scene once the eyes have been adjusted to the targets. It is possible that the dynamic
adjustment method could reduce the vergence-accommodation conict more eectively for a dynamic scene,
where xation targets continuously move in depth, for instance.
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:14 J. Jacobs et al.
Fig. 9. Subjective responses. (a) Increase in eye tiredness. In the 5-point rating scale, 1 corresponds to fresh and 5 corresponds
to irritated. Therefore, positive values correspond to the amount of increased eye tiredness in the second session, and negative
rates indicate the amount of decrease in eye tiredness. Plots in (b)–(d) show the responses when participants were asked to
evaluate (b) which session was more fatigue, (c) which session was more irritating to the eyes, and (d) which session was
easier to change in depths. wcorresponds to the sessions when dynamical adjustment is enabled and wocorresponds to the
sessions of no adjustment.
Essentially, we evaluated visual comfort in sessions of 20 minutes. Very likely fatigue gets stronger over time.
It is not clear how dynamic adjustments would aect results over longer periods of time. Additionally, we want
to point out that the accuracy of eye tracking plays an important role in such experiments. Even though the
proposed probability-based gaze estimation method achieves better results for targets that are close or far away,
the accuracy for target objects at the screen distance drops down, which leads to a large inconsistency between
vergence and accommodation when viewing such objects.
6 CONCLUSION
We have implemented and evaluated a new way to resolve a prolonged vergence-accommodation conict in
HMDs. The approach could be considered as having the potential to reduce the visual discomfort associated
with HMDs; however, our experiments indicate that this is not the case. At least in the tested scenario, both
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
Depth-based Dynamic Adjustment of Rendering for HMDs 16:15
subjective and objective evaluations suggest to keep camera parameters xed. This is a useful result, because the
dynamic adjustment requires additional equipment and processing. However, many factors could inuence the
results, and further studies are required to understand how well these ndings can be generalized.
The camera adjustment naturally provides a two-parameter space for the adjustment. In addition, the speed
of adjustment could be varied within the limits of the processing time of eye tracking equipment. We have opted
for the adjustment that we believe interferes as little as possible with human perception. It may be fruitful in the
future to experiment with other protocols for adjustment.
Another question arising from the experiment is to further look into what exactly causes discomfort.
The vergence-accommodation conict may contribute to discomfort, but perhaps other causes are so far
underestimated.
ACKNOWLEDGMENTS
We thank all participants who joined our experiment. We also thank Minjung Kim, Andreas Ley, Ronny Hänsch,
and Amelie Froessl, who joined our pilot study and gave us valuable feedback.
REFERENCES
David A. Atchison, George Smith, and George Smith. 2000. Optics of the human eye. Butterworth-Heinemann Oxford.
Genís Cardona, Carles Garcá, Carme Serés, Meritxell Vilaseca, and Joan Gispets. 2011. Blink rate, blink amplitude, and tear lm integrity
during dynamic visual display terminal tasks. Curr. Eye Res. 36, 3 (2011), 190–197. DOI:https://doi.org/10.3109/02713683.2010.544442
PMID: 21275516.
Wei Chen, Jérôme Fournier, Marcus Barkowsky, and Patrick Le Callet. 2011. New stereoscopic video shooting rule based on stereoscopic
distortion parameters and comfortable viewing zone. In IS&T/SPIE Electronic Imaging, Andrew J. Woods, Nicolas S. Holliman, and Neil
A. Dodgson (Eds.), Vol. 7863. 78631O. DOI:https://doi.org/10.1117/12.872332
Andrew T. Duchowski, Donald H. House, Jordan Gestring, Rui I. Wang, Krzysztof Krejtz, Izabela Krejtz, Radosław Mantiuk, and Bartosz
Bazyluk. 2014. Reducing visual discomfort of 3D stereoscopic displays with gaze-contingent depth-of-eld. In Proceedings of the ACM
Symposium on Applied Perception (SAP’14). ACM, New York, NY, 39–46. DOI:https://doi.org/10.1145/2628257.2628259
Martin Fisker, Kristoer Gram, Kasper Kronborg Thomsen, Dimitra Vasilarou, and Martin Kraus. 2013. Automatic convergence adjustment
for stereoscopy using eye tracking. Eurographics.
Stephan J. Garbin, Yiru Shen, Immo Schuetz, Robert Cavin, Gregory Hughes, and Sachin S. Talathi. 2019. OpenEDS: Open eye dataset. CoRR
abs/1905.03702 (2019). arXiv:1905.03702 http://arxiv.org/abs/1905.03702
David M. Homan, Ahna R. Girshick, Kurt Akeley, and Martin S. Banks. 2008. Vergence-accommodation conicts hinder visual performance
and cause visual fatigue. J. Vis. 8, 3 (03 2008), 33–33. DOI:https://doi.org/10.1167/8.3.33
Kenneth Holmqvist, Marcus Nyström, Richard Andersson, Richard Dewhurst, Halszka Jarodzka, and Joost Van de Weijer. 2011. Eye Tracking:
A Comprehensive Guide to Methods and Measures. Oxford University Press, Oxford.
Iana Iatsun, Mohamed-Chaker Larabi, and Christine Fernandez-Maloigne. 2013. Investigation of visual fatigue/discomfort generated by S3D
video using eye-tracking data. In Stereoscopic Displays and Applications XXIV, Vol. 8648. International Society for Optics and Photonics,
864803.
Paul V. Johnson, Jared A. Q. Parnell, Joohwan Kim, Christopher D. Saunter, Gordon D. Love, and Martin S. Banks. 2016. Dynamic lens and
monovision 3D displays to improve viewer comfort. Opt. Expr. 24, 11 (May 2016), 11808–11827. DOI:https://doi.org/10.1364/OE.24.011808
Petr Kellnhofer, Piotr Didyk, Karol Myszkowski, Mohamed M. Hefeeda, Hans-Peter Seidel, and Wojciech Matusik. 2016. GazeStereo3D:
Seamless disparity manipulations. ACM Trans. Graph. 35, 4 (10 2016), 68:1–68:13. DOI:https://doi.org/10.1145/2897824.2925866
Donghyun Kim, Sunghwan Choi, Sangil Park, and Kwanghoon Sohn. 2011. Stereoscopic visual fatigue measurement based on fusional
response curve and eye-blinks. In Proceedings of the 2011 17th International Conference on Digital Signal Processing (DSP’11).16.DOI:
https://doi.org/10.1109/ICDSP.2011.6004999
Robert Konrad, Emily A. Cooper, and Gordon Wetzstein. 2016. Novel optical congurations for virtual reality: Evaluating user preference
and performance with focus-tunable and monovision near-eye displays. In Proceedings of the 2016 CHI Conference on Human Factors in
Computing Systems (CHI’16). ACM, New York, NY, 1211–1220. DOI:https://doi.org/10.1145/2858036.2858140
Frank L. Kooi and Alexander Toet. 2004. Visual comfort of binocular and 3D displays. Displays 25, 2-3 (2004), 99–108.
George-Alex Koulieris, Bee Bui, Martin S. Banks, and George Drettakis. 2017. Accommodation and comfort in head-mounted displays. ACM
Trans. Graph. 36, 4, Article 87 (Jul. 2017), 11 pages. DOI:https://doi.org/10.1145/3072959.3073622
George Alex Koulieris, George Drettakis, Douglas Cunningham, and Katerina Mania. 2016. Gaze prediction using machine learning for
dynamic stereo manipulation in games. In Proceedings of the 2016 IEEE Conference on Virtual Reality (VR’16). 113–120. DOI:https://doi.
org/10.1109/VR.2016.7504694
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
16:16 J. Jacobs et al.
Manuel Lang, Alexander Hornung, Oliver Wang, Steven Poulakos, Aljoscha Smolic, and Markus Gross. 2010. Nonlinear disparity mapping
for stereoscopic 3D. ACM Trans. Graph. 29, 4, Article 75 (Jul. 2010), 10 pages. DOI:https://doi.org/10.1145/1778765.1778812
Thurmon E. Lockhart and Wen Shi. 2010a. Eects of age on dynamic accommodation. Ergonomics 53, 7 (2010), 892–903. DOI:https://doi.org/
10.1080/00140139.2010.489968 PMID: 20582770.
Thurmon E. Lockhart and Wen Shi. 2010b. Eects of age on dynamic accommodation. Ergonomics 53, 7 (10 2010), 892–903. DOI:https://
doi.org/10.1080/00140139.2010.489968
Holger Luedtke, Barbara Wilhelm, Martin Adler, Frank Schaeel, and Helmut Wilhelm. 1998. Mathematical procedures in data recording
and processing of pupillary fatigue waves. Vis. Res. 38, 19 (1998), 2889–2896. DOI:https://doi.org/10.1016/S0042-6989(98)00081- 9
Andrew Maimone, Gordon Wetzstein, Matthew Hirsch, Douglas Lanman, Ramesh Raskar, and Henry Fuchs. 2013. Focus 3D: Compressive
accommodation display. ACM Trans. Graph. 32, 5, Article 153 (Oct. 2013), 13 pages. DOI:https://doi.org/10.1145/2503144
Bernard Mendiburu. 2012. 3D Movie Making: Stereoscopic Digital Cinema from Script to Screen. Routledge.
Yair Morad, Hadas Lemberg, Nehemiah Yofe, and Yaron Dagan. 2000a. Pupillography as an objective indicator of fatigue. Curr. Eye Res. 21,
1 (2000), 535–542. DOI:https://doi.org/10.1076/0271-3683(200007)2111- ZFT535 PMID: 11035533.
Yair Morad, Hadas Lemberg, Nehemiah Yofe, and Yaron Dagan. 2000b. Pupillography as an objective indicator of fatigue. Curr. Eye Res. 21,
1 (2000), 535–542.
Thomas Oskam, Alexander Hornung, Huw Bowles, Kenny Mitchell, and Markus Gross. 2011. OSCAM—optimized stereoscopic camera con-
trol for interactive 3D. ACM Trans. Graph. 30, 6, Article 189 (Dec. 2011), 8 pages. DOI:https://doi.org/10.1145/2070781.2024223
Nitish Padmanaban, Robert Konrad, Tal Stramer, Emily A. Cooper, and Gordon Wetzstein. 2017. Optimizing virtual reality for all users
through gaze-contingent and adaptive focus displays. Proc. Natl. Acad. Sci. U.S.A. 114, 9 (2017), 2183–2188. DOI:https://doi.org/10.1073/
pnas.1617251114
Eli Peli, Reed Hedges, Jinshan Tang, Dan Landmann, T Reed Hedges, Jinshan Tang, and Dan Landmann. 2001. A binocular stereoscopic
display system with coupled convergence and accommodation demands. SID Symp. Dig. Techn. Pap. 32, 1 (10 2001), 1296–1299. DOI:
https://doi.org/10.1889/1.1831799
Clifton M. Schor, Lori A. Lott, David Pope, and Andrew D Graham. 1999. Saccades reduce latency and increase velocity of ocular accommo-
dation. Vis. Res. 39, 22 (10 1999), 3769–3795. DOI:https://doi.org/10.1016/S0042-6989(99)00094- 2
Andy Schumann, Juliane Ebel, and Karl-Jürgen Bär. 2017. Forecasting transient sleep episodes by pupil size variability. Curr. Direct. Biomed.
Eng. 3, 2 (2017), 583–586. DOI:https://doi.org/10.1515/cdbme-2017- 0121
Andrei Sherstyuk, Arindam Dey, Christian Sandor, and Andrei State. 2012. Dynamic eye convergence for head-mounted displays improves
user performance in virtual environments. In Proceedings of the Symposium on Interactive 3D Graphics and Games (I3D’12). ACM, 23–30.
DOI:https://doi.org/10.1145/2159616.2159620
Takashi Shibata, Joohwan Kim, David M. Homan, and Martin S. Banks. 2011. The zone of comfort: Predicting visual discomfort with stereo
displays. J. Vis. 11, 8 (07 2011), 11–11. DOI:https://doi.org/10.1167/11.8.11
Shinichi Shiwa, Katsuyuki Omura, and Fumio Kishino. 1996. Proposal for a 3-D display with accommodative compensation: 3DDAC. J. Soc.
Inf. Displ. 4, 4 (10 1996), 255–261. DOI:https://doi.org/10.1889/1.1987395
Lew B Stelmach, Wa James Tam, Filippo Speranza, Ronald Renaud, and Taali Martin. 2003. Improving the visual comfort of stereoscopic
images. In Stereoscopic Displays and Virtual Reality Systems X, Vol. 5006. International Society for Optics and Photonics, 269–283.
DOI:https://doi.org/10.1117/12.474093
Wa James Tam, Filippo Speranza, Sumio Yano, Koichi Shimono, and Hiroshi Ono. 2011. Stereoscopic 3D-TV: Visual comfort. IEEE Trans.
Broadcast. 57, 2 (6 2011), 335–346. DOI:https://doi.org/10.1109/TBC.2011.2125070
Kasim Terzić and Miles Hansard. 2016. Methods for reducing visual discomfort in stereoscopic 3D: A review. Sign. Process.: Image Commun.
47 (2016), 402–416. DOI:https://doi.org/10.1016/j.image.2016.08.002
Xi Wang, David Lindlbauer, Christian Lessig, and Marc Alexa. 2017. Accuracy of monocular gaze tracking on 3D geometry. In Eye Tracking
and Visualization, Michael Burch, Lewis Chuang, Brian Fisher, Albrecht Schmidt, and Daniel Weiskopf (Eds.). Springer International
Publishing, Cham, 169–184.
Andrew J. Woods, Tom Docherty, and Rolf Koch. 1993. Image distortions in stereoscopic video systems. In Stereoscopic Displays and Appli-
cations IV, Vol. 1915. International Society for Optics and Photonics, 36–49. DOI:https://doi.org/10.1117/12.157041
Yasunori Yamada and Masatomo Kobayashi. 2017. Fatigue detection model for older adults using eye-tracking data gathered while watch-
ing video: Evaluation against diverse fatiguing tasks. In Proceedings of the 2017 IEEE International Conference on Healthcare Informatics
(ICHI’17). 275–284. DOI:https://doi.org/10.1109/ICHI.2017.74
Yasunori Yamada and Masatomo Kobayashi. 2018. Detecting mental fatigue from eye-tracking data gathered while watching video: Evalua-
tion in younger and older adults. Artif. Intell. Med. 91 (2018), 39–48. DOI:https://doi.org/10.1016/j.artmed.2018.06.005
John M. Zelle and Charles Figura. 2004. Simple, low-cost stereographics: VR for everyone. SIGCSE Bull. 36, 1 (Mar. 2004), 348–352. DOI:
https://doi.org/10.1145/1028174.971421
Sanyuan Zhao and Tingzhi Shen. 2010. Driver fatigue detection based on eye status. In Proceedings of the 2010 International Conference on
Multimedia Technology.14.DOI:https://doi.org/10.1109/ICMULT.2010.5630864
Received July 2019; accepted July 2019
ACM Transactions on Applied Perception, Vol. 16, No. 3, Article 16. Publication date: September 2019.
... References N Mental workload Barre et al., 2019;Bektaş et al., 2021;Guna et al., 2019;Iqbal et al., 2021;Kalantari et al., 2021;Kia et al., 2021;Lacoche et al., 2022;Lim et al., 2022;Madathil & Greenstein, 2017;Makransky, Terkildsen, et al., 2019;Peterson et al., 2018;Tian et al., 2021;Tremmel et al., 2019;Wagner Filho et al., 2019;Wu et al., 2021 15 Physical workload and muscle fatigue Barre et al., 2019;Chihara & Seo, 2018;Iqbal et al., 2021;Lee & Han, 2018;Lim et al., 2022;Penumudi et al., 2020;Recenti et al., 2021 7 Physiological responses (stress) Aksoy et al., 2019;Barreda-Angeles et al., 2020;Bzd u skov a et al., 2021;Guna et al., 2019;Higuera-Trujillo et al., 2017;Kafri et al., 2021;Khenak et al., 2020;Kia et al., 2021;Lemmens et al., 2022;Peterson et al., 2018 10 Visual fatigue Alhassan et al., 2021;Elias et al., 2019;Hirota et al., 2019;Jacobs et al., 2019;Munsamy et al., 2020;Souchet, Philippe, et al., 2022;Szpak et al., 2019;Turnbull & Phillips, 2017;Yoon et al., 2020 9 Posture stability Bzd u skov a et al., 2021; Dennison & D'Zmura, 2018;Munafo et al., 2017;Sinitski et al., 2018;Widdowson et al., 2021 5 Usability Bektaş et al., 2021;Bracq et al., 2019;Kaminska et al., 2022;Kourtesis et al., 2019;Lacoche et al., 2022;Madathil & Greenstein, 2017;Pallavicini et al., 2019;Penumudi et al., 2020;Sun et al., 2015;Wagner Filho et al., 2019, 2020Yu et al., 2019 11 Cybersickness Barreda-Angeles et al., 2020;Bektaş et al., 2021;da Silva Marinho et al., 2022;Guna et al., 2019;Hussain et al., 2021;Lacoche et al., 2022;Lee et al., 2021;Lin et al., 2015Lin et al., , 2019Liu et al., 2020;Munafo et al., 2017;Szpak et al., 2019;Wibirama et al., 2020;Widdowson et al., 2021;Widyanti & Hafizhah, 2022;Yildirim, 2020 15 Presence Khenak et al., 2020;Lemmens et al., 2022;Magalhães et al., 2021;Maneuvrier et al., 2020;Pallavicini et al., 2019;Sun et al., 2015;Wagner Filho et al., 2020 7 Figure 4. Percentage of studies based on type of technique. ...
... Comparing watching film through TV and gaming by virtual reality (Maneuvrier et al.) Results showed that in emmetropic participants under 30 years of age with pupil distances between 51 and 70 mm, binocular accommodation and vergence ability increased after 25 min of VR play. Jacobs et al. (2019) 20-min visual search task Comparing dynamic adjustment compared to fixed camera parameters According to the results, dynamic adaptation significantly lowered baseline objective measures of visual comfort, such as pupil diameter and blink rate, after a 20-min trial. Although the data were inconclusive, subjective ratings of categories, such as fatigue or eye irritation showed a similar trend. ...
... In some VR studies, a force plate and motion capture were used to assess postural stability. The critical fusion frequency (CFF) (Lee et al., 2021) (Jacobs et al., 2019;Wang et al., 2019;Wibirama et al., 2020), pupil constriction rate, and other measures (Turnbull & Phillips, 2017;Widyanti & Hafizhah, 2022;Yoon et al., 2020) are objective techniques used to assess visual fatigue in VR. ...
Article
A variety of human factors/ergonomics (HFE) problems have been studied by researchers and developers in VR environments. This systematic review aimed to summarize important HFE issues and classify the validated instruments used to quantify them in virtual reality environments. The most representative electronic databases for this review (2013-2022) were searched for original articles. The results showed that aspects, such as cybersickness, visual fatigue, mental workload, performance, spatial presence, and usability were the most relevant HFE issues assessed, whereas some aspects, such as physical workload, posture, stress, and discomfort, were consider less often. Previous studies have neglected some human factors and ergonomic issues, such as physical ergo-nomics, stress, and aftereffects, such as fatigue and human error. In virtual environments, presence was an emerging human factor compared to real environments. Most techniques were unidimen-sional and subjective. Future studies should focus on more factors and risks associated with HFE by emphasizing objective techniques and multidimensional subjective methods.
... Avoid negative parallaxes if you need to use stereoscopy (Liu et al., 2021a). Ensure that disparity is constant (not changing all the time) (Speranza et al., 2006;Cai et al., 2017;Jacobs et al., 2019;Shen et al., 2019;Souchet et al., 2019) Make sure that virtual objects appear "on screen" (close to null disparity) to make vergence closer to accommodation (Fuchs, 2017). For other devices displaying stereoscopy, a viewing distance of 2 m or more are advised (Patterson, 2009): in HMDs, it advocates for objects in stereoscopy to be 2 m from the viewer Create a region of interest focus, applying blur on regions outside of interest (Carnegie and Rhee, 2015;Porcino et al., 2020b;Caputo et al., 2021) Try to make accommodation matching vergence with eye tracking (Hasnain et al., 2019) VF_4 IV ...
Article
Full-text available
Virtual reality (VR) can induce side effects known as virtual reality-induced symptoms and effects (VRISE). To address this concern, we identify a literature-based listing of these factors thought to influence VRISE with a focus on office work use. Using those, we recommend guidelines for VRISE amelioration intended for virtual environment creators and users. We identify five VRISE risks, focusing on short-term symptoms with their short-term effects. Three overall factor categories are considered: individual, hardware, and software. Over 90 factors may influence VRISE frequency and severity. We identify guidelines for each factor to help reduce VR side effects. To better reflect our confidence in those guidelines, we graded each with a level of evidence rating. Common factors occasionally influence different forms of VRISE. This can lead to confusion in the literature. General guidelines for using VR at work involve worker adaptation, such as limiting immersion times to between 20 and 30 min. These regimens involve taking regular breaks. Extra care is required for workers with special needs, neurodiversity, and gerontechnological concerns. In addition to following our guidelines, stakeholders should be aware that current head-mounted displays and virtual environments can continue to induce VRISE. While no single existing method fully alleviates VRISE, workers' health and safety must be monitored and safeguarded when VR is used at work.
... Furthermore, a possible influence of visual fatigue on available memory workload could directly influence work performance in VR (Park et al. 2015;Eckstein et al. 2017;Daniel and Kapoula 2019;Alhusuny et al. 2020;Bernhardt and Poltavski 2021). We reviewed eleven experiments with stimuli or tasks that could apply to the work (Souchet et al. 2018(Souchet et al. , 2019(Souchet et al. , 2021aShen et al. 2019b;Hirota et al. 2019;Jacobs et al. 2019;Iskander et al. 2019;Wang et al. 2019;Thai et al. 2020;Yoon et al. 2020;Chen and Hou 2021). Those studies mostly show the impacts of VR on accommodation and vergence during and after use. ...
Article
Full-text available
This narrative review synthesizes and introduces 386 previous works about virtual reality-induced symptoms and effects by focusing on cybersickness, visual fatigue, muscle fatigue, acute stress, and mental overload. Usually, these VRISE are treated independently in the literature, although virtual reality is increasingly considered an option to replace PCs at the workplace, which encourages us to consider them all at once. We emphasize the context of office-like tasks in VR, gathering 57 articles meeting our inclusion/exclusion criteria. Cybersickness symptoms, influenced by fifty factors, could prevent workers from using VR. It is studied but requires more research to reach a theoretical consensus. VR can lead to more visual fatigue than other screen uses, influenced by fifteen factors, mainly due to vergence-accommodation conflicts. This side effect requires more testing and clarification on how it differs from cybersickness. VR can provoke muscle fatigue and musculoskeletal discomfort, influenced by fifteen factors, depending on tasks and interactions. VR could lead to acute stress due to technostress, task difficulty, time pressure, and public speaking. VR also potentially leads to mental overload, mainly due to task load, time pressure, and intrinsically due interaction and interface of the virtual environment. We propose a research agenda to tackle VR ergonomics and risks issues at the workplace.
... Lee et al., 2010;Kim, Jung, et al., 2011;S.-H. Cho & Kang, 2012;Vienne et al., 2012;Iatsun et al., 2013;Bang et al., 2014;Benedetto et al., 2015;Iatsun et al., 2015;Zhang et al., 2015;Luo et al., 2016;Abromavicius & Serackis, 2017;Abromavičius & Serackis, 2018;Kim, Kumar et al., 2018;Lin & Widyaningrum, 2018;Zhou et al., 2019;Julie Iskander et al., 2019;Jacobs et al., 2019;Shen et al., 2019;Yan Wang et al., 2019;Thai et al., 2020;T. Kim & Lee, 2020). ...
Article
Virtual Reality Head-Mounted Displays (HMDs) reached the consumer market and are used for learning purposes. Risks regarding visual fatigue and high cognitive load arise while using HMDs. These risks could impact learning efficiency. Visual fatigue and cognitive load can be measured with eye tracking, a technique that is progressively implemented in HMDs. Thus, we investigate how to assess visual fatigue and cognitive load via eye tracking. We conducted this review based on five research questions. We first described visual fatigue and possible cognitive overload while learning with HMDs. The review indicates that visual fatigue can be measured with blinks and cognitive load with pupil diameter based on thirty-seven included papers. Yet, distinguishing visual fatigue from cognitive load with such measures is challenging due to possible links between them. Despite measure interpretation issues, eye tracking is promising for live assessment. More researches are needed to make data interpretation more robust and document human factor risks when learning with HMDs.
... Eyestrain was observed after using HMDs and S3D in many cases (Mon-Williams et al. 1993;Mon-Williams and Wann 1998;Rushton and Riddell 1999;Ujike and Watanabe 2015). Moreover, this phenomenon still occurs with the latest HMDs and mainly results in negative impacts on accommodation functions (Guo et al. 2017;Souchet et al. 2018Souchet et al. , 2019Hirota et al. 2019;Jacobs et al. 2019;Mohamed Elias et al. 2019;Iskander et al. 2019;Shen et al. 2019;Wang et al. 2019;Long et al. 2020;Munsamy et al. 2020;Szpak et al. 2020;Thai et al. 2020;Yoon et al. 2020;Alhassan et al. 2021): see Appendix for studies' summary. ...
Article
Full-text available
Purpose Do apparatuses and eyestrain have effects on learning performances and quality of experience? Materials and Methods: 42 participants played a serious game simulating a job interview with a Samsung Gear VR Head-Mounted Display (HMD) or a computer screen. Participants were randomly assigned to 3 groups: PC, HMD biocular, and HMD stereoscopy (S3D). Participants played the game thrice. Eyestrain was assessed pre- and post-exposure with six optometric measures. Learning performances were obtained in-game. Quality of experience was measured with questionnaires. Results: eyestrain was higher with HMDs than PC based on Punctum Proximum of accommodation but similar between biocular and S3D. Knowledge gain and retention were similar with HMDs and PC based on scores and response time. All groups improved response time but without statistically significant differences between HMDs and PC. Visual discomfort difference was statistically significant between PC and HMDs (biocular and S3D). Flow difference was statistically significant between PC and HMDs (biocular and S3D), with the PC group reporting higher Flow than HMD-S3D. Conclusion: short-term learning is similar between PC and HMDs. Groups initially using HMDs continued improving during long-term learning but without statistically significant difference compared to PC. Eyestrain and visual discomfort were higher with HMDs than PC. Flow was higher with the PC group. Our results show that eyestrain does not seem to decrease learning.
... Whereas some argue for its impact being underrated [Szpak et al. 2019], others suggest that its impact might not be as strong as assumed [Zhang et al. 2019a]. In addition, Jacobs et al. suggest that the VA conflict is only one of straining factors for VR and other causes may be underestimated [Jacobs et al. 2019]. Our survey revealed a fundamental lack of solutions that address active causes. ...
Article
It is often desirable or unavoidable to display Virtual Reality (VR) or stereoscopic content at low brightness. For example, a dimmer display reduces the flicker artefacts that are introduced by low-persistence VR headsets. It also saves power, prolongs battery life, and reduces the cost of a display or projection system. Additionally, stereo movies are usually displayed at relatively low luminance due to polarization filters or other optical elements necessary to separate two views. However, the binocular depth cues become less reliable at low luminance. In this paper, we propose a model of stereo constancy that predicts the precision of binocular depth cues for a given contrast and luminance. We use the model to design a novel contrast enhancement algorithm that compensates for the deteriorated depth perception to deliver good-quality stereoscopic images even for displays of very low brightness.
Preprint
This document provides an evaluation of the potential impacts of the use of immersive technologies on the users: potential positive impacts as well as potential negative impacts. We have conducted literature analysis in the different topics and present a report of what has been previously described in various fields, including investigation activities when available. We consider impacts on 3 dimensions: cognition, health, and well-being. For each kind of impact, we investigated means to measure and to mitigate (for negative impact) or to strengthen (for positive impact). It’s important to note that the impacts of immersive technologies on the users, their nature and intensity, closely relate to the technologies that are used. These technologies are rapidely evolving. Considering the current status of the technologies, the main findings can be summarized in three points: (1) Work in VR on the INFINITY platform should be weighted and dedicated to a limited number of tasks. Even if habituation to VR, which seems to reduce side effects, has been documented, medium to long-term effects is still unknown. The existing literature draws guideline to ensure the user’s wellbeing and we must refer to it to develop the platform, to reduce cognitive load and improve motivation and flow at work. (2) Measuring the effect of several stressors related to tasks in VR should be done on the INFINITY platform. It will help to assess acute stress. Ultimately, it could describe how those stressors can become chronic through episodic exposure, feeding occupational stress. In the short term, those stressors can negatively influence work performances, and INFINITY use-case performances since stress impacts cognitive resources necessary to interact with a virtual environment and conduct investigation-related tasks (data processing, meetings, decision making etc.). (3) Introducing VR as a new ICT tool requires changes in terms of interaction and interfaces and could impact mental workload. But interaction and the interface themselves could lead to mental overload because they require higher working memory resources. It appears that typical tasks transposed in VR do require more working memory resources, such as reading and writing with a keyboard. However, VR allows information spatialization. Despite requiring higher working memory resources, such spatialization seems to promote high performance when tasks take advantage of spatial information. Typically, data visualization and analytics seem to work well in VR because of these spatial information possibilities. This document sets the ground for recommendations that will be delivered in D2.2.
Article
Full-text available
We present a large scale data set, OpenEDS: Open Eye Dataset, of eye-images captured using a virtual-reality (VR) head mounted display mounted with two synchronized eye-facing cameras at a frame rate of 200 Hz under controlled illumination. This dataset is compiled from video capture of the eye-region collected from 152 individual participants and is divided into four subsets: (i) 12,759 images with pixel-level annotations for key eye-regions: iris, pupil and sclera (ii) 252,690 unlabelled eye-images, (iii) 91,200 frames from randomly selected video sequence of 1.5 seconds in duration and (iv) 143 pairs of left and right point cloud data compiled from corneal topography of eye regions collected from a subset, 143 out of 152, participants in the study. A baseline experiment has been evaluated on OpenEDS for the task of semantic segmentation of pupil, iris, sclera and background, with the mean intersection-over-union (mIoU) of 98.3 %. We anticipate that OpenEDS will create opportunities to researchers in the eye tracking community and the broader machine learning and computer vision community to advance the state of eye-tracking for VR applications. The dataset is available for download upon request at https://research.fb.com/programs/openeds-challenge
Article
Full-text available
Health monitoring technology in everyday situations is expected to improve quality of life and support aging populations. Mental fatigue among health indicators of individuals has become important due to its association with cognitive performance and health outcomes, especially in older adults. Previous models using eye-tracking measures allow inference of fatigue during cognitive tasks, such as driving, but they require us to engage in specific cognitive tasks. In addition, previous models were mainly tested by user groups that did not include older adults, although age-related changes in eye-tracking measures have been reported especially in older adults. Here, we propose a model to detect mental fatigue of younger and older adults in natural viewing situations. Our model includes two unique aspects: (i) novel feature sets to better capture fatigue in natural-viewing situations and (ii) an automated feature selection method to select a feature subset enabling the model to be robust to the target's age. To test our model, we collected eye-tracking data from younger and older adults as they watched video clips before and after performing cognitive tasks. Our model improved detection accuracy by up to 13.9% compared with a model based on the previous studies, achieving 91.0% accuracy (chance 50%).
Article
Full-text available
The ability to predict when a person is about to fall asleep is an important challenge in recent biomedical research and has various possible applications. Sleepiness and fatigue are known to increase pupillary fluctuations and the occurrence of eye blinks. In this study, we evaluated the use of the pupil diameter to forecast sleep episodes of short duration (>1s). We conducted multi-channel physiological and pupillometric recordings (diameter, gaze position) in 91 healthy volunteers at rest in supine position. Although they were instructed to keep their eyes open, short sleep episodes were detected in 20 participants (16 males, age: 26.2±5.6 years), 53 events in total. Before each sleep event, pupil size was extracted in a window of 30s (without additional sleep event). Mean pupil diameter and its standard deviation, Shannon entropy and wavelet entropy in the first half (15s) were compared to the second half of the window (15s). Linear and nonlinear measures demonstrated an elevation of pupil size variability before sleep onset. Most obviously, WE and SD increased significantly from 0.054±0.056 and 0.38±0.16 mm to 0.113±0.103 (T(102)=2.44, p<0.001) and 0.46±0.18 mm (T(104)=3.67, p<0.05) in the second half of each analysis window. We were able to identify 83% of the pre-sleep segments by linear discriminant analysis. Although our data was acquired in an experimental condition, it suggests that pupillary unrest might be a suitable predictor of events related to transient sleep or inattentiveness. In the future, we are going to involve the other recorded physiological signals into the analysis.
Conference Paper
Full-text available
Many applications such as data visualization or object recognition benefit from accurate knowledge of where a person is looking at. We present a system for accurately tracking gaze positions on a three dimensional object using a monocular head mounted eye tracker. We accomplish this by (1) using digital manufacturing to create stimuli whose geometry is know to high accuracy, (2) embedding fiducial markers into the manufactured objects to reliably estimate the rigid transformation of the object, and, (3) using a perspective model to relate pupil positions to 3D locations. This combination enables the efficient and accurate computation of gaze position on an object from measured pupil positions. We validate the of our system experimentally, achieving an angular resolution of 0.8∘ and a 1.5 % depth error using a simple calibration procedure with 11 points.
Article
Full-text available
Visual discomfort is a significant obstacle to the wider use of stereoscopic 3D displays. Many studies have identified the most common causes of discomfort, and a rich body of literature has emerged in recent years with proposed technological and algorithmic solutions. In this paper, we present the first comprehensive review of available image processing methods for reducing discomfort in stereoscopic images and videos. This review covers improved acquisition, disparity re-mapping, adaptive blur, crosstalk cancellation and motion adaptation, as well as improvements in display technology.
Article
Head-mounted displays (HMDs) often cause discomfort and even nausea. Improving comfort is therefore one of the most significant challenges for the design of such systems. In this paper, we evaluate the effect of different HMD display configurations on discomfort. We do this by designing a device to measure human visual behavior and evaluate viewer comfort. In particular, we focus on one known source of discomfort: the vergence-accommodation (VA) conflict. The VA conflict is the difference between accommodative and vergence response. In HMDs the eyes accommodate to a fixed screen distance while they converge to the simulated distance of the object of interest, requiring the viewer to undo the neural coupling between the two responses. Several methods have been proposed to alleviate the VA conflict, including Depth-of-Field (DoF) rendering, focus-adjustable lenses, and monovision. However, no previous work has investigated whether these solutions actually drive accommodation to the distance of the simulated object. If they did, the VA conflict would disappear, and we expect comfort to improve. We design the first device that allows us to measure accommodation in HMDs, and we use it to obtain accommodation measurements and to conduct a discomfort study. The results of the first experiment demonstrate that only the focus-adjustable-lens design drives accommodation effectively, while other solutions do not drive accommodation to the simulated distance and thus do not resolve the VA conflict. The second experiment measures discomfort. The results validate that the focus-adjustable-lens design improves comfort significantly more than the other solutions.
Conference Paper
This paper presents a controller for camera convergence and interaxial separation that specifically addresses challenges in interactive stereoscopic applications like games. In such applications, unpredictable viewer- or object-motion often compromises stereopsis due to excessive binocular disparities. We derive constraints on the camera separation and convergence that enable our controller to automatically adapt to any given viewing situation and 3D scene, providing an exact mapping of the virtual content into a comfortable depth range around the display. Moreover, we introduce an interpolation function that linearizes the transformation of stereoscopic depth over time, minimizing nonlinear visual distortions. We describe how to implement the complete control mechanism on the GPU to achieve running times below 0.2ms for full HD. This provides a practical solution even for demanding real-time applications. Results of a user study show a significant increase of stereoscopic comfort, without compromising perceived realism. Our controller enables 'fail-safe' stereopsis, provides intuitive control to accommodate to personal preferences, and allows to properly display stereoscopic content on differently sized output devices.
Article
Significance Wearable displays are becoming increasingly important, but the accessibility, visual comfort, and quality of current generation devices are limited. We study optocomputational display modes and show their potential to improve experiences for users across ages and with common refractive errors. With the presented studies and technologies, we lay the foundations of next generation computational near-eye displays that can be used by everyone.