Conference PaperPDF Available

Bivariate Effective Width Method to Improve the Normalization Capability for Subjective Speed-accuracy Biases in Rectangular-target Pointing

Authors:
  • Yahoo Japan
Bivariate Eective Width Method to Improve the Normalization
Capability for Subjective Speed-accuracy Biases in
Rectangular-target Pointing
Shota Yamanaka
Yahoo Japan Corporation
Chiyoda-ku, Tokyo, Japan
Hiroki Usuba
Meiji University
Nakano-ku, Tokyo, Japan
Homei Miyashita
Meiji University
Nakano-ku, Tokyo, Japan
ABSTRACT
The eective width method of Fitts’ law can normalize speed-
accuracy biases in 1D target pointing tasks. However, in graphical
user interfaces, more meaningful target shapes are rectangular. To
empirically determine the best way to normalize the subjective
biases, we ran remote and crowdsourced user experiments with
three speed-accuracy instructions. We propose to normalize the
speed-accuracy biases by applying the eective sizes to existing
Fitts’ law formulations including width
𝑊
and height
𝐻
. We call
this target-size adjustment the bivariate eective width method. We
found that, overall, Accot and Zhai’s weighted Euclidean model
using the eective width and height independently showed the
best t to the data in which the three instruction conditions were
mixed (i.e., the time data measured in all instructions were ana-
lyzed with a single regression expression). Our approach enables
researchers to fairly compare two or more conditions (e.g., devices,
input techniques, user groups) with the normalized throughputs.
CCS CONCEPTS
Human-centered computing
HCI theory, concepts and
models;Empirical studies in HCI.
KEYWORDS
Fitts’ law, pointing, graphical user interface, human motor perfor-
mance, crowdsourcing
ACM Reference Format:
Shota Yamanaka, Hiroki Usuba, and Homei Miyashita. 2022. Bivariate Ef-
fective Width Method to Improve the Normalization Capability for Sub-
jective Speed-accuracy Biases in Rectangular-target Pointing. In CHI Con-
ference on Human Factors in Computing Systems (CHI ’22), April 29-May
5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA, 13 pages.
https://doi.org/10.1145/3491102.3517466
1 INTRODUCTION
Evaluations of novel interaction techniques or devices compared
with baselines are regularly conducted in the HCI eld. Fitts’ law
[
18
] gives researchers a formalized methodology in which partici-
pants point to targets. Because participants may be unintentionally
biased towards either speed or accuracy [
67
], any such bias has
to be normalized in order to compare dierent input techniques,
devices, and user groups (e.g., children vs. older adults [
49
]). For-
tunately, Fitts’ law has a single metric for user performance that
This is the authors’ preprint version.
CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA
©2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-9157-3/22/04.. .$15.00
https://doi.org/10.1145/3491102.3517466
normalizes the speed-accuracy tradeos, called throughput
TP
. The
core idea of normalization is to use the eective target width
𝑊𝑒
that reects the endpoint distribution exhibited by the participants,
instead of the nominal target width
𝑊
displayed on the screen [
13
].
However, most previous studies on the eective width method
have focused on situations where the target is dened by
𝑊
alone,
such as 1D ribbon-shaped targets [
34
,
67
] or 2D circular targets
[
5
,
61
] (Figure 1a–b). In contrast, in realistic graphical user inter-
faces (GUIs), the shape of more meaningful targets is dened by
𝑊
and height
𝐻
(Figure 1c–d). The importance of testing user per-
formance with rectangular targets is well known [
1
,
3
,
29
,
66
,
68
],
but the characteristics of the eective height
𝐻𝑒
have rarely been
discussed [
28
,
46
,
53
]. To our knowledge, how well
𝐻𝑒
normalizes
the speed-accuracy biases in rectangular-target pointing has never
been studied. This is important because input devices have dierent
precisions in directions collinear and perpendicular to the cursor
movement [
40
], and comparing device performance with rectangu-
lar targets increases external validity (i.e., tasks with higher realism)
[1,29].
In this study, we investigated the potential of integrating
𝐻𝑒
in
the eective width method for normalizing speed-accuracy trade-
os. We apply
𝑊𝑒
and
𝐻𝑒
to existing Fitts’ law formulations in-
cluding
𝑊
and
𝐻
. This target-size adjustment is called the bivariate
eective width method. While the normalization capability has been
shown for 1D targets [
67
] and circular targets [
4
], the potential of
𝐻𝑒
for rectangular targets has remained unexplored (Figure 1). In
this study, we limited our experimental tasks to horizontal move-
ments (Figure 1c)1.
We ran two experiments: a remote-controlled one with university
students and a crowdsourcing one. In both, we provided three
subjectively biased speed-accuracy instructions. The remote study
was an alternative to a conventional lab-based one. The purpose
of the crowdsourced study was to replicate the remote study with
much more diverse participants, thereby increasing the validity
of the model evaluation. As the purposes for both experimental
styles are dierent, we do not compare the results of these two
experiments directly. Our ndings can be summarized as follows.
When we analyzed each subjective bias condition, Accot and
Zhai’s weighted Euclidean model [
1
] using nominal
𝑊
and
𝐻
showed the best t in both experiments. Thus, if researchers
would like to predict movement times more accurately with one
instruction (e.g., balance the speed and accuracy) for one user
1
The eectiveness of including
𝐻𝑒
for normalizing the speed-accuracy biases is not
yet known, even in a specic (horizontal-movement) task condition. Thus, testing the
eect of approaching angle
𝜃
(Figure 1d) is a logical next step after we conrm the
eectiveness to integrate 𝐻𝑒.
CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.
θ
H
D
W
D
(a) Ribbon-shaped targets (c) Rectangular targets
(b) Circular targets
W
D
WH
D
(d) Rectangular targets
W
Movement direction Single-axis Multi-directional Single-axis Dened by 𝜃
Nominal-size model Fitts 1954 [18] MacKenzie 1992 [34] Crossman 1956 [13] Yang+ 2010 [66]
MacKenzie 1992 [34] Soukore+ 2004 [54] Accot+ 2003 [1] Zhang+ 2012 [68]
Eective-size model Crossman 1956 [13] Wobbrock+ 2011 [61] Murata 1999 [46] None
Normalization capability test
Zhai+ 2004 [67] Batmaz+ 2021 [4] This work None
Figure 1: Previous studies on pointing models for dierent task conditions.
group using a single input device, we recommend using the
nominal 𝑊and 𝐻.
When we analyzed the data of the three instructions in a mixed
manner (i.e., the time data are analyzed without separating the
three bias conditions), Accot and Zhai’s model with 𝑊𝑒and 𝐻𝑒
showed the best t in most cases. Hence, if researchers want
to compare dierent input devices, interaction techniques, or
user groups, integrating
𝑊𝑒
and
𝐻𝑒
can adequately normalize
speed-accuracy tradeos.
When we used
𝑊𝑒
and
𝐻𝑒
, the range of
TP
values for the three
instructions was remarkably small. This nding also supports
our conclusion on the normalization capability of the bivariate
eective width method.
2 RELATED WORK
2.1 Fitts’ Law and Eective Width Method
Fitts’ law predicts the movement time
MT
to point to a target, which
is linearly related to the index of diculty ID:
MT =𝑎+𝑏·ID,(1)
where
𝑎
and
𝑏
are empirical constants. Given that the target dis-
tance is
𝐷
and its width is
𝑊
, as shown in Figure 2a, MacKenzie’s
formulation of ID [34] is widely used in the HCI eld:
ID =log2(𝐷/𝑊+1).(2)
Since any
ID
formulation using nominal target parameters ignores
the actual accuracy of participants, the higher-performance group
changes depending on whether we give weight to speed or accuracy
[
54
]. Because having several user groups with exactly the same
ER
is a rare occurrence, a post-hoc adjustment of accuracy is needed. To
enable such comparisons, the eective width method replaces the
nominal
𝑊
with the eective width
𝑊𝑒
that takes the distribution
of click positions (i.e., endpoints) into account [13], as
𝑊𝑒=2𝜋𝑒𝜎 =4.133𝜎, (3)
where
𝜎
is the standard deviation of the endpoints (
SDx
in Fig-
ure 2b). Using this method,
𝑊𝑒
is adjusted so that
4%of the
clicks fall outside of the target. Then, we obtain the eective in-
dex of diculty
IDe
by replacing
𝑊
in Equation 2with
𝑊𝑒
: i.e.,
IDe=log2(𝐷/𝑊𝑒+1)
.
SDx
can also be used with circular targets
[35,61].
While the eective width method assumes that endpoints are
normally distributed over a target [
13
,
59
], this assumption has
some theoretical issues [
22
]. For example,
ER
is set to 4% arbitrarily,
which has no information-theoretic justication. Still, most of the
aspects of the eective width method are positive, particularly the
fact that
IDe
enables device or user performances to be compared
across dierent experimental conditions (see Section 3.3 in [
21
],
[67]).
By using
IDe
, researchers can obtain a unied measure of user
performance,
TP
in bits/s, that integrates speed (in
MT
) and accu-
racy (in SDx). A famous denition of TP is
TP =
1
𝑁cond
𝑁cond
𝑖=1ID𝑒𝑖
𝑀𝑇𝑖,(4)
where
𝑁cond
is the number of task conditions and
𝑖
indicates the
𝑖
-
th condition among
𝑁cond
[
54
,
61
]. Readers are directed to [
48
,
61
]
for detailed discussions on the dierences between various
TP
denitions (e.g., whether or not the intercept of Fitts’ law regression
is integrated). In this paper, we rst aggregate the participants’
MT
data for each task condition and then apply Equation 4, which is
one of the possible ways to compute TP [48].
2.2 Modied Versions of Fitts’ Law for
Rectangular-target Pointing
We consider only left-right movements and dene
𝑊
and
𝐻
as the
target sizes on the x- and y-axes (Figure 1c), as in previous studies
on rectangular-target pointing [
1
,
8
,
13
,
27
,
31
]. Crossman proposed
the rst model to predict
MT
for such targets by using another
regression constant 𝑐[13]:
MT =𝑎+𝑏·log2𝐷
𝑊+1+𝑐·log2𝐷
𝐻+1,(5)
where
ID =[log2(𝐷/𝑊+
1
) +𝑐/𝑏·log2(𝐷/𝐻+
1
)]
. Crossman’s
original formulation did not include the “+1” factors. For a fair
comparison with other models, we will use this plus-one form, as
the previous studies did [
1
,
29
,
66
]. This decision does not aect our
conclusions, because these constants have only trivial eects on
model tness (see the theoretical discussions in [
22
,
50
]). Kvålseth
proposed a slightly dierent model in which the diculty for the
Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA
b
1()
1()
distance ()
a
width ()
height ()
Figure 2: (a) Parameters of a rectangular-target pointing and (b) computation of
SDx
and
SDy
. The "x" marks indicate click
positions.
target height was considered in addition to Fitts’ law [31]:
MT =𝑎+𝑏·log2𝐷
𝑊+1+𝑐·log21
𝐻,(6)
where
ID =[log2(𝐷/𝑊+
1
) +𝑐/𝑏·log2(
1
/𝐻)]
. Note that
𝐷
𝑊+1
is originally dened as
𝐷+𝑊
𝑊
[
34
], and thus, the “+1” factor is
included. In comparison, we found no justication to apply “+1” to
(1/𝐻)
in Equation 6. MacKenzie and Buxton [
36
] and Homann
and Sheikh [
27
] proposed a model using the smaller value of
𝑊
and 𝐻:
IDmin =log2𝐷
min(𝑊 , 𝐻 )+1.(7)
This model indicates that the time is solely aected by the more
dicult dimension. Lastly, a well-known successful formulation for
rectangular-target pointing is Accot and Zhai’s weighted Euclidean
model using a free parameter 𝑐:
ID =log2©«𝐷
𝑊2
+𝑐·𝐷
𝐻2
+1ª®¬.(8)
2.3 Eective Width and Height for
Rectangular-target Pointing
Our idea is to apply
𝑊𝑒
and
𝐻𝑒
to 2D forms of Fitts’ law (Equa-
tions 58).
𝐻𝑒
is dened in the same way as
𝑊𝑒
, i.e.,
𝐻𝑒=
4
.
133
·SDy
,
where
SDy
is the
𝜎
of endpoints perpendicular to the task axis (Fig-
ure 2b). This requires that the endpoints on the y-axis are normally
distributed over the target and that the endpoints on the x and y
axes are uncorrelated, which has been empirically found to be the
case [6,27,58].
Using
𝐻𝑒
was proposed by Murata [
46
]. He utilized square targets
(
𝑊=𝐻
) and the approach angle towards the upper-right of
𝜃=
45
but measured
SDx
and
SDy
on the screen. He dened the target
size as
min(𝑊𝑒, 𝐻𝑒)
by using the
IDmin
model. Another approach
to replacing
𝑊
is to use the bivariate standard deviation
SDxy
as
𝜎
in Equation 3[
16
,
61
]. We will compare these approaches with our
method, which independently applies 𝑊𝑒and 𝐻𝑒.
Jagacinski and Monk made an assumption that endpoints follow a
bivariate normal distribution [
28
]. However, they used only circular
targets and assumed that
SDx
was always equal to
SDy
. Sheikh and
Homann conrmed that
MT
can be modeled by (1) Fitts’ law with
𝑊
and (2) Fitts’ law that replaces
𝑊
with
𝐻𝑒
[
53
]. They tested the
tness for these models separately and did not use
𝑊𝑒
; thus, the
tness of a model integrating 𝑊𝑒and 𝐻𝑒is unknown.
2.4 Normalization Eect of the Eective Width
Method
Zhai et al. gave participants three instructions, namely,
Bias =
Accurate
,
Neutral
, and
Fast
, for emphasizing accuracy, balancing
speed and accuracy, and emphasizing speed, respectively [
67
].
When they analyzed the three biases’ data in a mixed manner,
Equation 2using
𝑊
showed
𝑅2=
0
.
696 and using
𝑊𝑒
showed
𝑅2=
0
.
825, which demonstrates the normalization capability of the
eective width method.
The
TP
s for dierent speed-accuracy instructions will be close
to each other. MacKenzie and Isokoski used three biases (
Accurate
,
Neutral
, and
Fast
), and the
TP
s were 5.70, 5.73, and 5.67 bits/s, re-
spectively (
<
1% dierences) [
38
]. This result shows that using
𝑊𝑒
normalizes the speed-accuracy biases, which enables us to com-
pare the accuracy-normalized performances of user groups having
dierent biases.
However, if we analyze the model tness for a single speed-
accuracy instruction condition, the
𝑅2
value obtained using
𝑊𝑒
will
be smaller than
𝑊
. This result has been reported in numerous stud-
ies [
5
,
19
,
33
,
39
,
67
]. We will also check this possible disadvantage
in our data analyses.
There are two main approaches for using the eective width
method. First, a single instruction (typically “Neutral”) is given to a
group of participants. It is inevitable that the participants will have
dierent personal biases, e.g., some will operate a mouse rapidly
while others slowly. Using
IDe
helps to normalize this personal
bias by adjusting the error rates to 4%; this is the reason using
the eective width method is recommended when measuring the
performance of several devices or user groups (i.e., not predicting
MT
s under new conditions) [
54
]. Second, several instructions are
given and each participant changes their speed-accuracy balance in
an experiment. The eective width method normalizes these inten-
tional biases, yielding invariant
TP
s between dierent instruction
conditions and yielding a high model tness when analyzing the
data in a mixed manner [
67
]. The second approach is investigated
in this paper.
CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.
3 EXPERIMENT 1: REMOTELY INSTRUCTED
TASK WITH UNIVERSITY STUDENTS
Experiment 1 was designed to be run as a lab-based experiment.
Due to COVID-19, we distributed the experimental system to stu-
dent members of our laboratory and they performed the task in
their homes using their own PCs and mice. We developed the ex-
perimental system using the
Hot Soup Processor
programming
language.
With lab-based experiments, researchers typically use the same
apparatus to evaluate models. If they use dierent mice, displays,
cursor-speed settings, etc., it is dicult to discuss whether a poor
model t comes from (e.g.) inadequate parameters in the model or
dierences in screen resolutions. It is important to note here that
target pointing performance is aected by various settings of the
apparatus and PC, such as mouse latency [
41
], screen resolution
[
12
], and cursor speed and acceleration function [
11
]. Hence, if it
was not possible to draw a clear conclusion about whether using
𝐻𝑒
could normalize the speed-accuracy biases because, for example, the
statistical dierences between several models were not signicant,
it would be better to re-run the study in a more controlled lab-based
environment.
However, it has been demonstrated that distributed (crowd-
sourced) user experiments lead to the conclusion that Fitts’ law
holds [
17
,
30
,
52
]. For rectangular-target pointing, the best-t model
is not likely to change even when comparing the model tness re-
peatedly by random-sampling; i.e., Accot and Zhai’s model has
always been the best [
64
]. Therefore, it has been well-demonstrated
that we can obtain a consistent conclusion on the best-t model
for lab-based and crowdsourced experiments. While we skip con-
ducting a traditional lab-based study, it is worth running remote
and crowdsourced experiments to examine our hypothesis. If re-
searchers uncover a negative result when running a lab-based study
in the future (e.g., introducing
𝐻𝑒
cannot normalize the speed-
accuracy biases), it will give a dierent type of contribution; e.g.,
how to control the experimental apparatus could lead to dierent
conclusions.
3.1 Task, Design, and Procedure
The task was to click the red target back and forth. The study
was a 3
×
2
×
4
×
5within-subjects design with the following
independent variables and levels: three subjective biases (
Bias =
Accurate,Neutral,
and
Fast
), two
𝐷
s (380 and 640 pixels), four
𝑊
s
(30
,
40
,
60, and 90 pixels), and ve
𝐻
s (20
,
30
,
40
,
60, and 150 pixels).
The three
Bias
conditions were the same as those of previous studies
[
19
,
67
]. Several previous studies have also tested other biases (e.g.,
extremely accurate/fast [
25
,
67
]), but in order to avoid an overly
high number of task-condition combinations, we decided to exam-
ine one accuracy- and one speed-emphasized condition along with
a baseline, which was sucient for our purpose.
One session consisted of 15 cyclic clicks with a xed
𝐷×𝑊×𝐻
condition. One block consisted of 40 (
=
2
𝐷×
4
𝑊×
5
𝐻
) sessions for a
xed
Bias
condition. The rst target was on the left side. When the
participant clicked on the target, the colors of the red (target) and
white (non-target) rectangles switched. If the participant missed the
target, it ashed yellow, and the participant had to aim at it again
until they clicked it successfully. We did not give auditory feedback
for success or failure. After completing 15 successful clicks, the
results of the session (
MT
and the number of errors) and a message
to take a break were displayed. The rst three clicks in each session
were omitted and we used the remaining 12 clicks (six for each
side) in the subsequent analyses. The order of the 40
𝐷×𝑊×𝐻
conditions was randomized for each block. In total, we recorded
3
Bias ×
2
𝐷×
4
𝑊×
5
𝐻×
12
clicks ×
18
participants =
25
,
920 data points.
3.2 Pre-experiment Instructions and Practice
We asked the participants to watch a 2.5-min video in which one of
the authors demonstrated the task. At this stage, we told them that
there would be three
Bias
conditions and asked them to perform
the tasks dierently in terms of speed and accuracy. In addition, to
control the cursor conguration, we asked them to set the cursor-
speed slider in the Control Panel to default (middle) and turn on the
cursor acceleration function (“Enhance pointer precision”), which
is the default of the Windows OS. Using a specic conguration
on the cursor speed is commonly done in lab-based experiments.
However, dierently from lab-based experiments, our participants
had dierent mice and displays. Thus, our decision might nega-
tively aect some participants’ performance, as the combinations
of apparatus settings are known to aect target-pointing behavior,
and this is a limitation of this study. To solve this issue, a more so-
phisticated method is needed, e.g., hardware-independent pointing
transfer functions [26].
We asked the participants to run an executable le that provided
a practice task with only one session for each
Bias
condition. In this
practice, the parameters of
(𝐷, 𝑊 , 𝐻 )=(
450
,
50
,
25
)
pixels were
xed to values not used in the data-collection task, as the purpose
of the practice was to allow the participants to get used to the three
speed/accuracy balances with the set cursor speed. To do so, we set
the rst
Bias
condition to
Neutral
so that the participants could un-
derstand the balance between speed and accuracy and then shifted
it towards more rapidly or more slowly. The order of the subsequent
two conditions (
Fast
and
Accurate
) was randomized. Then, in the
data collection trials, the order of the three
Bias
conditions was
counter-balanced among the 18 participants.
3.3 Participants
We recruited 18 students from our university. All participants used
optical mice. Each participant received JPY 5000 (
USD 48). The
main pointing task typically took 30 to 40 min to complete. The
participants’ demographics were as follows. Age: ranging from 21
to 24 years, 𝑀=22.2and SD =0.916. Gender: 10 were male and 8
were female. PC usage history: ranged from 2 to 18 years,
𝑀=
8
.
67
and SD =4.22. All were right-handed and used Windows 10.
4 RESULTS OF EXPERIMENT 1
We removed outlier data for trials in which the movement distance
for the rst click position was shorter than
𝐷/
2[
17
,
38
]. We did
not use another frequently used criterion that removes trials in
which the rst click position is more than 2
𝑊
away from the target
center [
17
,
38
], because the endpoints for the
Fast
instruction were
expected to be wider than those in previous studies. In addition,
we did not use
MT
-based outlier trials or participants, as extremely
Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA
rapid or slow movements were possible depending on
Bias
. As a re-
sult, we removed 35 outlier trials (0.135%). The dependent variables
were MT for the rst click, ER,SDx, and SDy.
4.1 Normality Test
We tested normality by using the Shapiro-Wilk test (
𝛼=
0
.
05)
before we ran an RM-ANOVA. Although ANOVA is robust against
violations of the normality test assumptions [
15
,
43
], it is better to
log-transform the data for detecting statistical signicance more
appropriately. Regarding
MT
, we found that 81 conditions out of
120 (3
Bias ×
2
𝐷×
4
𝑊×
5
𝐻
) passed the normality test, or 67.5%.
We then log-transformed the data and obtained 110 conditions
(91.7%) that passed the test. After that, we ran the RM-ANOVA with
Bonferroni’s
𝑝
-value adjustment method for pairwise comparisons.
For the
𝐸𝑅
data, only seven conditions out of 120 passed the nor-
mality test (5.9%). A number of data were 0% and thus we could not
log-transform them. Therefore, we used non-parametric ANOVAs
with an aligned rank transform [
60
] and Tukey’s
𝑝
-value adjustment
method for pairwise comparisons.
For the
SDx
data, we found that 90 conditions passed the normal-
ity test (75.0%), and 105 conditions (87.5%) of log-transformed data
passed the test. For
SDy
, 103 conditions passed the test (85.8%), and
then the log-transformed data from 115 conditions (95.8%) passed
the test. We ran RM-ANOVAs for log-transformed
SDx
and
SDy
data. Note that the normality test was to examine if the 18 partici-
pants’ data distributed normally, and the results were independent
from whether the click positions were distributed normally.
4.2 Movement Time
Throughout this paper, for the
𝐹
statistic, the degrees of freedom
for the main eects of
Bias
,
𝐷
,
𝑊
, and
𝐻
, as well as their interac-
tions, were corrected using the Greenhouse-Geisser method when
Mauchly’s sphericity assumption was violated (
𝛼=
0
.
05). Because
our focus is on model tness, we limit our report here to the main ef-
fects of the independent variables for simplicity (and more detailed
results are included in the supplementary materials).
We found signicant main eects of
Bias
(
𝐹2,34 =
87
.
53,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
84),
𝐷
(
𝐹1,17 =
586
.
2,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
97),
𝑊
(
𝐹2.138,36.35 =
870
.
6,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
98), and
𝐻
(
𝐹4,68 =
64
.
01,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
79) on
MT
. Signicant interactions (
𝑝<
0
.
05)
were found for
Bias×𝐷
,
Bias×𝑊
,
𝐷×𝑊
,
𝑊×𝐻
, and
Bias×𝐷×𝑊
. As
expected,
MT
decreased when the instructions emphasized speed
more, when
𝐷
decreased, and when
𝑊
and
𝐻
increased (Figure 3).
In particular, these results show that the participants appropriately
followed the Bias instructions (Figure 3a).
4.3 Error Rate
We found signicant main eects of
Bias
(
𝐹2,34 =
87
.
05,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
84) and
𝑊
(
𝐹3,51 =
13
.
87,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
45) on
ER
, but
no signicant eect of
𝐷
(
𝐹1,17 =
1
.
054,
𝑝=
0
.
32,
𝜂2
𝑝=
0
.
058) or
𝐻
(
𝐹4,68 =
0
.
349,
𝑝=
0
.
84,
𝜂2
𝑝=
0
.
020). Signicant interactions
(
𝑝<
0
.
05) were found for
Bias ×𝑊
,
𝐷×𝑊
,
𝐷×𝑊×𝐻
, and
Bias ×𝐷×𝑊×𝐻.
ER
increased when the instruction emphasized speed more (Fig-
ure 4a) and when
𝑊
decreased (c). In comparison,
𝐷
and
𝐻
did not
signicantly aect
ER
(Figure 4b and d). The same lack of eect of
𝐷
on
ER
has been found in previous studies [
1
,
65
]. In contrast to
our results, Accot and Zhai reported that
𝐻
had a signicant eect
on
ER
. If we had tested a much smaller
𝐻
, such as 8 pixels [
1
] or 1
mm [27], this result might have been dierent.
4.4 Endpoint Variability in SD𝑥and SD𝑦
For
SDx
, we found signicant main eects of
Bias
(
𝐹2,34 =
71
.
79,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
81),
𝑊
(
𝐹3,51 =
891
.
9,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
98),
and
𝐻
(
𝐹4,68 =
3
.
792,
𝑝<
0
.
01,
𝜂2
𝑝=
0
.
18), but no signicant eect
of
𝐷
(
𝐹1,17 =
2
.
799,
𝑝=
0
.
11,
𝜂2
𝑝=
0
.
14). Signicant interactions
(
𝑝<
0
.
05) were found for
𝑊×𝐻
and
𝐷×𝑊×𝐻
. For
SDy
, we found
signicant main eects of
Bias
(
𝐹2,34 =
26
.
85,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
61),
𝐷
(
𝐹1,17 =
66
.
99,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
80),
𝑊
(
𝐹3,51 =
3
.
063,
𝑝<
0
.
05,
𝜂2
𝑝=
0
.
15), and
𝐻
(
𝐹1.718,29.20 =
447
.
0,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
96).
Signicant interactions (
𝑝<
0
.
05) were found for
𝐷×𝐻
and
𝑊×𝐻
.
Figure 5plots the endpoint distributions. We can conrm here
that more clicks missed the target when the instructions empha-
sized speed more. Also, regarding the
𝐻𝑒
, more clicks were located
close to the target center on the y-axis when
Bias =Accurate
com-
pared with
Fast
. For example, when
𝐻=
20 pixels, the
SDy
for the
Accurate
condition was 3.414 pixels, while that for the
Neutral
and
Fast
conditions were 3.925 and 4.386 pixels, respectively; the
SDy
increased by 28% at most. This supports our hypothesis that, in
addition to SDx, the SDydata change according to the Bias.
Figure 5also shows that the spread of hits on the y-axis is likely to
increase as
𝐻
increases. To validate this, we checked the regression
between given target sizes and endpoint variability. Figure 6shows
that
SDx
and
SDy
increased with
𝑊
and
𝐻
, respectively, with
𝑅2>
0
.
85. In addition, when the instructions emphasized speed more,
the
SDx
values increased, showing larger intercepts and steeper
slopes. However, this relationship did not hold for
SDy
; e.g., the
slope for the Neutral condition was higher than that for Fast. One
possible explanation for this is that, when the instruction was
Fast
,
the participants tended to click roughly around the target even
when
𝐻
was small, and thus
SDy
became larger compared with
Neutral
. This led the y-axis values at low
𝐻
values to be higher
for
Fast
, thus tilting the regression line clockwise and pushing the
slope to become more stable. Therefore, the slopes of the regression
lines are not always higher for the faster
Bias
instructions. Another
tendency was that the tness for (
𝑊
,
SDx
) was greater than that
for (
𝐻
,
SDy
). This was possibly because we chose an extreme value
of 𝐻=150 pixels.
4.5 Model Fitness
We discuss the model tness in a comparative manner. We use an
adjusted
𝑅2
, and in addition, to discuss the model tness statisti-
cally, we calculate
AIC
[
2
]. As a rule of thumb, (a) a lower
AIC
value indicates a better model and a model with the minimum
AIC
(
𝐴𝐼𝐶minimum
) is the best; (b) a model with
AIC
(
𝐴𝐼𝐶minimum +
2)
is comparable with better models; and (c) a model with
AIC
(
𝐴𝐼𝐶minimum +
10) can be safely rejected [
9
]. For simplicity, we
CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.
728 689 680 672 671
0
200
400
600
800
1000
20 30 40 60 150
MT [ms]
H [pixels]
775 723 654 601
0
200
400
600
800
1000
30 40 60 90
MT [ms]
W [pixels]
639 737
0
200
400
600
800
1000
380 640
MT [ms]
D [pixels]
766 671 627
0
200
400
600
800
1000
Accurate Neutral Fast
MT [ms]
Bias
a b c d
Figure 3: Main eects on
MT
in Experiment 1. Throughout
this paper, the error bars show 95% CIs, and the horizontal
bars show signicant dierences (𝑝<0.05 at least).
6.09 5.79 5.57 6.10 5.93
0
5
10
15
20 30 40 60 150
ER [%]
H [pixels]
7.39 6.43 5.10 4.66
0
5
10
15
30 40 60 90
ER [%]
W [pixels]
5.75 6.04
0
5
10
15
380 640
ER [%]
D [pixels]
1.69
4.82 11.18
0
5
10
15
Accurate Neutral Fast
ER [%]
Bias
a b c d
Figure 4: Main eects on ER in Experiment 1.
Neutral
Fast
Accurate
H = 20 px H = 30 px H = 40 px H = 60 px H = 150 px
Figure 5: Click point distributions for
𝑊=
40 pixels condition by the 18 participants. The target height increases from left to
right. Three
Bias
conditions are shown at the top (
Accurate
), middle (
Neutral
), and bottom (
Fast
) rows. We aligned the task axis,
i.e., click points for the leftward movements are ipped to the right to merge the data when computing
SDx
and
SDxy
[
54
,
61
].
consider an
AIC
dierence greater than 10 to be signicant. Table 1
shows the tness results for the ve model candidates2.
When we used the nominal values, Accot and Zhai’s model al-
ways had the highest adjusted
𝑅2
and lowest
AIC
values. Thus,
Accot and Zhai’s model was the best for all
Bias
conditions and
for
Mixed
data. When we used the eective target sizes, Accot and
Zhai’s model again showed the highest adjusted
𝑅2
and lowest
AIC
values, except for the
Fast
condition, where MacKenzie’s formula-
tion using
SDx
gave the best t. However, as the
AIC
dierences
from Crossman’s, Kvålseth’s, and Accot and Zhai’s models were less
than 10, we could not actually determine that MacKenzie’s formu-
lation was the best. Accot and Zhai’s model is thus a safe choice for
2
The supplementary material shows more comprehensive results including the free
parameter values, non-adjusted 𝑅2values, and regression graphs.
a user experiment with a single instruction. Another insignicant
dierence was found for the
Mixed
condition: the dierence in
AIC
between Kvålseth’s (1187) and Accot and Zhai’s models (1179) was
less than 10.
To compare the models when using nominal vs. eective tar-
get sizes, when we analyzed the single-instruction data, using the
nominal values was always signicantly better for all three
Bias
conditions. This is consistent with previous studies on the eec-
tive width method [
62
,
67
]. Therefore, if researchers would like to
predict
MT
s with a single instruction, we recommend using Accot
and Zhai’s model with the nominal target sizes. In contrast, for
the mixed-instruction data, the eective target sizes gave signi-
cantly better model ts, except for the
IDmin
model. Therefore, if
researchers would like to compare several input devices or user
Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA
a b c
y = 0.1685x + 1.614
R² = 0.962
0
10
20
30
40
050 100
SD_x [pixels]
W [pixels]
y = 0.1912x + 2.5673
R² = 0.9348
0
10
20
30
40
050 100
SD_x [pixels]
W [pixels]
y = 0.2484x + 2.9625
R² = 0.881
0
10
20
30
40
050 100
SD_x [pixels]
W [pixels]
y = 0.0539x + 2.7879
R² = 0.9004
0
5
10
15
20
050 100 150
SD_y [pixels]
H [pixels]
y = 0.0643x + 3.1665
R² = 0.916
0
5
10
15
20
050 100 150
SD_y [pixels]
H [pixels]
y = 0.0604x + 3.828
R² = 0.8502
0
5
10
15
20
050 100 150
SD_y [pixels]
H [pixels]
Accurate, SDx Neutral, SDx Fast, SDx
def
Accurate, SDy Neutral, SDy Fast, SDy
Figure 6: Regression expressions for (a–c) SDxvs. 𝑊and (d–e) SDyvs. 𝐻in Experiment 1.
Table 1: Model tness in Experiment 1. For the three
Bias
conditions, we regressed 40 data points (2
𝐷×
4
𝑊×
5
𝐻
), while for the
“Mixed” data analysis, we used 120 data points in total. Only for the eective MacKenzie model, there is a choice as to whether to
use
SDx
or
SDxy
. The blue cells show the best-t results for each
Bias
condition for each {Nominal, Eective} target-size analysis.
Accurate Neutral Fast Mixed
Size Ref. Eq. adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC
Nominal
MacKenzie 20.8785 393.3 0.8851 386.0 0.9097 374.1 0.6151 1344
Crossman 50.9257 374.9 0.9284 368.4 0.9492 352.4 0.6434 1337
Kvålseth 60.9189 378.4 0.9195 373.1 0.9413 358.2 0.6381 1338
IDmin 70.5556 445.2 0.5707 438.8 0.5330 439.8 0.3857 1400
Accot & Zhai 80.9656 344.1 0.9748 326.7 0.9821 310.8 0.6700 1327
Eective
MacKenzie (SDx)20.8876 390.2 0.9086 376.9 0.9353 360.8 0.8697 1214
MacKenzie (SDxy )20.7273 425.6 0.7431 418.2 0.8256 400.4 0.7507 1292
Crossman 50.9235 376.1 0.9277 368.8 0.9211 370.0 0.8906 1195
Kvålseth 60.9209 377.4 0.9262 369.6 0.9217 369.7 0.8974 1187
IDmin 70.3518 460.3 0.3721 454.0 0.3327 454.1 0.3943 1399
Accot & Zhai 80.9477 360.9 0.9473 356.2 0.9242 368.4 0.9039 1179
groups, we recommend using the eective target sizes. To check
this, we applied the eective target size only for the width (i.e., us-
ing
𝑊𝑒
and
𝐻
) to Accot and Zhai’s model and obtained an adjusted
𝑅2=
0
.
8912 and
AIC =
1194. Thus, we found that using both
𝑊𝑒
and 𝐻𝑒signicantly contributed to the model tness.
Now, we can visually grasp how the bivariate eective width
method improves the Accot and Zhai’s model tness (see Figure 7).
For the nominal data, the plot points in (a) are clearly shifted on
the y-axis depending on the given instructions. Therefore, when
we analyzed the
Mixed
data, the regression line passed between the
Accurate
and
Fast
conditions’ plot points. In contrast, the plot points
in (b) are less biased by the instructional dierence and lie closer
to the regression line. This is because the eective width method
changes
ID
in accordance with the actual endpoints. For example,
for the nominal data,
ID
of Accot and Zhai’s model ranged from 2.40
to 4.63 bits, while the range when using the eective target sizes
was 2.21 to 4.77 bits. This feature is important for normalizing the
speed-accuracy biases and is consistent with the results of previous
studies [5,67].
4.6 Throughput
Figure 8a shown the throughputs. In addition, we computed the
range of these
TP
values. We dened the
TP
dierence among
the three
Bias
conditions as 100%
× (TPmax TPmin)/TPmax
. For
example, for MacKenzie’s formulation using the nominal
𝑊
, the
TP
dierence is 100%
× (
5
.
459
4
.
468
)/
5
.
459
=
18
.
15%. If a certain
model “perfectly” normalizes the speed-accuracy biases, the
TP
dierence is 0%, and the
TP
dierence does not reach 100% because
TPmin
is non-zero; 0%
TP dierence <
100%. In addition to
the model tness, this
TP
dierence is another intuitive metric to
discuss the
TP
normalization capability of models. However, note
that there is no clear threshold to determine the capability.
With the eective width method, the
TP
s for dierent speed-
accuracy biases are close to each other [
38
], so the dierence in
TP
is preferred to be small. By comparing the nominal and eective
CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.
aNominal bEffective
0
200
400
600
800
1000
1200
0123456
MT [ms]
ID [bits]
0
200
400
600
800
1000
1200
0123456
MT [ms]
ID [bits]
Accurate:
Neutral:
Fast:
y = 154.98x + 207.07, R² = 0.9670
y = 146.24x + 144.21, R² = 0.9757
y = 142.57x + 113.27, R² = 0.9823
Accurate:
Neutral:
Fast:
y = 171.98x + 108.50, R² = 0.9477
y = 166.16x + 77.209, R² = 0.9495
y = 159.04x + 101.60, R² = 0.9172
Mixed: y = 147.93x + 154.85, R² = 0.6747 Mixed: y = 180.68x + 43.534, R² = 0.9053
Figure 7: Model tness using Accot and Zhai’s model with (a) nominal and (b) eective target sizes.
4.47
5.21
3.52
5.13
4.68
4.85
4.14
5.74
3.74
6.13
4.98
5.10
5.95
4.02
5.86
5.35
5.17
4.44
6.13
3.83
6.69
5.32
5.46
6.37
4.30
6.28
5.73
5.10
4.42
6.05
3.58
6.97
5.26
5.01
5.84
3.95
5.76
5.25
5.04
4.33
5.97
3.72
6.60
5.19
0
2
4
6
8
10
12
MacKenzie Crossman Kvalseth ID_min Accot &
Zhai
MacKenzie
(SD_x)
MacKenzie
(SD_xy)
Crossman Kvalseth ID_min Accot &
Zhai
TP [bits/s]
Model
18.15
18.17
18.07
18.28
18.19
6.27
6.77
6.34
6.53
12.08
6.32
0
5
10
15
20
TP difference [%]
Model
ab
TP value TP difference
Accurate
Neutral
Fast
Mixed
Nominal Effective Nominal Effective
Figure 8: Throughputs in Experiment 1. (a) TP value and (b) TP dierence.
target sizes, we found that the eective values achieved this goal
(Figure 8b). MacKenzie’s formulation using
SDx
gave the smallest
dierence, while Accot and Zhai’s model gave the second small-
est. For the
IDmin
model, because the model tness for the
Mixed
data was the lowest (see Table 1), its dierence is remarkable in
the eective width method shown in Figure 8b. In summary, the
eective width method appropriately lowered the performance dif-
ferences between the three
Bias
conditions, which demonstrates its
normalization capability against speed-accuracy biases.
5 EXPERIMENT 2: CROWDSOURCING USER
STUDY
To further validate our bivariate eective width method, we repli-
cate our experiment with another participant group. Because the
method is for capturing the central tendency of user performances
rather than that of a single person, recruiting plenty of participants
will be helpful for observing the full capability of the method. Thus,
our next experiment was run via crowdsourcing. We oered the
task at Yahoo! Crowdsourcing (https://crowdsourcing.yahoo.co.jp).
Almost all the task designs and procedures were the same as in the
remote study. The points of dierence are described below.
5.1 Participants and Recruitment
We recruited workers who used Windows (Vista or a later version).
No other qualications or special skills were required. We used the
“White List” option in the crowdsourcing platform for screening
newly created accounts to omit multiple entries by the same persons.
This option enabled us to oer the task only to workers who were
considered reliable on the basis of their previous task history.
To reduce noise introduced by multiple pointing devices in the
crowdsourcing data, we asked the workers to use a mouse if they
had one, as a mouse is the most commonly available device other
than a touchpad for non-laptop-PC users. Nevertheless, to avoid
a possible false report in which all workers might answer that
they used a mouse, we explicitly explained that any device was
acceptable, and then removed the non-mouse users from the analy-
sis. The workers were not instructed to change the cursor speed
or acceleration-function setting to increase the ecological validity.
This decision also helped to omit the time to re-learn a new speed
conguration.
After the workers nished all sessions and completed the ques-
tionnaire, they uploaded the log data le to a server to receive
payment. Each worker received JPY 100 (
USD 0.96). It typically
took 10 min to complete the task, so the eective hourly payment
was approximately JPY 600 (USD 5.8).
In total, 207 mouse users completed the task. Their demographics
were as follows. Age: ranging from 20 to 72 years,
𝑀=
43
.
5, and
SD =
9
.
21. Gender: 166 were male, 39 were female, and 2 preferred
not to answer. Handedness: 14 were left-handed, and 193 were
right-handed. Windows version: 1 used Vista, 21 used Win7, 5 used
Win8, 5 used Win8.1, and 175 used Win10. PC usage history: ranged
from 1 to 40 years, 𝑀=21.9, and SD =7.00.
5.2 Task and Procedure
There were several points of dierence from Experiment 1. To
shorten the entire task time, (1) there were no practice sessions, and
(2)
𝐷
was xed to 640 pixels because testing the other independent
variables (
Bias
,
𝑊
, and
𝐻
) had higher priority. Previous studies on
rectangular-target pointing also used a single
𝐷
value [
8
,
13
,
27
,
31
].
𝑊
and
𝐻
were reduced:
𝑊=
30
,
50
,
and 90 pixels, and
𝐻=
20
,
40
,
70
,
and 150 pixels. Each session consisted of 19 clicks rather than 15
to increase the reliability of the endpoint distributions (
SD
). The
rst ve clicks in each session were omitted, and thus, 14 clicks
Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA
were used for data analyses. Text instructions were given instead
of video instructions.
There were three
Bias
conditions, the same as in the remote
study. A block consisted of 12 sessions (3
𝑊×
4
𝐻
) with a xed
Bias
condition, so each worker completed 36 sessions in total. The order
of the 12
𝑊×𝐻
conditions was randomized for each block. In total,
we recorded 3
Bias ×
1
𝐷×
3
𝑊×
4
𝐻×
14
repetitions×
207
workers =
104
,
328
clicks. As in the remote study, the instructions page informed the
participants that there would be three
Bias
conditions and asked
them to perform dierently in terms of speed and accuracy.
6 RESULTS OF EXPERIMENT 2
6.1 Screening Outlier Data and Normality Test
We removed outlier data if the distance of the click position was
shorter than
𝐷/
2. There were 149 outliers (0.143%). As a check,
we tried to detect workers who had exhibited extremely short or
long
MT
s. The inter-quartile range method [
14
,
17
], a robust and
frequently used method, was utilized for this. It agged two workers
who showed mean
MT
s of 1358 and 1532 ms across the 36 sessions.
These workers seemed to lean towards accuracy more than the
other workers, but this did not violate our task instructions and
thus their data were not removed.
Even after we log-transformed the data of
MT
,
SDx
, and
SDy
,
we found that 0, 4, and 14 conditions passed the normality test,
respectively, or 0, 11.1, and 38.9%. Still, we consistently ran RM-
ANOVAs, as ANOVA can be used robustly [
15
,
43
]. For
ER
, we used
non-parametric ANOVAs with an aligned rank transform.
6.2 Movement Time
We found signicant main eects of
Bias
(
𝐹1.664,342.8=
246
.
6,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
55),
𝑊
(
𝐹1.611,331.8=
2870,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
93),
and
𝐻
(
𝐹2.659,547.8=
383
.
4,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
65) on
MT
. Signi-
cant interactions (
𝑝<
0
.
05) were found for
Bias ×𝐻
,
𝑊×𝐻
, and
Bias ×𝑊×𝐻
. The
MT
decreased when the instructions emphasized
speed more and when
𝑊
and
𝐻
increased (Figure 9). These results
demonstrate that the participants appropriately followed the
Bias
instructions.
6.3 Error Rate
We found signicant main eects of
Bias
(
𝐹2,412 =
262
.
9,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
56),
𝑊
(
𝐹2,412 =
1711,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
89), and
𝐻
(
𝐹3,618 =
420
.
3,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
67) on
ER
. Signicant interactions (
𝑝<
0
.
05) were found for
Bias×𝑊
,
Bias×𝐻
,
𝑊×𝐻
, and
Bias×𝑊×𝐻
. This
is interesting because the signicant main eect of
𝐻
(Figure 10c)
was not found in the remote study. A larger sample size would have
helped to detect the signicance.
6.4 Endpoint Variability in SD𝑥and SD𝑦
For
SDx
, we found signicant main eects of
Bias
(
𝐹1.538,316.9=
252
.
1,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
55),
𝑊
(
𝐹1.566,322.5=
3711,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
95), and
𝐻
(
𝐹2.899,597.161 =
30
.
46,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
13).
Signicant interactions (
𝑝<
0
.
05) were found for
Bias×𝑊
,
Bias×𝐻
,
and
𝑊×𝐻
. For
SDy
, we found signicant main eects of
Bias
(
𝐹2,412 =
57
.
20,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
22) and
𝐻
(
𝐹1.749,360.3=
2303,
𝑝<
0
.
001,
𝜂2
𝑝=
0
.
92), but not for
𝑊
(
𝐹1.865,384.2=
1
.
463,
𝑝=
0
.
23,
𝜂2
𝑝=
0
.
007). Signicant interactions (
𝑝<
0
.
05) were found for
Bias ×𝐻and 𝑊×𝐻. The regression expressions are as follows:
Accurate :SDx=1.939 +0.1542𝑊(𝑅2=0.9915),
SDy=2.310 +0.07095𝐻(𝑅2=0.9768)(9)
Neutral :SDx=2.373 +0.1702𝑊(𝑅2=0.9819),
SDy=2.626 +0.07068𝐻(𝑅2=0.9717)(10)
Fast :SDx=3.677 +0.1936𝑊(𝑅2=0.9612),
SDy=3.138 +0.07043𝐻(𝑅2=0.9607)(11)
The
𝑅2
values were greater than those in Experiment 1 (
𝑅2
ranged
from 0.85 to 0.96), probably because there were fewer regression
points in Experiment 2. The intercepts and slopes for
SDx
mono-
tonically increased when the instruction emphasized speed more.
This was true only for the intercepts for SDy.
6.5 Model Fitness
Table 2shows the results of the ve models we examined. Regardless
of using the nominal or eective target sizes, Accot and Zhai’s
model always had the highest adjusted
𝑅2
and lowest
AIC
values
both for using single- or mixed-instruction data. Recall that, in the
remote study, Accot and Zhai’s model was not always the best (see
Table 1). If we apply the eective target size only for the width (i.e.,
using
𝑊𝑒
and
𝐻
) to Accot and Zhai’s model for the
Mixed
data, we
obtain an adjusted
𝑅2=
0
.
9216 and
AIC =
346
.
7(i.e., no signicant
dierence from using
𝑊𝑒
and
𝐻𝑒
where
AIC
was 339.5). This shows
that using both
𝑊𝑒
and
𝐻𝑒
helped to improve the model tness, but
not as clearly as we observed in the remote study in which the
AIC
dierence was signicant.
Another positive aspect in this crowdsourced experiment was
that there were no signicant
AIC
dierences between the nominal
and eective width method for the three
Bias
conditions when
Accot and Zhai’s model was used. Previous studies considered the
eective width to be inferior to the nominal width for analyzing
single-instruction data [
62
,
67
]. We also found that the adjusted
𝑅2
values using the nominal width were always higher than those
using
𝑊𝑒
. Still, the dierences are only less than 0.02 points, with
no signicant
AIC
dierences. Thus, even if we analyze a single
Bias
condition, the prediction accuracy of
MT
s is not signicantly
lower than using the nominal sizes.
6.6 Throughput
Figure 11a shows the
TP
, and Figure 11b shows the ranges of these
TP
values: 100%
×(TP max TPmin )/TPmax
. By comparing the nom-
inal and eective target sizes, we can see that the eective values
normalized the
TP
dierences more. Kvålseth’s model shows the
strongest normalization capability, followed by MacKenzie’s (
SDx
),
Crossman’s, and Accot and Zhai’s models. As in the remote study,
the crowdsourced study empirically showed that the bivariate ef-
fective width method lowered the
TP
dierences between the three
Bias
conditions if we chose the appropriate model formulations.
This again demonstrates the normalization capability against speed-
accuracy biases.
CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.
856 797 781 781
0
200
400
600
800
1000
20 40 70 150
MT [ms]
H [pixels]
899 799 712
0
200
400
600
800
1000
30 50 90
MT [ms]
W [pixels]
869 798 744
0
200
400
600
800
1000
Accurate Neutral Fast
MT [ms]
Bias
a b c
Figure 9: Main eects on MT in Experiment 2.
5.27 4.34 4.36 4.51
0
2
4
6
8
10
20 40 70 150
ER [%]
H [pixels]
6.21 4.32 3.34
0
2
4
6
8
10
30 50 90
ER [%]
W [pixels]
2.12 3.56 8.18
0
2
4
6
8
10
Accurate Neutral Fast
ER [%]
Bias
a b c
Figure 10: Main eects on ER in Experiment 2.
Table 2: Model tness in Experiment 2. For the three
Bias
conditions, we regressed the 12 data points (1
𝐷×
3
𝑊×
4
𝐻
), while
for the “Mixed” data analysis, we used those 36 data points in total. Only for the eective MacKenzie model, there is a choice
of using
SDx
or
SDxy
. The blue cells show the best-t results for each
Bias
condition for each {Nominal, Eective} target-size
analysis.
Accurate Neutral Fast Mixed
Size Ref. Eq. adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC
Nominal
MacKenzie 20.7742 127.7 0.8070 123.4 0.8318 119.7 0.5848 404.2
Crossman 50.9123 119.0 0.9116 116.6 0.9420 109.5 0.6673 398.7
Kvålseth 60.9084 119.5 0.9081 117.1 0.9403 109.9 0.6651 399.0
IDmin 70.6153 134.1 0.5479 133.6 0.5509 131.5 0.4287 415.7
Accot & Zhai 80.9898 93.17 0.9819 97.64 0.9795 97.04 0.7115 393.6
Eective
MacKenzie (SDx)20.8163 125.2 0.8752 118.2 0.9064 112.6 0.8463 368.4
MacKenzie (SDxy )20.3198 140.9 0.4171 136.7 0.5473 131.6 0.4640 413.4
Crossman 50.9157 118.5 0.9241 114.8 0.9372 110.5 0.9002 355.4
Kvålseth 60.9147 118.6 0.9235 114.9 0.9369 110.5 0.8997 355.6
IDmin 70.3130 141.1 0.2219 140.1 0.1752 138.8 0.3064 422.6
Accot & Zhai 80.9780 102.4 0.9626 106.3 0.9678 102.4 0.9359 339.5
4.31
5.25
2.99
4.85
4.53
4.74
3.87
5.64
3.70
5.68
4.85
4.70
5.72
3.26
5.29
4.94
4.96
4.14
5.92
3.81
6.05
5.10
5.03
6.13
3.49
5.67
5.29
4.99
4.24
5.95
3.69
6.30
5.13
4.68
5.70
3.25
5.27
4.92
4.90
4.09
5.84
3.73
6.01
5.03
0
2
4
6
8
10
12
14
MacKenzie Crossman Kvalseth ID_min Accot &
Zhai
MacKenzie
(SD_x)
MacKenzie
(SD_xy)
Crossman Kvalseth ID_min Accot &
Zhai
TP [bits/s]
Model
14.33
14.34
14.35
14.39
14.36
4.95
8.66
5.23
3.20
9.89
5.32
0
10
20
30
TP difference [%]
Model
ab
TP value TP difference
Accurate
Neutral
Fast
Mixed
Nominal Effective Nominal Effective
Figure 11: Throughputs in Experiment 2. (a) TP value and (b) TP dierence.
7 GENERAL DISCUSSION
7.1 Capability of Normalizing Speed-accuracy
Tradeos and Choice of Models
Overall, the results of the remote and crowdsourced experiments
indicate that using
𝑊𝑒
and
𝐻𝑒
appropriately normalized the subjec-
tive speed-accuracy biases. This capability was validated by the fact
that (1) the regression expression for the data in a mixed manner
showed better ts in terms of adjusted
𝑅2
and
AIC
compared with
using nominal values and (2) the throughput dierences between
the three
Bias
conditions were smaller when using
𝑊𝑒
and
𝐻𝑒
. On
the basis of the model tness results, we recommend using Accot
and Zhai’s model (Equation 8). This model was not always optimal
for normalizing the
TP
values (see Figures 8and 11), but the reli-
ability of the
TP
data is established on the basis of the tness of
Fitts’ law. As Accot and Zhai’s model showed the best t when an-
alyzing the mixed-instruction data, it makes sense that this model
appropriately normalized the biases.
Previously, using
𝑊𝑒
was recommended for comparing dierent
devices or user groups, but if researchers who use rectangular tar-
gets were to apply
𝑊𝑒
to the baseline (MacKenzie) formulation, they
would not observe the high prediction accuracy possible with more
appropriate formulations. We demonstrated the rst evidence that
applying
𝑊𝑒
and
𝐻𝑒
independently to proper models (Crossman,
Kvålseth, or Accot and Zhai) achieved signicant improvements in
model tness for the data in a mixed manner. Without such mod-
els, researchers have had to use innite-height or circular targets,
Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA
which are simplied articial shapes. This means that they had
no appropriate metric to compare dierent users or devices with
realistic rectangular targets, which has clearly been a limitation in
the HCI eld.
7.2 Implications
Our bivariate eective width method has several possible appli-
cations in addition to point-and-click tasks with mice, which will
enable the comparison of devices and techniques in more realistic
GUI targets that appear in actual situations. Because Fitts’ law holds
for drag-and-drop operations [
20
,
37
], we can compare the perfor-
mances when participants select texts on a web browser or select
multiple cells in a spreadsheet by dragging. In these cases, the font
size and kerning would aect the text-selection performance (they
act as target sizes), and the cell sizes of the spreadsheet would aect
the selection time. It is known that endpoint variability perpendic-
ular to the movement direction diers depending on input devices
[
40
], and therefore, it is necessary to normalize the speed-accuracy
biases for a fair comparison with baseline and novel techniques
for drag-and-drop operations, e.g., [
45
,
57
]. Still, the purpose of the
bivariate eective width method is to allow researchers to conduct
such comparison studies; the method itself does not directly judge
if a given GUI design is good or not, such as whether the cell size
is sucient for rapid and accurate selection.
Another potential implication is for eye-gaze movements, which
also follow Fitts’ law [
51
] and the eective width method [
56
]
(note that there is a debate on the applicability of Fitts’ law to
gaze-based pointing [
23
,
44
,
55
]). Murata et al. compared dierent
input methods, such as gaze vs. mouse, and reported the time and
accuracy separately (gaze was fast but inaccurate compared with
mouse) [
47
]. Now, researchers can use a unied metric
TP
and can
determine a better input technique after normalizing the accuracy.
In the future, it would be worth examining the applicability of the
bivariate eective width method to touch, gaze, and drag-and-drop
operations.
7.3 Limitations and Future Work
Our conclusions are limited by experimental design considerations
such as the ranges of
𝐷
,
𝑊
, and
𝐻
used in the two experiments.
Moreover, while we followed the conventional methodology of Fitts’
law in that we used only necessary targets, in realistic situations
there are additional buttons or icons that users do not want to select
(called distractors [
7
,
63
]), which would have an eect on the users’
pointing performance.
An untested target parameter was the approach angle
𝜃
of the
cursor towards the target, which is known to aect Fitts’ law perfor-
mance and model tness [
66
,
68
]. Although we tested the simplest
approach angles of
𝜃=
0
and 180
where
𝜃=
0
is dened as
rightward, prior literature has examined some modications to
models, e.g., Zhang et al.’s model [
68
]. Ko et al. demonstrated a way
to simplify
𝜃
: when
𝜃
ranges from 0
to 45
, the x-length of the
target is dened as
𝑊
, while the y-length is dened as
𝑊
when
45
<𝜃
90
[
29
]. More recently, Ma et al. proposed using the
projected target sizes [
32
]. Our future work will include experiments
on such models.
While we used the data from all sessions and blocks, we checked
if there were progress eects (learning, fatigue, etc.) on the
MT
results to validate our main claim. In Experiment 1, a RM-ANOVA
showed no signicant main eects of
Block
or
Session
, and no in-
teraction eect among them (all
𝑝>
0
.
7). In contrast, we found
the main eect of
Block
in Experiment 2 (
𝑝<
0
.
05). Pairwise tests
with Bonferroni’s
𝑝
-value adjustment showed that the
MT
was
signicantly longer (
𝑝<
0
.
05) for the rst block than for the third
one: 815 vs. 789 ms (95% CIs were 22 and 19 ms), respectively. The
crowd workers seemed to get used to the task and exhibited shorter
times in the nal block, but the 95% CI error bars overlap each other
and thus we consider it unfruitful to discuss this small learning
eect. We suspect that the large sample size (207 workers) helped
with nding this signicant main eect of
Block
. Still, we could not
remove any block’s data, because the three blocks correspond to
the three Bias conditions, which is a limitation of this study.
There are several issues relating to remote and crowdsourced
user studies, such as inconsistent display sizes and mouse models.
Thus, factors aecting the performance of Fitts’ law tasks, such
as the mouse-to-cursor latencies [
10
,
11
], were not controlled. In
addition, it is known that crowd workers tend to give the minimum
eort in order to nish a task in a short time (called “satiscing”
[
24
,
42
]). Thus, we were concerned about the possibility that some
workers may not have (e.g.) operated their mice carefully even
when the instruction was
Accurate
. Because the core interest of
the present study is the subjective biases, it is important that the
participants followed the instructions.
To discover possible issues of lack of compliance with instruc-
tions, we tried to analyze how dierently the participants exhibited
MT
and
ER
depending on each
Bias
condition. However, even if,
for example, a participant had exhibited a mean
ER
of 5% for the
Neutral
condition and 6% for
Accurate
, we thought that we should
not regard this as a violation of the instructions. This was because
error clicks would occur by chance and
ER
could be aected by the
order of the three
Bias
conditions. The focus of this study was not
individual data; rather, we conrmed that the
Bias
had signicant
main eects on
MT
and
ER
with large eect sizes in both experi-
ments. This demonstrates that, overall, the participants followed
the instructions and changed their behavior accordingly. Our fu-
ture work, of course, will include checking if the ndings of this
study also hold in lab-based controlled experiments in which the
participants are more motivated to follow subjective instructions so
that we can strengthen our conclusion that the bivariate eective
width method normalizes speed-accuracy biases.
8 CONCLUSION
In this work, we explored the utility of the eective width method
when it was applied to the target height in Fitts’ law tasks. The re-
sults of remotely conducted and crowdsourced experiments showed
that Accot and Zhai’s weighted Euclidean model [
1
] using
𝑊𝑒
and
𝐻𝑒
almost always exhibited the best t for the data mixing the
three
Bias
conditions. Integrating
𝑊𝑒
and
𝐻𝑒
with bivariate Fitts’
law models normalizes the speed-accuracy biases and thus enables
researchers to compare dierent task conditions. We also conrmed
that using the nominal sizes showed the (sometimes signicantly)
better tness when analyzing the data from a single-instruction
CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.
condition, which is consistent with previous studies [
39
,
67
]. Our
recommendations are summarized as follows.
Use Accot and Zhai’s model with
𝑊
and
𝐻
when researchers
would like to predict
MT
s under new task conditions with a
single input device and single instruction (“point to targets as
rapidly and accurately as possible”, i.e., Neutral).
Use Accot and Zhai’s model with
𝑊𝑒
and
𝐻𝑒
when researchers
would like to compare two or more devices (e.g., mouse vs.
touchpad vs. joystick), interaction techniques (e.g., a proposed
method vs. baseline point-and-click), or user groups (children vs.
young adults vs. older adults) with a single instruction (typically
Neutral).
Rectangular objects are perhaps the most frequently arranged
targets on desktops and mobile screens. Considering this, while user
experiments with innite-height or circular targets are frequently
used, they would be too articial to measure realistic user perfor-
mances. It has been claimed that rectangular targets are needed for a
better understanding of user behaviors in pointing tasks [
1
,
29
]. Our
giving an appropriate metric for rectangular-target pointing that
enables data obtained under dierent conditions to be compared is
a useful methodological contribution to the HCI eld.
ACKNOWLEDGMENTS
We thank the reviewers of CHI 2022 and International Journal of
Human-Computer Studies for their valuable feedback.
REFERENCES
[1]
Johnny Accot and Shumin Zhai. 2003. Rening Fitts’ Law Models for Bivariate
Pointing. In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems (Ft. Lauderdale, Florida, USA) (CHI ’03). ACM, New York, NY, USA,
193–200. https://doi.org/10.1145/642611.642646
[2]
Hirotugu Akaike. 1974. A new look at the statistical model identication. IEEE
Trans. Automat. Control 19, 6 (1974), 716–723. https://doi.org/10.1109/TAC.1974.
1100705
[3]
Caroline Appert, Olivier Chapuis, and Michel Beaudouin-Lafon. 2008. Evaluation
of Pointing Performance on Screen Edges. In Proceedings of the Working Confer-
ence on Advanced Visual Interfaces (Napoli, Italy) (AVI ’08). ACM, New York, NY,
USA, 119–126. https://doi.org/10.1145/1385569.1385590
[4]
Anil Ufuk Batmaz and WolfgangStuerzlinger. 2021. The Eect of Pitch in Auditory
Error Feedback for Fitts’ Tasks in Virtual Reality Training Systems. In 2021 IEEE
Virtual Reality and 3D User Interfaces (VR). IEEE, Washington, DC, USA, 85–94.
https://doi.org/10.1109/VR50410.2021.00029
[5]
Xiaojun Bi, Yang Li, and Shumin Zhai. 2013. FFitts Law: Modeling Finger Touch
with Fitts’ Law. In Proceedings of the SIGCHI Conference on Human Factors in
Computing Systems (Paris, France) (CHI ’13). ACM, New York, NY, USA, 1363–
1372. https://doi.org/10.1145/2470654.2466180
[6]
Xiaojun Bi and Shumin Zhai. 2013. Bayesian Touch: A Statistical Criterion
of Target Selection with Finger Touch. In Proceedings of the 26th Annual ACM
Symposium on User Interface Software and Technology (St. Andrews, Scotland,
United Kingdom) (UIST ’13). Association for Computing Machinery, New York,
NY, USA, 51–60. https://doi.org/10.1145/2501988.2502058
[7]
Renaud Blanch and Michael Ortega. 2011. Benchmarking Pointing Techniques
with Distractors: Adding a Density Factor to Fitts’ Pointing Paradigm. In Pro-
ceedings of the SIGCHI Conference on Human Factors in Computing Systems
(Vancouver, BC, Canada) (CHI ’11). ACM, New York, NY, USA, 1629–1638.
https://doi.org/10.1145/1978942.1979180
[8]
Michael Bohan, Mitchell Longsta, Arend Van Gemmert, Miya Rand, and George
Stelmach. 2003. Eects of target height and width on 2D pointing movement
duration and kinematics. Motor control 7 (08 2003), 278–289. Issue 3. https:
//doi.org/10.1123/mcj.7.3.278
[9]
Kenneth P Burnham and David R Anderson. 2003. Model selection and multimodel
inference: a practical information-theoretic approach. Springer Science & Business
Media, Heidelberg, Germany.
[10]
Géry Casiez, Stéphane Conversy, Matthieu Falce, Stéphane Huot, and Nicolas
Roussel. 2015. Looking Through the Eye of the Mouse: A Simple Method for
Measuring End-to-end Latency Using an Optical Mouse. In Proceedings of the 28th
Annual ACM Symposium on User Interface Software &#38; Technology (Charlotte,
NC, USA) (UIST ’15). ACM, New York, NY, USA, 629–636. https://doi.org/10.
1145/2807442.2807454
[11]
Géry Casiez and Nicolas Roussel. 2011. No More Bricolage!: Methods and Tools to
Characterize, Replicate and Compare Pointing Transfer Functions. In Proceedings
of the 24th Annual ACM Symposium on User Interface Software and Technology
(Santa Barbara, California, USA) (UIST ’11). ACM, New York, NY, USA, 603–614.
https://doi.org/10.1145/2047196.2047276
[12]
Olivier Chapuis and Pierre Dragicevic. 2011. Eects of Motor Scale, Visual Scale,
and Quantization on Small Target Acquisition Diculty. ACM Trans. Comput.-
Hum. Interact. 18, 3, Article 13 (Aug. 2011), 32 pages. https://doi.org/10.1145/
1993060.1993063
[13]
Edward R.F.W. Crossman. 1956. The measurement of perceptual load in manual
operations. Ph.D. Dissertation. University of Birmingham.
[14]
Jay L. Devore. 2011. Probability and Statistics for Engineering and the Sciences
(8th ed.). Brooks/Cole, Stamford, CT, USA. ISBN-13: 978-0-538-73352-6.
[15]
Peter Dixon. 2008. Models of accuracy in repeated-measures designs. Journal of
Memory and Language 59, 4 (2008), 447–456.
[16]
Sarah A. Douglas, Arthur E. Kirkpatrick, and I. Scott MacKenzie. 1999. Testing
Pointing Device Performance and User Assessment with the ISO 9241, Part 9
Standard. In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems (Pittsburgh, Pennsylvania, USA) (CHI ’99). Association for Computing
Machinery, New York, NY, USA, 215–222. https://doi.org/10.1145/302979.303042
[17]
Leah Findlater, Joan Zhang, Jon E. Froehlich, and Karyn Moatt. 2017. Dierences
in Crowdsourced vs. Lab-based Mobile and Desktop Input Performance Data. In
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems
(Denver, Colorado, USA) (CHI ’17). ACM, New York, NY, USA, 6813–6824. https:
//doi.org/10.1145/3025453.3025820
[18]
Paul M. Fitts. 1954. The information capacity of the human motor system in
controlling the amplitude of movement. Journal of Experimental Psychology 47, 6
(1954), 381–391. https://doi.org/10.1037/h0055392
[19]
P. M. Fitts and B. K. Radford. 1966. Information capacity of discrete motor
responses under dierent cognitive sets. Journal of experimental psychology 71, 4
(1966), 475–482.
[20]
Douglas J. Gillan, Kritina Holden, Susan Adam, Marianne Rudisill, and Laura
Magee. 1990. How Does Fitts’ Law Fit Pointing and Dragging?. In Proceed-
ings of the SIGCHI Conference on Human Factors in Computing Systems (Seat-
tle, Washington, USA) (CHI ’90). ACM, New York, NY, USA, 227–234. https:
//doi.org/10.1145/97243.97278
[21]
Julien Gori. 2018. Modeling the speed-accuracy tradeo using the tools of infor-
mation theory. Ph.D. Theses. Université Paris-Saclay. https://pastel.archives-
ouvertes.fr/tel-02005752
[22]
Julien Gori, Olivier Rioul, and Yves Guiard. 2018. Speed-Accuracy Tradeo:
A Formal Information-Theoretic Transmission Scheme (FITTS). ACM Trans.
Comput.-Hum. Interact. 25, 5, Article 27 (Sept. 2018), 33 pages. https://doi.org/
10.1145/3231595
[23]
Julien Gori, Olivier Rioul, Yves Guiard, and Michel Beaudouin-Lafon. 2018. The
Perils of Confounding Factors: How Fitts’ Law Experiments Can Lead to False
Conclusions. In Proceedings of the 2018 CHI Conference on Human Factors in
Computing Systems (Montreal QC, Canada) (CHI ’18). ACM, New York, NY, USA,
Article 196, 10 pages. https://doi.org/10.1145/3173574.3173770
[24]
Sandy J. J. Gould, Anna L. Cox, and Duncan P. Brumby. 2016. Diminished Control
in Crowdsourcing: An Investigation of Crowdworker Multitasking Behavior.
ACM Trans. Comput.-Hum. Interact. 23, 3, Article 19 (June 2016), 29 pages. https:
//doi.org/10.1145/2928269
[25]
Yves Guiard, Halla B. Olafsdottir, and Simon T. Perrault. 2011. Fitt’s Law as an
Explicit Time/Error Trade-O. In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for
Computing Machinery, New York, NY, USA, 1619–1628. https://doi.org/10.1145/
1978942.1979179
[26]
Raiza Hanada, Damien Masson, Géry Casiez, Mathieu Nancel, and Sylvain
Malacria. 2021. Relevance and Applicability of Hardware-Independent Point-
ing Transfer Functions. In Proceedings of the 34th Annual ACM Symposium on
User Interface Software and Technology (Virtual Event, USA) (UIST ’21). As-
sociation for Computing Machinery, New York, NY, USA, 524–537. https:
//doi.org/10.1145/3472749.3474767
[27]
Errol R. Homann and Ilyas H. Sheikh. 1994. Eect of varying target height in
a Fitts’ movement task. Ergonomics 37, 6 (1994), 1071–1088. https://doi.org/10.
1080/00140139408963719
[28]
Richard J. Jagacinski and Donald L. Monk. 1985. Fitts’ Law in Two Dimensions
with Hand and Head Movements Movements. Journal of Motor Behavior 17, 1
(1985), 77–95. https://doi.org/10.1080/00222895.1985.10735338
[29]
Yu-Jung Ko, Hang Zhao, Yoonsang Kim, IV Ramakrishnan, Shumin Zhai, and
Xiaojun Bi. 2020. Modeling Two Dimensional Touch Pointing. In Proceedings
of the 33rd Annual ACM Symposium on User Interface Software and Technology
(Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York,
NY, USA, 858–868. https://doi.org/10.1145/3379337.3415871
[30]
Steven Komarov, Katharina Reinecke, and Krzysztof Z. Gajos. 2013. Crowdsourc-
ing Performance Evaluations of User Interfaces. In Proceedings of the SIGCHI
Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA
Conference on Human Factors in Computing Systems (Paris, France) (CHI ’13).
ACM, New York, NY, USA, 207–216. https://doi.org/10.1145/2470654.2470684
[31]
Tarald O. Kvålseth. 1977. A Generalized Model of Temporal Motor Control
Subject to Movement Constraints. Ergonomics 20, 1 (1977), 41–50. https://doi.
org/10.1080/00140137708931599
[32]
Yan Ma, Shumin Zhai, IV Ramakrishnan, and Xiaojun Bi. 2021. Modeling Touch
Point Distribution with Rotational Dual Gaussian Model. In Proceedings of the
34th Annual ACM Symposium on User Interface Software and Technology (Virtual
Event, USA) (UIST ’21). Association for Computing Machinery, New York, NY,
USA, 858–868. https://doi.org/10.1145/3472749.3474816
[33]
I. Scott MacKenzie. 1991. Fitts’ law as a performance model in human-computer
interaction. Ph.D. Dissertation. University of Toronto.
[34]
I. Scott MacKenzie. 1992. Fitts’ law as a research and design tool in human-
computer interaction. Human-Computer Interaction 7, 1 (1992), 91–139. https:
//doi.org/10.1207/s15327051hci0701_3
[35]
I. Scott MacKenzie. 2018. Fitts’ Law. John Wiley & Sons, Ltd, Hoboken,
NJ, USA, Chapter 17, 347–370. https://doi.org/10.1002/9781118976005.ch17
arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118976005.ch17
[36]
I. Scott MacKenzie and William Buxton. 1992. Extending Fitts’ Law to Two-
dimensional Tasks. In Proceedings of the SIGCHI Conference on Human Factors in
Computing Systems (Monterey, California, USA) (CHI ’92). ACM, New York, NY,
USA, 219–226. https://doi.org/10.1145/142750.142794
[37]
I. Scott MacKenzie and William Buxton. 1994. Prediction of pointing and dragging
times in graphical user interfaces. Interacting with Computers 6, 2 (06 1994), 213–
227. https://doi.org/10.1016/0953-5438(94)90025- 6
[38]
I. Scott MacKenzie and Poika Isokoski. 2008. Fitts’ Throughput and the Speed-
Accuracy Tradeo. In Proceedings of the SIGCHI Conference on Human Factors
in Computing Systems (Florence, Italy) (CHI ’08). ACM, New York, NY, USA,
1633–1636. https://doi.org/10.1145/1357054.1357308
[39]
I. Scott MacKenzie and Shaidah Jusoh. 2001. An Evaluation of Two Input Devices
for Remote Pointing. In Proceedings of the 8th IFIP International Conference on
Engineering for Human-Computer Interaction (EHCI ’01). Springer-Verlag, Berlin,
Heidelberg, 235–250.
[40]
I. Scott MacKenzie, Tatu Kauppinen, and Miika Silfverberg. 2001. Accuracy
Measures for Evaluating Computer Pointing Devices. In Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems (Seattle, Washington, USA)
(CHI ’01). ACM, New York, NY, USA, 9–16. https://doi.org/10.1145/365024.365028
[41]
I. Scott MacKenzie and Colin Ware. 1993. Lag As a Determinant of Human
Performance in Interactive Systems. In Proceedings of the INTERACT ’93 and
CHI ’93 Conference on Human Factors in Computing Systems (Amsterdam, The
Netherlands) (CHI ’93). ACM, New York, NY, USA, 488–493. https://doi.org/10.
1145/169059.169431
[42]
Michael R. Maniaci and Ronald D. Rogge. 2014. Caring about carelessness: Par-
ticipant inattention and its eects on research. Journal of Research in Personality
48 (2014), 61–83. https://doi.org/10.1016/j.jrp.2013.09.008
[43]
Blanca Mena, M José, Rafael Alarcón, Jaume Arnau Gras, Roser Bono Cabré,
and Rebecca Bendayan. 2017. Non-normal data: Is ANOVA still a valid option?
Psicothema 29, 4 (2017), 552–557.
[44]
Darius Miniotas, Oleg Špakov, and I. Scott MacKenzie. 2004. Eye Gaze Interaction
with Expanding Targets. In CHI ’04 Extended Abstracts on Human Factors in
Computing Systems (Vienna, Austria) (CHI EA ’04). Association for Computing
Machinery, New York, NY, USA, 1255–1258. https://doi.org/10.1145/985921.
986037
[45]
Motoki Miura and Kenji Saisho. 2014. A Text Selection Technique Using Word
Snapping. Procedia Computer Science 35 (2014), 1644–1651. https://doi.org/10.
1016/j.procs.2014.08.257 Knowledge-Based and Intelligent Information & Engi-
neering Systems 18th Annual Conference, KES-2014 Gdynia, Poland, September
2014 Proceedings.
[46]
Atsuo Murata. 1999. Extending Eective Target Width in Fitts’ Law to a Two-
Dimensional Pointing Task. International Journal of Human-Computer Interaction
11, 2 (1999), 137–152. https://doi.org/10.1207/S153275901102_4
[47]
Atsuo Murata, Toshihisa Doi, Kazushi Kageyama, and Waldemar Karwowski.
2021. Development of an Eye-Gaze Input System With High Speed and Accuracy
through Target Prediction Based on Homing Eye Movements. IEEE Access 9
(2021), 22688–22697. https://doi.org/10.1109/ACCESS.2021.3055514
[48]
Halla B. Olafsdottir, Yves Guiard, Olivier Rioul, and Simon T. Perrault. 2012. A
New Test of Throughput Invariance in Fitts’ Law: Role of the Intercept and of
Jensen’s Inequality. In Proceedings of the 26th Annual BCS Interaction Specialist
Group Conference on People and Computers (Birmingham, United Kingdom) (BCS-
HCI ’12). BCS Learning & Development Ltd., Swindon, GBR, 119–126.
[49]
Xiangshi Ren and Xiaolei Zhou. 2011. An investigation of the usability of the sty-
lus pen for various age groups on personal digital assistants. Behaviour & Informa-
tion Technology 30, 6 (2011), 709–726. https://doi.org/10.1080/01449290903205437
[50]
Olivier Rioul and Yves Guiard. 2012. Power vs. logarithmic model of Fitts’ law:
A mathematical analysis. Mathematical Social Sciences 2012 (12 2012), 85–96.
https://doi.org/10.4000/msh.12317
[51]
Immo Schuetz, T. Scott Murdison, Kevin J. MacKenzie, and Marina Zannoli.
2019. An Explanation of Fitts’ Law-like Performance in Gaze-Based Selec-
tion Tasks Using a Psychophysics Approach. In Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk)
(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–13.
https://doi.org/10.1145/3290605.3300765
[52]
Michail Schwab, Sicheng Hao, Olga Vitek, James Tompkin, Je Huang, and
Michelle A. Borkin. 2019. Evaluating Pan and Zoom Timelines and Sliders. In
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
(Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New
York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300786
[53]
Ilyas H. Sheikh and Errol R. Homann. 1994. Eect of target shape on movement
time in a Fitts task. Ergonomics 37, 9 (1994), 1533–1547. https://doi.org/10.1080/
00140139408964932
[54]
R. William Soukore and I. Scott MacKenzie. 2004. Towards a standard for
pointing device evaluation, perspectives on 27 years of Fitts’ law research in
HCI. International Journal of Human-Computer Studies 61, 6 (2004), 751–789.
https://doi.org/10.1016/j.ijhcs.2004.09.001
[55]
Veikko Surakka, Marko Illi, and Poika Isokoski. 2004. Gazing and Frowning as a
New Human–Computer Interaction Technique. ACM Trans. Appl. Percept. 1, 1
(July 2004), 40–56. https://doi.org/10.1145/1008722.1008726
[56]
Roel Vertegaal. 2008. A Fitts Law Comparison of Eye Tracking and Man-
ual Input in the Selection of Visual Targets. In Proceedings of the 10th Inter-
national Conference on Multimodal Interfaces (Chania, Crete, Greece) (ICMI
’08). Association for Computing Machinery, New York, NY, USA, 241–248.
https://doi.org/10.1145/1452392.1452443
[57]
Daniel Vogel and Patrick Baudisch. 2007. Shift: A Technique for Operating Pen-
Based Interfaces Using Touch. In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems (San Jose, California, USA) (CHI ’07). Association
for Computing Machinery, New York, NY, USA, 657–666. https://doi.org/10.
1145/1240624.1240727
[58]
Feng Wang and Xiangshi Ren. 2009. Empirical Evaluation for Finger Input
Properties in Multi-touch Interaction. In Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). ACM, New
York, NY, USA, 1063–1072. https://doi.org/10.1145/1518701.1518864
[59]
Alan Travis Welford. 1968. Fundamentals of skill. London: Methuen, North
Yorkshire, UK.
[60]
Jacob O. Wobbrock, Leah Findlater, Darren Gergle, and James J. Higgins. 2011.
The Aligned Rank Transform for Nonparametric Factorial Analyses Using Only
Anova Procedures. In Proceedings of the SIGCHI Conference on Human Factors
in Computing Systems (Vancouver, BC, Canada) (CHI ’11). ACM, New York, NY,
USA, 143–146. https://doi.org/10.1145/1978942.1978963
[61]
Jacob O. Wobbrock, Kristen Shinohara, and Alex Jansen. 2011. The Eects of Task
Dimensionality, Endpoint Deviation, Throughput Calculation, and Experiment
Design on Pointing Measures and Models. In Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). ACM,
New York, NY, USA, 1639–1648. https://doi.org/10.1145/1978942.1979181
[62]
Charles E. Wright and Francis Lee. 2013. Issues Related to HCI Application of
Fitts’s Law. Human-Computer Interaction 28, 6 (2013), 548–578. https://doi.org/
10.1080/07370024.2013.803873
[63]
Shota Yamanaka. 2018. Eect of Gaps with Penal Distractors Imposing Time
Penalty in Touch-pointing Tasks. In Proceedings of the 20th International Confer-
ence on Human-Computer Interaction with Mobile Devices and Services (Barcelona,
Spain) (MobileHCI ’18). ACM, New York, NY, USA, 8 pages. https://doi.org/10.
1145/3229434.3229435
[64]
Shota Yamanaka. 2021. Comparing Performance Models for Bivariate Pointing
through a Crowdsourced Experiment. In Human-Computer Interaction – INTER-
ACT 2021. Springer International Publishing, Gewerbestr, Switzerland, 76–92.
https://doi.org/10.1007/978-3- 030-85616-8_6
[65]
Shota Yamanaka and Hiroki Usuba. 2020. Rethinking the Dual Gaussian Dis-
tribution Model for Predicting Touch Accuracy in On-Screen-Start Pointing
Tasks. Proc. ACM Hum.-Comput. Interact. 4, ISS, Article 205 (Nov. 2020), 20 pages.
https://doi.org/10.1145/3427333
[66]
Huahai Yang and Xianggang Xu. 2010. Bias Towards Regular Conguration in 2D
Pointing. In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems (Atlanta, Georgia, USA) (CHI ’10). ACM, New York, NY, USA, 1391–1400.
https://doi.org/10.1145/1753326.1753536
[67]
Shumin Zhai, Jing Kong, and Xiangshi Ren. 2004. Speed-accuracy tradeo in
Fitts’ law tasks: on the equivalency of actual and nominal pointing precision.
International Journal of Human-Computer Studies 61, 6 (2004), 823–856. https:
//doi.org/10.1016/j.ijhcs.2004.09.007
[68]
Xinyong Zhang, Hongbin Zha, and Wenxin Feng. 2012. Extending Fitts’ Law to
Account for the Eects of Movement Direction on 2D Pointing. In Proceedings of
the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas,
USA) (CHI ’12). ACM, New York, NY, USA, 3185–3194. https://doi.org/10.1145/
2207676.2208737
Conference Paper
Full-text available
Touch point distribution models are important tools for designing touchscreen interfaces. In this paper, we investigate how the finger movement direction affects the touch point distribution, and how to account for it in modeling. We propose the Rotational Dual Gaussian model, a refinement and generalization of the Dual Gaussian model, to account for the finger movement direction in predicting touch point distribution. In this model, the major axis of the prediction ellipse of the touch point distribution is along the finger movement direction, and the minor axis is perpendicular to the finger movement direction. We also propose using projected target width and height, in lieu of nominal target width and height to model touch point distribution. Evaluation on three empirical datasets shows that the new model reflects the observation that the touch point distribution is elongated along the finger movement direction, and outperforms the original Dual Gaussian Model in all prediction tests. Compared with the original Dual Gaussian model, the Rotational Dual Gaussian model reduces the RMSE of touch error rate prediction from 8.49% to 4.95%, and more accurately predicts the touch point distribution in target acquisition. Using the Rotational Dual Gaussian model can also improve the soft keyboard decoding accuracy on smartwatches.
Chapter
Full-text available
Evaluation of a novel user-performance model’s fitness requires comparison with baseline models, yet it is often time consuming and involves much effort by researchers to collect data from many participants. Crowdsourcing has recently been used for evaluating novel interaction techniques, but its potential for model comparison studies has not been investigated in detail. In this study, we evaluated four existing Fitts’ law models for rectangular targets, as though one of them was a proposed novel model. We recruited 210 crowd workers, who performed 94,080 clicks in total, and confirmed that the result for the best-fit model was consistent with previous studies. We also analyzed whether this conclusion would change depending on the sample size, but even when we randomly sampled data from five workers for 10,000 iterations, the best-fit model changed only once (0.01%). We have thus demonstrated a case in which crowdsourcing is beneficial for comparing performance models.
Article
Full-text available
In this study, a method to predict a target on the basis of the trajectory of eye movements and to increase the pointing speed while maintaining high predictive accuracy is proposed. First, a predictive method based on ballistic (fast) eye movements (Approach 1) was evaluated in terms of pointing speed and predictive accuracy. In Approach 1, the so-called Midas touch problem (pointing to an unintended target) occurred, particularly when a small number of samples was used to predict a target. Therefore, to overcome the poor predictive accuracy of Approach 1, we developed a new predictive method (Approach 2) using homing (slow) eye movements rather than ballistic (fast) eye movements. Approach 2 overcame the disadvantage (inaccurate prediction) of Approach 1 by shortening the pointing time while maintaining high predictive accuracy.
Article
Full-text available
The dual Gaussian distribution hypothesis has been used to predict the success rate of target pointing on touchscreens. Bi and Zhai evaluated their success-rate prediction model in off-screen-start pointing tasks. However, we found that their prediction model could also be used for on-screen-start pointing tasks. We discuss the reasons why and empirically validate our hypothesis in a series of four experiments with various target sizes and distances. The prediction accuracy of Bi and Zhai's model was high in all of the experiments, with a 10-point absolute (or 14.9% relative) prediction error at worst. Also, we show that there is no clear benefit to integrating the target distance when predicting the endpoint variability and success rate.
Conference Paper
Full-text available
Modeling touch pointing is essential to touchscreen interface development and research, as pointing is one of the most basic and common touch actions users perform on touchscreen devices. Finger-Fitts Law [4] revised the conventional Fitts' law into a 1D (one-dimensional) pointing model for finger touch by explicitly accounting for the fat finger ambiguity (absolute error) problem which was unaccounted for in the original Fitts' law. We generalize Finger-Fitts law to 2D touch pointing by solving two critical problems. First, we extend two of the most successful 2D Fitts law forms to accommodate finger ambiguity. Second, we discovered that using nominal target width and height is a conceptually simple yet effective approach for defining amplitude and directional constraints for 2D touch pointing across different movement directions. The evaluation shows our derived 2D Finger-Fitts law models can be both principled and powerful. Specifically, they outperformed the existing 2D Fitts' laws, as measured by the regression coefficient and model selection information criteria (e.g., Akaike Information Criterion) considering the number of parameters. Finally, 2D Finger-Fitts laws also advance our understanding of touch pointing and thereby serve as the basis for touch interface designs.
Conference Paper
Full-text available
Eye gaze as an input method has been studied since the 1990s, to varied results: some studies found gaze to be more efficient than traditional input methods like a mouse, others far behind. Comparisons are often backed up by Fitts' Law without explicitly acknowledging the ballistic nature of saccadic eye movements. Using a vision science-inspired model, we here show that a Fitts'-like distribution of movement times can arise due to the execution of secondary saccades, especially when targets are small. Study participants selected circular targets using gaze. Seven different target sizes and two saccade distances were used. We then determined performance across target sizes for different sampling windows ("dwell times") and predicted an optimal dwell time range. Best performance was achieved for large targets reachable by a single saccade. Our findings highlight that Fitts' Law, while a suitable approximation in some cases, is an incomplete description of gaze interaction dynamics.
Thesis
Full-text available
Fitts’ law, which relates movement time MTin a pointing task to the target’s dimensions D and Wis usually expressed by mimicking Shannon’s capacityformula MT = a + b log 2 (1 + D/W). Yet, the currentlyreceived analysis is incomplete and unsatisfactory: itstems from a vague analogy and there is no explicitcommunication model for pointing.I first develop a transmission model for pointing taskswhere the index of difficulty ID = log 2 (1 + D/W) isthe expression of both a source entropy and a chan-nel capacity, thereby reconciling Shannon’s informa-tion theory with Fitts’ law. This model is then levera-ged to analyze pointing data gathered from controlledexperiments but also from field studies.I then develop a second model which builds on thevariability of human movements and accounts for thetremendous diversity displayed by movement control:with of without feedback, intermittent or continuous.From a chronometry of the positional variance, eva-luated from a set of trajectories, it is observed thatmovement can be separated into two phases: a firstwhere the variance increases over time and wheremost of the distance to the target is covered, follo-wed by a second phase where the variance decreasesuntil it satisfies accuracy constraints. During this se-cond phase, the problem of aiming can be reduced toa Shannon-like communication problem where infor-mation is transmitted from a “source” (variance at theend of the first phase), to a “destination” (the limb ex-tremity) over a “channel” perturbed by Gaussian noisewith a feedback link. I show that the optimal solution tothis transmission problem amounts to a scheme firstsuggested by Elias. I show that the variance can de-crease at best exponentially during the second phase,and that this result induces Fitts’ law.
Conference Paper
Full-text available
Pan and zoom timelines and sliders help us navigate large time series data. However, designing efficient interactions can be difficult. We study pan and zoom methods via crowd-sourced experiments on mobile and computer devices, asking which designs and interactions provide faster target acquisition. We find that visual context should be limited for low-distance navigation, but added for far-distance navigation; that timelines should be oriented along the longer axis, especially on mobile; and that, as compared to default techniques, double click, hold, and rub zoom appear to scale worse with task difficulty, whereas brush and especially ortho zoom seem to scale better. Software and data used in this research are available as open source.