Conference PaperPDF Available

Bivariate Effective Width Method to Improve the Normalization Capability for Subjective Speed-accuracy Biases in Rectangular-target Pointing

April 2022

April 2022

DOI:10.1145/3491102.3517466

Conference: CHI '22: CHI Conference on Human Factors in Computing Systems

Authors:

Shota Yamanaka

Yahoo Japan

Homei Miyashita

Meiji University

Content uploaded by Shota Yamanaka

Content may be subject to copyright.

Bivariate Eective Width Method to Improve the Normalization

Capability for Subjective Speed-accuracy Biases in

Rectangular-target Pointing

Shota Yamanaka

Yahoo Japan Corporation

Chiyoda-ku, Tokyo, Japan

Hiroki Usuba

Meiji University

Nakano-ku, Tokyo, Japan

Homei Miyashita

Meiji University

Nakano-ku, Tokyo, Japan

ABSTRACT

The eective width method of Fitts’ law can normalize speed-

accuracy biases in 1D target pointing tasks. However, in graphical

user interfaces, more meaningful target shapes are rectangular. To

empirically determine the best way to normalize the subjective

biases, we ran remote and crowdsourced user experiments with

three speed-accuracy instructions. We propose to normalize the

speed-accuracy biases by applying the eective sizes to existing

Fitts’ law formulations including width

𝑊

and height

𝐻

. We call

this target-size adjustment the bivariate eective width method. We

found that, overall, Accot and Zhai’s weighted Euclidean model

using the eective width and height independently showed the

best t to the data in which the three instruction conditions were

mixed (i.e., the time data measured in all instructions were ana-

lyzed with a single regression expression). Our approach enables

researchers to fairly compare two or more conditions (e.g., devices,

input techniques, user groups) with the normalized throughputs.

CCS CONCEPTS

•Human-centered computing

→

HCI theory, concepts and

models;Empirical studies in HCI.

KEYWORDS

Fitts’ law, pointing, graphical user interface, human motor perfor-

mance, crowdsourcing

ACM Reference Format:

Shota Yamanaka, Hiroki Usuba, and Homei Miyashita. 2022. Bivariate Ef-

fective Width Method to Improve the Normalization Capability for Sub-

jective Speed-accuracy Biases in Rectangular-target Pointing. In CHI Con-

ference on Human Factors in Computing Systems (CHI ’22), April 29-May

5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA, 13 pages.

https://doi.org/10.1145/3491102.3517466

1 INTRODUCTION

Evaluations of novel interaction techniques or devices compared

with baselines are regularly conducted in the HCI eld. Fitts’ law

[

] gives researchers a formalized methodology in which partici-

pants point to targets. Because participants may be unintentionally

biased towards either speed or accuracy [

], any such bias has

to be normalized in order to compare dierent input techniques,

devices, and user groups (e.g., children vs. older adults [

]). For-

tunately, Fitts’ law has a single metric for user performance that

This is the authors’ preprint version.

CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA

ACM ISBN 978-1-4503-9157-3/22/04.. .$15.00

https://doi.org/10.1145/3491102.3517466

normalizes the speed-accuracy tradeos, called throughput

. The

core idea of normalization is to use the eective target width

𝑊𝑒

that reects the endpoint distribution exhibited by the participants,

instead of the nominal target width

𝑊

displayed on the screen [

However, most previous studies on the eective width method

have focused on situations where the target is dened by

𝑊

alone,

such as 1D ribbon-shaped targets [

] or 2D circular targets

[

] (Figure 1a–b). In contrast, in realistic graphical user inter-

faces (GUIs), the shape of more meaningful targets is dened by

𝑊

and height

𝐻

(Figure 1c–d). The importance of testing user per-

formance with rectangular targets is well known [

but the characteristics of the eective height

𝐻𝑒

have rarely been

discussed [

]. To our knowledge, how well

𝐻𝑒

normalizes

the speed-accuracy biases in rectangular-target pointing has never

been studied. This is important because input devices have dierent

precisions in directions collinear and perpendicular to the cursor

movement [

], and comparing device performance with rectangu-

lar targets increases external validity (i.e., tasks with higher realism)

[1,29].

In this study, we investigated the potential of integrating

𝐻𝑒

the eective width method for normalizing speed-accuracy trade-

os. We apply

𝑊𝑒

and

𝐻𝑒

to existing Fitts’ law formulations in-

cluding

𝑊

and

𝐻

. This target-size adjustment is called the bivariate

eective width method. While the normalization capability has been

shown for 1D targets [

] and circular targets [

], the potential of

𝐻𝑒

for rectangular targets has remained unexplored (Figure 1). In

this study, we limited our experimental tasks to horizontal move-

ments (Figure 1c)1.

We ran two experiments: a remote-controlled one with university

students and a crowdsourcing one. In both, we provided three

subjectively biased speed-accuracy instructions. The remote study

was an alternative to a conventional lab-based one. The purpose

of the crowdsourced study was to replicate the remote study with

much more diverse participants, thereby increasing the validity

of the model evaluation. As the purposes for both experimental

styles are dierent, we do not compare the results of these two

experiments directly. Our ndings can be summarized as follows.

•

When we analyzed each subjective bias condition, Accot and

Zhai’s weighted Euclidean model [

] using nominal

𝑊

and

𝐻

showed the best t in both experiments. Thus, if researchers

would like to predict movement times more accurately with one

instruction (e.g., balance the speed and accuracy) for one user

The eectiveness of including

𝐻𝑒

for normalizing the speed-accuracy biases is not

yet known, even in a specic (horizontal-movement) task condition. Thus, testing the

eect of approaching angle

𝜃

(Figure 1d) is a logical next step after we conrm the

eectiveness to integrate 𝐻𝑒.

CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.

(a) Ribbon-shaped targets (c) Rectangular targets

(b) Circular targets

(d) Rectangular targets

Movement direction Single-axis Multi-directional Single-axis Dened by 𝜃

Nominal-size model Fitts 1954 [18] MacKenzie 1992 [34] Crossman 1956 [13] Yang+ 2010 [66]

MacKenzie 1992 [34] Soukore+ 2004 [54] Accot+ 2003 [1] Zhang+ 2012 [68]

Eective-size model Crossman 1956 [13] Wobbrock+ 2011 [61] Murata 1999 [46] None

Normalization capability test

Zhai+ 2004 [67] Batmaz+ 2021 [4] This work None

Figure 1: Previous studies on pointing models for dierent task conditions.

group using a single input device, we recommend using the

nominal 𝑊and 𝐻.

•

When we analyzed the data of the three instructions in a mixed

manner (i.e., the time data are analyzed without separating the

three bias conditions), Accot and Zhai’s model with 𝑊𝑒and 𝐻𝑒

showed the best t in most cases. Hence, if researchers want

to compare dierent input devices, interaction techniques, or

user groups, integrating

𝑊𝑒

and

𝐻𝑒

can adequately normalize

speed-accuracy tradeos.

•

When we used

𝑊𝑒

and

𝐻𝑒

, the range of

values for the three

instructions was remarkably small. This nding also supports

our conclusion on the normalization capability of the bivariate

eective width method.

2 RELATED WORK

2.1 Fitts’ Law and Eective Width Method

Fitts’ law predicts the movement time

to point to a target, which

is linearly related to the index of diculty ID:

MT =𝑎+𝑏·ID,(1)

where

𝑎

and

𝑏

are empirical constants. Given that the target dis-

tance is

𝐷

and its width is

𝑊

, as shown in Figure 2a, MacKenzie’s

formulation of ID [34] is widely used in the HCI eld:

ID =log2(𝐷/𝑊+1).(2)

Since any

formulation using nominal target parameters ignores

the actual accuracy of participants, the higher-performance group

changes depending on whether we give weight to speed or accuracy

[

]. Because having several user groups with exactly the same

is a rare occurrence, a post-hoc adjustment of accuracy is needed. To

enable such comparisons, the eective width method replaces the

nominal

𝑊

with the eective width

𝑊𝑒

that takes the distribution

of click positions (i.e., endpoints) into account [13], as

𝑊𝑒=√2𝜋𝑒𝜎 =4.133𝜎, (3)

where

𝜎

is the standard deviation of the endpoints (

SDx

in Fig-

ure 2b). Using this method,

𝑊𝑒

is adjusted so that

∼

4%of the

clicks fall outside of the target. Then, we obtain the eective in-

dex of diculty

IDe

by replacing

𝑊

in Equation 2with

𝑊𝑒

: i.e.,

IDe=log2(𝐷/𝑊𝑒+1)

SDx

can also be used with circular targets

[35,61].

While the eective width method assumes that endpoints are

normally distributed over a target [

], this assumption has

some theoretical issues [

]. For example,

is set to 4% arbitrarily,

which has no information-theoretic justication. Still, most of the

aspects of the eective width method are positive, particularly the

fact that

IDe

enables device or user performances to be compared

across dierent experimental conditions (see Section 3.3 in [

[67]).

By using

IDe

, researchers can obtain a unied measure of user

performance,

in bits/s, that integrates speed (in

) and accu-

racy (in SDx). A famous denition of TP is

TP =

𝑁cond



𝑖=1ID𝑒𝑖

𝑀𝑇𝑖,(4)

where

𝑁cond

is the number of task conditions and

𝑖

indicates the

𝑖

th condition among

𝑁cond

[

]. Readers are directed to [

]

for detailed discussions on the dierences between various

denitions (e.g., whether or not the intercept of Fitts’ law regression

is integrated). In this paper, we rst aggregate the participants’

data for each task condition and then apply Equation 4, which is

one of the possible ways to compute TP [48].

2.2 Modied Versions of Fitts’ Law for

Rectangular-target Pointing

We consider only left-right movements and dene

𝑊

and

𝐻

as the

target sizes on the x- and y-axes (Figure 1c), as in previous studies

on rectangular-target pointing [

]. Crossman proposed

the rst model to predict

for such targets by using another

regression constant 𝑐[13]:

MT =𝑎+𝑏·log2𝐷

𝑊+1+𝑐·log2𝐷

𝐻+1,(5)

where

ID =[log2(𝐷/𝑊+

) +𝑐/𝑏·log2(𝐷/𝐻+

)]

. Crossman’s

original formulation did not include the “+1” factors. For a fair

comparison with other models, we will use this plus-one form, as

the previous studies did [

]. This decision does not aect our

conclusions, because these constants have only trivial eects on

model tness (see the theoretical discussions in [

]). Kvålseth

proposed a slightly dierent model in which the diculty for the

Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA

1()

1()

distance ()

width ()

height ()

Figure 2: (a) Parameters of a rectangular-target pointing and (b) computation of

SDx

and

SDy

. The "x" marks indicate click

positions.

target height was considered in addition to Fitts’ law [31]:

MT =𝑎+𝑏·log2𝐷

𝑊+1+𝑐·log21

𝐻,(6)

where

ID =[log2(𝐷/𝑊+

) +𝑐/𝑏·log2(

/𝐻)]

. Note that

𝐷

𝑊+1

is originally dened as

𝐷+𝑊

𝑊

[

], and thus, the “+1” factor is

included. In comparison, we found no justication to apply “+1” to

(1/𝐻)

in Equation 6. MacKenzie and Buxton [

] and Homann

and Sheikh [

] proposed a model using the smaller value of

𝑊

and 𝐻:

IDmin =log2𝐷

min(𝑊 , 𝐻 )+1.(7)

This model indicates that the time is solely aected by the more

dicult dimension. Lastly, a well-known successful formulation for

rectangular-target pointing is Accot and Zhai’s weighted Euclidean

model using a free parameter 𝑐:

ID =log2©«𝐷

𝑊2

+𝑐·𝐷

𝐻2

+1ª®¬.(8)

2.3 Eective Width and Height for

Rectangular-target Pointing

Our idea is to apply

𝑊𝑒

and

𝐻𝑒

to 2D forms of Fitts’ law (Equa-

tions 5–8).

𝐻𝑒

is dened in the same way as

𝑊𝑒

, i.e.,

𝐻𝑒=

133

·SDy

where

SDy

is the

𝜎

of endpoints perpendicular to the task axis (Fig-

ure 2b). This requires that the endpoints on the y-axis are normally

distributed over the target and that the endpoints on the x and y

axes are uncorrelated, which has been empirically found to be the

case [6,27,58].

Using

𝐻𝑒

was proposed by Murata [

]. He utilized square targets

(

𝑊=𝐻

) and the approach angle towards the upper-right of

𝜃=

◦

but measured

SDx

and

SDy

on the screen. He dened the target

size as

min(𝑊𝑒, 𝐻𝑒)

by using the

IDmin

model. Another approach

to replacing

𝑊

is to use the bivariate standard deviation

SDxy

𝜎

in Equation 3[

]. We will compare these approaches with our

method, which independently applies 𝑊𝑒and 𝐻𝑒.

Jagacinski and Monk made an assumption that endpoints follow a

bivariate normal distribution [

]. However, they used only circular

targets and assumed that

SDx

was always equal to

SDy

. Sheikh and

Homann conrmed that

can be modeled by (1) Fitts’ law with

𝑊

and (2) Fitts’ law that replaces

𝑊

with

𝐻𝑒

[

]. They tested the

tness for these models separately and did not use

𝑊𝑒

; thus, the

tness of a model integrating 𝑊𝑒and 𝐻𝑒is unknown.

2.4 Normalization Eect of the Eective Width

Method

Zhai et al. gave participants three instructions, namely,

Bias =

Accurate

Neutral

, and

Fast

, for emphasizing accuracy, balancing

speed and accuracy, and emphasizing speed, respectively [

When they analyzed the three biases’ data in a mixed manner,

Equation 2using

𝑊

showed

𝑅2=

696 and using

𝑊𝑒

showed

𝑅2=

825, which demonstrates the normalization capability of the

eective width method.

The

s for dierent speed-accuracy instructions will be close

to each other. MacKenzie and Isokoski used three biases (

Accurate

Neutral

, and

Fast

), and the

s were 5.70, 5.73, and 5.67 bits/s, re-

spectively (

1% dierences) [

]. This result shows that using

𝑊𝑒

normalizes the speed-accuracy biases, which enables us to com-

pare the accuracy-normalized performances of user groups having

dierent biases.

However, if we analyze the model tness for a single speed-

accuracy instruction condition, the

𝑅2

value obtained using

𝑊𝑒

will

be smaller than

𝑊

. This result has been reported in numerous stud-

ies [

]. We will also check this possible disadvantage

in our data analyses.

There are two main approaches for using the eective width

method. First, a single instruction (typically “Neutral”) is given to a

group of participants. It is inevitable that the participants will have

dierent personal biases, e.g., some will operate a mouse rapidly

while others slowly. Using

IDe

helps to normalize this personal

bias by adjusting the error rates to 4%; this is the reason using

the eective width method is recommended when measuring the

performance of several devices or user groups (i.e., not predicting

s under new conditions) [

]. Second, several instructions are

given and each participant changes their speed-accuracy balance in

an experiment. The eective width method normalizes these inten-

tional biases, yielding invariant

s between dierent instruction

conditions and yielding a high model tness when analyzing the

data in a mixed manner [

]. The second approach is investigated

in this paper.

CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.

3 EXPERIMENT 1: REMOTELY INSTRUCTED

TASK WITH UNIVERSITY STUDENTS

Experiment 1 was designed to be run as a lab-based experiment.

Due to COVID-19, we distributed the experimental system to stu-

dent members of our laboratory and they performed the task in

their homes using their own PCs and mice. We developed the ex-

perimental system using the

Hot Soup Processor

programming

language.

With lab-based experiments, researchers typically use the same

apparatus to evaluate models. If they use dierent mice, displays,

cursor-speed settings, etc., it is dicult to discuss whether a poor

model t comes from (e.g.) inadequate parameters in the model or

dierences in screen resolutions. It is important to note here that

target pointing performance is aected by various settings of the

apparatus and PC, such as mouse latency [

], screen resolution

[

], and cursor speed and acceleration function [

]. Hence, if it

was not possible to draw a clear conclusion about whether using

𝐻𝑒

could normalize the speed-accuracy biases because, for example, the

statistical dierences between several models were not signicant,

it would be better to re-run the study in a more controlled lab-based

environment.

However, it has been demonstrated that distributed (crowd-

sourced) user experiments lead to the conclusion that Fitts’ law

holds [

]. For rectangular-target pointing, the best-t model

is not likely to change even when comparing the model tness re-

peatedly by random-sampling; i.e., Accot and Zhai’s model has

always been the best [

]. Therefore, it has been well-demonstrated

that we can obtain a consistent conclusion on the best-t model

for lab-based and crowdsourced experiments. While we skip con-

ducting a traditional lab-based study, it is worth running remote

and crowdsourced experiments to examine our hypothesis. If re-

searchers uncover a negative result when running a lab-based study

in the future (e.g., introducing

𝐻𝑒

cannot normalize the speed-

accuracy biases), it will give a dierent type of contribution; e.g.,

how to control the experimental apparatus could lead to dierent

conclusions.

3.1 Task, Design, and Procedure

The task was to click the red target back and forth. The study

was a 3

5within-subjects design with the following

independent variables and levels: three subjective biases (

Bias =

Accurate,Neutral,

and

Fast

), two

𝐷

s (380 and 640 pixels), four

𝑊

(30

60, and 90 pixels), and ve

𝐻

s (20

60, and 150 pixels).

The three

Bias

conditions were the same as those of previous studies

[

]. Several previous studies have also tested other biases (e.g.,

extremely accurate/fast [

]), but in order to avoid an overly

high number of task-condition combinations, we decided to exam-

ine one accuracy- and one speed-emphasized condition along with

a baseline, which was sucient for our purpose.

One session consisted of 15 cyclic clicks with a xed

𝐷×𝑊×𝐻

condition. One block consisted of 40 (

𝐷×

𝑊×

𝐻

) sessions for a

xed

Bias

condition. The rst target was on the left side. When the

participant clicked on the target, the colors of the red (target) and

white (non-target) rectangles switched. If the participant missed the

target, it ashed yellow, and the participant had to aim at it again

until they clicked it successfully. We did not give auditory feedback

for success or failure. After completing 15 successful clicks, the

results of the session (

and the number of errors) and a message

to take a break were displayed. The rst three clicks in each session

were omitted and we used the remaining 12 clicks (six for each

side) in the subsequent analyses. The order of the 40

𝐷×𝑊×𝐻

conditions was randomized for each block. In total, we recorded

Bias ×

𝐷×

𝑊×

𝐻×

clicks ×

participants =

920 data points.

3.2 Pre-experiment Instructions and Practice

We asked the participants to watch a 2.5-min video in which one of

the authors demonstrated the task. At this stage, we told them that

there would be three

Bias

conditions and asked them to perform

the tasks dierently in terms of speed and accuracy. In addition, to

control the cursor conguration, we asked them to set the cursor-

speed slider in the Control Panel to default (middle) and turn on the

cursor acceleration function (“Enhance pointer precision”), which

is the default of the Windows OS. Using a specic conguration

on the cursor speed is commonly done in lab-based experiments.

However, dierently from lab-based experiments, our participants

had dierent mice and displays. Thus, our decision might nega-

tively aect some participants’ performance, as the combinations

of apparatus settings are known to aect target-pointing behavior,

and this is a limitation of this study. To solve this issue, a more so-

phisticated method is needed, e.g., hardware-independent pointing

transfer functions [26].

We asked the participants to run an executable le that provided

a practice task with only one session for each

Bias

condition. In this

practice, the parameters of

(𝐷, 𝑊 , 𝐻 )=(

450

)

pixels were

xed to values not used in the data-collection task, as the purpose

of the practice was to allow the participants to get used to the three

speed/accuracy balances with the set cursor speed. To do so, we set

the rst

Bias

condition to

Neutral

so that the participants could un-

derstand the balance between speed and accuracy and then shifted

it towards more rapidly or more slowly. The order of the subsequent

two conditions (

Fast

and

Accurate

) was randomized. Then, in the

data collection trials, the order of the three

Bias

conditions was

counter-balanced among the 18 participants.

3.3 Participants

We recruited 18 students from our university. All participants used

optical mice. Each participant received JPY 5000 (

∼

USD 48). The

main pointing task typically took 30 to 40 min to complete. The

participants’ demographics were as follows. Age: ranging from 21

to 24 years, 𝑀=22.2and SD =0.916. Gender: 10 were male and 8

were female. PC usage history: ranged from 2 to 18 years,

𝑀=

and SD =4.22. All were right-handed and used Windows 10.

4 RESULTS OF EXPERIMENT 1

We removed outlier data for trials in which the movement distance

for the rst click position was shorter than

𝐷/

]. We did

not use another frequently used criterion that removes trials in

which the rst click position is more than 2

𝑊

away from the target

center [

], because the endpoints for the

Fast

instruction were

expected to be wider than those in previous studies. In addition,

we did not use

-based outlier trials or participants, as extremely

Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA

rapid or slow movements were possible depending on

Bias

. As a re-

sult, we removed 35 outlier trials (0.135%). The dependent variables

were MT for the rst click, ER,SDx, and SDy.

4.1 Normality Test

We tested normality by using the Shapiro-Wilk test (

𝛼=

05)

before we ran an RM-ANOVA. Although ANOVA is robust against

violations of the normality test assumptions [

], it is better to

log-transform the data for detecting statistical signicance more

appropriately. Regarding

, we found that 81 conditions out of

120 (3

Bias ×

𝐷×

𝑊×

𝐻

) passed the normality test, or 67.5%.

We then log-transformed the data and obtained 110 conditions

(91.7%) that passed the test. After that, we ran the RM-ANOVA with

Bonferroni’s

𝑝

-value adjustment method for pairwise comparisons.

For the

𝐸𝑅

data, only seven conditions out of 120 passed the nor-

mality test (5.9%). A number of data were 0% and thus we could not

log-transform them. Therefore, we used non-parametric ANOVAs

with an aligned rank transform [

] and Tukey’s

𝑝

-value adjustment

method for pairwise comparisons.

For the

SDx

data, we found that 90 conditions passed the normal-

ity test (75.0%), and 105 conditions (87.5%) of log-transformed data

passed the test. For

SDy

, 103 conditions passed the test (85.8%), and

then the log-transformed data from 115 conditions (95.8%) passed

the test. We ran RM-ANOVAs for log-transformed

SDx

and

SDy

data. Note that the normality test was to examine if the 18 partici-

pants’ data distributed normally, and the results were independent

from whether the click positions were distributed normally.

4.2 Movement Time

Throughout this paper, for the

𝐹

statistic, the degrees of freedom

for the main eects of

Bias

𝐷

𝑊

, and

𝐻

, as well as their interac-

tions, were corrected using the Greenhouse-Geisser method when

Mauchly’s sphericity assumption was violated (

𝛼=

05). Because

our focus is on model tness, we limit our report here to the main ef-

fects of the independent variables for simplicity (and more detailed

results are included in the supplementary materials).

We found signicant main eects of

Bias

(

𝐹2,34 =

53,

𝑝<

001,

𝜂2

𝑝=

84),

𝐷

(

𝐹1,17 =

586

𝑝<

001,

𝜂2

𝑝=

97),

𝑊

(

𝐹2.138,36.35 =

870

𝑝<

001,

𝜂2

𝑝=

98), and

𝐻

(

𝐹4,68 =

01,

𝑝<

001,

𝜂2

𝑝=

79) on

. Signicant interactions (

𝑝<

05)

were found for

Bias×𝐷

Bias×𝑊

𝐷×𝑊

𝑊×𝐻

, and

Bias×𝐷×𝑊

. As

expected,

decreased when the instructions emphasized speed

more, when

𝐷

decreased, and when

𝑊

and

𝐻

increased (Figure 3).

In particular, these results show that the participants appropriately

followed the Bias instructions (Figure 3a).

4.3 Error Rate

We found signicant main eects of

Bias

(

𝐹2,34 =

05,

𝑝<

001,

𝜂2

𝑝=

84) and

𝑊

(

𝐹3,51 =

87,

𝑝<

001,

𝜂2

𝑝=

45) on

, but

no signicant eect of

𝐷

(

𝐹1,17 =

054,

𝑝=

32,

𝜂2

𝑝=

058) or

𝐻

(

𝐹4,68 =

349,

𝑝=

84,

𝜂2

𝑝=

020). Signicant interactions

(

𝑝<

05) were found for

Bias ×𝑊

𝐷×𝑊

𝐷×𝑊×𝐻

, and

Bias ×𝐷×𝑊×𝐻.

increased when the instruction emphasized speed more (Fig-

ure 4a) and when

𝑊

decreased (c). In comparison,

𝐷

and

𝐻

did not

signicantly aect

(Figure 4b and d). The same lack of eect of

𝐷

has been found in previous studies [

]. In contrast to

our results, Accot and Zhai reported that

𝐻

had a signicant eect

. If we had tested a much smaller

𝐻

, such as 8 pixels [

] or 1

mm [27], this result might have been dierent.

4.4 Endpoint Variability in SD𝑥and SD𝑦

For

SDx

, we found signicant main eects of

Bias

(

𝐹2,34 =

79,

𝑝<

001,

𝜂2

𝑝=

81),

𝑊

(

𝐹3,51 =

891

𝑝<

001,

𝜂2

𝑝=

98),

and

𝐻

(

𝐹4,68 =

792,

𝑝<

01,

𝜂2

𝑝=

18), but no signicant eect

𝐷

(

𝐹1,17 =

799,

𝑝=

11,

𝜂2

𝑝=

14). Signicant interactions

(

𝑝<

05) were found for

𝑊×𝐻

and

𝐷×𝑊×𝐻

. For

SDy

, we found

signicant main eects of

Bias

(

𝐹2,34 =

85,

𝑝<

001,

𝜂2

𝑝=

61),

𝐷

(

𝐹1,17 =

99,

𝑝<

001,

𝜂2

𝑝=

80),

𝑊

(

𝐹3,51 =

063,

𝑝<

05,

𝜂2

𝑝=

15), and

𝐻

(

𝐹1.718,29.20 =

447

𝑝<

001,

𝜂2

𝑝=

96).

Signicant interactions (

𝑝<

05) were found for

𝐷×𝐻

and

𝑊×𝐻

Figure 5plots the endpoint distributions. We can conrm here

that more clicks missed the target when the instructions empha-

sized speed more. Also, regarding the

𝐻𝑒

, more clicks were located

close to the target center on the y-axis when

Bias =Accurate

com-

pared with

Fast

. For example, when

𝐻=

20 pixels, the

SDy

for the

Accurate

condition was 3.414 pixels, while that for the

Neutral

and

Fast

conditions were 3.925 and 4.386 pixels, respectively; the

SDy

increased by 28% at most. This supports our hypothesis that, in

addition to SDx, the SDydata change according to the Bias.

Figure 5also shows that the spread of hits on the y-axis is likely to

increase as

𝐻

increases. To validate this, we checked the regression

between given target sizes and endpoint variability. Figure 6shows

that

SDx

and

SDy

increased with

𝑊

and

𝐻

, respectively, with

𝑅2>

85. In addition, when the instructions emphasized speed more,

the

SDx

values increased, showing larger intercepts and steeper

slopes. However, this relationship did not hold for

SDy

; e.g., the

slope for the Neutral condition was higher than that for Fast. One

possible explanation for this is that, when the instruction was

Fast

the participants tended to click roughly around the target even

when

𝐻

was small, and thus

SDy

became larger compared with

Neutral

. This led the y-axis values at low

𝐻

values to be higher

for

Fast

, thus tilting the regression line clockwise and pushing the

slope to become more stable. Therefore, the slopes of the regression

lines are not always higher for the faster

Bias

instructions. Another

tendency was that the tness for (

𝑊

SDx

) was greater than that

for (

𝐻

SDy

). This was possibly because we chose an extreme value

of 𝐻=150 pixels.

4.5 Model Fitness

We discuss the model tness in a comparative manner. We use an

adjusted

𝑅2

, and in addition, to discuss the model tness statisti-

cally, we calculate

AIC

[

]. As a rule of thumb, (a) a lower

AIC

value indicates a better model and a model with the minimum

AIC

(

𝐴𝐼𝐶minimum

) is the best; (b) a model with

AIC ≤

(

𝐴𝐼𝐶minimum +

is comparable with better models; and (c) a model with

AIC ≥

(

𝐴𝐼𝐶minimum +

10) can be safely rejected [

]. For simplicity, we

CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.

728 689 680 672 671

200

400

600

800

1000

20 30 40 60 150

MT [ms]

H [pixels]

775 723 654 601

200

400

600

800

1000

30 40 60 90

MT [ms]

W [pixels]

639 737

200

400

600

800

1000

380 640

MT [ms]

D [pixels]

766 671 627

200

400

600

800

1000

Accurate Neutral Fast

MT [ms]

Bias

a b c d

Figure 3: Main eects on

in Experiment 1. Throughout

this paper, the error bars show 95% CIs, and the horizontal

bars show signicant dierences (𝑝<0.05 at least).

6.09 5.79 5.57 6.10 5.93

20 30 40 60 150

ER [%]

H [pixels]

7.39 6.43 5.10 4.66

30 40 60 90

ER [%]

W [pixels]

5.75 6.04

380 640

ER [%]

D [pixels]

1.69

4.82 11.18

Accurate Neutral Fast

ER [%]

Bias

a b c d

Figure 4: Main eects on ER in Experiment 1.

Neutral

Fast

Accurate

H = 20 px H = 30 px H = 40 px H = 60 px H = 150 px

Figure 5: Click point distributions for

𝑊=

40 pixels condition by the 18 participants. The target height increases from left to

right. Three

Bias

conditions are shown at the top (

Accurate

), middle (

Neutral

), and bottom (

Fast

) rows. We aligned the task axis,

i.e., click points for the leftward movements are ipped to the right to merge the data when computing

SDx

and

SDxy

[

consider an

AIC

dierence greater than 10 to be signicant. Table 1

shows the tness results for the ve model candidates2.

When we used the nominal values, Accot and Zhai’s model al-

ways had the highest adjusted

𝑅2

and lowest

AIC

values. Thus,

Accot and Zhai’s model was the best for all

Bias

conditions and

for

Mixed

data. When we used the eective target sizes, Accot and

Zhai’s model again showed the highest adjusted

𝑅2

and lowest

AIC

values, except for the

Fast

condition, where MacKenzie’s formula-

tion using

SDx

gave the best t. However, as the

AIC

dierences

from Crossman’s, Kvålseth’s, and Accot and Zhai’s models were less

than 10, we could not actually determine that MacKenzie’s formu-

lation was the best. Accot and Zhai’s model is thus a safe choice for

The supplementary material shows more comprehensive results including the free

parameter values, non-adjusted 𝑅2values, and regression graphs.

a user experiment with a single instruction. Another insignicant

dierence was found for the

Mixed

condition: the dierence in

AIC

between Kvålseth’s (1187) and Accot and Zhai’s models (1179) was

less than 10.

To compare the models when using nominal vs. eective tar-

get sizes, when we analyzed the single-instruction data, using the

nominal values was always signicantly better for all three

Bias

conditions. This is consistent with previous studies on the eec-

tive width method [

]. Therefore, if researchers would like to

predict

s with a single instruction, we recommend using Accot

and Zhai’s model with the nominal target sizes. In contrast, for

the mixed-instruction data, the eective target sizes gave signi-

cantly better model ts, except for the

IDmin

model. Therefore, if

researchers would like to compare several input devices or user

Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA

a b c

y = 0.1685x + 1.614

R² = 0.962

050 100

SD_x [pixels]

W [pixels]

y = 0.1912x + 2.5673

R² = 0.9348

050 100

SD_x [pixels]

W [pixels]

y = 0.2484x + 2.9625

R² = 0.881

050 100

SD_x [pixels]

W [pixels]

y = 0.0539x + 2.7879

R² = 0.9004

050 100 150

SD_y [pixels]

H [pixels]

y = 0.0643x + 3.1665

R² = 0.916

050 100 150

SD_y [pixels]

H [pixels]

y = 0.0604x + 3.828

R² = 0.8502

050 100 150

SD_y [pixels]

H [pixels]

Accurate, SDx Neutral, SDx Fast, SDx

def

Accurate, SDy Neutral, SDy Fast, SDy

Figure 6: Regression expressions for (a–c) SDxvs. 𝑊and (d–e) SDyvs. 𝐻in Experiment 1.

Table 1: Model tness in Experiment 1. For the three

Bias

conditions, we regressed 40 data points (2

𝐷×

𝑊×

𝐻

), while for the

“Mixed” data analysis, we used 120 data points in total. Only for the eective MacKenzie model, there is a choice as to whether to

use

SDx

SDxy

. The blue cells show the best-t results for each

Bias

condition for each {Nominal, Eective} target-size analysis.

Accurate Neutral Fast Mixed

Size Ref. Eq. adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC

Nominal

MacKenzie 20.8785 393.3 0.8851 386.0 0.9097 374.1 0.6151 1344

Crossman 50.9257 374.9 0.9284 368.4 0.9492 352.4 0.6434 1337

Kvålseth 60.9189 378.4 0.9195 373.1 0.9413 358.2 0.6381 1338

IDmin 70.5556 445.2 0.5707 438.8 0.5330 439.8 0.3857 1400

Accot & Zhai 80.9656 344.1 0.9748 326.7 0.9821 310.8 0.6700 1327

Eective

MacKenzie (SDx)20.8876 390.2 0.9086 376.9 0.9353 360.8 0.8697 1214

MacKenzie (SDxy )20.7273 425.6 0.7431 418.2 0.8256 400.4 0.7507 1292

Crossman 50.9235 376.1 0.9277 368.8 0.9211 370.0 0.8906 1195

Kvålseth 60.9209 377.4 0.9262 369.6 0.9217 369.7 0.8974 1187

IDmin 70.3518 460.3 0.3721 454.0 0.3327 454.1 0.3943 1399

Accot & Zhai 80.9477 360.9 0.9473 356.2 0.9242 368.4 0.9039 1179

groups, we recommend using the eective target sizes. To check

this, we applied the eective target size only for the width (i.e., us-

ing

𝑊𝑒

and

𝐻

) to Accot and Zhai’s model and obtained an adjusted

𝑅2=

8912 and

AIC =

1194. Thus, we found that using both

𝑊𝑒

and 𝐻𝑒signicantly contributed to the model tness.

Now, we can visually grasp how the bivariate eective width

method improves the Accot and Zhai’s model tness (see Figure 7).

For the nominal data, the plot points in (a) are clearly shifted on

the y-axis depending on the given instructions. Therefore, when

we analyzed the

Mixed

data, the regression line passed between the

Accurate

and

Fast

conditions’ plot points. In contrast, the plot points

in (b) are less biased by the instructional dierence and lie closer

to the regression line. This is because the eective width method

changes

in accordance with the actual endpoints. For example,

for the nominal data,

of Accot and Zhai’s model ranged from 2.40

to 4.63 bits, while the range when using the eective target sizes

was 2.21 to 4.77 bits. This feature is important for normalizing the

speed-accuracy biases and is consistent with the results of previous

studies [5,67].

4.6 Throughput

Figure 8a shown the throughputs. In addition, we computed the

range of these

values. We dened the

dierence among

the three

Bias

conditions as 100%

× (TPmax −TPmin)/TPmax

. For

example, for MacKenzie’s formulation using the nominal

𝑊

, the

dierence is 100%

× (

459

−

468

459

15%. If a certain

model “perfectly” normalizes the speed-accuracy biases, the

dierence is 0%, and the

dierence does not reach 100% because

TPmin

is non-zero; 0%

≤TP dierence <

100%. In addition to

the model tness, this

dierence is another intuitive metric to

discuss the

normalization capability of models. However, note

that there is no clear threshold to determine the capability.

With the eective width method, the

s for dierent speed-

accuracy biases are close to each other [

], so the dierence in

is preferred to be small. By comparing the nominal and eective

CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.

aNominal bEffective

200

400

600

800

1000

1200

0123456

MT [ms]

ID [bits]

200

400

600

800

1000

1200

0123456

MT [ms]

ID [bits]

Accurate:

Neutral:

Fast:

y = 154.98x + 207.07, R² = 0.9670

y = 146.24x + 144.21, R² = 0.9757

y = 142.57x + 113.27, R² = 0.9823

Accurate:

Neutral:

Fast:

y = 171.98x + 108.50, R² = 0.9477

y = 166.16x + 77.209, R² = 0.9495

y = 159.04x + 101.60, R² = 0.9172

Mixed: y = 147.93x + 154.85, R² = 0.6747 Mixed: y = 180.68x + 43.534, R² = 0.9053

Figure 7: Model tness using Accot and Zhai’s model with (a) nominal and (b) eective target sizes.

4.47

5.21

3.52

5.13

4.68

4.85

4.14

5.74

3.74

6.13

4.98

5.10

5.95

4.02

5.86

5.35

5.17

4.44

6.13

3.83

6.69

5.32

5.46

6.37

4.30

6.28

5.73

5.10

4.42

6.05

3.58

6.97

5.26

5.01

5.84

3.95

5.76

5.25

5.04

4.33

5.97

3.72

6.60

5.19

MacKenzie Crossman Kvalseth ID_min Accot &

Zhai

MacKenzie

(SD_x)

MacKenzie

(SD_xy)

Crossman Kvalseth ID_min Accot &

Zhai

TP [bits/s]

Model

18.15

18.17

18.07

18.28

18.19

6.27

6.77

6.34

6.53

12.08

6.32

TP difference [%]

Model

TP value TP difference

Accurate

Neutral

Fast

Mixed

Nominal Effective Nominal Effective

Figure 8: Throughputs in Experiment 1. (a) TP value and (b) TP dierence.

target sizes, we found that the eective values achieved this goal

(Figure 8b). MacKenzie’s formulation using

SDx

gave the smallest

dierence, while Accot and Zhai’s model gave the second small-

est. For the

IDmin

model, because the model tness for the

Mixed

data was the lowest (see Table 1), its dierence is remarkable in

the eective width method shown in Figure 8b. In summary, the

eective width method appropriately lowered the performance dif-

ferences between the three

Bias

conditions, which demonstrates its

normalization capability against speed-accuracy biases.

5 EXPERIMENT 2: CROWDSOURCING USER

STUDY

To further validate our bivariate eective width method, we repli-

cate our experiment with another participant group. Because the

method is for capturing the central tendency of user performances

rather than that of a single person, recruiting plenty of participants

will be helpful for observing the full capability of the method. Thus,

our next experiment was run via crowdsourcing. We oered the

task at Yahoo! Crowdsourcing (https://crowdsourcing.yahoo.co.jp).

Almost all the task designs and procedures were the same as in the

remote study. The points of dierence are described below.

5.1 Participants and Recruitment

We recruited workers who used Windows (Vista or a later version).

No other qualications or special skills were required. We used the

“White List” option in the crowdsourcing platform for screening

newly created accounts to omit multiple entries by the same persons.

This option enabled us to oer the task only to workers who were

considered reliable on the basis of their previous task history.

To reduce noise introduced by multiple pointing devices in the

crowdsourcing data, we asked the workers to use a mouse if they

had one, as a mouse is the most commonly available device other

than a touchpad for non-laptop-PC users. Nevertheless, to avoid

a possible false report in which all workers might answer that

they used a mouse, we explicitly explained that any device was

acceptable, and then removed the non-mouse users from the analy-

sis. The workers were not instructed to change the cursor speed

or acceleration-function setting to increase the ecological validity.

This decision also helped to omit the time to re-learn a new speed

conguration.

After the workers nished all sessions and completed the ques-

tionnaire, they uploaded the log data le to a server to receive

payment. Each worker received JPY 100 (

≈

USD 0.96). It typically

took 10 min to complete the task, so the eective hourly payment

was approximately JPY 600 (≈USD 5.8).

In total, 207 mouse users completed the task. Their demographics

were as follows. Age: ranging from 20 to 72 years,

𝑀=

5, and

SD =

21. Gender: 166 were male, 39 were female, and 2 preferred

not to answer. Handedness: 14 were left-handed, and 193 were

right-handed. Windows version: 1 used Vista, 21 used Win7, 5 used

Win8, 5 used Win8.1, and 175 used Win10. PC usage history: ranged

from 1 to 40 years, 𝑀=21.9, and SD =7.00.

5.2 Task and Procedure

There were several points of dierence from Experiment 1. To

shorten the entire task time, (1) there were no practice sessions, and

(2)

𝐷

was xed to 640 pixels because testing the other independent

variables (

Bias

𝑊

, and

𝐻

) had higher priority. Previous studies on

rectangular-target pointing also used a single

𝐷

value [

𝑊

and

𝐻

were reduced:

𝑊=

and 90 pixels, and

𝐻=

and 150 pixels. Each session consisted of 19 clicks rather than 15

to increase the reliability of the endpoint distributions (

). The

rst ve clicks in each session were omitted, and thus, 14 clicks

Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA

were used for data analyses. Text instructions were given instead

of video instructions.

There were three

Bias

conditions, the same as in the remote

study. A block consisted of 12 sessions (3

𝑊×

𝐻

) with a xed

Bias

condition, so each worker completed 36 sessions in total. The order

of the 12

𝑊×𝐻

conditions was randomized for each block. In total,

we recorded 3

Bias ×

𝐷×

𝑊×

𝐻×

repetitions×

207

workers =

104

328

clicks. As in the remote study, the instructions page informed the

participants that there would be three

Bias

conditions and asked

them to perform dierently in terms of speed and accuracy.

6 RESULTS OF EXPERIMENT 2

6.1 Screening Outlier Data and Normality Test

We removed outlier data if the distance of the click position was

shorter than

𝐷/

2. There were 149 outliers (0.143%). As a check,

we tried to detect workers who had exhibited extremely short or

long

s. The inter-quartile range method [

], a robust and

frequently used method, was utilized for this. It agged two workers

who showed mean

s of 1358 and 1532 ms across the 36 sessions.

These workers seemed to lean towards accuracy more than the

other workers, but this did not violate our task instructions and

thus their data were not removed.

Even after we log-transformed the data of

SDx

, and

SDy

we found that 0, 4, and 14 conditions passed the normality test,

respectively, or 0, 11.1, and 38.9%. Still, we consistently ran RM-

ANOVAs, as ANOVA can be used robustly [

]. For

, we used

non-parametric ANOVAs with an aligned rank transform.

6.2 Movement Time

We found signicant main eects of

Bias

(

𝐹1.664,342.8=

246

𝑝<

001,

𝜂2

𝑝=

55),

𝑊

(

𝐹1.611,331.8=

2870,

𝑝<

001,

𝜂2

𝑝=

93),

and

𝐻

(

𝐹2.659,547.8=

383

𝑝<

001,

𝜂2

𝑝=

65) on

. Signi-

cant interactions (

𝑝<

05) were found for

Bias ×𝐻

𝑊×𝐻

, and

Bias ×𝑊×𝐻

. The

decreased when the instructions emphasized

speed more and when

𝑊

and

𝐻

increased (Figure 9). These results

demonstrate that the participants appropriately followed the

Bias

instructions.

6.3 Error Rate

We found signicant main eects of

Bias

(

𝐹2,412 =

262

𝑝<

001,

𝜂2

𝑝=

56),

𝑊

(

𝐹2,412 =

1711,

𝑝<

001,

𝜂2

𝑝=

89), and

𝐻

(

𝐹3,618 =

420

𝑝<

001,

𝜂2

𝑝=

67) on

. Signicant interactions (

𝑝<

05) were found for

Bias×𝑊

Bias×𝐻

𝑊×𝐻

, and

Bias×𝑊×𝐻

. This

is interesting because the signicant main eect of

𝐻

(Figure 10c)

was not found in the remote study. A larger sample size would have

helped to detect the signicance.

6.4 Endpoint Variability in SD𝑥and SD𝑦

For

SDx

, we found signicant main eects of

Bias

(

𝐹1.538,316.9=

252

𝑝<

001,

𝜂2

𝑝=

55),

𝑊

(

𝐹1.566,322.5=

3711,

𝑝<

001,

𝜂2

𝑝=

95), and

𝐻

(

𝐹2.899,597.161 =

46,

𝑝<

001,

𝜂2

𝑝=

13).

Signicant interactions (

𝑝<

05) were found for

Bias×𝑊

Bias×𝐻

and

𝑊×𝐻

. For

SDy

, we found signicant main eects of

Bias

(

𝐹2,412 =

20,

𝑝<

001,

𝜂2

𝑝=

22) and

𝐻

(

𝐹1.749,360.3=

2303,

𝑝<

001,

𝜂2

𝑝=

92), but not for

𝑊

(

𝐹1.865,384.2=

463,

𝑝=

23,

𝜂2

𝑝=

007). Signicant interactions (

𝑝<

05) were found for

Bias ×𝐻and 𝑊×𝐻. The regression expressions are as follows:

Accurate :SDx=1.939 +0.1542𝑊(𝑅2=0.9915),

SDy=2.310 +0.07095𝐻(𝑅2=0.9768)(9)

Neutral :SDx=2.373 +0.1702𝑊(𝑅2=0.9819),

SDy=2.626 +0.07068𝐻(𝑅2=0.9717)(10)

Fast :SDx=3.677 +0.1936𝑊(𝑅2=0.9612),

SDy=3.138 +0.07043𝐻(𝑅2=0.9607)(11)

The

𝑅2

values were greater than those in Experiment 1 (

𝑅2

ranged

from 0.85 to 0.96), probably because there were fewer regression

points in Experiment 2. The intercepts and slopes for

SDx

mono-

tonically increased when the instruction emphasized speed more.

This was true only for the intercepts for SDy.

6.5 Model Fitness

Table 2shows the results of the ve models we examined. Regardless

of using the nominal or eective target sizes, Accot and Zhai’s

model always had the highest adjusted

𝑅2

and lowest

AIC

values

both for using single- or mixed-instruction data. Recall that, in the

remote study, Accot and Zhai’s model was not always the best (see

Table 1). If we apply the eective target size only for the width (i.e.,

using

𝑊𝑒

and

𝐻

) to Accot and Zhai’s model for the

Mixed

data, we

obtain an adjusted

𝑅2=

9216 and

AIC =

346

7(i.e., no signicant

dierence from using

𝑊𝑒

and

𝐻𝑒

where

AIC

was 339.5). This shows

that using both

𝑊𝑒

and

𝐻𝑒

helped to improve the model tness, but

not as clearly as we observed in the remote study in which the

AIC

dierence was signicant.

Another positive aspect in this crowdsourced experiment was

that there were no signicant

AIC

dierences between the nominal

and eective width method for the three

Bias

conditions when

Accot and Zhai’s model was used. Previous studies considered the

eective width to be inferior to the nominal width for analyzing

single-instruction data [

]. We also found that the adjusted

𝑅2

values using the nominal width were always higher than those

using

𝑊𝑒

. Still, the dierences are only less than 0.02 points, with

no signicant

AIC

dierences. Thus, even if we analyze a single

Bias

condition, the prediction accuracy of

s is not signicantly

lower than using the nominal sizes.

6.6 Throughput

Figure 11a shows the

, and Figure 11b shows the ranges of these

values: 100%

×(TP max −TPmin )/TPmax

. By comparing the nom-

inal and eective target sizes, we can see that the eective values

normalized the

dierences more. Kvålseth’s model shows the

strongest normalization capability, followed by MacKenzie’s (

SDx

Crossman’s, and Accot and Zhai’s models. As in the remote study,

the crowdsourced study empirically showed that the bivariate ef-

fective width method lowered the

dierences between the three

Bias

conditions if we chose the appropriate model formulations.

This again demonstrates the normalization capability against speed-

accuracy biases.

CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.

856 797 781 781

200

400

600

800

1000

20 40 70 150

MT [ms]

H [pixels]

899 799 712

200

400

600

800

1000

30 50 90

MT [ms]

W [pixels]

869 798 744

200

400

600

800

1000

Accurate Neutral Fast

MT [ms]

Bias

a b c

Figure 9: Main eects on MT in Experiment 2.

5.27 4.34 4.36 4.51

20 40 70 150

ER [%]

H [pixels]

6.21 4.32 3.34

30 50 90

ER [%]

W [pixels]

2.12 3.56 8.18

Accurate Neutral Fast

ER [%]

Bias

a b c

Figure 10: Main eects on ER in Experiment 2.

Table 2: Model tness in Experiment 2. For the three

Bias

conditions, we regressed the 12 data points (1

𝐷×

𝑊×

𝐻

), while

for the “Mixed” data analysis, we used those 36 data points in total. Only for the eective MacKenzie model, there is a choice

of using

SDx

SDxy

. The blue cells show the best-t results for each

Bias

condition for each {Nominal, Eective} target-size

analysis.

Accurate Neutral Fast Mixed

Size Ref. Eq. adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC adj. 𝑅2AIC

Nominal

MacKenzie 20.7742 127.7 0.8070 123.4 0.8318 119.7 0.5848 404.2

Crossman 50.9123 119.0 0.9116 116.6 0.9420 109.5 0.6673 398.7

Kvålseth 60.9084 119.5 0.9081 117.1 0.9403 109.9 0.6651 399.0

IDmin 70.6153 134.1 0.5479 133.6 0.5509 131.5 0.4287 415.7

Accot & Zhai 80.9898 93.17 0.9819 97.64 0.9795 97.04 0.7115 393.6

Eective

MacKenzie (SDx)20.8163 125.2 0.8752 118.2 0.9064 112.6 0.8463 368.4

MacKenzie (SDxy )20.3198 140.9 0.4171 136.7 0.5473 131.6 0.4640 413.4

Crossman 50.9157 118.5 0.9241 114.8 0.9372 110.5 0.9002 355.4

Kvålseth 60.9147 118.6 0.9235 114.9 0.9369 110.5 0.8997 355.6

IDmin 70.3130 141.1 0.2219 140.1 0.1752 138.8 0.3064 422.6

Accot & Zhai 80.9780 102.4 0.9626 106.3 0.9678 102.4 0.9359 339.5

4.31

5.25

2.99

4.85

4.53

4.74

3.87

5.64

3.70

5.68

4.85

4.70

5.72

3.26

5.29

4.94

4.96

4.14

5.92

3.81

6.05

5.10

5.03

6.13

3.49

5.67

5.29

4.99

4.24

5.95

3.69

6.30

5.13

4.68

5.70

3.25

5.27

4.92

4.90

4.09

5.84

3.73

6.01

5.03

MacKenzie Crossman Kvalseth ID_min Accot &

Zhai

MacKenzie

(SD_x)

MacKenzie

(SD_xy)

Crossman Kvalseth ID_min Accot &

Zhai

TP [bits/s]

Model

14.33

14.34

14.35

14.39

14.36

4.95

8.66

5.23

3.20

9.89

5.32

TP difference [%]

Model

TP value TP difference

Accurate

Neutral

Fast

Mixed

Nominal Effective Nominal Effective

Figure 11: Throughputs in Experiment 2. (a) TP value and (b) TP dierence.

7 GENERAL DISCUSSION

7.1 Capability of Normalizing Speed-accuracy

Tradeos and Choice of Models

Overall, the results of the remote and crowdsourced experiments

indicate that using

𝑊𝑒

and

𝐻𝑒

appropriately normalized the subjec-

tive speed-accuracy biases. This capability was validated by the fact

that (1) the regression expression for the data in a mixed manner

showed better ts in terms of adjusted

𝑅2

and

AIC

compared with

using nominal values and (2) the throughput dierences between

the three

Bias

conditions were smaller when using

𝑊𝑒

and

𝐻𝑒

. On

the basis of the model tness results, we recommend using Accot

and Zhai’s model (Equation 8). This model was not always optimal

for normalizing the

values (see Figures 8and 11), but the reli-

ability of the

data is established on the basis of the tness of

Fitts’ law. As Accot and Zhai’s model showed the best t when an-

alyzing the mixed-instruction data, it makes sense that this model

appropriately normalized the biases.

Previously, using

𝑊𝑒

was recommended for comparing dierent

devices or user groups, but if researchers who use rectangular tar-

gets were to apply

𝑊𝑒

to the baseline (MacKenzie) formulation, they

would not observe the high prediction accuracy possible with more

appropriate formulations. We demonstrated the rst evidence that

applying

𝑊𝑒

and

𝐻𝑒

independently to proper models (Crossman,

Kvålseth, or Accot and Zhai) achieved signicant improvements in

model tness for the data in a mixed manner. Without such mod-

els, researchers have had to use innite-height or circular targets,

Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA

which are simplied articial shapes. This means that they had

no appropriate metric to compare dierent users or devices with

realistic rectangular targets, which has clearly been a limitation in

the HCI eld.

7.2 Implications

Our bivariate eective width method has several possible appli-

cations in addition to point-and-click tasks with mice, which will

enable the comparison of devices and techniques in more realistic

GUI targets that appear in actual situations. Because Fitts’ law holds

for drag-and-drop operations [

], we can compare the perfor-

mances when participants select texts on a web browser or select

multiple cells in a spreadsheet by dragging. In these cases, the font

size and kerning would aect the text-selection performance (they

act as target sizes), and the cell sizes of the spreadsheet would aect

the selection time. It is known that endpoint variability perpendic-

ular to the movement direction diers depending on input devices

[

], and therefore, it is necessary to normalize the speed-accuracy

biases for a fair comparison with baseline and novel techniques

for drag-and-drop operations, e.g., [

]. Still, the purpose of the

bivariate eective width method is to allow researchers to conduct

such comparison studies; the method itself does not directly judge

if a given GUI design is good or not, such as whether the cell size

is sucient for rapid and accurate selection.

Another potential implication is for eye-gaze movements, which

also follow Fitts’ law [

] and the eective width method [

]

(note that there is a debate on the applicability of Fitts’ law to

gaze-based pointing [

]). Murata et al. compared dierent

input methods, such as gaze vs. mouse, and reported the time and

accuracy separately (gaze was fast but inaccurate compared with

mouse) [

]. Now, researchers can use a unied metric

and can

determine a better input technique after normalizing the accuracy.

In the future, it would be worth examining the applicability of the

bivariate eective width method to touch, gaze, and drag-and-drop

operations.

7.3 Limitations and Future Work

Our conclusions are limited by experimental design considerations

such as the ranges of

𝐷

𝑊

, and

𝐻

used in the two experiments.

Moreover, while we followed the conventional methodology of Fitts’

law in that we used only necessary targets, in realistic situations

there are additional buttons or icons that users do not want to select

(called distractors [

]), which would have an eect on the users’

pointing performance.

An untested target parameter was the approach angle

𝜃

of the

cursor towards the target, which is known to aect Fitts’ law perfor-

mance and model tness [

]. Although we tested the simplest

approach angles of

𝜃=

◦

and 180

◦

where

𝜃=

◦

is dened as

rightward, prior literature has examined some modications to

models, e.g., Zhang et al.’s model [

]. Ko et al. demonstrated a way

to simplify

𝜃

: when

𝜃

ranges from 0

◦

to 45

◦

, the x-length of the

target is dened as

𝑊

, while the y-length is dened as

𝑊

when

◦<𝜃≤

◦

[

]. More recently, Ma et al. proposed using the

projected target sizes [

]. Our future work will include experiments

on such models.

While we used the data from all sessions and blocks, we checked

if there were progress eects (learning, fatigue, etc.) on the

results to validate our main claim. In Experiment 1, a RM-ANOVA

showed no signicant main eects of

Block

Session

, and no in-

teraction eect among them (all

𝑝>

7). In contrast, we found

the main eect of

Block

in Experiment 2 (

𝑝<

05). Pairwise tests

with Bonferroni’s

𝑝

-value adjustment showed that the

was

signicantly longer (

𝑝<

05) for the rst block than for the third

one: 815 vs. 789 ms (95% CIs were 22 and 19 ms), respectively. The

crowd workers seemed to get used to the task and exhibited shorter

times in the nal block, but the 95% CI error bars overlap each other

and thus we consider it unfruitful to discuss this small learning

eect. We suspect that the large sample size (207 workers) helped

with nding this signicant main eect of

Block

. Still, we could not

remove any block’s data, because the three blocks correspond to

the three Bias conditions, which is a limitation of this study.

There are several issues relating to remote and crowdsourced

user studies, such as inconsistent display sizes and mouse models.

Thus, factors aecting the performance of Fitts’ law tasks, such

as the mouse-to-cursor latencies [

], were not controlled. In

addition, it is known that crowd workers tend to give the minimum

eort in order to nish a task in a short time (called “satiscing”

[

]). Thus, we were concerned about the possibility that some

workers may not have (e.g.) operated their mice carefully even

when the instruction was

Accurate

. Because the core interest of

the present study is the subjective biases, it is important that the

participants followed the instructions.

To discover possible issues of lack of compliance with instruc-

tions, we tried to analyze how dierently the participants exhibited

and

depending on each

Bias

condition. However, even if,

for example, a participant had exhibited a mean

of 5% for the

Neutral

condition and 6% for

Accurate

, we thought that we should

not regard this as a violation of the instructions. This was because

error clicks would occur by chance and

could be aected by the

order of the three

Bias

conditions. The focus of this study was not

individual data; rather, we conrmed that the

Bias

had signicant

main eects on

and

with large eect sizes in both experi-

ments. This demonstrates that, overall, the participants followed

the instructions and changed their behavior accordingly. Our fu-

ture work, of course, will include checking if the ndings of this

study also hold in lab-based controlled experiments in which the

participants are more motivated to follow subjective instructions so

that we can strengthen our conclusion that the bivariate eective

width method normalizes speed-accuracy biases.

8 CONCLUSION

In this work, we explored the utility of the eective width method

when it was applied to the target height in Fitts’ law tasks. The re-

sults of remotely conducted and crowdsourced experiments showed

that Accot and Zhai’s weighted Euclidean model [

] using

𝑊𝑒

and

𝐻𝑒

almost always exhibited the best t for the data mixing the

three

Bias

conditions. Integrating

𝑊𝑒

and

𝐻𝑒

with bivariate Fitts’

law models normalizes the speed-accuracy biases and thus enables

researchers to compare dierent task conditions. We also conrmed

that using the nominal sizes showed the (sometimes signicantly)

better tness when analyzing the data from a single-instruction

CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA Yamanaka et al.

condition, which is consistent with previous studies [

]. Our

recommendations are summarized as follows.

•

Use Accot and Zhai’s model with

𝑊

and

𝐻

when researchers

would like to predict

s under new task conditions with a

single input device and single instruction (“point to targets as

rapidly and accurately as possible”, i.e., Neutral).

•

Use Accot and Zhai’s model with

𝑊𝑒

and

𝐻𝑒

when researchers

would like to compare two or more devices (e.g., mouse vs.

touchpad vs. joystick), interaction techniques (e.g., a proposed

method vs. baseline point-and-click), or user groups (children vs.

young adults vs. older adults) with a single instruction (typically

Neutral).

Rectangular objects are perhaps the most frequently arranged

targets on desktops and mobile screens. Considering this, while user

experiments with innite-height or circular targets are frequently

used, they would be too articial to measure realistic user perfor-

mances. It has been claimed that rectangular targets are needed for a

better understanding of user behaviors in pointing tasks [

]. Our

giving an appropriate metric for rectangular-target pointing that

enables data obtained under dierent conditions to be compared is

a useful methodological contribution to the HCI eld.

ACKNOWLEDGMENTS

We thank the reviewers of CHI 2022 and International Journal of

Human-Computer Studies for their valuable feedback.

REFERENCES

[1]

Johnny Accot and Shumin Zhai. 2003. Rening Fitts’ Law Models for Bivariate

Pointing. In Proceedings of the SIGCHI Conference on Human Factors in Computing

Systems (Ft. Lauderdale, Florida, USA) (CHI ’03). ACM, New York, NY, USA,

193–200. https://doi.org/10.1145/642611.642646

[2]

Hirotugu Akaike. 1974. A new look at the statistical model identication. IEEE

Trans. Automat. Control 19, 6 (1974), 716–723. https://doi.org/10.1109/TAC.1974.

1100705

[3]

Caroline Appert, Olivier Chapuis, and Michel Beaudouin-Lafon. 2008. Evaluation

of Pointing Performance on Screen Edges. In Proceedings of the Working Confer-

ence on Advanced Visual Interfaces (Napoli, Italy) (AVI ’08). ACM, New York, NY,

USA, 119–126. https://doi.org/10.1145/1385569.1385590

[4]

Anil Ufuk Batmaz and WolfgangStuerzlinger. 2021. The Eect of Pitch in Auditory

Error Feedback for Fitts’ Tasks in Virtual Reality Training Systems. In 2021 IEEE

Virtual Reality and 3D User Interfaces (VR). IEEE, Washington, DC, USA, 85–94.

https://doi.org/10.1109/VR50410.2021.00029

[5]

Xiaojun Bi, Yang Li, and Shumin Zhai. 2013. FFitts Law: Modeling Finger Touch

with Fitts’ Law. In Proceedings of the SIGCHI Conference on Human Factors in

Computing Systems (Paris, France) (CHI ’13). ACM, New York, NY, USA, 1363–

1372. https://doi.org/10.1145/2470654.2466180

[6]

Xiaojun Bi and Shumin Zhai. 2013. Bayesian Touch: A Statistical Criterion

of Target Selection with Finger Touch. In Proceedings of the 26th Annual ACM

Symposium on User Interface Software and Technology (St. Andrews, Scotland,

United Kingdom) (UIST ’13). Association for Computing Machinery, New York,

NY, USA, 51–60. https://doi.org/10.1145/2501988.2502058

[7]

Renaud Blanch and Michael Ortega. 2011. Benchmarking Pointing Techniques

with Distractors: Adding a Density Factor to Fitts’ Pointing Paradigm. In Pro-

ceedings of the SIGCHI Conference on Human Factors in Computing Systems

(Vancouver, BC, Canada) (CHI ’11). ACM, New York, NY, USA, 1629–1638.

https://doi.org/10.1145/1978942.1979180

[8]

Michael Bohan, Mitchell Longsta, Arend Van Gemmert, Miya Rand, and George

Stelmach. 2003. Eects of target height and width on 2D pointing movement

duration and kinematics. Motor control 7 (08 2003), 278–289. Issue 3. https:

//doi.org/10.1123/mcj.7.3.278

[9]

Kenneth P Burnham and David R Anderson. 2003. Model selection and multimodel

inference: a practical information-theoretic approach. Springer Science & Business

Media, Heidelberg, Germany.

[10]

Géry Casiez, Stéphane Conversy, Matthieu Falce, Stéphane Huot, and Nicolas

Roussel. 2015. Looking Through the Eye of the Mouse: A Simple Method for

Measuring End-to-end Latency Using an Optical Mouse. In Proceedings of the 28th

Annual ACM Symposium on User Interface Software & Technology (Charlotte,

NC, USA) (UIST ’15). ACM, New York, NY, USA, 629–636. https://doi.org/10.

1145/2807442.2807454

[11]

Géry Casiez and Nicolas Roussel. 2011. No More Bricolage!: Methods and Tools to

Characterize, Replicate and Compare Pointing Transfer Functions. In Proceedings

of the 24th Annual ACM Symposium on User Interface Software and Technology

(Santa Barbara, California, USA) (UIST ’11). ACM, New York, NY, USA, 603–614.

https://doi.org/10.1145/2047196.2047276

[12]

Olivier Chapuis and Pierre Dragicevic. 2011. Eects of Motor Scale, Visual Scale,

and Quantization on Small Target Acquisition Diculty. ACM Trans. Comput.-

Hum. Interact. 18, 3, Article 13 (Aug. 2011), 32 pages. https://doi.org/10.1145/

1993060.1993063

[13]

Edward R.F.W. Crossman. 1956. The measurement of perceptual load in manual

operations. Ph.D. Dissertation. University of Birmingham.

[14]

Jay L. Devore. 2011. Probability and Statistics for Engineering and the Sciences

(8th ed.). Brooks/Cole, Stamford, CT, USA. ISBN-13: 978-0-538-73352-6.

[15]

Peter Dixon. 2008. Models of accuracy in repeated-measures designs. Journal of

Memory and Language 59, 4 (2008), 447–456.

[16]

Sarah A. Douglas, Arthur E. Kirkpatrick, and I. Scott MacKenzie. 1999. Testing

Pointing Device Performance and User Assessment with the ISO 9241, Part 9

Standard. In Proceedings of the SIGCHI Conference on Human Factors in Computing

Systems (Pittsburgh, Pennsylvania, USA) (CHI ’99). Association for Computing

Machinery, New York, NY, USA, 215–222. https://doi.org/10.1145/302979.303042

[17]

Leah Findlater, Joan Zhang, Jon E. Froehlich, and Karyn Moatt. 2017. Dierences

in Crowdsourced vs. Lab-based Mobile and Desktop Input Performance Data. In

Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems

(Denver, Colorado, USA) (CHI ’17). ACM, New York, NY, USA, 6813–6824. https:

//doi.org/10.1145/3025453.3025820

[18]

Paul M. Fitts. 1954. The information capacity of the human motor system in

controlling the amplitude of movement. Journal of Experimental Psychology 47, 6

(1954), 381–391. https://doi.org/10.1037/h0055392

[19]

P. M. Fitts and B. K. Radford. 1966. Information capacity of discrete motor

responses under dierent cognitive sets. Journal of experimental psychology 71, 4

(1966), 475–482.

[20]

Douglas J. Gillan, Kritina Holden, Susan Adam, Marianne Rudisill, and Laura

Magee. 1990. How Does Fitts’ Law Fit Pointing and Dragging?. In Proceed-

ings of the SIGCHI Conference on Human Factors in Computing Systems (Seat-

tle, Washington, USA) (CHI ’90). ACM, New York, NY, USA, 227–234. https:

//doi.org/10.1145/97243.97278

[21]

Julien Gori. 2018. Modeling the speed-accuracy tradeo using the tools of infor-

mation theory. Ph.D. Theses. Université Paris-Saclay. https://pastel.archives-

ouvertes.fr/tel-02005752

[22]

Julien Gori, Olivier Rioul, and Yves Guiard. 2018. Speed-Accuracy Tradeo:

A Formal Information-Theoretic Transmission Scheme (FITTS). ACM Trans.

Comput.-Hum. Interact. 25, 5, Article 27 (Sept. 2018), 33 pages. https://doi.org/

10.1145/3231595

[23]

Julien Gori, Olivier Rioul, Yves Guiard, and Michel Beaudouin-Lafon. 2018. The

Perils of Confounding Factors: How Fitts’ Law Experiments Can Lead to False

Conclusions. In Proceedings of the 2018 CHI Conference on Human Factors in

Computing Systems (Montreal QC, Canada) (CHI ’18). ACM, New York, NY, USA,

Article 196, 10 pages. https://doi.org/10.1145/3173574.3173770

[24]

Sandy J. J. Gould, Anna L. Cox, and Duncan P. Brumby. 2016. Diminished Control

in Crowdsourcing: An Investigation of Crowdworker Multitasking Behavior.

ACM Trans. Comput.-Hum. Interact. 23, 3, Article 19 (June 2016), 29 pages. https:

//doi.org/10.1145/2928269

[25]

Yves Guiard, Halla B. Olafsdottir, and Simon T. Perrault. 2011. Fitt’s Law as an

Explicit Time/Error Trade-O. In Proceedings of the SIGCHI Conference on Human

Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for

Computing Machinery, New York, NY, USA, 1619–1628. https://doi.org/10.1145/

1978942.1979179

[26]

Raiza Hanada, Damien Masson, Géry Casiez, Mathieu Nancel, and Sylvain

Malacria. 2021. Relevance and Applicability of Hardware-Independent Point-

ing Transfer Functions. In Proceedings of the 34th Annual ACM Symposium on

User Interface Software and Technology (Virtual Event, USA) (UIST ’21). As-

sociation for Computing Machinery, New York, NY, USA, 524–537. https:

//doi.org/10.1145/3472749.3474767

[27]

Errol R. Homann and Ilyas H. Sheikh. 1994. Eect of varying target height in

a Fitts’ movement task. Ergonomics 37, 6 (1994), 1071–1088. https://doi.org/10.

1080/00140139408963719

[28]

Richard J. Jagacinski and Donald L. Monk. 1985. Fitts’ Law in Two Dimensions

with Hand and Head Movements Movements. Journal of Motor Behavior 17, 1

(1985), 77–95. https://doi.org/10.1080/00222895.1985.10735338

[29]

Yu-Jung Ko, Hang Zhao, Yoonsang Kim, IV Ramakrishnan, Shumin Zhai, and

Xiaojun Bi. 2020. Modeling Two Dimensional Touch Pointing. In Proceedings

of the 33rd Annual ACM Symposium on User Interface Software and Technology

(Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York,

NY, USA, 858–868. https://doi.org/10.1145/3379337.3415871

[30]

Steven Komarov, Katharina Reinecke, and Krzysztof Z. Gajos. 2013. Crowdsourc-

ing Performance Evaluations of User Interfaces. In Proceedings of the SIGCHI

Bivariate Eective Width Method to Improve the Normalization Capability CHI ’22, April 29-May 5, 2022, New Orleans, LA, USA

Conference on Human Factors in Computing Systems (Paris, France) (CHI ’13).

ACM, New York, NY, USA, 207–216. https://doi.org/10.1145/2470654.2470684

[31]

Tarald O. Kvålseth. 1977. A Generalized Model of Temporal Motor Control

Subject to Movement Constraints. Ergonomics 20, 1 (1977), 41–50. https://doi.

org/10.1080/00140137708931599

[32]

Yan Ma, Shumin Zhai, IV Ramakrishnan, and Xiaojun Bi. 2021. Modeling Touch

Point Distribution with Rotational Dual Gaussian Model. In Proceedings of the

34th Annual ACM Symposium on User Interface Software and Technology (Virtual

Event, USA) (UIST ’21). Association for Computing Machinery, New York, NY,

USA, 858–868. https://doi.org/10.1145/3472749.3474816

[33]

I. Scott MacKenzie. 1991. Fitts’ law as a performance model in human-computer

interaction. Ph.D. Dissertation. University of Toronto.

[34]

I. Scott MacKenzie. 1992. Fitts’ law as a research and design tool in human-

computer interaction. Human-Computer Interaction 7, 1 (1992), 91–139. https:

//doi.org/10.1207/s15327051hci0701_3

[35]

I. Scott MacKenzie. 2018. Fitts’ Law. John Wiley & Sons, Ltd, Hoboken,

NJ, USA, Chapter 17, 347–370. https://doi.org/10.1002/9781118976005.ch17

arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118976005.ch17

[36]

I. Scott MacKenzie and William Buxton. 1992. Extending Fitts’ Law to Two-

dimensional Tasks. In Proceedings of the SIGCHI Conference on Human Factors in

Computing Systems (Monterey, California, USA) (CHI ’92). ACM, New York, NY,

USA, 219–226. https://doi.org/10.1145/142750.142794

[37]

I. Scott MacKenzie and William Buxton. 1994. Prediction of pointing and dragging

times in graphical user interfaces. Interacting with Computers 6, 2 (06 1994), 213–

227. https://doi.org/10.1016/0953-5438(94)90025- 6

[38]

I. Scott MacKenzie and Poika Isokoski. 2008. Fitts’ Throughput and the Speed-

Accuracy Tradeo. In Proceedings of the SIGCHI Conference on Human Factors

in Computing Systems (Florence, Italy) (CHI ’08). ACM, New York, NY, USA,

1633–1636. https://doi.org/10.1145/1357054.1357308

[39]

I. Scott MacKenzie and Shaidah Jusoh. 2001. An Evaluation of Two Input Devices

for Remote Pointing. In Proceedings of the 8th IFIP International Conference on

Engineering for Human-Computer Interaction (EHCI ’01). Springer-Verlag, Berlin,

Heidelberg, 235–250.

[40]

I. Scott MacKenzie, Tatu Kauppinen, and Miika Silfverberg. 2001. Accuracy

Measures for Evaluating Computer Pointing Devices. In Proceedings of the SIGCHI

Conference on Human Factors in Computing Systems (Seattle, Washington, USA)

(CHI ’01). ACM, New York, NY, USA, 9–16. https://doi.org/10.1145/365024.365028

[41]

I. Scott MacKenzie and Colin Ware. 1993. Lag As a Determinant of Human

Performance in Interactive Systems. In Proceedings of the INTERACT ’93 and

CHI ’93 Conference on Human Factors in Computing Systems (Amsterdam, The

Netherlands) (CHI ’93). ACM, New York, NY, USA, 488–493. https://doi.org/10.

1145/169059.169431

[42]

Michael R. Maniaci and Ronald D. Rogge. 2014. Caring about carelessness: Par-

ticipant inattention and its eects on research. Journal of Research in Personality

48 (2014), 61–83. https://doi.org/10.1016/j.jrp.2013.09.008

[43]

Blanca Mena, M José, Rafael Alarcón, Jaume Arnau Gras, Roser Bono Cabré,

and Rebecca Bendayan. 2017. Non-normal data: Is ANOVA still a valid option?

Psicothema 29, 4 (2017), 552–557.

[44]

Darius Miniotas, Oleg Špakov, and I. Scott MacKenzie. 2004. Eye Gaze Interaction

with Expanding Targets. In CHI ’04 Extended Abstracts on Human Factors in

Computing Systems (Vienna, Austria) (CHI EA ’04). Association for Computing

Machinery, New York, NY, USA, 1255–1258. https://doi.org/10.1145/985921.

986037

[45]

Motoki Miura and Kenji Saisho. 2014. A Text Selection Technique Using Word

Snapping. Procedia Computer Science 35 (2014), 1644–1651. https://doi.org/10.

1016/j.procs.2014.08.257 Knowledge-Based and Intelligent Information & Engi-

neering Systems 18th Annual Conference, KES-2014 Gdynia, Poland, September

2014 Proceedings.

[46]

Atsuo Murata. 1999. Extending Eective Target Width in Fitts’ Law to a Two-

Dimensional Pointing Task. International Journal of Human-Computer Interaction

11, 2 (1999), 137–152. https://doi.org/10.1207/S153275901102_4

[47]

Atsuo Murata, Toshihisa Doi, Kazushi Kageyama, and Waldemar Karwowski.

2021. Development of an Eye-Gaze Input System With High Speed and Accuracy

through Target Prediction Based on Homing Eye Movements. IEEE Access 9

(2021), 22688–22697. https://doi.org/10.1109/ACCESS.2021.3055514

[48]

Halla B. Olafsdottir, Yves Guiard, Olivier Rioul, and Simon T. Perrault. 2012. A

New Test of Throughput Invariance in Fitts’ Law: Role of the Intercept and of

Jensen’s Inequality. In Proceedings of the 26th Annual BCS Interaction Specialist

Group Conference on People and Computers (Birmingham, United Kingdom) (BCS-

HCI ’12). BCS Learning & Development Ltd., Swindon, GBR, 119–126.

[49]

Xiangshi Ren and Xiaolei Zhou. 2011. An investigation of the usability of the sty-

lus pen for various age groups on personal digital assistants. Behaviour & Informa-

tion Technology 30, 6 (2011), 709–726. https://doi.org/10.1080/01449290903205437

[50]

Olivier Rioul and Yves Guiard. 2012. Power vs. logarithmic model of Fitts’ law:

A mathematical analysis. Mathematical Social Sciences 2012 (12 2012), 85–96.

https://doi.org/10.4000/msh.12317

[51]

Immo Schuetz, T. Scott Murdison, Kevin J. MacKenzie, and Marina Zannoli.

2019. An Explanation of Fitts’ Law-like Performance in Gaze-Based Selec-

tion Tasks Using a Psychophysics Approach. In Proceedings of the SIGCHI

Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk)

(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–13.

https://doi.org/10.1145/3290605.3300765

[52]

Michail Schwab, Sicheng Hao, Olga Vitek, James Tompkin, Je Huang, and

Michelle A. Borkin. 2019. Evaluating Pan and Zoom Timelines and Sliders. In

Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

(Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New

York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300786

[53]

Ilyas H. Sheikh and Errol R. Homann. 1994. Eect of target shape on movement

time in a Fitts task. Ergonomics 37, 9 (1994), 1533–1547. https://doi.org/10.1080/

00140139408964932

[54]

R. William Soukore and I. Scott MacKenzie. 2004. Towards a standard for

pointing device evaluation, perspectives on 27 years of Fitts’ law research in

HCI. International Journal of Human-Computer Studies 61, 6 (2004), 751–789.

https://doi.org/10.1016/j.ijhcs.2004.09.001

[55]

Veikko Surakka, Marko Illi, and Poika Isokoski. 2004. Gazing and Frowning as a

New Human–Computer Interaction Technique. ACM Trans. Appl. Percept. 1, 1

(July 2004), 40–56. https://doi.org/10.1145/1008722.1008726

[56]

Roel Vertegaal. 2008. A Fitts Law Comparison of Eye Tracking and Man-

ual Input in the Selection of Visual Targets. In Proceedings of the 10th Inter-

national Conference on Multimodal Interfaces (Chania, Crete, Greece) (ICMI

’08). Association for Computing Machinery, New York, NY, USA, 241–248.

https://doi.org/10.1145/1452392.1452443

[57]

Daniel Vogel and Patrick Baudisch. 2007. Shift: A Technique for Operating Pen-

Based Interfaces Using Touch. In Proceedings of the SIGCHI Conference on Human

Factors in Computing Systems (San Jose, California, USA) (CHI ’07). Association

for Computing Machinery, New York, NY, USA, 657–666. https://doi.org/10.

1145/1240624.1240727

[58]

Feng Wang and Xiangshi Ren. 2009. Empirical Evaluation for Finger Input

Properties in Multi-touch Interaction. In Proceedings of the SIGCHI Conference on

Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). ACM, New

York, NY, USA, 1063–1072. https://doi.org/10.1145/1518701.1518864

[59]

Alan Travis Welford. 1968. Fundamentals of skill. London: Methuen, North

Yorkshire, UK.

[60]

Jacob O. Wobbrock, Leah Findlater, Darren Gergle, and James J. Higgins. 2011.

The Aligned Rank Transform for Nonparametric Factorial Analyses Using Only

Anova Procedures. In Proceedings of the SIGCHI Conference on Human Factors

in Computing Systems (Vancouver, BC, Canada) (CHI ’11). ACM, New York, NY,

USA, 143–146. https://doi.org/10.1145/1978942.1978963

[61]

Jacob O. Wobbrock, Kristen Shinohara, and Alex Jansen. 2011. The Eects of Task

Dimensionality, Endpoint Deviation, Throughput Calculation, and Experiment

Design on Pointing Measures and Models. In Proceedings of the SIGCHI Conference

on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). ACM,

New York, NY, USA, 1639–1648. https://doi.org/10.1145/1978942.1979181

[62]

Charles E. Wright and Francis Lee. 2013. Issues Related to HCI Application of

Fitts’s Law. Human-Computer Interaction 28, 6 (2013), 548–578. https://doi.org/

10.1080/07370024.2013.803873

[63]

Shota Yamanaka. 2018. Eect of Gaps with Penal Distractors Imposing Time

Penalty in Touch-pointing Tasks. In Proceedings of the 20th International Confer-

ence on Human-Computer Interaction with Mobile Devices and Services (Barcelona,

Spain) (MobileHCI ’18). ACM, New York, NY, USA, 8 pages. https://doi.org/10.

1145/3229434.3229435

[64]

Shota Yamanaka. 2021. Comparing Performance Models for Bivariate Pointing

through a Crowdsourced Experiment. In Human-Computer Interaction – INTER-

ACT 2021. Springer International Publishing, Gewerbestr, Switzerland, 76–92.

https://doi.org/10.1007/978-3- 030-85616-8_6

[65]

Shota Yamanaka and Hiroki Usuba. 2020. Rethinking the Dual Gaussian Dis-

tribution Model for Predicting Touch Accuracy in On-Screen-Start Pointing

Tasks. Proc. ACM Hum.-Comput. Interact. 4, ISS, Article 205 (Nov. 2020), 20 pages.

https://doi.org/10.1145/3427333

[66]

Huahai Yang and Xianggang Xu. 2010. Bias Towards Regular Conguration in 2D

Pointing. In Proceedings of the SIGCHI Conference on Human Factors in Computing

Systems (Atlanta, Georgia, USA) (CHI ’10). ACM, New York, NY, USA, 1391–1400.

https://doi.org/10.1145/1753326.1753536

[67]

Shumin Zhai, Jing Kong, and Xiangshi Ren. 2004. Speed-accuracy tradeo in

Fitts’ law tasks: on the equivalency of actual and nominal pointing precision.

International Journal of Human-Computer Studies 61, 6 (2004), 823–856. https:

//doi.org/10.1016/j.ijhcs.2004.09.007

[68]

Xinyong Zhang, Hongbin Zha, and Wenxin Feng. 2012. Extending Fitts’ Law to

Account for the Eects of Movement Direction on 2D Pointing. In Proceedings of

the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas,

USA) (CHI ’12). ACM, New York, NY, USA, 3185–3194. https://doi.org/10.1145/

2207676.2208737

Better Definition and Calculation of Throughput and Effective Parameters for Steering to Account for Subjective Speed-accuracy Tradeoffs

Conference Paper

May 2024

Throughput and Effective Parameters in Crossing

Conference Paper

Apr 2023

Modeling Touch Point Distribution with Rotational Dual Gaussian Model

Conference Paper

Full-text available

Oct 2021

Touch point distribution models are important tools for designing touchscreen interfaces. In this paper, we investigate how the finger movement direction affects the touch point distribution, and how to account for it in modeling. We propose the Rotational Dual Gaussian model, a refinement and generalization of the Dual Gaussian model, to account for the finger movement direction in predicting touch point distribution. In this model, the major axis of the prediction ellipse of the touch point distribution is along the finger movement direction, and the minor axis is perpendicular to the finger movement direction. We also propose using projected target width and height, in lieu of nominal target width and height to model touch point distribution. Evaluation on three empirical datasets shows that the new model reflects the observation that the touch point distribution is elongated along the finger movement direction, and outperforms the original Dual Gaussian Model in all prediction tests. Compared with the original Dual Gaussian model, the Rotational Dual Gaussian model reduces the RMSE of touch error rate prediction from 8.49% to 4.95%, and more accurately predicts the touch point distribution in target acquisition. Using the Rotational Dual Gaussian model can also improve the soft keyboard decoding accuracy on smartwatches.

Comparing Performance Models for Bivariate Pointing Through a Crowdsourced Experiment

Chapter

Full-text available

Aug 2021

Shota Yamanaka

Evaluation of a novel user-performance model’s fitness requires comparison with baseline models, yet it is often time consuming and involves much effort by researchers to collect data from many participants. Crowdsourcing has recently been used for evaluating novel interaction techniques, but its potential for model comparison studies has not been investigated in detail. In this study, we evaluated four existing Fitts’ law models for rectangular targets, as though one of them was a proposed novel model. We recruited 210 crowd workers, who performed 94,080 clicks in total, and confirmed that the result for the best-fit model was consistent with previous studies. We also analyzed whether this conclusion would change depending on the sample size, but even when we randomly sampled data from five workers for 10,000 iterations, the best-fit model changed only once (0.01%). We have thus demonstrated a case in which crowdsourcing is beneficial for comparing performance models.

Development of an Eye-Gaze Input System With High Speed and Accuracy through Target Prediction Based on Homing Eye Movements

Article

Full-text available

Jan 2021

In this study, a method to predict a target on the basis of the trajectory of eye movements and to increase the pointing speed while maintaining high predictive accuracy is proposed. First, a predictive method based on ballistic (fast) eye movements (Approach 1) was evaluated in terms of pointing speed and predictive accuracy. In Approach 1, the so-called Midas touch problem (pointing to an unintended target) occurred, particularly when a small number of samples was used to predict a target. Therefore, to overcome the poor predictive accuracy of Approach 1, we developed a new predictive method (Approach 2) using homing (slow) eye movements rather than ballistic (fast) eye movements. Approach 2 overcame the disadvantage (inaccurate prediction) of Approach 1 by shortening the pointing time while maintaining high predictive accuracy.

Rethinking the Dual Gaussian Distribution Model for Predicting Touch Accuracy in On-screen-start Pointing Tasks

Article

Full-text available

Nov 2020

The dual Gaussian distribution hypothesis has been used to predict the success rate of target pointing on touchscreens. Bi and Zhai evaluated their success-rate prediction model in off-screen-start pointing tasks. However, we found that their prediction model could also be used for on-screen-start pointing tasks. We discuss the reasons why and empirically validate our hypothesis in a series of four experiments with various target sizes and distances. The prediction accuracy of Bi and Zhai's model was high in all of the experiments, with a 10-point absolute (or 14.9% relative) prediction error at worst. Also, we show that there is no clear benefit to integrating the target distance when predicting the endpoint variability and success rate.

Modeling Two Dimensional Touch Pointing

Conference Paper

Full-text available

Oct 2020

Modeling touch pointing is essential to touchscreen interface development and research, as pointing is one of the most basic and common touch actions users perform on touchscreen devices. Finger-Fitts Law [4] revised the conventional Fitts' law into a 1D (one-dimensional) pointing model for finger touch by explicitly accounting for the fat finger ambiguity (absolute error) problem which was unaccounted for in the original Fitts' law. We generalize Finger-Fitts law to 2D touch pointing by solving two critical problems. First, we extend two of the most successful 2D Fitts law forms to accommodate finger ambiguity. Second, we discovered that using nominal target width and height is a conceptually simple yet effective approach for defining amplitude and directional constraints for 2D touch pointing across different movement directions. The evaluation shows our derived 2D Finger-Fitts law models can be both principled and powerful. Specifically, they outperformed the existing 2D Fitts' laws, as measured by the regression coefficient and model selection information criteria (e.g., Akaike Information Criterion) considering the number of parameters. Finally, 2D Finger-Fitts laws also advance our understanding of touch pointing and thereby serve as the basis for touch interface designs.

An Explanation of Fitts' Law-like Performance in Gaze-Based Selection Tasks Using a Psychophysics Approach

Conference Paper

Full-text available

Apr 2019

Eye gaze as an input method has been studied since the 1990s, to varied results: some studies found gaze to be more efficient than traditional input methods like a mouse, others far behind. Comparisons are often backed up by Fitts' Law without explicitly acknowledging the ballistic nature of saccadic eye movements. Using a vision science-inspired model, we here show that a Fitts'-like distribution of movement times can arise due to the execution of secondary saccades, especially when targets are small. Study participants selected circular targets using gaze. Seven different target sizes and two saccade distances were used. We then determined performance across target sizes for different sampling windows ("dwell times") and predicted an optimal dwell time range. Best performance was achieved for large targets reachable by a single saccade. Our findings highlight that Fitts' Law, while a suitable approximation in some cases, is an incomplete description of gaze interaction dynamics.

Modeling the speed-accuracy tradeoff using the tools of information theory

Thesis

Full-text available

Dec 2018

Julien Gori

Fitts’ law, which relates movement time MTin a pointing task to the target’s dimensions D and Wis usually expressed by mimicking Shannon’s capacityformula MT = a + b log 2 (1 + D/W). Yet, the currentlyreceived analysis is incomplete and unsatisfactory: itstems from a vague analogy and there is no explicitcommunication model for pointing.I first develop a transmission model for pointing taskswhere the index of difficulty ID = log 2 (1 + D/W) isthe expression of both a source entropy and a chan-nel capacity, thereby reconciling Shannon’s informa-tion theory with Fitts’ law. This model is then levera-ged to analyze pointing data gathered from controlledexperiments but also from field studies.I then develop a second model which builds on thevariability of human movements and accounts for thetremendous diversity displayed by movement control:with of without feedback, intermittent or continuous.From a chronometry of the positional variance, eva-luated from a set of trajectories, it is observed thatmovement can be separated into two phases: a firstwhere the variance increases over time and wheremost of the distance to the target is covered, follo-wed by a second phase where the variance decreasesuntil it satisfies accuracy constraints. During this se-cond phase, the problem of aiming can be reduced toa Shannon-like communication problem where infor-mation is transmitted from a “source” (variance at theend of the first phase), to a “destination” (the limb ex-tremity) over a “channel” perturbed by Gaussian noisewith a feedback link. I show that the optimal solution tothis transmission problem amounts to a scheme firstsuggested by Elias. I show that the variance can de-crease at best exponentially during the second phase,and that this result induces Fitts’ law.

Evaluating Pan and Zoom Timelines and Sliders

Conference Paper

Full-text available

May 2019

Pan and zoom timelines and sliders help us navigate large time series data. However, designing efficient interactions can be difficult. We study pan and zoom methods via crowd-sourced experiments on mobile and computer devices, asking which designs and interactions provide faster target acquisition. We find that visual context should be limited for low-distance navigation, but added for far-distance navigation; that timelines should be oriented along the longer axis, especially on mobile; and that, as compared to default techniques, double click, hold, and rub zoom appear to scale worse with task difficulty, whereas brush and especially ortho zoom seem to scale better. Software and data used in this research are available as open source.

Relevance and Applicability of Hardware-independent Pointing Transfer Functions

Conference Paper

Oct 2021

The Effect of Pitch in Auditory Error Feedback for Fitts' Tasks in Virtual Reality Training Systems

Conference Paper

Mar 2021

Bivariate Effective Width Method to Improve the Normalization Capability for Subjective Speed-accuracy Biases in Rectangular-target Pointing

Recommended publications

The characterisation of snubber diodes for use with high voltageGTO thyristors

A Comparison of Estimators for a Two-Parameter Hyperbola

Comparing Performance Models for Bivariate Pointing Through a Crowdsourced Experiment

Error-rate Prediction for Mouse-based Rectangular-target Pointing with no Knowledge of Movement Angl...

Computing Touch-Point Ambiguity on Mobile Touchscreens for Modeling Target Selection Times

Time-Penalty Impact on Effective Index of Difficulty and Throughputs in Pointing Tasks