ArticlePDF Available

The psychometric function: The lapse rate revisited

June 2012
Journal of Vision 12(6)

June 2012
12(6)

DOI:10.1167/12.6.25

Source
PubMed

Authors:

Nicolaas Prins

University of Mississippi

In their influential paper, Wichmann and Hill (2001) have shown that the threshold and slope estimates of a psychometric function may be severely biased when it is assumed that the lapse rate equals zero but lapses do, in fact, occur. Based on a large number of simulated experiments, Wichmann and Hill claim that threshold and slope estimates are essentially unbiased when one allows the lapse rate to vary within a rectangular prior during the fitting procedure. Here, I replicate Wichmann and Hill's finding that significant bias in parameter estimates results when one assumes that the lapse rate equals zero but lapses do occur, but fail to replicate their finding that freeing the lapse rate eliminates this bias. Instead, I show that significant and systematic bias remains in both threshold and slope estimates even when one frees the lapse rate according to Wichmann and Hill's suggestion. I explain the mechanisms behind the bias and propose an alternative strategy to incorporate the lapse rate into psychometric function models, which does result in essentially unbiased parameter estimates.

Results of 10,000 simulations in the s1 stimulus placement regimen when the lapse rate was free to vary. The generating lapse rate was zero. The generating threshold was F À1 0:5 ¼ 8:85 and is indicated by the triangle in the Figures. (a) distribution of lapse rate estimates. (b) distribution of threshold estimates in all 10,000 simulations. (c) distribution of threshold estimates in those simulations in whichˆkwhichˆ whichˆk ¼ 0. (d) distribution of threshold estimates in those simulations in whichˆkwhichˆ whichˆk ¼ 0.06. (e) distribution of threshold estimates in the simulations in which 0 , ˆ k , 0.06.

…

Figures - uploaded by Nicolaas Prins

Content may be subject to copyright.

Content uploaded by Nicolaas Prins

Content may be subject to copyright.

The psychometric function: The lapse rate revisited

Nicolaas Prins

Department of Psychology, University of Mississippi,

University, MS, USA

In their influential paper, Wichmann and Hill (2001) have shown that the threshold and slope estimates of a psychometric

function may be severely biased when it is assumed that the lapse rate equals zero but lapses do, in fact, occur. Based on a

large number of simulated experiments, Wichmann and Hill claim that threshold and slope estimates are essentially

unbiased when one allows the lapse rate to vary within a rectangular prior during the fitting procedure. Here, I replicate

Wichmann and Hill’s finding that significant bias in parameter estimates results when one assumes that the lapse rate equals

zero but lapses do occur, but fail to replicate their finding that freeing the lapse rate eliminates this bias. Instead, I show that

significant and systematic bias remains in both threshold and slope estimates even when one frees the lapse rate according

to Wichmann and Hill’s suggestion. I explain the mechanisms behind the bias and propose an alternative strategy to

incorporate the lapse rate into psychometric function models, which does result in essentially unbiased parameter estimates.

Keywords: psychophysical methods, psychometric function, lapse rate, maximum-likelihood

Citation: Prins, N. (2012). The psychometric function: The lapse rate revisited. Journal of Vision, 12(6):25, 1–16, http://www.

journalofvision.org/content/12/6/25, doi: 10.1167/12.6.25.

Introduction

The psychometric function (PF) relates some behav-

ioral measure (e.g., proportion correct on a detection

task) to some quantitative characteristic of a sensory

stimulus (e.g., luminance contrast). In the following I

will refer to the latter simply as stimulus intensity,

though this may not be an appropriate term in many

circumstances (e.g., the variable may be spatial or

temporal frequency, orientation offset, etc.). A generic

formulation of the psychometric function is given by:

wðx; a; b; c; kÞ¼c þð1  c  kÞFðx; a; bÞð1aÞ

(e.g., Wichmann & Hill, 2001, Kingdom & Prins, 2010).

Though discredited, the classic high-threshold detection

model (e.g., Swets, 1961) provides for an intuitively

appealing interpretation of the parameters of Equation

1a. Under the high-threshold model, F(x; a, b)

describes the probability of detection by an underlying

sensory mechanism as a function of stimulus intensity

x, c corresponds to the guess rate (the probability of a

correct response when the stimulus is not detected by

the underlying sensory mechanism), and k corresponds

to the lapse rate (the probability of an incor rect

response, which is independent of stimulus intensity).

Several forms of F(x; a, b) are in common use such as

the Logistic function, the Weibull function, and the

cumulative normal distribution. In this paper, the

Weibull function is used exclusively and is given by:

ðx; a; bÞ¼1  exp 





ð1bÞ

The parameter a of F

(x; a, b) determines the

function’s location and is commonly referred to as

the function’s ‘threshold.’ The parameter b determines

the rate of change of performance as a function of

stimulus intensity x and is commonly referred to as the

‘slope.’

Even though the high-threshold model has been

discredited (e.g., Swets, 1961), Equation 1 is consistent

also with assumptions of signal-detection theory as

proved formally by Garc

ıa-P

erez and Alcal

a-Quintana

(2007). Either way, few theorists would argue with the

notion that whereas the threshold and slope parameters

characterize the sensory mechanism that underlies

performance, the remaining two parameters do not.

Rather, the guess rate characterizes the decision process

and the lapse rate characterizes such things as observer

vigilance and response error. While the guess rate can

generally be assumed to have a value determined by the

experimental procedure (e.g., in an m-AFC task the

guess rate can be assumed to equal 1/m), the same

cannot be said of the lapse rate. As it does not describe

the sensory mechanism, researchers generally are not

interested in the value of the lapse rate per se.

While the effect that lapses have on parameter

estimates has been noted for some time (for example,

Manny & Klein addressed the issue as early as 1985),

systematic investigations of this effect (e.g., Treutwein

& Strasburger, 1999; Wichmann & Hill, 2001) are

relatively recent. Wichmann and Hill (2001) advocate

allowing the value of the lapse rate to vary alongside

the values of the threshold and slope of the PF. They

do so based on the results of a large number of

simulated experiments. Brieﬂy, Wichmann and Hill

Journal of Vision (2012) 12(6):25, 1–16 1http://www.journalofvision.org/content/12/6/25

doi: 10.1167/1 2.6 . 25 ISSN 1534-7362 Ó 2012 ARVOReceived December 6, 2011; published June 19, 2012

produced simulated datasets that were generated by a

Weibull function with known parameter values. The

generating values for a and b were equal to 10 and 3,

respectively. The guess rate c was 0.5. The generating

lapse rate k was systematically varied from 0 to 0.05 in

steps of 0.01. The method of constant stimuli (MOCS)

utilizing seven different stimulus placement regimens

was used. The seven stimulus placement regimens are

shown in Figure 1 (s1 through s7) relative to the

generating form of F. The total number of simulated

trials (N) in each simulated experiment was evenly

distributed among the six stimulus intensities in each of

the placement regimens. Each simulated dataset was

then ﬁtted with the psychometric function in Equation

1 using a maximum-likelihood criterion. The threshold

and slope parameters were free to vary during the

ﬁtting process. The lapse rate parameter was either held

constant at a ﬁxed value or was allowed to vary within

the interval [0 0.06]. This prior

was placed on the lapse

rate parameter to reﬂect beliefs regarding likely values

of the lapse rate parameter. Unless the prior is applied,

nonsensical negative estimates of the lapse rate might

result, as well as unrealistically high estimates of the

lapse rate.

Wichmann and Hill (2001) report the threshold and

slope not in terms of a and b, but rather in terms of F

1

0:5

[that is, the stimulus intensity at which function F

(Equation 1b) evaluates to 0.5] and F

0:5

(that is, the

gradient or ﬁrst derivative of F evaluated at F

1

0:5

). The

true, generating, values of these quantities are 8.85 and

0.118 respectively. Figure 2 was taken from Wichmann

and Hill (2001). It shows the median threshold and

median slope estimates, each derived based on 2,000

simulated experiments. The light symbols show the

estimates when the lapse rate was ﬁxed at zero, the

darker symbols show the estimates when the lapse rate

was allowed to vary. The different shapes of the

symbols in the ﬁgure indicate stim ulus placement

regimen and correspond to the symbols used in Figure

1. The true (generating) values of the threshold (in

terms of F

1

0:5

) and slope (in terms of F

0:5

) are indicated

by the horizontal lines.

As is clear from Figure 2, both the threshold and

slope estimates are signiﬁcantly biased when the lapse

rate is assumed to equal 0 but the generating lapse rate

in fact differs from 0. The severity of the bias is

(predictably) mainly a function of whether the stimulus

placement regimen included stimulus placements at

high intensities (see Wichmann & Hill, 2001 for a

detailed argument on why this is so). However, when

the lapse rate was allowed to vary, threshold and slope

estimates were, in their terminology, ‘‘essentially

unbiased’’ (Wichmann & Hill, 2001, p. 1298).

Attempted replication of Wichmann and Hill

Garc

ıa-P

erez and Alcal

a-Quintana (2005) have

noted that when during the ﬁtting procedure the lapse

rate estimate is allowed to vary but constrained to have

Figure 1. The seven different stimulus placement regimens (s1 through s7) used by Wichmann and Hill (2001). Also shown is the stimulus

placement resulting from using the adaptive psi-method (W) averaged across 10,000 simulated runs of 960 trials each (see Method

section for more details). In the latter, the area of the symbols in the Figure is proportional to the number of trials presented at the

corresponding stimulus intensity. The curve shown is the Weibull with a ¼ 10, b ¼ 3 (i.e., the generating form of F).

Journal of Vision (2012) 12(6):25, 1–16 Prins 2

Figure 2. Results reported by Wichmann and Hill (2001). Median threshold and slope estimates across 2,000 simulations are shown for

seven stimulus placement regimens (represented by the different symbol shapes) as a function of the generating lapse rate (k

gen

). All

placement regimens contained six stimulus intensities with the total number of trials (N) evenly distributed among the different stimulus

intensities. Light symbols correspond to fits in which the lapse rate was fixed at a value of zero, dark symbols correspond to fits in which

the lapse rate was allowed to vary within a rectangular prior. (Reproduced from Figure 3 in: Wichmann and Hill [2001], with kind

permission from Springer Science þ Business Media B.V.).

Journal of Vision (2012) 12(6):25, 1–16 Prins 3

a value within a narrow range, many l apse rate

estimates will be equal to one of the limits of this

range. Whether and to which degree this will be the

case depends mainly on the number of observations,

and the s timulus placemen t regimen (particularly

whether the regimen includesplacementsathigh

stimulus intensities). Even when high stimulus intensi-

ties are included in the regimen, many lapse rate

estimates may be equal to one of the limits of this

range. Often the lapse rate estimate distribution is

bimodal with peaks at both of the limits of the prior.

Because I was interested in the distribution of the lapse

rate estimates in Wichmann and Hill’s (2001) simula-

tions (Wichman n & Hill do not provide these) I

attempted to replicate their results shown in Figure 2.

Method

Simulations

I repeated Wichmann and Hill’s (2001) simulations

that are shown in (this paper’s) Figure 2 following their

procedure closely (some details, such as the exact values

for the stimulus placements were obtained from Hill,

2001). In order to avoid crowding of ﬁgures, results will

be presented only for placement regimens s1, s6, and s7.

These three placement regimens differ with respect to

their placement of stimuli in characteristic ways that

will prove to affect the behavior of parameter estimates

systematically. Regimen s1 places stimuli exclusively

near threshold level, regimen s6 is similar to s1 in that

respect but also includes a single intensity at a

performance level near asymptote. Regimen s7 has

intensities around threshold level, one intensity at a

very high performance level, but also includes intensi-

ties at intermediate levels. Matlabt code that can be

used to perform all simulations and parameter estima-

tions presented in this paper as well as the other

placement regimens used by Wichmann and Hill is

available here: www.palamedestoolbox.org/jovcode.

html.

I also performed simulations in which stimulus

placement was guided by the adaptive psi-method

(Kontsevich & Tyler, 1999). The psi-method selects

stimulus intensities on each trial such as to reduce

uncertainty in the threshold as well as slope parameter

estimates. Brieﬂy, following each trial the psi-method

derives a posterior probabi lit y distribu tion acros s

(discrete) values for the threshold and slope parameters

based on all previous trials and a user-provided prior

Figure 3. For each simulation, a brute-force search was first performed through a search grid containing 150 possible values for the

threshold parameter, 150 possible values for the slope parameter and, in case the lapse rate was free to vary, 13 possible values for the

lapse rate parameter. The full 3-D search grid thus contained 292,500 (150x150x13) PFs. In fits in which the lapse rate was fixed, the

search grid was restricted to the (150x150) plane that corresponded to the fixed lapse rate. The best-fitting PF in the grid subsequently

served as the seed for the iterative Nelder-Mead search. (A) Equal-likelihood contours for the example dataset shown in (C) across the

threshold and slope values contained within the search grid. The color code here and in (B) indicates the value of the lapse rate of the PF

with the highest likelihood. (B) Likelihood across threshold and slope for the region outlined by the square in (A). This region is centered

on the PF in the search grid that has the highest likelihood. The grain of the grid as shown in the Figure corresponds to that of the search

grid. (C) The generating curve with lapse rate equal to zero (black curve), some hypothetical data under placement regimen s1, the best-

fitting PF contained in the search grid (blue curve, largely obscured by red curve), and the best-fitting PF resulting from the Nelder-Mead

iterative search (red curve). Note that while the fitted curve corresponds closely (at least for the stimulus intensities included in s1) to the

generating curve in terms of the probability of a correct response (w in Equation 1, function of a, b, c, and k), it does not in terms of F (inset;

function only of a and b; see Equation 1). The code which accompanies this paper will produce a Figure such as this for any simulation

performed in this paper as well as any of Wichmann and Hill’s conditions not reported here.

Journal of Vision (2012) 12(6):25, 1–16 Prins 4

distribution. The stimulus intensity to be used on the

next trial is then selected such that the expected entropy

in the posterior distribution is minimized.

The range of values of the threshold parameter

included in the psi-method’s parameter space included

51 possible values spaced logarithmically between F

1

0:1

( ¼ 4.72) and F

1

0:9

( ¼ 13.21). The range of values of the

slope parameter included in the psi-method’s param-

eter space included 41 possible values spaced logarith-

mically between b ¼ 1(F

0:5

¼ 0.050 when a ¼ 10) and b

¼ 10 (F

0:5

¼ 0.360 when a ¼ 10). A uniform prior

across these parameter values was used. The range of

possible stimulus intensities the psi-method could

select from included 21 values spaced logarithmically

between F

1

0:1

( ¼ 4.72) and F

1

0:999

( ¼ 19.04). The psi-

method assumed a Weibull function, with lapse rate

equal to 0.025 and a guess rate equal to 0.5. Note that

the choice for assumed lapse rate and its correspon-

dence to the generating value affect directly only the

exact stimulus intensities used in the simulations, not

the parameter estimates I report here as these are

derived based on a maximum likelihood criterion in a

Figure 4. Attempted replication of the results shown in Figure 2. Also shown are ‘half 68% confidence interval widths’ (see text for details).

Observed pattern of results for the simulations in which lapse rate was free to vary differs from that reported by Wichmann and Hill shown

in their Figure 3 (reproduced here as Figure 2).

Journal of Vision (2012) 12(6):25, 1–16 Prins 5

separate procedure. The psi-method was implemented

using the Palamedes toolbox (Prins & Kingdom,

2009). In order to provide a general idea as to the

placement of stimuli when the psi method is used, the

stimulus placements combined across 10,000 simula-

tions where the generating lapse rate equaled 0.03 and

the number of trials was 960 is included in Figure 1.

The interdependencies among observations that are

introduced by the nature of the psi-method (as well as

any other adaptive method) introduce bias in param-

eter estimates in addition to any bias that may be

introduced by other sources (see Kaernbach, 2001, for

a detailed explanation of the mechanism behind this

bias). In order to separate the effect of stimulus

placement per se on the one hand and that of serial

dependencies on the other, all psi-method trial runs

were retested in a MOCS context (e.g., Kaernbach,

2001). That is, the exact series of stimulus intensities

resulting from each psi-method run was used again to

simulate a new set of responses. These sets of data

would thus have identical placement to those resulting

from the psi-method run but would not contain the

serial depende ncy, which is i nherently present in

adaptive runs.

Parameter estimation

Maximum likelihood parameter estimates for each

of the simulations were derived by the Palamedes

toolbox (Prins & Kingdom, 2009). Details of the ﬁtting

procedure are described in Figure 3. Median threshold

and slope estimates for Wichmann and Hill’s (2001)

MOCS placement regimens s1, s6, and s7 are presented

in Figure 4 for N ¼ 240 and N ¼ 960. Parameter

estimates are shown for ﬁts in which the lapse rate was

allowed to vary within the prior, the lapse rate was

ﬁxed at a value of zero and the lapse rate was ﬁxed at a

value of 0.025. In order to provide a measure of the

variance of the parameter estimates ‘half 68% conﬁ-

dence interval widths’ (‘WCI

/2’) are also shown in

Figure 4. These are simply half the distance between the

and 84

percentile in the distribution of parameter

estimates. Insofar as these distributions are normally

distributed, these values are comparable to standard

errors of estimate. Figure 5 shows scatterplots of

parameter estimates obtained with a free lapse rate for

s1, s6, s7, and psi-controlled placement regimens using

a generating lapse rate of 0.03 and N ¼ 960. Full

distributions of parameter estimates in the form of

histograms and scatterplots will be produced by the

code that accompanies this paper for any of the

simulations performed in this paper as well as any of

the conditions in Wichmann & Hill’s (2001) Figure 3

(reproduced here as Figure 2).

In Figure 4, my results for the conditions in which

the lapse rate estimate was ﬁxed at a value of zero are

virtually identical to those of Wichmann and Hill

(2001). With the lapse rate ﬁxed at 0.025, I obtain

results similar to those obtained by Wichmann and Hill

in their Figure 5 (not reproduced here) where they ﬁxed

k at values other than zero. Systematic and signiﬁcant

biases ar e obser ved in F

1

0:5

as well as F

0:5

.The

magnitude of bias depends primarily on the difference

between the value of the generating lapse rate and the

value assumed during the ﬁt. In line with observations

made by Klein (2001), ﬁxing the lapse rate at a small

(but greater than zero) value avoids the excessive biases

in slope found when the lapse rate is ﬁxed at a value of

zero.

My results for the ﬁts in which the lapse rate estimate

was allowed to vary, however, are quite different from

those reported by Wichmann and Hill. Whereas

Wichmann and Hill’s results indicate a lack of bias

when the lapse rate estimate is allowed to vary within

the prior window, my results instead do display a

systematic bias. At low values of the generating lapse

rate, threshold estimates are essentially unbiased as

long as high values of F are included in the placement

Figure 5. Scatterplots showing the relationship between parameter estimates for the three MOCS placement regimens as well as psi-

method controlled placements (w). The generating lapse rate was 0.03 and N ¼ 960. The number in each plot indicates the number of

simulations (of 2000) resulting in parameter estimates contained within range of parameter values included in plots.

Journal of Vision (2012) 12(6):25, 1–16 Prins 6

regimen (s6, s7 and not shown here: s3 and s5), but

negatively biased when placement regimens are used,

which do not include high values of F (s1 and, not

shown here, s2 and s4). At high values of the generating

lapse rate, all placement regimens (including those not

shown here) lead to positively biased threshold

estimates. Bias in slope estimates is small and consistent

across placement regimens but varies systematically

with generating lapse rate. From Figure 5 it is clear that

the lapse rate estimate is correlated with both the

threshold and slope parameters.

Results for the simulations in whi ch stimulus

placement was guided by the psi-method are presented

in Figure 6 in a manner similar to Figure 4. When

simulations using stimulus placements resulting from

psi-method runs are repeated as MOCS (square

symbols), the observed pattern of bias is comparable

to that shown in Figure 4 for placement regimen s1

(which does not include high values of F; the psi-

method runs also tend not to include stimuli placed at

high stimulus intensities, see Figure 1). It is also clear

that the use of the adaptive psi-method itself affects

parameter esti mate bias greatly (round symbols),

especially when the lapse rate is free to vary and the

number of trials is low. As noted above, this is due to

Figure 6. Median threshold and slope estimates when stimulus placement was guided by the adaptive psi-method (round symbols). Also

shown are median parameter estimates derived from MOCS replications of psi-method controlled runs (square symbols). For each of the

four sets of symbols in each graph are also shown ‘half 68% confidence interval widths’ (see text for details).

Journal of Vision (2012) 12(6):25, 1–16 Prins 7

the trial-to-trial interdependencies inherent in adaptive

methods (Kaernbach, 2001).

Discussion

My results indicate that, contrary to Wichmann and

Hill’s (2001) claim, signiﬁcant and systematic bias in

parameter estimates remains when the lapse rate is

allowed to vary during ﬁtting. This bias is exacerbated

when stimulus placement is governed by the adaptive

psi-method. I will ﬁrst discuss in some detail the source

of this bias. I will then suggest and test an alternative

estimation strategy, which suggests itself based on the

considerations of the source of bias in parameter

estimates.

Source of bias

I will focus my discussion on the threshold param-

eter estimates obtained using Wichmann and Hill’s

(2001) MOCS placement regimens. The argument I

present in regard to the source of the bias in these

threshold estimates applies equally well to the threshold

estimates obtained with placements derived by psi-

method but retested as MOCS. While the argument I

present focuses on threshold parameter estimates it

generalizes easily to slope estimates. I will also discuss

the additional bias introduced when using the adaptive

psi-method to control stimulus placement.

In order to understand the source of the bias

observed when the lapse rate was free to vary, let us

ﬁrst note that two distinct patterns of bias in threshold

are evident in Figure 3. For placement regimens s1

(and, not shown here, s2 and s4) bias in threshold is an

approximately linear function of the generating lapse

rate: Whereas at low generating lapse rates threshold

estimates tend to underestimate the generating value, at

high generating lapse rates threshold estimates tend to

overestimate the generating value. Bias in threshold is

near zero when the generating lapse rate equals 0.03.

For placement regimens s6, s7 (and, not shown here, s3

and s5) on the other hand, threshold estimates are

approximately unbiased at low generating lapse rates

(up to about k

gen

¼ 0.03) but tend to overestimate the

generating value when the lapse rate is high.

As will become evident from the argument I present

below, the critical feature that sets s1 (as well as s2 and

s4) apart from s6 and s7 (as well as s3 and s5) is that

whereas the former do not include a stimulus placed at

a high intensity, the latter do. The highest stimulus

intensity used in regimen s1 was F

1

0:7

. Regimen s6

includes a stimulus placement at F

1

0:99

, s7 includes a

stimulus placement at F

1

0:998

. Below I consider the

sources of bias in some detail. I do so separately for the

two observed patterns of bias.

Source of bias when placement regimen does not

include a high intensi ty stimulus

Whereas this discussion focuses on bias observed

under stimulus placement regimen s1, the argument

generalizes to stimulus placements derived by the psi-

method but retested as MOCS or any other regimen

that does not include stimuli placed at a high intensity.

Figure 3c displays the PF that was used as the

generating function here and in Wichmann and Hill

(2001) (i.e., a ¼ 10, b ¼ 3 [corresponding to F

1

0:5

¼ 8:85

and F

0:5

¼ :118; c ¼ 0.5) with a lapse rate equal to 0

(black curve). A hypothetical data set is shown by the

black symbols in the ﬁgure. When the lapse rate is

allowed to vary within [0 0.06], the best-ﬁtting PF is the

red curve in the ﬁgure. Its estimate of the threshold

parameter a equals 9.29, its estimate of the slope

parameter b equals 3.38 and its estimate of the lapse

rate parameter k equals 0.06 (i.e., the upper limit on the

lapse rate’s prior). These values correspond to F

1

0:5

8:33 and F

0:5

¼ 0:140. As is clear from Figure 3c,

despite having dissimilar parameter values, the gener-

ating curve and the best-ﬁtting (red) curve are virtually

identical within the range covered by the s1 regimen

(which spans F

1

0:3

¼ 7:09 through F

1

0:7

¼ 10:64). It is

important to note, however, that these curves are

similar only in terms of probability of a positive (e.g.,

‘correct’) response (i.e., w in Equation 1). The functions

describing the underlying perceptual process (in which

we are interested; F in Equation 1) are quite different,

as shown in the Figure inset.

The red and black functions shown in Figure 3c in

fact, merely the limits of an entire family of PFs that

are virtually identical within the s1 placement range.

These limits are deﬁned by the boundaries placed on

the lapse rate. In case an observer (real or simulated)

generates data under placement regimen s1 and

according to the generating PF with k ¼ 0, all PFs in

the family bound by the two functions shown in Figure

3c are, within the tested range, virtually identical to the

generating PF. Likewise, any dataset generated under

the s1 placement regimen will have an entire family of

PFs associated with it that will all have likelihoods very

near the maximum in the likelihood function. Indeed,

from Figure 3b, which shows the likelihood function

across threshold and slope values for the hypothetical

dataset shown in Figure 3c, it is clear that the

likelihood function lacks a distinct peak but instead

has a ridge corresponding to the family of PFs bound

by the two functions shown in Figure 3c. Note that the

ridge occurs because the lapse rate is free to vary (the

value of the lapse rate of the PF with the highest

likelihood is indicated by the color code). Note also

Journal of Vision (2012) 12(6):25, 1–16 Prins 8

that the extent of the ridge is constrained by the limits

placed on the lapse rate: The ridge would extend farther

in both directions if the prior on the lapse rate would

allow it.

It will be rare to ﬁnd that the maximum in the

likelihood function for data obtained under s1 occurs

somewhere other than at either of the limits set by the

prior on the lapse rate. Figure 7 shows parameter

estimates for 10,000 simulations that were generated by

the generating function with k ¼ 0 under placement

regimen s1 with number of trials equal to N ¼ 960.

Figure 7a shows the distribution of lapse rate estimates

for the 10,000 simulations. Nearly all lapse rate

estimates (9,358 of the 10,000, or 93.6%) are at the

limits of the prior, with about an equal number at each

end of the prior (N ¼ 4,885 at

k ¼ 0, N ¼ 4,473 at

k ¼

0.06). Figure 7b shows a histogram of all 10,000

threshold estimates. From Figure 7b we note that

thresholds are clearly biased (the generating threshold

value is indicated in the ﬁgure by the triangle). The bias

in threshold estimates is closely linked to the observed

distribution of lapse rate estimates. Figure 7c shows the

threshold estimates for the 4,885 simulations in which

the lapse rate estimate equaled z ero. For these

simulations the lapse rate estimate was accurate (albeit

mostly accidentally so, as I will argue) and we ﬁnd that

for this subset of simulations, the threshold estimates

are unbiased. Effectively, these 4,885 simulations were

ﬁtted by a PF at the ‘correct’ limit of the family of PFs

that would all ﬁt these simulations about equally well.

However, Figure 7d shows the threshold estimates

for the 4,473 simulations in which the lapse rate

Figure 7. Results of 10,000 simulations in the s1 stimulus placement regimen when the lapse rate was free to vary. The generating lapse

rate was zero. The generating threshold was F

1

0:5

¼ 8:85 and is indicated by the triangle in the Figures. (a) distribution of lapse rate

estimates. (b) distribution of threshold estimates in all 10,000 simulations. (c) distribution of threshold estimates in those simulations in

which

k ¼ 0. (d) distribution of threshold estimates in those simulations in which

k ¼ 0.06. (e) distribution of threshold estimates in the

simulations in which 0 ,

k , 0.06.

Journal of Vision (2012) 12(6):25, 1–16 Prins 9

estimate equaled 0.06 (that is, those simulations for

which the best-ﬁtting PFs were at the other, ‘incorrect’

limit of the family of PFs). Note that the median

threshold estimate for these 4,473 is very biased. The

value of the median threshold estimate (

1

0:5

¼ 8.33)

instead corresponds to the threshold of the PF which is

the closest match to the generating PF within the range

covered by the s1 regimen but which also has k ¼ 0.06

(cf. red curve in Figure 3c). The same pattern of results

is observed in the median slope estimates. For the ﬁts in

which the lapse rate estimate was equal to zero, the

median of slope estimates

0:5

was equal to 0.118

(compare to the slope of the generating Wei bull

0:5

¼ 0:118). However, for the ﬁts in which the lapse

rate estimate was 0.06, the median of slope estimates

0:5

was equal to 0.139 which corresponds closely to the

slope of the red curve shown in Figure 3c (

0:5

¼ 0.140).

Finally, in Figure 7e are shown the threshold estimates

for the (relatively few) remaining simulations. The lapse

rate estimates for these simulations are between those

for the simulations in Figure 7c and 7d and so is the

median threshold estimate for this subset of simula-

tions.

When the 10,000 simulations were repeated but now

withageneratinglapserateequalto0.05,the

distribution of lapse rates differed hardly from that

shown in Figure 7a: 4,612 simulations resulted in a

lapse rate estimate of 0 and 5,015 resulted in a lapse

rate estimate of 0.06. The distribution of lapse rates

estimates apparently has little to do with the generating

lapse rate when placement regimen s1 is used. The

median threshold for the 4,612 simulations resulting in

a lapse rate estimate equal to 0 was F

1

0:5

¼ 9.35, that for

the 5,015 simulations resulting in a lapse rate equal to

0.06 was F

1

0:5

¼ 8.76. These values correspond closely to

the thresholds of the two functions which are the

(prior-deﬁned) limits of the family of PFs associated

with the generating function with k ¼ 0.05. The leftmost

scatterplot in Figure 5 shows the relationships among

parameter estimates observed under placement regimen

s1 in a different manner (the generating lapse rate in the

ﬁgure equaled 0.03).

Source of bias when placement regimen includes a

high intensity stimulus

While threshold estimates in stimulus regimens that

include high stimulus intensities (s6, s7 and, not shown,

s3 and s5) are essentially unbiased when the generating

lapse rate is low, there is a systematic bias in these

estimates when the generating lapse rate is high. It will

proveworthwhiletoconsiderthesourceofthis

asymmetry in some detail. When the placement

regimen contains high stimulus intensities (i.e., those

for which F [Equation 1b] is near unity) and the

generating lapse rate equals zero, very few (if indeed

any at all) incorrect responses will occur at the high

stimulus intensity. Such a result would be consistent

only with F having a value near unity and the lapse rate

having a value near 0. As a result, when the generating

lapse rate is zero and a placement regimen which

includes a high stimulus intensity is used, lapse rate

estimates will be at or near the (‘correct’) value of 0.

Correspondingly, bias in threshold and slope will be

minimal.

However, when the lapse rate is high, several

incorrect responses are expected to be observed at the

high intensity stimulus. This presents an ambiguous

situation: The relatively high number of incorrect

responses could be due either to a high lapse rate or

to a low value of F (or some combination of these two

factors). Stated more precisely, a relatively high

number of incorrect responses at the high stimulus

intensity will be consistent with a relatively broad

family of functions, members of which will display a

wide range of lapse rate values. The manner in which

the three parameters trade off when placement regimen

includes a high stimulus intensity is very apparent from

Figure 5 (middle two panels: s6 and s7): High lapse rate

estimates tend to go with threshold and slope estimates

that combine to produce high perceptual performance

(i.e., high F) at the high stimulus intensity (i.e., low

threshold/high slope). Similarly, low lapse rate esti-

mates tend to go with high threshold/shallow slope

estimates. It might be noted in passing that, contrary to

intuitive appeal perhaps, increasing the number of trials

at the highest stimulus intensity will not do anything to

resolve this ambiguity. That is, a proportion of, say,

5% incorrect responses at a high intensity will be

consistent with either a low value of F or a high lapse

rate regardless of the number of observations it is based

on. The ambiguity must instead be resolved by

obtaining accurate estimates of threshold and slope

parameters through observations made at the lower

stimulus intensities. Returning to our argument, the

bias observed when the generating lapse rate is high

arises because of the asymmetry of the window of

allowed lapse rate estimates r elative to a high

generating lapse rate. Whereas the window allows

those functions which have lapse rate values that are

much lower than the generating value (which are

coupled with upward biased threshold estimates), it

does not allow those with lapse rate estimates that are

(much) higher than the generating value. Overall, then,

threshold estimates are biased upward when the

generating lapse rate is high.

Bias introduced by using the adaptive psi-method

Let us ﬁrst consider the bias in estimates when the

stimulus placement was as in the psi-method but

retested in an MOCS context and placement was thus

Journal of Vision (2012) 12(6):25, 1–16 Prins 10

not contingent upon previous responses. These results

are indicated by the square symbols in Figure 6. The

pattern of results for these conditions is very similar to

that obtained in Figure 4 for placement regimen s1,

which also did not include high stimulus intensities.

That is, bias was obtained when the generating lapse

rate was low or high and this bias was larger (especially

when considering slope estimates) when the lapse rate

was ﬁxed at 0.025 compared to being free to vary. The

source of bias in these conditions is the same as that

discussed above for placement regimens that do not

include high stimulus intensities. When we compare

these results to those obtained in the original psi-

method runs (circular symbols in Figure 6), however, it

becomes clear that the contingencies have the effect of

raising the median threshold estimates such that, while

the median threshold estimates get closer to the

generating value for some of the lower values of the

generating lapse rates, the threshold estimates when

considered overall become more biased. This effect is

especially pronounced when N is low and the lapse rate

is free to vary. As a matter of fact, the pattern of

threshold estimates for these conditions is somewhat

similar to that shown in Figures 2 and 4 for ﬁts in

which the lapse rate was ﬁxed at a value of zero. Slope

estimates, on the other hand, are overall less biased

when the lapse rate is allowed to vary compared to

being ﬁxed.

Again, we ﬁnd that bias in threshold estimates is

closely linked with concurrent deviations of lapse rate

estimates from the true, generating value. When the

lapse rat e is free to vary, lapse rate estimates mostly

equal 0 (at N ¼ 960, 73.4% of lapse rate estimates

equal 0 when k

gen

¼ 0 and 52.6% of estimate s equal 0

when k

gen

¼ 0.05). This ﬁnding seems to extend an

observation made by Kaernbach (2001). Kaernbach

demonstrated (and meticulously argued) that a bias in

slope estimates results when an adaptive method is

used that selects stimulus intensities such as to

optimize measurement of the threshold but not that

of the slope parameter. He fur ther demonstrated that

the bias in slope es timates is remedied when an

adaptive method is used that selects stimulus intensities

to optimize measurement of the slope as well as the

threshold (see also Kontsevich & Tyler, 1999). The

high degree of bia s in lapse rate estimates (and the

closely linked bias in threshold and slope parameter

estimates) obt ained here may thus be a result of the

fact that the adaptive method selects stimulus intensi-

ties such as to optimize threshold and slope estimates,

but not the lapse rate estimate. The general rule

appears t o be that unless an adap tive procedure

optimizes stimulus selection for the estimation of a

speciﬁc parameter, caution should be exercised when

that parameter is subsequently estimated from the

resulting observ ations.

Proposed alternative strategy

It has been suggested by some to include a relatively

high proportion of trials at high intensities (e.g.,

Treutwein, 1995). Some of the placement regimens

used in Wichmann and Hill (2001) and here did include

high stimulus intensities and each of these presented 1/6

of the total trials at this high intensity. Whereas this

indeed leads to essentially unbiased parameter estimates

when the generating lapse rate is low, estimates are still

biased when the generating lapse rate is high. I argued

above that a critical element underlying this bias is that

the source of incorrect responses at a high stimulus

intensity is ambiguous. These incorrect responses may

result from either a high lapse rate or a low value of F

(or a combination of the two). As I have argued, merely

increasing the number of observations at high intensi-

ties does nothing to resolve this ambiguity.

This observation suggests an alternative strategy to

incorporating estimates of the lapse rate into our

models. T his strategy would involve including a

proportion of trials at a stimulus intensity so high that

it can be reasonably assumed that F at this intensity

effectively equals unity. I will refer to such an intensity

as an Asymptotic Performance Intensity or API.

Critically, the model that is ﬁtted would reﬂect the

assumption that F equals unity at API. This strategy

would remove the ambiguity as to the source of any

incorrect responses at API. Otherwise, the ﬁtting is

performed as in the method advocated by Wichmann

and Hill (2001). I call this strategy ‘joint Asymptotic

Performance Lapse Estimation’ or jAPLE. In jAPLE,

the model that is ﬁtted is given by:

wðx; a; b; c; kÞ¼1  k when x ¼ a

wðx; a; b; c; kÞ

¼ c þð1  c  kÞFðx; a; bÞ otherwise: ð2Þ

In Equation 2, stimulus intensity a is an API. In effect,

errors made at x ¼ a will be unambiguously attributed

to lapses. Note that under jAPLE, observations made

at intensities other than x ¼ a also contribute to the

estimation of the lapse rate.

A second alternative method of ﬁtting datasets in

which a proportion of trials is collected at an API is a

two-step procedure. In the ﬁrst step, the lapse rate

estimate is derived based solely on observations made

at the API, under the assumption that incorrect

responses observed there are exclusively due to lapses.

The maximum-likelihood estimate of the lapse rate thus

would simply correspond to the proportion of incorrect

responses observed at the API. In the second step, the

threshold and slope are estimated from observations

made at the non-API intensities while ﬁxing the lapse

rate at the value obtained in the ﬁrst step. I will refer to

this second strategy as isolated-Asymptotic Perfor-

Journal of Vision (2012) 12(6):25, 1–16 Prins 11

mance Lapse Estimation or ‘iAPLE.’ Note that under

iAPLE, observations made at intensities other than

API do not contribute to the estimate of the lapse rate.

In order to provide a brief demonstration and test of

the proposed methods, I modiﬁed Wichmann and Hill’s

(2001) placement regimens simply by changing the

highest stimulus intensity included in each of the

regimens (whatever that intensity was) to F

1

0:9999

(that

is, to a stimulus intensity at which an incorrect response

would almost exclusively be due to a lapse).

I also tested the proposed methods with data

collected using the adaptive psi-method. For the

purposes of comparison I used the same number of

total trials (i.e., N ¼ 240 and 960) in these simulations

as I did above. However, only 5/6 of the total of N

trials were collected using the adaptive psi-method, the

remaining 1/6 were simulated using a constant stimulus

intensity of F

1

0:9999

. Here too, all simulations were

repeated as MOCS in order to isolate the effect of

intertrial stimulus-respons e interdependencies. Note

that the total number of trials used in all simulations

remains identical to that used above.

Parameter estimates for all simulations were then

derived under the jAPLE and the iAPLE methods. As a

control, parameter estimates were also derived using

the method proposed by Wichmann and Hill. In the

remainder, I will refer to Wi chmann and Hill’s

proposed method as nAPLE (‘non-Asymptotic Perfor-

mance Lapse Estimation’). Results for modiﬁed place-

ment regimens s1, s6, and s7 are presented in Figure 8

alongside results using the original placement regimens

ﬁtted nAPLE (which were also shown in Figure 4).

Results for the psi-method controlled stimulus place-

ments are presented in Figure 9. Figure 10 shows

scatterplots for the modiﬁed placement regimens and

nAPLE, jAPLE, and iAPLE ﬁtting schemes (generating

lapse rate ¼ 0.03 and N ¼ 960).

Results indicate that, in terms of bias, the jAPLE

and iAPLE methods outperform nAPLE at both values

of N tested here. As a matter of fact, only at N ¼ 240

does either method seem to display a mild but

systematic bias which is dependent on placement

regimen. Whil e Wichmann and Hill’s method of

estimating parameters using the modiﬁed placement

regimens outperforms their method using the original

placement regimens, bias (albeit small) remains even at

the highest value of N tested. In terms of the precision

of estimates, however, performance suffers under the

modiﬁed placement regimens in some conditions. This

is especially evident for slope estimates under place-

ment regimen s1 when jAPLE and iAPLE are used.

Since jAPLE and iAPLE assume that incorrect

responses at the API result only from lapses, observa-

tions taken at API do not contribute directly to the

estimates of threshold or slope. Especially under s1, the

remaining ﬁve intensities used to estimate threshold

and slope are poorly placed to achieve high precision in

the estimate of the slope.

Inspection of Figure 10 indicates that a trade-off

among the three parameters is much more apparent

under the nAPLE ﬁtting scheme compared to the

methods proposed here, especially under (modiﬁed)

sampling schemes s1 and s6. As discussed here,

incorrect responses made at a high stimulus intensity

may be due either to lapses or to genuine perceptual

misses (i.e., low value of F). In order to disambiguate

the source of such errors, we must obtain accurate

estimates of both threshold and slope from observa-

tions made at lower intensities. Modiﬁed placement

regimens s1 and s6 cannot provide these: The ﬁve

intensities that are not at API are all near threshold

value in both s1 and s6.

A very small, but apparently sy stematic bias

remains, even at high N, when the results from the

psi-method runs are ﬁtted directly. This bias is much

more pronounced when the nAPLE method is used

compared to the methods proposed here. At high N,

parameter estimates are essentially unbiased when

intertrial dependencies are removed by retesting the

psi-method stimulus placements simulations using

MOCS.

A few concluding remarks

Wichmann and Hill (2001) chose to report their

results in terms of F

1

0:5

and F

0:5

, rather than a and b (see

Equation 1). The metrics used by Wichmann and Hill

have the advantage of allowing numerical comparison

ofparametervaluesacrossdifferentformsofF

(Weibull, Logistic, etc.). However, F

1

0:5

and F

0:5

are

both non-linear functions of both a and b, and it is the

values of a and b which are estimated in the maximum-

likelihood estimation procedure. Thus, while maxi-

mum-likelihood estimators have the desirable property

of being asymptotically unbiased (e.g., Edwards, 1972),

this property would apply only to a and b, not to F

1

0:5

and F

0:5

. Moreover, since F

1

0:5

and F

0:5

are both

functions of both a and b, any bias in either a or b

would result in bias for both of F

1

0:5

and F

0:5

. Since my

results directly challenge the integrity of Wichmann

and Hill’s results I have chosen to report my results in

terms of F

1

0:5

and F

0:5

also. However, my pattern of

results would be the same whether expressed in terms of

1

0:5

and F

0:5

or in terms of a and b. That is, like F

1

0:5

and

0:5

, a and b are both biased when estimated by the

method proposed by Wichmann and Hill and, like F

1

0:5

and F

0:5

, a and b are both not (noticeably) biased when

the methods I propose are used and the number of

observations is sufﬁcient.

Based on the results of my simulations one should

consider a number of issues before deciding whether to

Journal of Vision (2012) 12(6):25, 1–16 Prins 12

allow the lapse rate to vary while ﬁtting the PF. First, it

is important to realize that bias in threshold and slope

remains even when we allow the lapse rate to vary.

With the lapse rate allowed to vary, the bias in slope

estimates is largely independent of the sampling scheme

chosen, while bias in threshold estimates is very

dependent on sampling scheme (at least when the

generating lapse rate is small). The opposite is true of

biases when the lapse rate is ﬁxed at a small (but greater

than zero) value. In that case threshold bias is largely

independent of sampling scheme while the bias in slope

does depend on sampling scheme. Second, the degree of

bias is very much dependent on the speciﬁc limits set on

the prior window. Just as it is no coincidence that

threshold and slope estimates were unbiased when the

generating value of the lapse rate equaled its ﬁxed,

Figure 8. Median parameter estimates obtained using the methods proposed here (jAPLE and iAPLE). Modified placement regimens were

as shown in Figure 1 except that the highest stimulus intensity was changed to F

1

0:9999

. Also shown are median parameter estimates

derived from the same simulations using the method advocated by Wichmann and Hill (nAPLE), as well as median parameter estimates

using the original placement regimens fitted nAPLE (i.e., these are identical to those shown in Figure 4 and replicated here only for the

purpose of easy comparison). Also shown for each condition are ‘half 68% confidence intervals widths’ (see text for details).

Journal of Vision (2012) 12(6):25, 1–16 Prins 13

assumed value, it is also no coincidence that none of the

sampling schemes produce a signiﬁcant bias when the

generating lapse ra te equale d 0.03 (i .e., midway

between the limits of the prior). Thi rd, when an

adaptive method is used, bias in threshold may actually

be much greater when the lapse rate is allowed to vary

compared to being ﬁxed at a small, non-zero value.

Fourth, when we allow the lapse rate to vary a stimulus

at an API intensity (that is, one at which performance

has reached asymptotic level) should be included in the

sampling scheme even when, otherwise, stimulus

intensities are selected by an adaptive method. The

placement regimen should contain additional stimuli

placed such as to obtain accurate estimates of both

threshold and slope. Fifth, when an API stimulus is

included one should consider using the pro posed

jAPLE or iAPLE ﬁtting method. It is important to

realize, however, that these methods may not always be

Figure 9. Median parameter estimates obtained using the methods proposed here (jAPLE and iAPLE). Stimulus placement of 5/6 of the

trials was governed by the psi-method. The remaining trials used stimulus placement at F

1

0:9999

. Also shown are median parameter

estimates derived from the same simulated datasets using the method advocated by Wichmann and Hill (nAPLE). The results obtained

from data obtained from the psi-method runs directly are shown as round symbols, while square symbols display results from simulations

in which stimulus placements obtained in each psi-method run were retested as MOCS. Also shown for each condition are ‘half 68%

confidence interval widths’ (see text for details).

Journal of Vision (2012) 12(6):25, 1–16 Prins 14

possible to implement since the maximum achievable

stimulus intensity may not be at asymptotic perfor-

mance. Great care should be taken to ensure that

performance has indeed reached an asymptotic level at

the stimulus intensity chosen as API.

One other possible strategy of dealing with the issue

of the lapse rate might prove valuable. In mos t

practical cases, bias in the threshold and slope estimates

per se is of no concern. Of theoretical concern,

generally, is not the absolute value of a threshold or

slope, but rather whether differences in parameter

values exist between experimental conditions. In such

cases, rather than ﬁtting thresholds and slopes to

different experimental conditions individually, we may

reparametrize our thresholds and slopes (e.g., Yssaad-

Fesselier & Knoblauch, 2006; Kingdom & Prins, 2010).

For example, in a two-condition experiment, we may

reparametrize our thresholds into a parameter corre-

sponding to the sum of the thresholds in the two

conditions and a parameter corresponding to the

difference between thresholds. Of theoretical concern

in most research will be the value of the ‘difference

parameter’ while that of the ‘sum parameter’ will,

generally, have few theoretical implications. Of interest,

then, is whether the difference parameter is subject to

bias when assumptions regarding the lapse rate are

violated. The issue of bias in the sum parameter and

difference parameter is the focus of current research in

my lab.

Acknowledgments

The author is grateful to Stan Klein, Felix Wich-

mann, Mike Landy, Miguel Garc

ıa-P

erez, and an

Figure 10. Scatterplots showing the relationship between parameter estimates for the three modified MOCS placement regimens as well

as modified psi-method controlled placements using the different fitting schemes. The generating lapse rate was 0.03 and N ¼ 960.

Number in each plot indicates the number of simulations (of 2,000) resulting in parameter estimates contained within range of parameter

values included in plots.

Journal of Vision (2012) 12(6):25, 1–16 Prins 15

anonymous reviewer for valuable comments on earlier

versio ns of t his pa per . The author thank s Fel ix

Wichmann also for his gracious permission to repro-

duce Figure 3 of Wichmann and Hill (2001).

Commercial relationships: none.

Corresponding author: Nicolaas Prins.

Email: nprins@olemiss.edu.

Address: Department of Psychology, University of

Mississippi, University, Mississippi, USA.

Footnote

Wichmann and Hill (2001) refer to the interval of

allowed lapse rates as a ‘Bayesian prior.’ Constraining

the lapse rate estimates to an interval that reﬂects the

subjective belief concerning the likely values of the

lapse rate does indeed embody a critical feature of

Bayesian reasoning. However, the estimation of pa-

rameter values in Wichmann and Hill and here is

performed by (constrained) maximum-likelihood esti-

mation, not by Bayesian estimation. For that reason I

will refer to the interval of allowed lapse rates simply as

the ‘prior window’ or the ‘prior.’

References

Edwards, A. F. W. (1972). Likelihood. Baltimore, MD:

Johns Hopkins University Press.

Garc

ıa-P

erez, M. A., & Alcal

a-Quintana, R. (2005).

Sampling plans for fitting the psychometric func-

tion. The Spanish Journal of Psychology, 8(2), 256–

289.

Garc

ıa-P

erez, M. A., & Alcal

a-Quintana, R. (2007).

The transducer model for contrast detection and

discrimination: Formal relations, implications, and

an empirical test. Spatial Vision, 20(1–2), 5–43.

Hill, N. J. (2001). Testing hypotheses about psycho-

metric functions–an investigation of some confi-

dence interval methods, their validity, and their use

in the evaluation of optimal sampling strategies.

Doctoral thesis, University of Oxford, Oxford, UK.

Internet site: http://bootstrap-software.org/

psignifit/publications/hill2001.pdf (Accessed May

21, 2012).

Kaernbach, C. (2001). Slope bias of psychometric

functions derived from adaptive data. Perception

& Psychophysics, 63(8), 1389–1398.

Kingdom, F. A. A., & Prins, N. (2010). Psychophysics:

A practical introduction. London: Academic Press:

An imprint of Elsevier.

Klein, S. A. (2001). Measuring, estimating, and

understanding the psychometric function: A com-

mentary. Perception a nd Psychophysics, 63(8),

1421–1455.

Kontsevich, L. L., & Tyler, C. W. (1999). Bayesian

adaptive estimation of psychometric slope and

threshold. Vision Research, 39(16), 2729–2737.

Manny, R. E., & Klein, S. A. (1985). A three

alternative tracking paradigm to measure vernier

acuity of older infants. Vision Research, 25(9),

1245–1252. [PubMed].

Prins, N., & Kingdom, F. A. A. (2009). Palamedes:

Matlabt routines for analyzing psychophysical

data. Internet site: www.palamedestoolbox.org.

Swets, J. A. (1961). Is there a sensory threshold?

Science, 134, 168–177.

Treutwein, B. (1995). Adaptive psychophysical proce-

dures. Vision Research, 35(17), 2503–2522.

Treutwein, B., & Strasburger, H. (1999). Fitting the

psychometric function. Perception and Psychophys-

ics, 61(1), 87–106.

Wichmann, F. A., & Hill, N. J. (2001). The psycho-

metric function: I. fitting, sampling, and goodness

of fit. Perception and Psychophysics, 63(8), 1293–

1313.

Yssaad-Fesselier, R., & Knoblauch, K. (2006). Mod-

eling psychometric functions in R. Behavior Re-

search Methods, 38(1), 28–41.

Journal of Vision (2012) 12(6):25, 1–16 Prins 16

Correctly establishing evidence for cue combination via gains in sensory precision: Why the choice of comparator matters

Article

Full-text available

Sep 2023
BEHAV RES METHODS

Studying how sensory signals from different sources (sensory cues) are integrated within or across multiple senses allows us to better understand the perceptual computations that lie at the foundation of adaptive behaviour. As such, determining the presence of precision gains – the classic hallmark of cue combination – is important for characterising perceptual systems, their development and functioning in clinical conditions. However, empirically measuring precision gains to distinguish cue combination from alternative perceptual strategies requires careful methodological considerations. Here, we note that the majority of existing studies that tested for cue combination either omitted this important contrast, or used an analysis approach that, unknowingly, strongly inflated false positives. Using simulations, we demonstrate that this approach enhances the chances of finding significant cue combination effects in up to 100% of cases, even when cues are not combined. We establish how this error arises when the wrong cue comparator is chosen and recommend an alternative analysis that is easy to implement but has only been adopted by relatively few studies. By comparing combined-cue perceptual precision with the best single-cue precision, determined for each observer individually rather than at the group level, researchers can enhance the credibility of their reported effects. We also note that testing for deviations from optimal predictions alone is not sufficient to ascertain whether cues are combined. Taken together, to correctly test for perceptual precision gains, we advocate for a careful comparator selection and task design to ensure that cue combination is tested with maximum power, while reducing the inflation of false positives. Supplementary Information The online version contains supplementary material available at 10.3758/s13428-023-02227-w.

Spectral-temporal processing of naturalistic sounds in monkeys and humans

Article

Full-text available

Nov 2023
J NEUROPHYSIOL

Human speech and vocalizations in animals are rich in joint spectrotemporal (S-T) modulations, wherein acoustic changes in both frequency and time are functionally related. In principle, the primate auditory system could process these complex dynamic sounds based on either an inseparable representation of S-T features, or alternatively, a separable representation. The separability hypothesis implies an independent processing of spectral and temporal modulations. We collected comparative data on the S-T hearing sensitivity in humans and macaque monkeys to a wide range of broadband dynamic spectrotemporal ripple stimuli employing a yes-no signal-detection task. Ripples were systematically varied-as a function of density (spectral modulation-frequency), velocity (temporal modulation-frequency), or modulation depth-to cover a listener's full S-T modulation sensitivity; derived from a total of 87 psychometric ripple detection curves. Audiograms were measured to control for normal hearing. Determined were hearing thresholds, reaction time distributions, and S-T modulation transfer functions (MTFs); both at the ripple detection thresholds, and at supra-threshold modulation depths. Our psychophysically derived MTFs are consistent with the hypothesis that both monkeys and humans employ analogous perceptual strategies: S-T acoustic information is primarily processed separable. Singular-value decomposition (SVD), however, revealed a small but consistent, inseparable spectral-temporal interaction. Finally, SVD analysis of the known visual spatiotemporal contrast-sensitivity function (CSF) highlights that human vision is space-time inseparable to a much larger extent than is the case for S-T sensitivity in hearing. Thus, the specificity with which the primate brain encodes natural sounds appears to be less strict than is required to adequately deal with natural images.

A Comparative Study on Line Bisection and Landmark Task Performance Using a Hybrid Online Setting

Article

Full-text available

Mar 2023

Bisection tasks are commonly used to assess biases and asymmetries in visuospatial attention in both patients and neurologically intact individuals. In these tasks, participants are usually asked to identify the midpoint and manually bisect a horizontal line. Typically, healthy individuals tend to show an attention processing advantage for the left visual field, known as “pseudoneglect.” Here, performance at two computerized versions of the task was compared to assess pseudoneglect in neurologically intact individuals. Specifically, we used a hybrid online setting in which subjects (n = 35) performed the online tasks under the video guidance of the experimenter. We measured attentional biases in the line bisection and landmark tasks. We found pseudoneglect in both tasks, although the bias was larger in the line bisection task. Overall, these findings show that hybrid online tasks may provide a valid setting to assess attentional biases and suggest their feasibility in the clinical setting.

Easy, bias-free Bayesian hierarchical modeling of the psychometric function using the Palamedes Toolbox

Article

Jan 2023
BEHAV RES METHODS

Nicolaas Prins

A hierarchical Bayesian method is proposed that can be used to fit multiple psychometric functions (PFs) simultaneously across conditions and subjects. The method incorporates the generalized linear model and allows easy reparameterization of the parameters of the PFs, for example, to constrain parameter values across conditions or to code for experimental effects (e.g., main effects and interactions in a factorial design). Simulations indicate that fitting PFs for multiple conditions and observers simultaneously using the hierarchical structure effectively eliminates bias and improves precision in parameter estimates relative to fitting PFs individually in each condition. The method is further validated by analyzing human psychophysical data obtained in an experiment investigating the effect of attention on correspondence matching in an ambiguous long-range motion display. The method converges successfully, even for experiments that use a low number of trials per subject, without the need for fine-tuning by the user and while using the default essentially uninformative priors. The latter may make the method more acceptable to those critical of applying informative priors. The method is implemented in the freely downloadable Palamedes Toolbox, which also includes routines that graphically display the fitted psychometric functions alongside the data, and derive and display posterior distributions of parameters, summary statistics, and diagnostic measures. Overall, these features make hierarchical Bayesian modeling of PFs easily available to researchers who wish to use Bayesian statistics but lack the expertise to implement these methods themselves.

Virtual occlusion effects on the perception of self-initiated visual stimuli

Article

Dec 2022
CONSCIOUS COGN

Virtual reality (VR) has established itself as a useful tool in the study of human perception in the laboratory. A recent study introduced a new approach to examine visual sensory attenuation (SA) effects in VR. Hand movements triggered the appearance of Gabor stimuli, which were either presented behind the participant's hand - not rendered in VR ("virtual occlusion") - or elsewhere on the display. Virtual occlusion led to a rightward shift of the psychometric curve, suggesting that self-generated hand movements reduced the perceived contrast of the stimulus. Since such attenuation effects might provide a window into the predictive processing of the sensory and cognitive apparatus, we sought to better understand the nature of the virtual occlusion effects. In our study, the presentation of test stimuli was either self-initiated, self-initiated with a variable delay, or triggered externally; the test stimuli were occluded or not. In conflict with our hypothesis, we found moderate to strong evidence for an absence of any horizontal shifts between the psychometric curves. However, virtual occlusion was associated with a decrease in the slope of the psychometric function. Our results suggest that virtual occlusion attenuated the relative perceptual sensitivity, so that participants had more difficulty discriminating contrast differences when the test stimulus was presented behind the hand. We tentatively conclude that, in the visual domain, the discriminability of stimulus intensity is modified by internal predictive cues (i.e., proprioception), possibly linked to shifts in covert spatial attention.

Intra-individual consistency of vestibular perceptual thresholds

Article

Apr 2024

Vestibular perceptual thresholds quantify sensory noise associated with reliable perception of small self-motions. Previous studies have identified substantial variation between even healthy individuals’ thresholds. However, it remains unclear if or how an individual’s vestibular threshold varies over repeated measures across various time scales (repeated measurements on the same day, across days, weeks, or months). Here, we assessed yaw rotation and roll tilt thresholds in four individuals and compared this intra-individual variability to inter-individual variability of thresholds measured across a large age-matched cohort each measured only once. For analysis, we performed simulations of threshold measurements where there was no underlying variability (or it was manipulated) to compare to that observed empirically. We found remarkable consistency in vestibular thresholds within individuals, for both yaw rotation and roll tilt; this contrasts with substantial inter-individual differences. Thus, we conclude that vestibular perceptual thresholds are an innate characteristic, which validates pooling measures across sessions and potentially serves as a stable clinical diagnostic and/or biomarker.

Trial-history biases in evidence accumulation can give rise to apparent lapses in decision-making

Article

Full-text available

Jan 2024

Trial history biases and lapses are two of the most common suboptimalities observed during perceptual decision-making. These suboptimalities are routinely assumed to arise from distinct processes. However, previous work has suggested that they covary in their prevalence and that their proposed neural substrates overlap. Here we demonstrate that during decision-making, history biases and apparent lapses can both arise from a common cognitive process that is optimal under mistaken beliefs that the world is changing i.e. nonstationary. This corresponds to an accumulation-to-bound model with history-dependent updates to the initial state of the accumulator. We test our model’s predictions about the relative prevalence of history biases and lapses, and show that they are robustly borne out in two distinct decision-making datasets of male rats, including data from a novel reaction time task. Our model improves the ability to precisely predict decision-making dynamics within and across trials, by positing a process through which agents can generate quasi-stochastic choices.

Diverse and flexible behavioral strategies arise in recurrent neural networks trained on multisensory decision making

Preprint

Full-text available

Nov 2023

Behavioral variability across individuals leads to substantial performance differences during cognitive tasks, although its neuronal origin and mechanisms remain elusive. Here we use recurrent neural networks trained on a multisensory decision-making task to investigate inter-subject behavioral variability. By uniquely characterizing each network with a random synaptic-weights initialization, we observed a large variability in the level of accuracy, bias and decision speed across these networks, mimicking experimental observations in mice. Performance was generally improved when networks integrated multiple sensory modalities. Additionally, individual neurons developed modality-, choice- or mixed-selectivity, these preferences were different for excitatory and inhibitory neurons, and the concrete composition of each network reflected its preferred behavioral strategy: fast networks contained more choice- and mixed-selective units, while accurate networks had relatively less choice-selective units. External modulatory signals shifted the preferred behavioral strategies of networks, suggesting an explanation for the recently observed within-session strategy alternations in mice.

The potential cognitive benefits of musical training from childhood to healthy aging

Thesis

Full-text available

May 2023

Rafael Román-Caballero

There is currently a growing interest in ways to enhance and preserve our cognitive skills through changes in lifestyle. Extensive scientific evidence links several behavioral and environmental factors, such as smoking, alcohol and drug abuse, a sedentary lifestyle, and inadequate nutrition, to an increased risk of cognitive impairment, dementia, and accelerated aging. On the other side, education, physical exercise, and cognitively stimulating occupations and leisure activities have all been associated with neurocognitive benefits and the prevention of the pervasive consequences of neural aging. Among them, a wealth of studies has associated musical training, and particularly learning to play an instrument, with differences in auditory and sensorimotor skills, as well as in multiple non-musical cognitive capacities: intelligence, visuospatial abilities, processing speed, executive control, attention and vigilance, episodic and working memory, and language.

Tree Shrews as an Animal Model for Studying Perceptual Decision-Making Reveal a Critical Role of Stimulus-Independent Processes in Guiding Behavior

Article

Full-text available

Nov 2022

Decision-making is an essential cognitive process by which we interact with the external world. However, attempts to understand the neural mechanisms of decision-making are limited by the current available animal models and the technologies that can be applied to them. Here, we build on the renewed interest in using tree shrews ( Tupaia belangeri ) in vision research and provide strong support for them as a model for studying visual perceptual decision-making. Tree shrews learned very quickly to perform a two-alternative forced choice contrast discrimination task, and they exhibited differences in response time distributions depending on the reward and punishment structure of the task. Specifically, they made occasional fast guesses when incorrect responses are punished by a constant increase in the interval between trials. This behavior was suppressed when faster incorrect responses were discouraged by longer intertrial intervals. By fitting the behavioral data with two variants of racing diffusion decision models, we found that the between-trial delay affected decision-making by modulating the drift rate of a time accumulator. Our results thus provide support for the existence of an internal process that is independent of the evidence accumulation in decision-making and lay a foundation for future mechanistic studies of perceptual decision-making using tree shrews.

The psychometric function: I. Fitting, sampling, and goodness of fit

Article

Full-text available

Nov 2001

The psychometric function relates an observer’s performance to an independent variable, usually some physical quantity of a stimulus in a psychophysical task. This paper, together with its companion paper (Wichmann & Hill, 2001), describes an integrated approach to (1) fitting psychometric functions, (2) assessing the goodness of fit, and (3) providing confidence intervals for the function’s parameters and other estimates derived from them, for the purposes of hypothesis testing. The present paper deals with the first two topics, describing a constrained maximum-likelihood method of parameter estimation and developing several goodness-of-fit tests. Using Monte Carlo simulations, we deal with two specific difficulties that arise when fitting functions to psychophysical data. First, we note that human observers are prone to stimulus-independent errors (orlapses). We show that failure to account for this can lead to serious biases in estimates of the psychometric function’s parameters and illustrate how the problem may be overcome. Second, we note that psychophysical data sets are usually rather small by the standards required by most of the commonly applied statistical tests. We demonstrate the potential errors of applying traditionalX 2 methods to psychophysical data and advocate use of Monte Carlo resampling techniques that do not rely on asymptotic theory. We have made available the software to implement our methods.

Fitting the psychometric function

Article

Full-text available

Feb 1999

A constrained generalized maximum likelihood routine for fitting psychometric functions is proposed, which determines optimum values for the complete parameter set--that is, threshold and slope--as well as for guessing and lapsing probability. The constraints are realized by Bayesian prior distributions for each of these parameters. The fit itself results from maximizing the posterior distribution of the parameter values by a multidimensional simplex method. We present results from extensive Monte Carlo simulations by which we can approximate bias and variability of the estimated parameters of simulated psychometric functions. Furthermore, we have tested the routine with data gathered in real sessions of psychophysical experimenting.

Bayesian adaptive estimation of psychometric slope and threshold

Article

Full-text available

Sep 1999
VISION RES

We introduce a new Bayesian adaptive method for acquisition of both threshold and slope of the psychometric function. The method updates posterior probabilities in the two-dimensional parameter space of psychometric functions and makes predictions based on the expected mean threshold and slope values. On each trial it sets the stimulus intensity that maximizes the expected information to be gained by completion of that trial. The method was evaluated in computer simulations and in a psychophysical experiment using the two-alternative forced-choice (2AFC) paradigm. Threshold estimation within 2 dB (23%) precision requires less than 30 trials for a typical 2AFC detection task. To get the slope estimate with the same precision takes about 300 trials.

Is There a Sensory Threshold?

Article

Jan 1961

Swets JA

Psychophysics: A Practical Introduction

Chapter

Jan 2010

A three alternative tracking paradigm to measure vernier acuity of older infants

Article

Feb 1985
VISION RES

Vernier acuity was measured in infants 1 to 14 months of age using a dynamic three alternative tracking paradigm. The location of the vernier offset would move randomly to one of three screen locations. An observer, unaware of the stimulus location, viewed the infant from behind a screen and guessed the position of the vernier offset. The magnitude of the vernier offset was controlled by a staircase and the results were analyzed using a broken line psychometric function. These thresholds were compared to thresholds obtained from the same infants during the same session with a two alternative forced choice preferential looking paradigm. Differences in the results of these two procedures and those reported previously in the literature are discussed in terms of the differences in the nature of the visual stimuli, one containing motion and one without motion.

Adaptive Psychophysical Procedures

Article

Oct 1995
VISION RES

Bernhard Treutwein

Improvements in measuring thresholds, or points on a psychometric function, have advanced the field of psychophysics in the last 30 years. The arrival of laboratory computers allowed the introduction of adaptive procedures, where the presentation of the next stimulus depends on previous responses of the subject. Unfortunately, these procedures present themselves in a bewildering variety, though some of them differ only slightly. Even someone familiar with several methods cannot easily name the differences, or decide which method would be best suited for a particular application. This review tries to illuminate the historical background of adaptive procedures, explain their differences and similarities, and provide criteria for choosing among the various techniques.

Measuring, estimating, and understanding the psychometric function: A commentary

Article

Nov 2001

Stanley Klein

The psychometric function, relating the subject's response to the physical stimulus, is fundamental to psychophysics. This paper examines various psychometric function topics, many inspired by this special symposium issue of Perception & Psychophysics: What are the relative merits of objective yes/no versus forced choice tasks (including threshold variance)? What are the relative merits of adaptive versus constant stimuli methods? What are the relative merits of likelihood versus up-down staircase adaptive methods? Is 2AFC free of substantial bias? Is there no efficient adaptive method for objective yes/no tasks? Should adaptive methods aim for 90% correct? Can adding more responses to forced choice and objective yes/no tasks reduce the threshold variance? What is the best way to deal with lapses? How is the Weibull function intimately related to the d' function? What causes bias in the likelihood goodness-of-fit? What causes bias in slope estimates from adaptive methods? How good are nonparametric methods for estimating psychometric function parameters? Of what value is the psychometric function slope? How are various psychometric functions related to each other? The resolution of many of these issues is surprising.

Slope bias of psychometric functions derived from adaptive data

Article

Dec 2001

Christian Kaernbach

Several investigators have fit psychometric functions to data from adaptive procedures for threshold estimation. Although the threshold estimates are in general quite correct, one encounters a slope bias that has not been explained up to now. The present paper demonstrates slope bias for parametric and nonparametric maximum-likelihood fits and for Spearman-Kärber analysis of adaptive data. The examples include staircase and stochastic approximation procedures. The paper then presents an explanation of slope bias based on serial data dependency in adaptive procedures. Data dependency is first illustrated with simple two-trial examples and then extended to realistic adaptive procedures. Finally, the paper presents an adaptive staircase procedure designed to measure threshold and slope directly. In contrast to classical adaptive threshold-only procedures, this procedure varies both a threshold and a spread parameter in response to double trials.

Is there a sensory threshold?

Article

Aug 1961

JA Swets

When the effects of the observer's response criterion are isolated, a sensory limitation is not evident.

The psychometric function: The lapse rate revisited

Abstract and Figures

Recommended publications

A Coefficient Alpha for Test-Retest Data

The influence of global discourse on lexical ambiguity resolution

Overview of Agreement Statistics for Medical Devices

Measuring β-diversity with species abundance data