ArticlePDF Available

Modeling the hemodynamic response in fMRI using smooth FIR filters

Authors:

Abstract and Figures

Modeling the hemodynamic response in functional magnetic resonance (fMRI) experiments is an important aspect of the analysis of functional neuroimages. This has been done in the past using parametric response function, from a limited family. In this contribution, the authors adopt a semi-parametric approach based on finite impulse response (FIR) filters. In order to cope with the increase in the number of degrees of freedom, the authors introduce a Gaussian process prior on the filter parameters. They show how to carry on the analysis by incorporating prior knowledge on the filters, optimizing hyper-parameters using the evidence framework, or sampling using a Markov Chain Monte Carlo (MCMC) approach. The authors present a comparison of their model with standard hemodynamic response kernels on simulated data, and perform a full analysis of data acquired during an experiment involving visual stimulation.
Content may be subject to copyright.
1188 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 12, DECEMBER 2000
Modeling the Haemodynamic Response in fMRI
Using Smooth FIR Filters
Cyril Goutte*, Finn Årup Nielsen, and Lars Kai Hansen, Member, IEEE
Abstract—Modeling the haemodynamic response in functional
magnetic resonance (fMRI) experiments is an important aspect of
the analysis of functional neuroimages. This has been done in the
past using parametric response function, from a limited family.
In this contribution, we adopt a semi-parametric approach based
on finite impulse response (FIR) filters. In order to cope with
the increase in the number of degrees of freedom, we introduce
a Gaussian process prior on the filter parameters. We show
how to carry on the analysis by incorporating prior knowledge
on the filters, optimizing hyper-parameters using the evidence
framework, or sampling using a Markov Chain Monte Carlo
(MCMC) approach. We present a comparison of our model with
standard haemodynamic response kernels on simulated data, and
perform a full analysis of data acquired during an experiment
involving visual stimulation.
Index Terms—Evidence, FIR filters, fMRI, haemodynamic re-
sponse, Markov Chain Monte Carlo, neuroimaging, smoothness
prior, Tikhonov regularization.
I. INTRODUCTION
MODELING the haemodynamic response is important for
several reasons. First, an appropriate modeling leads to
better statistical maps. With the increased temporal resolution
of functional magnetic resonance (fMRI) images [as compared
to positron emission tomography (PET)], a binary baseline-ac-
tivation description of the data is insufficient and it is necessary
to take into account the temporal pattern of activation due to the
haemodynamic response to the activation. A second reason is
the possibility of performing simulations with the model. The
predicted behavior obtained from simulation can be used to for-
mulate more explicit hypotheses about the fMRI signal, and pos-
sibly optimize the design and acquisition [1]. A last reason is the
possibility, for some models, to give a physiological interpreta-
tion of the model parameters and, thus, better understand the
neurophysiology [2], [3].
Manuscript received April 10, 2000; revised August 21, 2000. This work
was supported by the EU through a BIOMED II Grant BMH4-CT97-2775, the
Human Brain Project P20 MH57180 and the Danish Research Councils through
the Danish Computational Neural Network Center (CONNECT) and the THOR
Center for Neuroinformatics. The Associate Editor responsible for coordinating
the review of this paper and recommending its publication was X. Hu. Asterisk
indicates corresponding author.
*C. Goutte was with the Department of Mathematical Modeling, Technical
University of Denmark, DK-2800 Lyngby, Denmark. He is now with INRIA
Rhone-Alpes, Zirst Montbonnot - 655 avenue de l’Europe F-38334 Saint Ismier
Cedex France (e-mail: cyril.goutte@inrialpes.fr).
F. Å. Nielsen and L. K. Hansen are with the Department of Mathematical
Modeling, Building 321, Technical University of Denmark, DK-2800 Lyngby,
Denmark.
Publisher Item Identifier S 0278-0062(00)10618-4.
The haemodynamic response is usually, as a first approxima-
tion, modeled as a convolution of the experimental paradigm by
a linear filter, and implemented as a linear time-invariant (LTI)
system. This assumption is usually justified by the observation
of additivity in the fMRI signal [4], which is consistent with
the linear hypothesis. Although several groups have since re-
ported small to strong departures from normality in a number of
contexts [5]–[8], it is still believed that the linearity assumption
holds in a wide range of experimental conditions [8]. The LTI
approach has been pursued using several types of parametric
models of the filter, for example using a Poisson filter [9], a
Gamma filter [4], [10], a Gaussian filter [11], or a simple delay
[12]. In addition, a number of investigators have used linear fil-
ters to model the haemodynamic response, but as they set and do
not fit the parameters, they are somewhat out of the scope of this
study (see, e.g., [6], [13]). In these models, the few parameters
have a specific interpretation, measuring, e.g., delay, strength of
activation, etc.
Inthiscontribution,weuseadifferentstandpoint,wherewedo
not impose a specific shape on the linear filter coefficients. The
haemodynamic response is modeled as an FIR function, a partic-
ular case of autoregressive with exogenous input (ARX) model.
This approach has been pioneered by [14]. Though it is undoubt-
edly parametric in the sense that it fits a number of parameters,
these do not really have a physical or physiological meaning. We
will,therefore,refertoitasasemi-parametricmodelingapproach.
This approach is much more flexible than the use of a parametric
filter shape. In particular, it can model reliably theearly decrease
in signal (initial dip, see, e.g., [15], [16]) or the post-activation
undershoot [17], whereas, e.g., the Poisson, Gamma or Gaussian
filterareintrinsically unabletodoso.
As the number of parameters increases, there is a risk that
the model will overfit or that parameters become ill determined.
We deal with this problem by placing a Gaussian Process prior
on the filter coefficients, forcing the filter to be smooth. The
resulting model is determined by the data and three hyper-pa-
rameters which can be set beforehand or again fitted on the data
using a probabilistic argument.
In the following sections, we describe the basic theory for
smooth FIR filters. We discuss a number of topics like boundary
conditions and link to traditional Tikhonov regularization. We
then go on to describe how to use a Bayesian argument to esti-
mate the hyper-parameters, either using the evidence argument
or by integrating over nuisance parameters using Markov Chain
Monte Carlo (MCMC) methods.
We illustrate the workings of this filter using several exper-
iments. We first show how the model can implement some of
the LTI models currently in use in the fMRI literature. We show
0278–0062/00$10.00 © 2000 IEEE
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
GOUTTE et al.: MODELING THE HAEMODYNAMIC RESPONSE IN fMRI USING SMOOTH FIR FILTERS 1189
that the smooth FIR filter is able to implement additional fea-
tures that these classical models cannot, for example a post-ac-
tivation undershoot. We then perform a full analysis of fMRI
data acquired during a visual stimulation experiment. In partic-
ular, we show how to derive from the resulting filter measures of
support ( -values) for the null hypothesis of no activation, and
meaningful physiological information like the strength or delay
in activation.
II. DATA
The dataset was acquired at Hvidovre Hospital on a 1.5-T
Magnetom Vision MR scanner by Egill Rostrup. The scanning
sequence was a 2-D gradient echo EPI (T2 weighted) with 66
ms echo time and 50 degrees RF flip angle. The images were ac-
quired with a matrix of 128 128 pixels, with FOV of 230-mm,
and 10-mm slice thickness, in a para-axial orientation parallel to
the calcarine sulcus. The region of interest (ROI) will belimited
to a 68 82 two-dimensional (2-D) voxel map. The voxel di-
mension is 1.8 1.8 10 mm.
The visual paradigm consists of a rest period of 20 s of dark-
ness using a light fixation dot, followed by 10 s of full-field
checker board reversing at 8 Hz, and ending with 20 s of rest
(darkness). In total, 150 images were acquired in 50 s, corre-
sponding to a period of approximately 330 ms/image.
The experiment was repeated in ten separate runs containing
150 images each. In order to reduce saturation effects, the first
29 images were discarded, leaving 121 images for each run.
The datasets studied in this article were acquired on the same
subject, but during two separate scanning sessions (d3711 and
d3991), such that, e.g., the position and the shape of the slice
are slightly different. In each case, the dataset was built by com-
bining the ten runs into a single sequence of 1210 images. How-
ever, as the runs were acquired separately, it should be noted that
there cannot be any causality between the activation in one run
and the signal measured in the next. Note also that due to the
haemodynamic delay, the signal measured in activated voxels
will be roughly centered within the remaining 40 s of each run.
In the dataset we use in this article, the brain has first been
masked, and the data was preprocessed using the run-based de-
trending described by [18].
III. THEORY OF SMOOTH FIR FILTERS
Let us consider a fMRI signal acquired in a given voxel
using a stimulus . The image index runs between one and
. The finite input response (FIR) filter of order models the
fMRI signal using linear coefficients
(1)
where is a vector of
past values of the stimulus.
Assuming independent additive zero mean Gaussian noise,
the likelihood of the model parameters becomes
(2)
(3)
where , is a matrix con-
taining the (transposed) input vectors for all values1of ,
and , is the vector of mea-
surements, the target values for our filter. Maximizing the likeli-
hood with respect to leads to the well-known maximum-like-
lihood (ML) solution
(4)
When the ratio of the number of independent data to the filter
order is small, the matrix tends to be badly conditioned
and the ML solution becomes unstable. It is necessary to regu-
larize the solution. Alternatively, in a Bayesian context we will
impose constraints on by specifying a prior . We will
focus on Gaussian priors, of the general form
(5)
where indicates the determinant of a matrix. The posterior
distribution of , conditioned on the data and the hyper-param-
eters becomes
(6)
which is largest for the maximum a posteriori parameters
(7)
Note that this is also the ridge regression solution when is a
diagonal matrix with identical elements on the diagonal.
The matrix implements the constraints that we impose on
the model. Here, we want to obtain smooth filters, i.e. filters
such as neighboring parameters (e.g., and ) have similar
value. This corresponds to saying that neighboring filter param-
eters should be somehow correlated. Accordingly, will be the
inverse of a covariance matrix where the covariance is a de-
creasing function of the distance between two parameters
with (8)
In (8), the covariance decreases as a Gaussian parameterized by
and , but any nonnegative decreasing function of the distance
could be used. This corresponds to putting a Gaussian
process prior on the filter parameters themselves, rather than on
the predictions [19], [20].
With this expression, the MAP estimate of becomes
which can be efficiently
calculated as , avoiding
the additional inversion of . The resulting estimate
depends on three hyper-parameters: the noise level , the
strength of the prior and the smoothness factor . We will
see below how it is possible to estimate the values of these
parameters using a Bayesian argument.
1Values of
x
(
t
)
,
t<
1
can be treated in several ways. For a block design in-
volving baseline-activation-baseline patterns, they will naturally take the value
of the baseline. Alternatively, all
x
(
t
)
;t<
1
can be treated as nuisance param-
eters and integrated out of the model.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
1190 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 12, DECEMBER 2000
Fig. 1. Resulting filter using ridge regression (thin line) and the smooth FIR filter approach (thick line). Top: characteristic length
`
=
5 s; bottom:
`
=
7.5 s.
From left to right, increasing levels of regularization
g
=
=
. Data: one voxel from the visual cortex (voxel 429) displaying a large activation.
The hyper-parameter controls the smoothness of the re-
sulting filter. For large values of , will go to zero very
fast for increasing values of , such that there will be very
little correlation between parameters: the filter will be very un-
smooth. For , we recover the ridge regression solution.
For small values of , will stay close to one for all ,
indicating perfect correlation between the filter parameters. The
filter will be over-smooth. In the limit all parameters are
identical and the filter performs a local averaging of the stim-
ulus . It is useful to think of as corresponding to a “char-
acteristic length” of the filter, i.e., the typical length in which
the filter varies. The characteristic length can be defined here
as . This is quite useful in fMRI modeling because
it is widely believed on the basis of empirical studies [21], [3]
that the haemodynamic response has a characteristic length on
the scale of seconds, typically between 5 and 10 s. In first ap-
proximation it is then possible to use this prior information such
that corresponds to, e.g., 7 s. For an fMRI experiment where
s, this corresponds to 21 filter parameters.
Fig. 1 displays a comparison of smooth filters with the result
of ridge regression. It is quite clear that ridge regression does not
yield smooth filters, and the fluctuation in the parameters is im-
portant. It is possible to reduce this fluctuation by increasing the
amount of regularization (from left to right on Fig. 1), but this
also reduces the amplitude of the filter. This fluctuation means
that the filter contains high frequency components, which po-
tentially have a strong influence on summary statistics like the
maximum parameter or the delay, even though they seem to con-
tribute little to the modeling on average. The influence of the
smoothness factor is clear from the comparison between the top
and bottom row in the figure. The smooth filters in the bottom
Fig. 2. Comparison of the filters obtained by ridge regression, the smooth FIR
filter without boundary condition (thick solid) and the FIR filter with boundary
conditions (thick dashed). Notice how the endpoint goes (smoothly) to zero.
Data: one voxel from the visual cortex (voxel 429) displaying a large activation.
row are clearly smoother, as expected from the larger character-
istic length (the filters obtained from ridge regression are obvi-
ously identical).
A. Boundary Conditions
It is apparent from Fig. 1 that the first and (especially) the
last filter parameters can take clearly positive or negative values.
This cannot be avoided using the aboveequations. However, by
causality, all for should be zero, as the influence of
the stimulus at time , should be felt only for . This
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
GOUTTE et al.: MODELING THE HAEMODYNAMIC RESPONSE IN fMRI USING SMOOTH FIR FILTERS 1191
corresponds to saying that a hypothetical filter parameter
should be equal to zero. According to our prior, this will have
a decreasing influence on , , etc. which will be forced (by
smoothness) to be close to zero. Similarly, it is sensible that the
influence of an activation should vanish in the past, hence, van-
ishing filter parameters for large delays. This can be again im-
plemented by forcing an additional parameter to be zero.
In practice, we are still interested only in estimating the values
of filter parameters . This is done by defining
amatrix , such that
, for . The matrix is then
defined as the central part of , i.e., taking away the
first and last rows and columns. This operation can be easily
defined mathematically by introducing the matrix
, constructed as the superposition of a row of zeros on
top, a unit matrix in the central rows and a row of zeros
at the bottom. We then have .
Note that we cannot use the same trick as above to avoid in-
verting the parameter covariance matrix. However, is a band-
diagonal (Toeplitz) matrix, such as efficient methods exist to
perform an inversion in quadratic time instead of cubic for gen-
eral matrices. Furthermore, is usually quite small such that
inversion of a matrix is quite fast.
Fig. 2 shows the effect of the boundary conditions. The hyper-
parameters are set to the same values as the left-most bottom
plot on Fig. 1. The smooth FIR filter obtained above had clearly
negative values for the first delays as well as for the longer de-
lays (around 75). This effect disappears when boundary condi-
tions are used. In particular, for large delays, the coefficients go
smoothly to zero, as expected.
B. Link to Tikhonov Regularization
Regularization is often performed using Tikhonov regular-
ization, which imposes a constraint on derivatives of the target
function. In the context of this work, this would correspond
to imposing smoothness by constraining the derivatives of the
filter. The regularized solution is then obtained by minimizing
the penalized cost
(9)
where is the order of the derivatives used for smoothing.
Of course the true derivatives are unknown, such that we typ-
ically use instead the central differences approximation, where,
e.g., the first derivative (gradient) is approximated by the differ-
ence between neighboring filter coefficients:
. The regularized cost can then be formulated as
(10)
Note that we have kept the notation because equation (10) can
actually be obtained (up to an additive and a multiplicative con-
stants) as the negative logarithm of the product of the likelihood
(2) and the prior (5), i.e., as the log-posterior, and plays the
same role as the prior covariance. The expression of depends
Fig. 3. Comparison of the neighboring influences for Ridge regression,
Tikhonov regularization on the gradient [“Tikhonov (1)”] and the curvature
[“Tikhonov (2)”] and the smooth filter approach.
on the derivative used. For 1 (gradient) and 2 (curva-
ture), we have , with
.
.
.....
.
.
and
.
.
.....
.
.(11)
Tikhonov regularization is, thus, implemented by using a band-
diagonal regularization matrix . However, whereas for smooth
FIR filters has nonzero elements on (almost) all diag-
onals, the number of diagonals used by Tikhonov regularization
depends on the order of the derivative and the approximation
used. This means that whereas for smooth FIR filters the effect
of one given parameter is far reaching, it is really limited for
Tikhonov regularization (2 neighbors for here).
By plotting one row (or one column) of the regularization ma-
trix, one can picture the influence of the values of neighboring
parameters on the solution. This is done on Fig. 3 for ridge re-
gression, Tikhonov regularization on the gradient and curva-
tures, and the smooth FIR filter. The influence of the smooth
filter can be seen here as a generalization of the Tikhonov ap-
proach of regularizing on the (approximate) amplitude of higher
order derivatives.
C. Error Bars
The posterior distribution of makes it possible to estimate
the uncertainty of . Note however that (6) is a multivariate
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
1192 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 12, DECEMBER 2000
Fig. 4. fMRI time series (
+
) measured in one strongly activated voxel (voxel
429), with the activation estimated by the smooth FIR filter (solid) and the
prediction error bars on this activation, obtained from (12) (dotted). The error
bars appear slightly over estimated because the first run has a larger amplitude
than the last nine runs, thus inflating the apparent noise level.
Gaussian distribution with a general covariance matrix, such as
the individual components of are correlated. In that context it
is not easy to represent graphically the uncertainty. Conditional
error bars obtained from , , where
contains all filter parameters except , give a good idea of how
close the filter parameters should be from each other, but greatly
underestimate the possible range of variation of . This range
is well estimated by the marginally error bars obtained from
but these error bars overlook the fact that filter parameters are
very correlated to each other, such that it is impossible for ex-
ample that lies at the top of its marginal error bar while
lies at the bottom.
The conditional error bars are easily obtained from the co-
variance matrix of the posterior: where is the
th diagonal element of .
It is especially interesting to find error bars on the resulting
predictions of the MAP filter, i.e., the estimated (de-noised) re-
sponse pattern. Using the Gaussian noise assumption, the pos-
terior for the prediction associated with an input is also
Gaussian
(12)
This is illustrated for a particular time series (measured in the
visual cortex) in Fig. 4. We have plotted three runs out of 10, and
we see clearly that the data fits well inside the error bars. These
actually seem slightly overestimated, for two main reasons. First
the Gaussian assumption might be violated, though there is little
evidence on this data of outliers. Second, the impression is actu-
ally due to the fact that we represent only three of ten runs on the
figure. Over the 10 runs, 79 measurements exceed the interval
given by the estimate plus or minus 1.96 standard deviations.
This should be compared to an expected 5% of 1210, or 61.
The time series modeled by the smooth FIR filter shows a
clear post-activation undershoot, followed by an overshoot of
similar amplitude. It should be noted that this might be a pre-
processing artefact. Note also that (on this data at least) it is not
possible to observe an “initial dip.”
D. Significance of Activation
In the context of functional neuroimaging, it is not sufficient
to estimate the haemodynamic response in each location of the
brain. One has to use the estimated model in the purpose of
finding regions that are activated by a given stimulus sequence.
This is traditionally done by testing the null hypothesis of no
activation using various statistical tests. In the context of this
study, the null hypothesis takes the form of H , i.e.,
the filter parameters are identically equal to zero. The alterna-
tive hypothesis is H . In a Bayesian context, this
problem is fundamentally ill posed. The posterior probability
for each hypothesis H and H can, of course, be
estimated, but as H corresponds to a single point in parameter
() space, the associated volume is zero, yielding H
0 and rejection of the null-hypothesis in favor of the alternative
H.
In a Bayesian context, the comparison of a point hypothesis
with an interval hypothesis will, thus, usually lead to the adop-
tion of the latter. In order to derive a measure of support for our
point-null hypothesis in a Bayesian context, we will use the con-
cept of highest posterior density (HPD), described, e.g., by [22]
and used in a functional neuroimaging context by [12]. Given
a posterior density function , the HPD region of content
is the region of parameter space such that [22, section
2.8]
1) ;
2) , .
For a given significance level , we can test whether H lies
within the HPD region of content . If so, the null hypothesis
would be accepted at level . Otherwise H would be rejected
and the voxel declared activated. Alternatively, we can use the
HPD as a measure of support by calculating the volume of the
region , . This is the region in
parameter space that lies outside the equiprobability curve going
through 0. Clearly, if H is to be accepted, this volume will
be larger than , as the HPD contains . This region is large
when zero is close to the MAP (hence, should not be rejected),
and small when zero is far of the MAP (and should be rejected).
It, thus, seems justified to use the volume of as a measure of
support for the null hypothesis.
It is important to note that the use of the HPD to construct a
measure of support for point hypotheses is not exempt of some
of the logical flaws of other traditional measures of support like
-values or Bayes factors [23], [24]. In particular, it potentially
suffers from inconsistency in some pathological cases. How-
ever, it has been noted that in a number of standard situations, it
yields results that are similar to classical statistical tests [22].
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
GOUTTE et al.: MODELING THE HAEMODYNAMIC RESPONSE IN fMRI USING SMOOTH FIR FILTERS 1193
In our case, the posterior density of the filter parameter is
Gaussian, such that it is possible to get an efficient closed-form
solution for the measure of support of H in each voxel. With
the notation ,wehave
(13)
where gammainc is the two-parameter incomplete Gamma
function, and we have adopted the notation for the
support for H by similarity with traditional -values.
E. Estimation of the Delay
Whereas standard parametric models of the haemodynamic
response have one or several parameters representing the delay
(parameter of the Poisson, mean of the Gaussian, ratio of the
Gamma parameters), the FIR filter does not model this directly.
It is necessary to estimate the delay from the many filter param-
eters. One approach is to use the group delay described, e.g., by
Oppenheim and Schafer [25]
(14)
i.e., the average of the delay in each filter parameter, weighted
by the parameter values. In some situations, this estimate will be
unreliable. This is the case for example when the denominator of
(14) is small or when the filter has high frequency components.
Note that by construction, the smooth FIR filter contain only
low frequency components, such that the latter does not occur.
Furthermore, the denominator will take small values when the
mean filter coefficient is close to zero, indicating a nonactivated
voxel. Overall, the estimation of the group delay in activated
regions will give a reliable idea of the delay implemented by
the FIR filter.
A second interesting measure is the delay necessary to reach
90% percent of activation after onset of the stimulus, or to re-
turn within 10% of maximum activation after offset of the stim-
ulus. This delay has been reported to be between 5 and 8 s [21].
Note that linear filters implement symmetrical responses, such
that the shape of the activation (and, thus, delay) after stimulus
onset is identical to the deactivation after stimulus offset. For
block design involving binary baseline-activation stimulus, the
delay is easily calculated from the cumulative sums of the filter
parameters.
F. Tuning of Hyper-Parameters
For given values of , and , we have been able to give the
expression of the posterior (6) and derive the MAP estimate of
the filter parameters and some error bars. We will now see how
we can find proper values for these parameters, using again a
probabilistic approach. In a fully Bayesian approach, we would
integrate over “nuisance” parameters to obtain the posterior dis-
tribution of interest. If we are interested in , for example, we
would integrate over the three hyper-parameters, after endowing
them with suitable priors (i.e., reflecting prior knowledge or lack
thereof). In the context of this study, it is impractical to carry out
the marginalization analytically. Classical MCMC techniques
[26] are able to perform numerical integration, but these tech-
niques are computationally intensive and not practical for a full
brain (or even a full slice) analysis. We give a quick guideline
and an example of application in the appendix to this article.
In the following, we will use an intermediate approach,
and select the hyper-parameters according to their likeli-
hood . Note that using uniform priors
on the hyper-parameters,2the posterior distribution of
the hyper-parameters is proportional to the likelihood,
. The hyper-parameters
that we wish to optimize will, thus, be chosen so as to maximize
the likelihood, also known as the evidence. This is obtained by
integrating over the distribution of the weights
(15)
As the product of the two terms inside the integral has a
Gaussian form, integration can be performed analytically,
leading to
(16)
The evidence (16) can be optimized over several hyper-pa-
rameters using standard nonlinear optimization techniques [27].
As this can still be computationally intensive, we can use an ap-
proximation from the so-called “evidence framework” [28], [29,
sec. 10.4], which provides a re-estimation formula for the noise
level and the prior strength . In that framework, these hyper-
parameters can be estimated iteratively. For example, given a
noise level , we estimate the filter , and the noise
level is updated according to the resulting filter fit [29,
sec. 10.4].
The model size can also be thought of as playing the role
of a hyper-parameter. A sensible choice is to take sufficiently
large such that the corresponding filter contains the entire hemo-
dynamic response. In this paper we have chosen to take 60,
corresponding to 20 s. Additional experiments using 75
(corresponding to 25 s) showed little difference in the results.
As a comparison, the length of the filter in SPM99 (from the
file SPM_HRF.M)is32s.
IV. EXPERIMENTS
A. Can Smooth FIR Filter Estimate Standard Kernel Shapes?
In a first experiment, we look at the ability of the smooth filter
to recover the shape of traditional linear filters: the Poisson filter
proposed by [9], the Gamma filter of [10] and the Gaussian filter
[11].
We use the same sequence of 1210 images with ten runs of
31 baseline, 30 activation and 60 baseline. Hence, the paradigm
2Technically, the priors are uniform on the log-domain
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
1194 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 12, DECEMBER 2000
(a) (b)
(c)
Fig. 5. Smooth FIR filter obtained on data generated by convolution of a square wave by a fixed-shape kernel. (a) Poisson, (b) Gamma, and (c) Gaussian filters.
Top: paradigm (dotted), signal (dashed), noisy data (dots) and modeled signal (solid); Bottom: generating filter (dashed) and estimated smooth FIR filter (solid).
Notice that the target and modeled signals (top row) are almost indistinguishable.
is a vector with 1210 elements and consists of a series of square
waves. For all three filters, the mean is taken to be 18 images or
6 s, while the variance of the Gamma and Gaussian filters are set
equal to 70. The variance of the Poisson filter is by construction
equal to the mean, i.e., 18. All filter parameters where scaled
such that the amplitude of the signal was roughly the same as
what we observed in activated voxels in the actual experiment.
Additive white noise of variance 400 was added to the
convolved signal, giving a signal-to-noise ratio between 2.24
and 2.91 dB (i.e., the variance of the signal is only around 40%
larger than that of the noise), cf. Fig. 5.
Fig. 5 shows the results obtained for the three filters. The re-
sulting smooth FIR filter was better estimated in the case of
the Gamma and Gaussian generating filter. The Poisson filter
is more difficult to estimate because the “length scale” of the
filter varies widely depending on the delay. There are steep
changes in the filter coefficients around the maximum, while
the second half of the filter is virtually flat. So in a way this
is the least smooth of the three filters. Note that the estima-
tion of the Gamma filter is also slightly impaired for the same
reason: the variation in filter coefficients is faster before the
mode, and smoother afterwards (middle plot). Note however
that when taking into account the rather large amount of noise
in the data, the fit is quite satisfactory in all cases. Furthermore,
the misfit in the filter parameters does not prevent the smooth
FIR filter to model the data almost exactly (cf. top row in Fig.5).
This simulation indicates that for the three basic filter shapes,
the smooth FIR filter is able to recover the filter shape effi-
ciently. In particular, in all three cases studied here, the recov-
ered FIR filter showed little or no post-activation undershoot, in
accordance with the strictly positive target filters. This suggests
that any difference between the smooth FIR filter and these clas-
sical filters observed on real data is not due to the inability of
the FIR filter to reconstruct the true fMRI response, but rather
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
GOUTTE et al.: MODELING THE HAEMODYNAMIC RESPONSE IN fMRI USING SMOOTH FIR FILTERS 1195
(a) (b)
Fig. 6. (a) Smooth FIR filter obtained for a particular voxel (bold solid), together with the best fit obtained using a Poisson shape (solid), a Gamma shape (dash
dotted) and a Gaussian shape (dashed). (b) Fitted signal, for one run, of the four filters from (a), plotted together with the data, averaged across the ten runs.
(a) (b)
Fig. 7. Brain maps indicating the mean filter coefficient in each voxel. The colorbar is common to both images. The two maps are from the same subject, buttwo
different scanning sessions. The differing shapes are due to different alignments in the two sessions. Notice the good agreement between the activation patterns,
indicating good reproducibility. Voxels from the primary visual cortex (and to the lesser extent the supplementary visual cortex) display a strong (and asymmetric)
response to the stimulus. Notice that locally the haemodynamic response yields a “negative activation” (white dots above the activated area). (a) The horizontal
lines at row 14 and 66 indicate the voxels that we will study in more detail further down (cf. Fig. 8).
to the built-in limitations of these classical filters. In particular,
the modeling of the post-activation undershoot observed on real
data is not due to the Gaussian process prior used to constrain
the FIR coefficients, but reflects a feature that standard filter
shapes are unable to model.
B. What is the Shape of the Haemodynamic Response?
Let us now take the opposite standpoint and compare the
smooth filter obtained on real data to the best fit using the other
three standard kernel shapes. On the same data, we estimate the
maximum a posteriori filter parameters , as well as the
Poisson, Gamma and Gaussian filters with the best fit to the data.
The results are presented on Fig. 6, where we have plotted the
result of the smooth FIR filter together with the best fit obtained
using the three standard kernel shapes introduced above.
One obvious result is that the Poisson filter seems to be quite
inappropriate for estimating the haemodynamic filter. This is
due to the fact that the one-parameter filter has identical mean
and variance. In some particular cases, notably when TR is large
and the filter only covers a few images, this might not be too
limiting. However, this clearly introduces a strong constraint on
the shape of the filter, which leads to an inappropriate filter on
our data. The Poisson parameter is here 17.8, corresponding to
a mean activation delay of 5.9 s, which is reasonable. We would
on the other hand appreciate getting a wider filter as the three
other filters are wider, but due to the restriction of the Poisson
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
1196 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 12, DECEMBER 2000
(a) (b)
Fig. 8. The smooth FIR filters obtained on the rows indicated in black on Fig. 7 (row 14 and 66). The
X
-axis (VOXELS) runs along the “cut” indicated in Fig. 7,
the
Y
-axis (DELAY) runs along the delays in the FIR filter (like the
X
-axis in, e.g., Figs. 1 and 2). Notice the strong activation in row 66, in the middle of the
range, which corresponds to voxels from the primary visual cortex (V1). There is also a more limited response in the lateral areas. By contrast, the filters in row
14 are almost flat, indicating no activation.
filter, this would increase the activation delay beyond reason-
able. For comparison, the activation delays are 5.3 s and 5.9 s
for the Gamma and Gaussian filters, respectively, and 5.8 s for
the smooth FIR filter. But the width, measured by the standard
deviation, is 1.4 s for the Poisson filter, versus 1.8 s and 1.9 s
for Gamma and Gaussian.
A second salient feature is that, by construction, none of the
three basic filter shapes is able to model the post activation
undershoot evidenced by the smooth filter. The ability of the
Gaussian filter to model the first activation “bump” nicely, and
go to zero fast afterwards gives a slightly better fit to the data.
By construction, the Gamma filter is skewed and has signifi-
cant mass in the tail (i.e., for large delays). This proves to be
harmful as it introduces additional misfit around the post-acti-
vation undershoot (as the filter can not go to zero fast enough).
This is also the reason why the maximum of the filter seems to
be reached slightly ahead of what is expected. Because of the
skewness, the maximum of the filter is attained sensibly earlier
(at 4.7 s) than the mean (5.9 s). Furthermore, in order to mini-
mize the misfit around the post-activation undershoot, the mode
has to be shifted toward zero.
This result indicates that the smooth FIR filter will be able
to model additional features in the data, when traditional filter
shapes fail. This is important because we know from previous
studies (e.g., [17]) that there is a post-activation undershoot in
fMRI data. It might also be possible to model the initial negative
response if it is present in the data (cf. Section V-A).
C. Full Analysis
The slice that we study in this dataset contains 3891 voxels.
We estimate the smooth FIR filter in each voxel, using 60
delays. This leads to 3891 distinct filters. Note however that in
the MAP estimation procedure (7), the fMRI signal comes into
the pictures through the vector , while is identical for all
voxels. An important methodological question is whether the
hyper-parameters , and should be kept constant across
the brain or locally estimated. The local estimation requires a
significant increase in computation which makes the estimation
process impractical on current workstations for several thou-
sand voxels. Accordingly, we will here adopt a hybrid approach,
which is computationally easier. Parameter is fixed from a
priori knowledge to a value corresponding to a characteristic
length of 7 s; is set by iterative re-estimation, which converges
very fast; and is fixed for the whole brain to a value optimized
using the evidence on a given activated voxel.
Note that using a global set of hyper-parameters does not
imply that the filter itself should be constant. As argued by [10],
the characteristics of the filter should vary spatially. However, it
is desirable to impose some constraint on the filter such that the
filters would not vary beyond reasonable from one voxel to the
next. The use of global hyper-parameters, beyond its computa-
tional justification, forces such a high level constraint between
the filters.
The results are summarized in Fig. 7. As we have 60
filter parameters/voxel, it is necessary to design a summary
statistic in each voxel for presentation purposes. In Fig. 7
we present the mean filter coefficient. The rationale behind
this choice is that positive responses will display a positive
mean coefficient, even when the post-activation undershoot
is taken into account. Alternatively, we could use the most
extreme coefficient, or the mean absolute coefficient, but
the latter loses the sign of the activation, and we know
from previous studies [30], [18] that some areas display a
negative BOLD response to the stimulus.
In order to represent the filter themselves, additional dimen-
sions are required to accommodate the filter delay and the coef-
ficient values. Accordingly, we will illustrate the difference be-
tween the filters on two rows of voxels, one from a nonactivated
area (row 14), and one taken from a cut through the visual cortex
corresponding to row 66 in the summary images (Fig. 7). The re-
sults are presented in Fig. 8. In row 14, the filters are almost flat,
reflecting the fact that there is no activation. Small fluctuations
around zero reflect the presence of noise. On the other hand, the
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
GOUTTE et al.: MODELING THE HAEMODYNAMIC RESPONSE IN fMRI USING SMOOTH FIR FILTERS 1197
(a) (b)
Fig. 9. Brain maps indicating the log support
p
in each voxel, calculated using (13). Values have been thresholded at 10 and superimposed over areference
background. (a) Displays concentrated activation in the primary visual cortex, as well as in lateral areas. In (b), the lateral activation is more diffuse, on the other
hand artefactual false positives appear (see, e.g., arrow), probably due to movement effects. The colorbar indicates the (base 10) log of
p
.
filters identified in the voxels corresponding to the primary vi-
sual cortex (V1), in the middle of the range on Fig. 8(b), display
a strong positive activation, followed by a post-activation under-
shoot, modeled by a series of negative filter coefficients around
40 to 60 images delay. Voxels located in the lateral visual cortex
display a moderate positive activation. In some cases, the esti-
mated filter displays a corresponding under-shoot, but the am-
plitudes are so limited that the relevance of this feature is clearly
debatable.
Let us investigate the significance of activation using the
highest probability density approach outlined above. Fig. 9
presents the location of the voxels for which ,
superimposed on a background reference. This allows for a
quantitative characterization of the activation pattern outlined
above (Fig. 7). In both experiments there is a clear activation
in the primary visual cortex. Notice that the asymmetric nature
of the activation reproduces well in both experiments. Another
finding is that the negative activation that were apparent above
the main (positive) activation area turn out as highly significant
in Fig. 9. We also note that whereas some significant activation
is present on both sides of the lateral visual cortex in the
first experiment [d3711, Fig. 9(a)], only traces of significant
activation are observed in the second experiment. On the other
hand, experiment d3991 displays a higher number of scattered
artefactual activation, including a very consistent area [arrow
on Fig. 9(b)] which could be due to movement artefacts.
Finally, the activated area may seem larger than the cortical
area (visual cortex) especially for d3711. One factor explaining
this is the spatial blurring in the hemodynamic signal, which to
our knowledge is still imperfectly understood.
We will now characterize the delay in activation modeled
by the FIR filters, using the group delay [25] described earlier.
Fig. 10 plots the resulting delays on brain maps where only the
voxels that exceeded the threshold used in Fig. 9 ( ) are
retained, and superimposed on the background reference. There
is a similarity in the spread of the delays, as well as in the fact
that the group delay seems longer in the posterior region of acti-
vation. There is however a striking difference in the actual delay
values. In the first experiments, the delays range roughly be-
tween nine images (3 s) and 18 images (6 s), while in the second,
the range is between 15 and 24 images (5–8 s). This difference
could be the sign of an inconsistency in the time registration
during the experiments. The activation periods might have ac-
tually occurred earlier than registered in experiment d3711, or
later than registered in d3991, or a combination of both.
Apart from this possible inconsistency, Fig. 10 reveals that
the estimation of the group delay yields values that seem
biologically plausible, though noticeably smaller than the
on-line/off-line delay. The estimates seem to be locally similar,
but display noticeable differences at larger scale, some regions
reacting with shorter delays. This difference in activation delay
was previously spotted using clustering [18].
V. DISCUSSION
A. Initial Dip
In addition to modeling the post-activation undershoot, as
shown, e.g., in Fig. 6, the smooth FIR filter is potentially able
to model the initial negative response or initial dip [15], [16],
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
1198 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 12, DECEMBER 2000
(a) (b)
Fig. 10. Brain maps indicating the group delay measured in each activated voxel (
p
<
0
:
001
), cf. Fig. 9. The delays in the activated areas range from nine to 18
images for d3711 (3–6 s) and from 15 to 24 images for d3991 (5–8 s), indicating a possible inconsistency in time registration during the experiments. The colorbar
indicates delay (in images) between nine images (3 s) and 24 images (8 s).
[31]–[33], . Despite the relatively low field intensity, the ini-
tial dip was observed here in a number of voxels, mainly in ex-
periment d3991. Due to the temporal alignment problem that
we have uncovered above, it was difficult to check the repro-
ducibility of the observed negative response across the two ex-
periment.
However, we can report that this feature has been observed
in 20%–30% of the activated voxels, in both experiments, after
correction of the temporal alignment. A complete analysis of
the initial dip is out of the scope of the current article and will
be reported elsewhere.
B. Dependency on Neuronal Activation
The relationship between the experimental paradigm and the
observed images involves both a neuronal activation induced by
the paradigm, and the hemodynamic response to this activation,
which leads to the actual measurements. By using the paradigm
as input to the linear filter, we assume that the pattern of activa-
tion follows the paradigm closely. Although this is a reasonable
assumption is the context of a strong visual stimulation, in other
cases the stimulus might not be the same as the actual neuronal
activation.
This limitation is common to all the parametric models men-
tioned in this study and has, therefore, little relevance for the
comparison between these methods. It should, however, be kept
in mind when more complicated experiments (perhaps involving
cognitive tasks) are involved. To determine both the neuronal
activation and the hemodynamic response would require blind
deconvolution using a latent variable model. The model of [34]
is an attempt to do this. It is, however, outside of the scope of
this article.
C. Computational Issues
One of the biggest challenge of this approach lies in the prac-
tical implementation for whole brain analysis, or for the anal-
ysis of a reasonable subset of voxel, e.g., after sieving with an
omnibus F-test. While the calculation of the MAP estimate for a
given set of hyper-parameters , and is straightforward and
not more computationally intensive than traditional approaches
based on ridge regression or singular value decomposition, the
tuning of the hyper-parameter is usually time consuming. While
we have argued for example that the parameter controlling the
typical length scale could be set a priori to between 5 and 10 s
(7 s here), there is no guarantee that this is optimal in any sense.
A full nonlinear optimization over two or three hyper-param-
eters is feasible for a limited number of voxels, but too compu-
tationally demanding for a whole volume or even a slice with
current computing facilities. Similarly, sampling from the pos-
terior using an MCMC technique is only practical for a limited
subset of voxels.
One simplification would be the use of fixed hyper-parame-
ters for the whole volumes. In that case only one nonlinear op-
timization or Markov Chain would be needed to yield a set of
hyper-parameters which are applied to all voxels. Though some
researchers (e.g., [10]) have argued that the characteristics of the
haemodynamic response vary spatially, note that having fixed
hyper-parameters would allow the filters themselves to be spa-
tially different, while tying them at a higher level in a hierar-
chical manner. A drawback of this approach is that it might lead
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
GOUTTE et al.: MODELING THE HAEMODYNAMIC RESPONSE IN fMRI USING SMOOTH FIR FILTERS 1199
to averaging some of the characteristics like the noise level or
the length scale. Typically, nonactivated voxels could be mod-
eled using large length scales, corresponding to flat filters, while
activated voxels would benefit from the flexibility introduced by
smaller length scales. Note that this is not necessarily a problem
as far as the predictions themselvesare concerned, as suggested
by Fig. 5.
In the experiments described above, we have adopted an in-
termediate approach, where is set a priori to give a length
scale of 7 s, the regularization strength is optimized once and
for all based on some activated voxels, while the noise level
is estimated locally using an iterative procedure similar to the
re-estimation formulas in the so-called “evidence framework”
(e.g., [28], [29, sec. 10.4]). There is an obvious benefit in terms
of computationally time. In our Matlab implementation running
on a 450 Mhz Pentium II, the full estimation of the filters and
associated measures of support takes around 90 s for 4000
voxels. An added benefit of the local estimation of is that we
do not need to make the assumption that the noise is spatially
stationary.
VI. CONCLUSION
In this paper, the use of smooth FIR filters for analyzing
functional magnetic resonance imaging data was described.
Smoothness is implemented using a correlated Gaussian prior,
and analysis is carried on using Bayesian inference. The
smooth FIR filter has a number of advantages over standard
(Poisson, Gamma, Gaussian) parametric families for modeling
the haemodynamic response. In particular, it can model a long
post-activation undershoot or the initial negative response.
The generality and flexibility of the smooth FIR approach was
illustrated on simulated data. A full analysis of data acquired
during a visual stimulation experiment with high temporal
resolution was performed. The ability of the smooth FIR filter
to find activated regions was demonstrated using a measure of
support derived from the highest posterior density approach.
APPENDIX
SAMPLING VIA MCMC
As noted in Section III-F, an ideal Bayesian analysis would
not optimize parameters, but obtain distributions of the relevant
quantities by integrating over nuisance parameters. This can be
useful here is at least three contexts.
1) Marginalize the hyper-parameters to obtain the posterior
of the filter parameters , in order to obtain
maximum posterior parameters or the covariance of these
parameters;
2) Marginalize the hyper-parameters in the distribution of
the prediction (12), in order to obtain the distribution of
conditioned only on the actual data;
3) Obtain the posterior distribution of the hyper-parame-
ters conditioned on the data in order
to check for example whether the hyper-parameters are
well-determined by the data.
Numerical integration methods will be necessary for all
three problems, and can in principle be easily performed using
(a)
(b)
Fig. 11. (a) Histogram for the three hyper-parameters obtained from sampling
the posterior. (b) Bivariate samples of
and
h
(in the log domain) as dots,
superimposed on a contour plot of the joint density of
and
h
conditioned on
=
400
.
MCMC [26], in particular Metropolis–Hastings, or hybrid
Monte Carlo [35] if derivatives are available.
A. Priors
The first step is to setup some priors for the three hyper-pa-
rameters that we use here. We will put a Gamma prior on the
variance of the noise and prior strength . For normalized
data, we would take a mean of and a shape factor of
, because it is unlikely that the noise level far exceeds
the variance of the data, and there should be significant mass to-
ward zero in order to allow small noise levels (or little regular-
ization). Accordingly, we will scale the priorwith the empirical
variance calculated on the actual data
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
1200 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 12, DECEMBER 2000
For the length scale, we have strong a priori information
suggesting that typical length scales for the haemodynamic re-
sponse should be between 5 and 10 s. However, we want to
allow larger length scales, which might be useful for nonacti-
vated voxels, where the underlying filter should be uniformly
zero. Accordingly, we will model this prior information using
a log-normal distribution, such that the log of the characteristic
length covers over two standard deviations. This
leads to
B. Sampling from the Posterior
We will now illustrate how to sample from the posterior of the
hyper-parameters , in order to check how well
determined these hyper-parameters are. The posterior is easily
obtained (from Bayes’ rule) as it is proportional to the product
of the evidence (16) by the priors described previously. As the
derivative of the evidence with respect to is nontrivial, we will
simply use a Metropolis–Hastings algorithm [36], [26] with a
Gaussian proposal (in the log domain), which has the advantage
of being symmetric.
After setting the proposal such that we get a acceptance rate
between 50% and 60%, we run the chain for 1000 iterations
and discard the 50 first samples as “burn-in.” The histograms of
the sample distribution for the three hyper-parameters are pre-
sented in Fig. 11 (above). The log scale gives a good indica-
tion of the relative spread of the hyper-parameters around their
mean. Clearly, the noise level is very well determined by the
data. The prior strength is badly determined, meaning that a
wide range of values have large probability. The situation for the
length scale is somewhat intermediate. These results show that
it is sensible to optimize the noise level to a fixed value, as its
posterior distribution is close to a delta function. On the other
hand, it would be interesting to integrate over hyper-parameters
and , which have broader marginal distributions.
In Fig. 11(b), we investigate the joint distribution of and .
The background contour plot has been obtained from the expres-
sion of the posterior by setting .
As our previous investigation showed that the noise level is
well determined in the neighborhood of this value, this gives
a probably accurate description of the marginal joint distribu-
tion . The sample (dots on Fig. 11, right) seems
to support this approximation, and indicates that, due to corre-
lation between and , the hyper-parameters are slightly better
determined by the data than it is suggested by the marginal his-
tograms.
This result shows that obtaining a sample from the hyper-pa-
rameters posterior is potentially useful. Unfortunately, as indi-
cated earlier, it is not computationally possible to perform this
sampling on a large scale.
ACKNOWLEDGMENT
The authors would like to thank E. Rostrup for making his
visual stimulation dataset available to them. They also thank
J. Kershaw for stimulating discussions on highest posterior
density regions and C. Rasmussen for discussions on general
Bayesian matters.
REFERENCES
[1] P. A. Bandettini and R. W. Cox, “Functional contrast in event-related
fMRI: Interstimulus dependency and blocked design comparison,” in
Proceedings of the Fourth International Conference on Functional Map-
ping of the Human Brain. ser. NeuroImage, T. Paus, A. Gjedde, and A.
Evans, Eds. New York: Academic, May 1998, pt. 2 of 3, p. 522.
[2] C. G. Thomas and R. S. Menon, “Amplitude response and stimulus
presentation frequency response of human primary visual cortex using
BOLD EPI at 4 T,” Magn. Reson. Medicine, vol. 40, no. 2, pp. 203–209,
1998.
[3] R. B. Buxton, E. C. Wong, and L. R. Frank, “Dynamics of blood flow
and oxygenation changes during brain activation: The ballon model,”
Magn. Reson. Med, vol. 39, no. 6, pp. 855–864, 1998.
[4] G. M. Boynton, S. A. Engel, G. H. Glover, and D. J. Heeger, “Linear
systems analysis of functional magnetic resonance imaging in human
V1,” J. Neurosci., vol. 16, no. 13, pp. 4207–4221, 1996.
[5] J. R. Binder, S. M. Rao, T. A. Hammeke, J. A. Frost, P. A. Bandettini,
and J. S. Hyde, “Effects of stimulus on signal response during functional
magnetic-resonance-imaging of auditory-cortex,” Cogn. Brain Res., vol.
2, no. 1, pp. 31–38, 1994.
[6] A. M. Dale and R. L. Buckner, “Selective averaging of rapidly presented
individual trials using fMRI,” Human Brain Mapping, vol. 5, no. 5, pp.
329–340, 1997.
[7] M. D. Robson, J. L. Dorosz, and J. C. Gore, “Measurements of the tem-
poral fMRI response of the human auditory cortex to trains of tones,”
NeuroImage, vol. 7, no. 3, pp. 185–198, 1998.
[8] G. H. Glover, “Deconvolution of impulse response in event-related
BOLD fMRI,” NeuroImage, vol. 9, no. 4, pp. 416–429, 1999.
[9] K. J. Friston, P. Jezzard, and R. Turner, “The analysis of functional MRI
time-series,” Human Brain Mapping, vol. 1, pp. 153–174, 1994.
[10] N. Lange and S. L. Zeger, “Non-linear Fourier time series analysis for
human brain mapping by functional magnetic resonance imaging,” J.
Roy. Statistical Soc., ser. C, Appl. Stat., vol. 46, no. 1, pp. 1–30, 1997.
[11] J. C. Rajapakse, F. Kruggel, J. M. Maisog, and D. Y. von Cramon, “Mod-
eling hemodynamic response for analysis of functional MRI time-se-
ries,” Human Brain Mapping, vol. 6, pp. 283–300, 1998.
[12] J. Kershaw, B. A. Ardekani, and I. Kanno, “Application of Bayesian
inference to fMRI data analysis,” IEEE Trans. Med. Imag., vol. 18, pp.
1138–1153, Dec. 1999.
[13] M. S. Cohen, “Parametric analysis of fMRI data using linear systems
methods,” NeuroImage, vol. 6, no. 2, pp. 93–103, Aug. 1997.
[14] F. Å. Nielsen, L. K. Hansen, P. Toft, C. Goutte, N. Lange, S. C. Strother,
N. Mørch, C. Svarer, R. Savoy, B. Rosen, E. Rostrup, and B. Peter,
“Comparison of two convolution models for fMRI time series,” in
Friberg et al. [37], L. Friberg, A. Gjedde, S. Holm, N. A. Lassen,
and M. Nowak, Eds. New York: Academic, May 1997, pt. 2 of 4 in
NeuroImage, vol. 5, p. S473.
[15] R. S. Menon, S. Ogawa, X. Hu, J. P. Strupp, P. Anderson, and K. Ugurbil,
“BOLD based functional MRI at 4 Tesla includes a capillary bed con-
tribution: Echo-planar imaging correlates with previous optical imaging
using intrinsic signals,” Magn. Reson. Med, vol. 33, no. 3, pp. 453–459,
1995.
[16] E. Yacoub and X. Hu, “Detection of the early negative response in fMRI
at 1.5 tesla,” Magn. Reson. Med, vol. 41, no. 6, pp. 1088–1092, 1999.
[17] G. Krüger, A. Kleinschmidt, and J. Frahm, “Dynamic MRI sensitized
to cerebral blood oxygenation and flow during sustained activation of
human visual cortex,” Magn. Reson. Med, vol. 35, no. 6, pp. 797–800,
1996.
[18] C. Goutte, L. K. Hansen, M. G. Liptrot, and E. Rostrup, “Fea-
ture space clustering for fMRI meta-analysis,” IMM, Tech. Rep.
IMM-REP-1999-13, 1999.
[19] C. E. Rasmussen, “Evaluation of Gaussian processes and other methods
for nonlinear regression,” Ph.D. dissertation, Dept. Comput. Sci., Univ.
Toronto, Toronto, Canada, 1996.
[20] C. K. I. Williams, “Prediction with Gaussian processes: From linear re-
gression to linear prediction and beyond,” in Learning and Inference in
Graphical Models, M. I. Jordan, Ed. Norwell, MA: Kluwer, 1998.
[21] P. A. Bandettini, A. Jesmanowicz, E. C. Wong, and J. S. Hyde, “Pro-
cessing strategies for time-course data sets in functional MRI of the
human brain,” Magn. Reson. Med, vol. 30, no. 2, pp. 161–173, August
1993.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
GOUTTE et al.: MODELING THE HAEMODYNAMIC RESPONSE IN fMRI USING SMOOTH FIR FILTERS 1201
[22] G. E. P. Box and G. C. Tiao, Bayesian Inference in Statistical Anal-
ysis. New York: Wiley, 1992.
[23] M. J. Schervish,
P
-values: What they are and what they are not,” Amer.
Statistician, vol. 50, no. 3, pp. 203–206, Aug. 1996.
[24] M. Lavine and M. J. Schervish, “Bayes factors: What they are and what
they are not,” Amer. Statistician, vol. 53, no. 2, pp. 119–122, May 1999.
[25] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Pro-
cessing. Englewood Cliffs, NJ: Prentice-Hall, 1989.
[26] D. J. C. MacKay, “Introduction to Monte Carlo methods,” in Learning
in Graphical Models. ser. NATO SCIENCE: D Behavioral and Social
Sciences, M. I. Jordan, Ed. Dordrecht, The Netherlands: Kluwer Aca-
demic, 1998, vol. 89.
[27] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery,
Numerical Recipes in C, 2nd ed. Cambridge, U.K.: Cambridge Univ.
Press, 1992.
[28] D. MacKay, “A practical Bayesian framework for backprop networks,”
Neural Computation, vol. 4, pp. 448–472, 1992.
[29] C. M. Bishop, Neural Networks for Pattern Recognition. Oxford,
U.K.: Clarendon, 1995.
[30] C. Goutte, P. Toft, E. Rostrup, F. Å. Nielsen, and L. K. Hansen, “On
clustering fMRI time series,” NeuroImage, vol. 9, no. 3, pp. 298–310,
1999.
[31] X. Hu, T. H. Le, and K. Ugurbil, “Evaluation of the early response
in fMRI in individual subjects using short stimulus duration,” Magn.
Reson. Med, vol. 37, no. 6, pp. 877–884, 1997.
[32] E. Yacoub, T. H. Le, K. Ugurbil, and X. Hu, “Further evaluation of
the initial negative response in functional magnetic resonance imaging,”
Magn. Reson. Med, vol. 41, no. 3, pp. 436–441, 1999.
[33] G. M. Hathout, B. Varjavand, and R. K. Gopi, “The early response in
fMRI: A modeling approach,” Magn. Reson. Med, vol. 41, no. 3, pp.
550–554, 1999.
[34] P. A. d. F. R. Højen-Sørensen, L. K. Hansen, and C. E. Rasmussen,
“Bayesian modeling of fMRI time series,” in Advances in Neural In-
formation Processing Systems 12: Proceedings of the 1999 Conference,
S. A. Solla, T. K. Leen, and K.-R. Müller, Eds. Cambridge, MA: MIT
Press, 2000, pp. 754–760.
[35] R. M. Neal, Bayesian Learning for Neural Networks. New York:
Springer, 1996, vol. 118 of Lecture Notes in Statistics.
[36] S. Chib and E. Greenberg, “Understanding the Metropolis–Hastings al-
gorithm,” The American Statistician, vol. 49, no. 4, pp. 327–335, 1995.
[37] L. Friberg, A. Gjedde, S. Holm, N. A. Lassen, and M. Nowak, Eds., Pro-
ceedings of the Third International Conference on Functional Mapping
of the Human Brain. New York: Academic, May 1997, part 2 of 4 in
NeuroImage.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 30, 2009 at 10:29 from IEEE Xplore. Restrictions apply.
... Deconvolution and methods alike are aiming to estimate neuronal activity by undoing the blurring effect of the hemodynamic response, characterized as a hemodynamic response function (HRF). 1 Given the inherently ill-posed nature of hemodynamic deconvolution, due to the strong temporal low-pass characteristics of the HRF, the key is to introduce additional constraints in the estimation problem that are typically expressed as regularizers. For instance, the so-called Wiener deconvolution O R I G I N A L R E S E A R C H A R T I C L E is expressing a "minimal energy" constraint on the deconvolved signal and has been used in the framework of psychophysiological interaction analysis to compute the interaction between a seed's activity-inducing timecourse and an experimental modulation (5)(6)(7)(8)(9). ...
... with up to second-order harmonics per cardiac (f c , i ) and respiratory (f r , i ) component that were randomly generated following normal distributions with variance 0.04 and mean if r and if c , for i = [1,2]. We set the fundamental frequencies to f r = 0.3 Hz for the respiratory component (71) and f c = 1.1 Hz for the cardiac component (72). ...
... Note that the term deconvolution is also alternatively employed to refer to the estimation of the hemodynamic response shape assuming a known activityinducing signal or neuronal activity(1)(2)(3)(4). ...
Article
Full-text available
Deconvolution of the hemodynamic response is an important step to access short timescales of brain activity recorded by functional magnetic resonance imaging (fMRI). Albeit conventional deconvolution algorithms have been around for a long time (e.g., Wiener deconvolution), recent state-of-the-art methods based on sparsity-pursuing regularization are attracting increasing interest to investigate brain dynamics and connectivity with fMRI. This technical note revisits the main concepts underlying two main methods, paradigm free mapping and total activation, in the most accessible way. Despite their apparent differences in the formulation, these methods are theoretically equivalent as they represent the synthesis and analysis sides of the same problem, respectively. We demonstrate this equivalence in practice with their best-available implementations using both simulations, with different signal-to-noise ratios, and experimental fMRI data acquired during a motor task and resting state. We evaluate the parameter settings that lead to equivalent results and showcase the potential of these algorithms compared to other common approaches. This note is useful for practitioners interested in gaining a better understanding of state-of-the-art hemodynamic deconvolution and aims to answer questions that practitioners often have regarding the differences between the two methods.
... HRF estimation with a data-driven approach or a smoothness constraint has been previously explored. The adopted methods included various constraints through, for example, Gaussian process prior (Goutte et al., 2000;Ciuciu et al., 2003), cubic smoothing splines (Zhang et al., 2007) Tikhonov regularization (Zhang et al., 2007;Casanova et al., 2008;Casanova et al. 2009;Zhang et al. 2012), spatial regularization (Badillo et al., ...
... .02.13.528362 doi: bioRxiv preprint 2013Chaari et al., 2013;Zhang et al., 2018), cross validation (Zhang et al., 2013), nonlinear optimization (Pedregosa et al., 2015). Some of these methods were limited to individual-level modeling (Goutte et al., 2000;Goutte et al., 2000;Ciuciu et al., 2003;Zhang et al., 2007;Chaari et al., 2013;Pedregosa et al., 2015), and their applicability, performance, and computational feasibility of these applications at the population level remain to be explored. Other methods were confined at region level (Chaari et al., 2013;Zhang et al., 2012;Badillo et al., 2013;Zhang et al., 2013;Zhang et al., 2018) or resorted to information extraction through dimension reduction of HRF to two or three morphological features (Zhang et al., 2012;Zhang et al., 2013). ...
... .02.13.528362 doi: bioRxiv preprint 2013Chaari et al., 2013;Zhang et al., 2018), cross validation (Zhang et al., 2013), nonlinear optimization (Pedregosa et al., 2015). Some of these methods were limited to individual-level modeling (Goutte et al., 2000;Goutte et al., 2000;Ciuciu et al., 2003;Zhang et al., 2007;Chaari et al., 2013;Pedregosa et al., 2015), and their applicability, performance, and computational feasibility of these applications at the population level remain to be explored. Other methods were confined at region level (Chaari et al., 2013;Zhang et al., 2012;Badillo et al., 2013;Zhang et al., 2013;Zhang et al., 2018) or resorted to information extraction through dimension reduction of HRF to two or three morphological features (Zhang et al., 2012;Zhang et al., 2013). ...
Preprint
Full-text available
Typical FMRI analyses assume a canonical hemodynamic response function (HRF) with a focus on the overshoot peak height, while other morphological aspects are largely ignored. Thus, in most reported analyses, the overall effect is reduced from a curve to a single scalar. Here, we adopt a data-driven approach to HRF estimation at the whole-brain voxel level, without assuming a profile at the individual level. Then, we estimate the BOLD response in its entirety with a smoothness constraint at the population level to improve predictive accuracy and inferential efficiency. Instead of using just the scalar that represents the effect magnitude, we assess the whole HRF shape, which reveals additional information that may prove relevant for many aspects of a study, as well as for cross-study reproducibility. Through a fast event-related FMRI dataset, we demonstrate the extent of under-fitting and information loss that occurs when adopting the canonical approach. We also address the following questions: 1) How much does the HRF shape vary across regions, conditions, and clinical groups? 2) Does an agnostic approach improve sensitivity to detect an effect compared to an assumed HRF? 3) Can examining HRF shape help validate the presence of an effect complementing statistical evidence? 4) Could the HRF shape provide evidence for whole-brain BOLD response during a simple task?
... Previous work on HRF estimation has explored data-driven approaches. Some mnethods adopted a smoothness constraint using, for example, a Gaussian process prior ( Goutte et al., 2000;Ciuciu et al., 2003;Eickenberg et al., 2017 ), cubic smoothing splines ( Zhang et al., 2007 ), B-splines ( Degras and Lindquist, 2014 ), a canonical HRF combined with its temporal derivative ( Elbau et al., 2018 ), wavelet bases ( Van De Ville et al., 2004;Khalidov et al., 2011 ), a biophysically informed HRF ( Rosa et al., 2015 ), Tikhonov regularization ( Zhang et al., 2007;Casanova et al., 2008;Casanova et al. 2009;Zhang et al. 2012 ), spatial regularization ( Badillo et al., 2013;Chaari et al., 2013;Zhang et al., 2018 ), cross validation ( Zhang et al., 2013 ), and nonlinear optimization ( Pedregosa et al., 2015 ). Some of these methods were applied to individual-level modeling for task-based experiments ( Goutte et al., 2000;Ciuciu et al., 2003;Zhang et al., 2007;Chaari et al., 2013;Pedregosa et al., 2015 ) and for resting state data ( Wu et al., 2021 ); Cherkaoui et al., 2021 ). ...
... Some mnethods adopted a smoothness constraint using, for example, a Gaussian process prior ( Goutte et al., 2000;Ciuciu et al., 2003;Eickenberg et al., 2017 ), cubic smoothing splines ( Zhang et al., 2007 ), B-splines ( Degras and Lindquist, 2014 ), a canonical HRF combined with its temporal derivative ( Elbau et al., 2018 ), wavelet bases ( Van De Ville et al., 2004;Khalidov et al., 2011 ), a biophysically informed HRF ( Rosa et al., 2015 ), Tikhonov regularization ( Zhang et al., 2007;Casanova et al., 2008;Casanova et al. 2009;Zhang et al. 2012 ), spatial regularization ( Badillo et al., 2013;Chaari et al., 2013;Zhang et al., 2018 ), cross validation ( Zhang et al., 2013 ), and nonlinear optimization ( Pedregosa et al., 2015 ). Some of these methods were applied to individual-level modeling for task-based experiments ( Goutte et al., 2000;Ciuciu et al., 2003;Zhang et al., 2007;Chaari et al., 2013;Pedregosa et al., 2015 ) and for resting state data ( Wu et al., 2021 ); Cherkaoui et al., 2021 ). Other methods have been adopted at the region level ( Chaari et al., 2013;Zhang et al., 2012;Badillo et al., 2013;Zhang et al., 2013;Zhang et al., 2018 ) or developed for information extraction through information reduction of HRF to two or three morphological features ( Zhang et al., 2012;Zhang et al., 2013 ). ...
Article
Typical fMRI analyses often assume a canonical hemodynamic response function (HRF) that primarily focuses on the peak height of the overshoot, neglecting other morphological aspects. Consequently, reported analyses often reduce the overall response curve to a single scalar value. In this study, we take a data-driven approach to HRF estimation at the whole-brain voxel level, without assuming a response profile at the individual level. We then employ a roughness penalty at the population level to estimate the response curve, aiming to enhance predictive accuracy, inferential efficiency, and cross-study reproducibility. By examining a fast event-related FMRI dataset, we demonstrate the shortcomings and information loss associated with adopting the canonical approach. Furthermore, we address the following key questions: 1. To what extent does the HRF shape vary across different regions, conditions, and participant groups? 2. Does the data-driven approach improve detection sensitivity compared to the canonical approach? 3. Can analyzing the HRF shape help validate the presence of an effect in conjunction with statistical evidence? 4. Does analyzing the HRF shape offer evidence for whole-brain response during a simple task?
... For task-based fMRI the underlying effects can be further described based on properties of local hemodynamic response functions (HRF) estimated from the data [49][50][51]. Finite impulse response models [52] are an example of such HRF estimation methods. Possibilities for presenting such HRF results range from unthresholded mapping of quantitative HRF properties to detailed local depiction of HRF curves and quantitative HRF comparison based on a-priori knowledge of the brain regions involved in the processing of a task [49]. ...
Article
Full-text available
Many functional magnetic resonance imaging (fMRI) studies and presurgical mapping applications rely on mass-univariate inference with subsequent multiple comparison correction. Statistical results are frequently visualized as thresholded statistical maps. This approach has inherent limitations including the risk of drawing overly-selective conclusions based only on selective results passing such thresholds. This article gives an overview of both established and newly emerging scientific approaches to supplement such conventional analyses by incorporating information about subthreshold effects with the aim to improve interpretation of findings or leverage a wider array of information. Topics covered include neuroimaging data visualization, p-value histogram analysis and the related Higher Criticism approach for detecting rare and weak effects. Further examples from multivariate analyses and dedicated Bayesian approaches are provided.
... For trial-level neural pattern similarity analyses we constructed similar GLMs as we did for the univariate analyses, but modeled each trial of interest as a separate regressor in line with the Least Squares Separate approach 73 . Specifically, an instance of either autobiographical memory recall or directed eCFT was modeled using a finite impulse response (FIR) function 74,75 with three 4-s windows (i.e., two TRs; see Supplementary Fig. S1b). The FIR modeling approach was chosen over the canonical double-Gamma hemodynamic response function to better characterize the temporal dynamics of recollections and mental simulations over an extended period of time (i.e., 12 s), as in other studies of autobiographical memory with extended recall phases 76,77 . ...
Article
Full-text available
Episodic counterfactual thinking (eCFT) is the process of mentally simulating alternate versions of experiences, which confers new phenomenological properties to the original memory and may be a useful therapeutic target for trait anxiety. However, it remains unclear how the neural representations of a memory change during eCFT. We hypothesized that eCFT-induced memory modification is associated with changes to the neural pattern of a memory primarily within the default mode network, moderated by dispositional anxiety levels. We tested this proposal by examining the representational dynamics of eCFT for 39 participants varying in trait anxiety. During eCFT, lateral parietal regions showed progressively more distinct activity patterns, whereas medial frontal neural activity patterns became more similar to those of the original memory. Neural pattern similarity in many default mode network regions was moderated by trait anxiety, where highly anxious individuals exhibited more generalized representations for upward eCFT (better counterfactual outcomes), but more distinct representations for downward eCFT (worse counterfactual outcomes). Our findings illustrate the efficacy of examining eCFT-based memory modification via neural pattern similarity, as well as the intricate interplay between trait anxiety and eCFT generation.
... It is important to stress that p(w|y, X,λ,ν) is conditional on the point estimates ofλ 142 andν and that it may fail to be even close to a reasonable approximation of p(w, ν, λ|X, y) dν dλ 143 [44]. properties of the BOLD signal [47]. It can similarly be relevant to assume smoothness across spectral 188 channels in analyses that model relations between average stimulus power spectra and BOLD fMRI 189 responses. ...
Preprint
Full-text available
Regression is a principal tool for relating brain responses to stimuli or tasks in computational neuroscience. This often involves fitting linear models with predictors that can be divided into groups, such as distinct stimulus feature subsets in encoding models or features of different neural response channels in decoding models. When fitting such models, it can be relevant to impose differential shrinkage of the different groups of regression weights. Here, we explore a framework that allow for straightforward definition and estimation of such models. We present an expectation-maximization algorithm for tuning hyperparameters that control shrinkage of groups of weights. We highlight properties, limitations, and potential use-cases of the model using simulated data. Next, we explore the model in the context of a BOLD fMRI encoding analysis and an EEG decoding analysis. Finally, we discuss cases where the model can be useful and scenarios where regularization procedures complicate model interpretation.
... A popular technique for modeling the hemodynamic response function related to specific events is the finite impulse response (FIR) technique, which estimates response amplitudes at each time point of a specific time window. FIR is more flexible than using a parametric filter shape (Goutte et al., 2000) as it is not biased towards the hemodynamic response function used in traditional fMRI analyses. However, one limitation of FIR models is the risk of overfitting the model due to the large number of parameters, which can lead to noise being modeled instead of signal. ...
Preprint
Full-text available
Cortical function is complex, nuanced, and involves information processing in a multimodal and dynamic world. However, previous functional magnetic resonance imaging (fMRI) research has generally characterized static activation differences between strictly controlled proxies of real-world stimuli that do not encapsulate the complexity of everyday multimodal experiences. Of primary importance to the field of neuroimaging is the development of techniques that distill complex spatiotemporal information into simple, behaviorally relevant representations of neural activation. Herein, we present a novel 4D spatiotemporal clustering method to examine dynamic neural activity associated with events (specifically the onset of human faces in audiovisual movies). Results from this study showed that 4D spatiotemporal clustering can extract clusters of fMRI activation over time that closely resemble the known spatiotemporal pattern of human face processing without the need to model a hemodynamic response function. Overall, this technique provides a new and exciting window into dynamic functional processing across both space and time using fMRI that has wide applications across the field of neuroscience.
... After the modified time series is obtained for each ROI, a Finite Impulse Response (FIR) analysis is performed to extract the hemodynamic response for each ROI. FIR modeling is a model-free approach to obtain the hemodynamic response from the time series data using the GLM framework [29]. The design matrix for FIR model consists of train of impulses at successive TRs. ...
Article
Full-text available
In this article, we try to explore and understand the neurodynamics of the decision-making process for mobile application downloading. We begin the model development in a rather unorthodox fashion. Patterns of brain activation regions are identified, across participants, at different time instance of the decision-making process. Region-wise activation knowledge from previous studies is used to put together the entire process model like a cognitive jigsaw puzzle. We find that there are indeed a common dynamic set of activation patterns that are consistent across people and apps. That is to say that not only are there consistent patterns of activation there is a consistent change from one pattern to another across time as people make the app adoption decision. Moreover, this pattern is clearly different for decisions that end in adoption than for decisions that end with no adoption.
Article
Regression is a principal tool for relating brain responses to stimuli or tasks in computational neuroscience. This often involves fitting linear models with predictors that can be divided into groups, such as distinct stimulus feature subsets in encoding models or features of different neural response channels in decoding models. When fitting such models, it can be relevant to allow differential shrinkage of the different groups of regression weights. Here, we explore a framework that allow for straightforward definition and estimation of such models. We present an expectation-maximization algorithm for tuning hyperparameters that control shrinkage of groups of weights. We highlight properties, limitations, and potential use-cases of the model using simulated data. Next, we explore the model in the context of a BOLD fMRI encoding analysis and an EEG decoding analysis. Finally, we discuss cases where the model can be useful and scenarios where regularization procedures complicate model interpretation.
Article
Full-text available
It is typically assumed that large networks of neurons exhibit a large repertoire of nonlinear behaviours. Here we challenge this assumption by leveraging mathematical models derived from measurements of local field potentials via intracranial electroencephalography and of whole-brain blood-oxygen-level-dependent brain activity via functional magnetic resonance imaging. We used state-of-the-art linear and nonlinear families of models to describe spontaneous resting-state activity of 700 participants in the Human Connectome Project and 122 participants in the Restoring Active Memory project. We found that linear autoregressive models provide the best fit across both data types and three performance metrics: predictive power, computational complexity and the extent of the residual dynamics unexplained by the model. To explain this observation, we show that microscopic nonlinear dynamics can be counteracted or masked by four factors associated with macroscopic dynamics: averaging over space and over time, which are inherent to aggregated macroscopic brain activity, and observation noise and limited data samples, which stem from technological limitations. We therefore argue that easier-to-interpret linear models can faithfully describe macroscopic brain dynamics during resting-state conditions.
Article
Optical imaging studies have provided evidence of an initial increase in deoxyhemoglobin following the onset of neuronal stimulation/activation and demonstrated that this initial increase could be spatially more specific to the site of neuronal activity. These studies also raised the possibility of improving the specificity of fMRI by selective mapping of this early response. Previous MR studies reported the observation of this early response but were limited in scope and not in full agreement. This paper presents a more extensive study that (a) demonstrates the initial signal decrease in individual subjects and (b) examines its dependence on stimulus duration and subject. Binocular visual stimulation experiments were performed on 14 subjects using echo-planar imaging (EPI) with high temporal resolution. An initial signal decrease was consistently observed in regions that were more localized than those displaying the delayed positive response. In agreement with previous fMRI and optical imaging findings, the maximum signal decrease was 1-2% and occurred at approximately 2 s after the onset of the stimulus, depending on the subject. For stimulus longer then 3.0 s, the temporal dynamics and the amount of signal change of the early response was essentially independent of the stimulus duration, while the delayed response and the post-stimulus undershoot increased both in terms of magnitude and rise time as the duration of the stimulus increased; this observation is concordant with the recent optical imaging study.
Article
Two features distinguish the Bayesian approach to learning models from data. First, beliefs derived from background knowledge are used to select a prior probability distribution for the model parameters. Second, predictions of future observations are made by integrating the model's predictions with respect to the posterior parameter distribution obtained by updating this prior to take account of the data. For neural network models, both these aspects present diiculties | the prior over network parameters has no obvious relation to our prior knowledge, and integration over the posterior is computationally very demanding. I address the problem by deening classes of prior distributions for network param-eters that reach sensible limits as the size of the network goes to innnity. In this limit, the properties of these priors can be elucidated. Some priors converge to Gaussian processes, in which functions computed by the network may be smooth, Brownian, or fractionally Brownian. Other priors converge to non-Gaussian stable processes. Interesting eeects are obtained by combining priors of both sorts in networks with more than one hidden layer.
Conference Paper
This paper describes a sequence of Monte Carlo methods: importance sampling, rejection sampling, the Metropolis method, and Gibbs sampling. For each method, we discuss whether the method is expected to be useful for high-dimensional problems such as arise in inference with graphical models. After the methods have been described, the terminology of Markov chain Monte Carlo methods is presented. The chapter concludes with a discussion of advanced methods, including methods for reducing random walk behaviour.
Book
IntroductionInferences Concerning a Single Mean from Observations Assuming Common Known VarianceInferences Concerning the Spread of a Normal Distribution from Observations Having Common Known MeanInferences When Both Mean and Standard Deviation are UnknownInferences Concerning the Difference Between Two MeansInferences Concerning a Variance RatioAnalysis of the Linear ModelA General Discussion of Highest Posterior Density RegionsH.P.D. Regions for the Linear Model: A Bayesian Justification of Analysis of VarianceComparison of ParametersComparison of the Means of k Normal PopulationsComparison of the Spread of k DistributionsSummarized Calculations of Various Posterior Distributions
Article
P values (or significance probabilities) have been used in place of hypothesis tests as a means of giving more information about the relationship between the data and the hypothesis than does a simple reject/do not reject decision. Virtually all elementary statistics texts cover the calculation of P values for one-sided and point-null hypotheses concerning the mean of a sample from a normal distribution. There is, however, a third case that is intermediate to the one-sided and point-null cases, namely the interval hypothesis, that receives no coverage in elementary texts. We show that P values are continuous functions of the hypothesis for fixed data. This allows a unified treatment of all three types of hypothesis testing problems. It also leads to the discovery that a common informal use of P values as measures of support or evidence for hypotheses has serious logical flaws.