ArticlePDF Available

How much information is associated with a particular stimulus?

Authors:

Abstract and Figures

Although the Shannon mutual information can be used to reveal general features of the neural code, it cannot directly address which symbols of the code are significant. Further insight can be gained by using information measures that are specific to particular stimuli or responses. The specific information is a previously proposed measure of the amount of information associated with a particular response; however, as I show, it does not properly characterize the amount of information associated with particular stimuli. Instead, I propose a new measure: the stimulus-specific information (SSI), defined to be the average specific information of responses given the presence of a particular stimulus. Like other information theoretic measures, the SSI does not rely on assumptions about the neural code, and is robust to non-linearities of the system. To demonstrate its applicability, the SSI is applied to data from simulated visual neurons, and identifies stimuli consistent with the neuron's linear kernel. While the SSI reveals the essential linearity of the visual neurons, it also successfully identifies the well-encoded stimuli in a modified example where linear analysis techniques fail. Thus, I demonstrate that the SSI is an appropriate measure of the information associated with particular stimuli, and provides a new unbiased method of analysing the significant stimuli of a neural code.
Content may be subject to copyright.
INSTITUTE OF PHYSICS PUBLISHING NETWORK: COMPUTATION IN NEURAL SYSTEMS
Network: Comput. Neural Syst. 14 (2003) 177–187 PII: S0954-898X(03)52703-8
How much information is associated with a particular
stimulus?
Daniel A Butts
Department of Neurobiology, Harvard Medical School, 220 Longwood Avenue, Boston,
MA 02115, USA
E-mail: daniel
butts@hms.harvard.edu
Received 21 August 2002, in final form 24 October 2002
Published 28 January 2003
Online at stacks.iop.org/Network/14/177
Abstract
Although the Shannon mutual information can be used to reveal general features
of the neural code, it cannot directly address which symbols of the code are
significant. Further insight can be gained by using information measures that
are specific to particular stimuli or responses. The specific information is a
previously proposed measure of the amount of information associated with a
particular response; however, as I show, it does not properly characterize the
amount of information associated with particular stimuli. Instead, I propose
anewmeasure: the stimulus-specific information (SSI), defined to be the
average specific information of responses given the presence of a particular
stimulus. Like other information theoretic measures, the SSI does not rely on
assumptions about the neural code, and is robust to non-linearities of the system.
To demonstrate its applicability, the SSI is applied to data from simulated visual
neurons, and identifies stimuli consistent with the neuron’s linear kernel. While
the SSI reveals the essential linearity of the visual neurons, it also successfully
identifies the well-encoded stimuli in a modified example where linear analysis
techniques fail. Thus, I demonstrate that the SSI is an appropriate measure of
theinformation associated with particular stimuli, and provides a new unbiased
method of analysing the significant stimuli of a neural code.
1. Introduction
Information theory provides measures for comparing different coding schemes of neurons
and neural ensembles, avoiding both biases caused by preconceptions of the neural code and
complications inherent in analysing neuronal systems with non-linearities and under conditions
of complex stimuli. As a result, information theory has been used to analyse neural data in a
variety of sensory systems(for a review,see Borst and Theunissen 1999). Such studies typically
make comparisons between the Shannon mutual information given different classifications of
0954-898X/03/020177+11$30.00 © 2003 IOP Publishing Ltd Printed in the UK 177
178 DAButts
stimulus and response ensembles, but do not address which stimuli and responses within these
ensembles are significant in information transmission.
As aresult, DeWeese and Meister (1999) proposed an information theoretic measure of
the significance of particular symbols in the neural code: the specific information.Thespecific
information of a particular response is defined as the reduction in uncertainty in the stimulus
gained by the observation of that response. Since the mutual information represents the average
reduction in the uncertainty of the stimulus gained by one measurement, specific information
is intuitively a good representation of the degree to which a given response contributes to the
overall mutual information. Furthermore, DeWeese and Meister (1999) show that specific
information has unique properties that are appropriate for a measure of the information of a
response.
Specific information can be applied to both particular stimuli and particular responses due
to the symmetry between stimulus and responseininformation measures. However, because of
the asymmetry of stimulus and response with respect to causality (i.e., stimuli cause responses
and not vice versa), here I show that specific information does not provide a good measure
of stimulus significance. I propose a new measure, the ‘stimulus-specific information’ (SSI),
which is defined to be the average reduction in uncertainty of one observation given a particular
stimulus.
Since the definition of the SSI relies on a measure of the information associated
with responses (i.e., specific information), I will first define specific information (proposed
by DeWeese and Meister 1999). Using a simple example, I will then show that it does not
provide a good measure of the information associated with particular stimuli, and motivate the
definition of the SSI.
To demonstrate its effectiveness, the SSI is applied to data from realistic simulations of
neurons in the lateral geniculate nucleus (LGN) presented with full-field white-noise stimuli
(Keat et al 2001), and to a modified version of this model that defies typical linear analyses.
In both cases, the SSI identifies the most significant stimuli and offers additional insight into
the underlying neural code.
2. Decomposing mutual information into response-specific information
Consider a system presented with an ensemble of stimuli S and whose behaviour can be
classified into a set of responses R.The Shannon mutual information between the stimulus S
and response R ensembles of this system is given by
I [R, S] =
sS
r R
p(s, r) log
2
p(s, r)
p(s) p(r)
(1)
where p(s, r) is the joint probability distribution, i.e., the probability of simultaneously
observing the stimulus s S and the response r R.The joint probability distribution
p(s, r) can be computed by counting the frequency of each stimulus–response pair over a
sufficient amount of time, over which the ‘natural ensemble of stimuli is sampled with the
probability given by the prior distribution p(s).
While the mutual information can be used to objectively evaluate different coding schemes
through different classifications of the stimulus and response ensembles (S and R), it represents
an average over the entire set of stimuli s S and responsesr R.Itisoftenofinterest to know
whichparticular stimuli are effectively encoded by the system, and which particular responses
communicate information about the stimuli. Such questions can be addressed by decomposing
I [R, S]into measures that represent the contributions of specific stimuli or responses to the
Stimulus-specific information 179
mutual information, i.e.
I [R, S] =
sS
p(s)i(s) =
r R
p(r)i(r ). (2)
In this sense, the mutual information is explicitly a weighted average over individual
contributions from particular stimuli or particular responses.
There are an arbitrary number of ways to perform such decompositions, meaning that there
is not one single measure that represents the ‘specific’ information associated with a particular
stimulus or response. As a result, an appropriate measure must be chosen that properly signifies
therole of particular stimuli and responses in information transmission.
DeWeese and Meister (1999) argue that the information of a response must be additive,
since intuitively, information should accumulate over consecutive independent measurements
such that the total information from multiple measurements is equal to the sum of information
gained from each measurement separately. They show that there is only one decomposition of
mutual information that is additive, the specific information,givenby
i
sp
(r) = H [S] H [S|r](3)
where the entropy of the prior distribution is given by H [S] =−
s
p(s) log
2
p(s) and
theconditional entropy associated with a particular response r is given by H [S|r] =
s
p(s|r) log
2
p(s|r).
The specific information has a straightforward intuitive meaning with regards to the process
of information transmission. Entropy is a measure of the uncertainty in probability distributions
(see Cover and Thomas 1991): broad distributions have large entropy, and narrow distributions
(where the variable in question is more localized to particular values) have little entropy. The
specific information is a difference in entropy of stimulus distributions (equation (3)), and thus
represents the amount that the initial uncertainty of the stimulus is reduced by observing the
response r.Furthermore,specific information is calibrated to mutual information: the specific
information is a direct measure of the amount learned by a given measurement, the mutual
information is the amount learned by a measurement on average overall possible responses.
Thus, the specific information serves as an information theoretic measure through which
thesignificance of different responses can be compared. Responses with a high specific
information reduce the uncertainty in the stimulus the most and are thus most significant to
thesystem, and responses that do not reduce the amount of uncertainty in the stimulus are less
significant.
3. The significance of astimulus is different to that of a response
Shannon’s mutual information is a statistical measure of the interdependence of two random
variables, which, in the study of sensory systems, is usually stimulus and response. As a
result of this generality, expressions for information theoretic quantities do not distinguish
between stimulus and response (for example, see equation (1)), meaning that tools applicable
to responses can be applied to stimuli. For example, the specific information of a stimulus can
be calculated by interchanging response and stimulus in equation (3):
i
sp
(s) = H [R] H [R|s]. (4)
Is the specific information an appropriate measure of the information associated with a
particular stimulus? Below I show that, due to the asymmetry of stimulus and response with
respect to causality, a stimulus that is significant to a system does not have the same properties
as a significant response.
180 DAButts
Consider a simple joint probability distribution p(r, s) where there are only two stimuli
and two responses, with the probabilities of each stimulus–response pair given by
Without a measurement, the probability of stimulus s
1
is
3
4
and the probability of s
2
is
1
4
,i.e.,
the prior distribution is p(s) ={
3
4
,
1
4
} with an entropy H [S] = 2
3
4
log
2
3 = 0.81 bits.
Before addressing the question of stimulus significance, consider the significant responses
of this system. Each response is equally likely, but conveys different amounts of information
about the stimuli. Observation of the first response r
1
designates an ambiguous situation, with
equal probability of either stimulus: p(s|r
1
) ={
1
2
,
1
2
}: H [S|r
1
] = 1bit. The second response,
however, clearly designates s
1
,sincep(s|r
2
) ={1, 0}: H [S|r
2
] = 0bits. The information
content of each response is reflected in its specific information, with i
sp
(r
1
) =−0.19 bits
(uninformative), and i
sp
(r
2
) = H [S] = 0.81 bits. The total mutual information (equation (1))
can be calculated from the specific informations as their weighted average over the ensemble
of responses (equation (2)): I [R, S] = 0.31 bits.
This example was devised to place an intuitive notion of stimulus significance at odds
with the specific information applied to stimuli. With s
2
present, the response is completely
predictable (r
1
), whereas the presence of s
1
could result in either response. As a result, the
specific information applied to stimuli is higher for s
2
: i
sp
(s
1
) = 0.08 bitsversus i
sp
(s
2
) = 1bit.
However, recall that responses encode information about stimuli,andnot the reverse.
Though s
2
has the maximal specific information, neither response specifies it: observation of
r
1
givesanequalprobability of s
1
and s
2
,andobservation of r
2
unambiguously designates s
1
.
As aresult, from an information transmission standpoint, s
1
is encoded more effectively than
s
2
,incontrast to their specific informations.
Why does specific informationfailtoselect the more effectively encoded stimulus?
Specific information is largest for those stimuli that have few responses associated with them,
without regard to whether these responses are informative. For example, consider a neuron
(such as the example discussed later in this paper) that is unresponsive (does not fire) to many
stimuli. These stimuli would have a large specific information because ‘not firing’ is the only
response associated with them. But observing a ‘not firing’ response would be very ambiguous
since there are many stimuli that do not cause the neuron to fire. Thus, the specific information
of a stimulus that the neuron does not respond to would be relatively high, but one would not
say that it was well encoded.
4. Stimulus-specific information
Itherefore propose that the most informative stimuli are those that cause the most informative
responses, and are thus well encoded by the system. The responses associated with a stimulus
s are given by the conditional distribution p(r|s),andtheinformation conveyed by a particular
response r is given by its specific information i
sp
(r).Thus, I propose an information theoretic
measure of stimulus significance called the SSI, given by
i
ssi
(s)
r
p(r|s)i
sp
(r) =
r
p(r|s){H [S] H [S|r]}. (5)
Since the specific information i
sp
(r) is the reduction in uncertainty of the stimulus gained by
the particular observation r R,theSSI i
ssi
(s) is the average reduction of uncertainty gained
from one measurement given the stimulus s S.
Stimulus-specific information 181
Stimulus
Response
7.8 ms
L
S
=3
L
R
=1
(t=0.5)
-1 -1 +1
λ
11
A. B.
-100 -80 -60 -40 -20 0
0
0.1
0.2
0.3
0.4
Latency λ (ms)
Mutual Information (bits)
L
S
=10
L
R
=1
(t=0.5)
Figure 1. Mutual information between full-field flicker stimuli and neuronal spikes. (A) A three-
frame stimulus word [1, 1, +1] and a one-frame response word [1, 1] are associated at a given
latency λ between the end of the stimulus word and beginning of the response word. (B) Mutual
information is calculated between stimulus and response given a choice for L
S
= 10, L
R
= 1, and
t = 1/2, and is shown as a function of latency λ.
When calculated using the example of the last section, we see that i
ssi
(s
1
) = 0.48 bits, and
i
ssi
(s
2
) =−0.19 bits. Like the specific information, the weighted average of i
ssi
(s) over the
stimulus ensemble s S gives the mutual information, and the value of the SSI is calibrated to
the mutual information and specific information. For example, the value of i
ssi
(s
1
) is consistent
with the fact that an observation completely determines the stimulus (giving close to 1 bit of
information) half of the time, and otherwise is not informative. The SSI of s
2
demonstrates
that the only possible measurement that can result when s
2
is presented is uninformative.
5. Simulated visual neurons
To demonstrate the use of SSI, I perform an information theoretic calculation on simulated
visual neurons. Data were generated using a model proposed by Keat et al (2001), which
accurately reproduces the timing of single action potentials as well as their statistical
distribution over multiple trials. Using model parameters that simulate the behaviour of cat
LGN neurons, spike trains were generated in response to random full-field flicker stimuli
presented at 128 Hz (see methods).
Information quantities between the full-field flicker stimulus S and resulting spike trains
R can be calculated as a function of several parameters: the length of the stimulus word L
S
,
thelength and time resolution of the response word (L
R
and t), and the latency λ between
them (see figure 1(A)). This method of calculating information is similar to those of Liu et al
(2001). For this paper, I will use L
S
= 10 frames, L
R
= 1 frame, and t = 0.5 frames, where
L
S
is chosen to be as large as possible given the limited number of data (see methods), and
L
R
and t are sufficiently representative of results found with longer response words (data
not shown). General questions about the dependence of mutual information on stimulus and
response word length and resolution are addressed elsewhere (Liu et al 2001).
The mutual information I [R, S]isshowninfigure 1(B) as a function of latency λ.It
peaks at a latency of 16.5ms, when the average response carries 0.37 bits about the stimulus.
Since I [R, S] can be decomposed into SSI (see equation (2)), the total mutual information can
be thought of as an average that includes both stimuli that are well encoded and those that are
not. To determine the contribution of each stimulus to the total mutual information, the SSI is
calculated for each of the 2
10
= 1024 stimuli.
182 DAButts
A
B
C
D
E
Stimulus-Specific Information (bits)
Number of Stimuli
A.
B.
Specific Information (bits)
0 1 2 3 4 5 6
0
2
4
6
8
10
AB
C
DE
-1.5 -1 -0.5 0 0.5
AC
-100 -80 -60 -40 -20 0
SSI-Averaged Stimulus
Spike-Triggered Average
Time before response (ms)
-0.4
0.0
0.4
0.8
-0.1
0.0
0.1
0.2
Figure 2. Information measures of particular stimuli. (A) SSI i
ssi
(s) and specific information i
sp
(s)
are calculated for the 2
10
stimuli at a latency λ of 16.5msandtheirdistributions are displayed as
histograms in the left and right frames respectively. Bins with greater than 10 elements are cut off to
focus on the few outlying stimuli. The five stimuli with the largest SSI are labelled AE and shown
in the inset of the left frame. Two of them, A and C,havethelowest specific information (right
frame). (B) To identify the features of the stimulus ensemble that are well encoded, an average
stimulus is calculated by weighting each stimulus by its SSI (circles). This SSI-weighted average
stimulus is approximately proportional to the spike-triggered average stimulus of the neuron up to
ascalefactor (solid curve). Note that the spike-triggered average stimulus is scaled so it is directly
comparable to the units of average SSI.
The left panel of figure 2(A) shows a histogram of the SSI distribution of the stimulus
ensemble. The minimum SSI was 0.085 bits (leaving the lowest bin between 0 and 0.05 bits
empty). However, over half the stimuli (599 out of 1024) have SSIs in the next lowest bin
(between 0.05 and 0.1 bits). On the other extreme, the top five stimuli are well separated from
the rest of the distribution, and are shown in the inset. Note that the best’ stimulus (A)isa
simple off-to-on transition which occurs at 47.8ms. Otherstimuli with lower SSI have a
slightly different off-to-on latency (C and D), or an ‘on’ frame instead of an ‘off frame at
large latencies (B and E).
For comparison, the right panel of figure 2(A) shows the distribution of specific
informations for the same stimuli (i
sp
(s),givenbyequation (4)). Specific information gives
nearly the opposite classification of stimuli as SSI: 486 out of 1024 stimuli are clustered in
the largest occupied bin (0.74 bits). At the same time, stimuli with some of the highest SSIs
have some of the lowest specific informations; stimuli A and C have the two lowest specific
informations (figure 2(A), right), and stimuli B, D,andE are lost in the broader distribution.
Stimulus-specific information 183
As explained above, this is due to the fact that many stimuli rarely elicit a spike, meaning that
they are associated with only one response (‘no spike’), and have high specific information.
At thesame time, stimuli that often result in a spike cause an uncertain response since they
may or may not elicit a spike.
Of course, the five stimuli with the largest SSIs are only presented 0.5% of the time,
and almost 90% of the stimuli have an SSI less than 1 bit but account for almost half of
the informationconveyed. To determine the important features in the stimulus ensemble
that lead to informative responses, I calculate an average stimulus based on each stimulus’s
fractional contribution to the mutual information p(s)i
ssi
(s)/I[R, S]. This ‘SSI-weighted
average stimulus is shown as circles in gure 2(B), and is compared to a more common
measure of significant stimuli: the spike-triggered average stimulus (STA), shown as a solid
curve.Notethat these results apply for a given latency λ =−16.5ms,meaningthat since the
SSI-weighted average has a resolution of one frame (7.8 ms), there are ten discrete points in
this average (L
S
= 10). The SSI-weighted average calculatedforother latencies is in similar
agreement to the STA (data not shown).
The close agreement between the spike-triggered and SSI-weighted average stimulus
results from a combination of two factors:
(1) the ability particular stimuli to evoke spikes in the Keat model is proportional to a linear
convolution with the stimulus (see the methods for more details); and
(2) the majority of information in the responses of this neuron is carried by spikes (found by
calculating the specific information of responses i
sp
(r) of equation (3)—data not shown).
As aresult of these two conditions, the SSI of a given stimulus is roughly proportional to the
number of spikes that were associated with it, i.e., i
ssi
(s) p(spike|s)i
sp
(spike) p(spike|s).
Thus, each stimulus contributes to the spike-triggered and SSI-weighted averages in proportion
to the number of spikes that each stimulus evokes, resulting in the same shape shown in
gure 2(B).
6. Specific surprise identifies a different aspect of stimulus significance
Thus, the SSI gives an appropriate characterization of the ‘best encoded’ stimuli, in stark
contrast to the specific information. However, as mentioned earlier, there are many possible
stimulus-specific decompositions of mutual information (equation (2)), some of which might
provide alternative but reasonable classifications. For example, in their discussion of a measure
of the information associated with individual symbols (i.e., stimuli and responses), DeWeese
and Meister (1999) consider the ‘specific surprise’, given by (when applied to stimuli)
i
sur
(s) =
r
p(r|s) log
2
p(r|s)
p(r)
. (6)
Note that this is the Kullback–Leibler distance between the conditional probability p(r|s) and
the marginal distribution p(r).Below,Idemonstrate that the specific surprise provides an
alternative measure of stimulus significance, but it highlights other properties of the stimulus
in a way that lacks the intuitive meaning of the SSI as a measure of the information associated
with a stimulus.
Specific surprise, in the form of equation (6), compares the marginal distribution p(r) to
the conditional distribution p(r|s).Asdiscussed above in relation to specific information, such
184 DAButts
Specific Surprise (bits)
Number of Stimuli
A. B.
0 1 2 3 4 5 6
0
2
4
6
8
10
AB
C
D
E
A
B
C
D
E
-100 -80 -60 -40 -20 0
-0.1
0.0
0.1
0.2
Averaged Stimulus
Spike-Triggered Average
Time before response (ms)
Surprise-Averaged
SSI-Averaged
Figure 3. The specific surprise. (A) The distribution of specific surprises over the stimulus
ensemble is calculated for same visual neuron as in figure 2. Though the distribution is very similar
to the SSI distribution (figure 2(A), left)—with the same top five stimuli, there are subtle differences,
such as their order (ABDEC). (B) The specific-surprise-weighted average stimulus (filled
circles) is compared to the spike-triggered average stimulus (solid curve, scaled for the best fit) and
the best fit of the SSI-weighted average from figure 2(B) (dotted curve).
acomparison is not intuitively meaningful with respect to causality. However, using Bayes’
law, specific surprise can be re-expressed as
i
sur
(s) =
r
p(r|s)
log
2
1
p(s)
log
2
1
p(s|r)
. (7)
The logarithm of the reciprocal probability of a stimulus is often referred to as its ‘surprise’,
since rarer stimuli are more ‘surprising’. In this form, specific surprise has a causal meaning:
the reduction in surprise of a particular stimulus gained from each response, averaged over
all responses associated with that stimulus.Thus, whereas the SSI weighs each response r
by its effect over the stimulus ensemble (through the change of H [S]toH [S|r]), the specific
surprise measures of the effect of the response on just the stimulus in question (through the
change of p(s) to p(s|r )).
How does using a different response weight change the evaluation of stimulus significance?
In figure 3(A), the specific surprise is calculated for the simulated visual neuron. It provides
classifications of the stimuli similar to the SSI, and identifies the same top five stimuli, though
with a different order (ABDEC). The subtle differences between the two measures in
this example are reflected in a comparison between the SSI-weighted average of figure 2(B)
and the specific-surprise-weighted average, calculated in the same way, shown in gure 3(B).
The specific-surprise-weighted average is shown as solid circles, and the best scaling of the
spike-triggered average is shown as a solid curve. In the meantime, since the SSI-weighted
average fit the STA almost exactly (figure 2(B)), theSTAscaled to the SSI-weighted average
is shown as a dotted curve.Notethat the shape of the surprise-weighted average stimulus is
adifferent shape to the STA, and it has a smaller magnitude compared to the SSI-weighted
average, meaning that larger specific surprise does not as closely correspond to particular
features of the stimulus.
The specific surprise and the SSI behave more distinctly in the simple example of the 2 ×2
joint probability distribution described earlier in this paper. In this case, the specific surprise is
higher for the second stimulus s
2
(which only had one ambiguous response r
1
associated with
it): i
sur
(s
1
) = 0.08 bits and i
sur
(s
2
) = 1bit. This contrasts with the intuitive notion that s
1
is
better encoded than s
2
,sinces
2
is never clearly designated by a response. Does this mean that
specific surprise is a bad measure of a well-encoded stimulus?
Stimulus-specific information 185
0 1 2 3 4 5 6
0
2
4
6
8
10
A'
C'
B'B'
D'
Stimulus-Specific Information (bits)
Number of Stimuli
A'C' B'D'
Figure 4. A non-linear neuron. The distribution of the SSI for the 2
10
stimuli is calculated for a
non-linear neuron whose spike-triggered average is zero. This neuron is designed such that opposite
stimuli evoke the same response, resulting in pairs of stimuli that have the same SSI (inset).
In fact, it is simply a different measurewith a different interpretation: while s
1
is
unambiguously denoted by an observation of r
2
(changing its probability:
3
4
1), the
measurement of r
1
decreases its likelihood (
3
4
1
2
), nearly cancelling the change in surprise:
i
sur
(s
1
) = 0.08 =
1
3
(0.59) +
2
3
(0.42) bits. In the meantime, the specific surprise of s
2
is
relatively large (i
sur
(s
2
) = 1bit) because the only response associated with s
2
is r
1
,andit
leads to an increase in the probability of observing s
2
:
1
4
1
2
.Thus, each possible response
contributes to the specific surprise of a particular stimulus s in relation to how much its
observation increases the likelihood of s.
Why do the specific surprise and the SSI give similar results with the visual neuron? For
aresponse to be informative (i.e., have a large specific information i
sp
(r)), the entropy of
its conditional distribution p(s|r) must be less than that of the prior p (s),meaning that the
probability of some stimuli must be increased. As a result, in many cases, an informative
response will result in the largest changes in probability of the stimuli that are best encoded,
and the specific surprise and the SSI will be qualitative agreement. However, as the simple
2 × 2examplemade clear, this is not always the case.
So,the specific surprise and the SSI are measuring fundamentally different aspects of
stimulus significance. Since the SSI is explicitly measuring an average difference in entropies,
it has intuitive meaning with regard to information transmission. As a result, the SSI in both
the simple 2 × 2example and the visual neuron fulfil this intuition of what a well-encoded
stimulus is, in contrast to the specific surprise.
7. Non-linear neurons
In the example of the visual neuron, the SSI correctly identified the significant stimuli to a
neuron, in a way that was validated by linear analysis of the neuron, i.e., the spike-triggered
average. In this way, the agreementbetween the spike-triggered and SSI-weighted average
stimulus demonstrated that the neuron is essentially linear. Thus, while this example validates
the performance of the SSI, it makes clear that linear analysis would have been sufficient in
identifying the significant stimuli in this case.
Information theory is most useful in studying neurons with non-linear properties, or those
that do not encode the bulk of their information in single spikes. To demonstrate the potential
utility of the SSI, the visual neuron model discussed above is modified to exhibit non-linear
186 DAButts
behaviour. This modified model retains the same spiking properties, but is designed such
that it has, on average, the same response to a stimulus as it does to the opposite stimulus
(see the methods for more details). For example,anon-to-off transition will now evoke the
same number of spikes as an off-to-on transition with the same latency. As a result, the spike-
triggered average stimulus of this neuron is zero, since a spike is just as likely to be evoked by a
particular stimulus as its opposite. In this case, important stimuli can only be identified through
higher-order statistics, such as the spike-triggered covariance used by Arcas et al (2000).
Of course, differences in higher-order statistics are detected by information theoretic
measures, and the SSI distribution for the modified non-linear neuron is shown in figure 4
(calculated in the same way as the distribution of figure 2(A)). Thesixstimuli with the largest
SSI are shown in the inset, demonstrating the pairing of each stimulus with its opposite.
8. Conclusions
Ihaveshownthatthe SSI is an appropriate and reliable measure of the information associated
with aparticular stimulus. As with other information measures, the SSI is calculated without
particular assumptions about the coding scheme, and is robust to non-linearities in the system
being studied. As a result, the SSI is particularly useful in identifying the stimuli that are
significant to a neuron where linear analyses break down. The SSI is also useful in cases
where neural responses other than individual spikes carry information, though such examples
were not considered in this paper.
Unfortunately, unlike the specific information of a response proposed by DeWeese and
Meister (1999), the SSI does not have a mathematical quality that shows it to be the only
possible measure of information associated with a stimulus. As previously discussed, there
are many possible stimulus-specific decompositions of mutual information; as a result, I have
chosen a decomposition that has an intuitive meaning with respect to information transmission,
and furthermore have shown it gives expected results in both simple constructed examples and
realistic examples using neuronal data.
The SSI has the same drawbacks as other information measures: a significant number
of data are required in order to properly estimate the underlying probability distributions
needed by such calculations. However, since the same data as are used to calculate Shannon’s
mutual information can also be used for the specific information of responses and the SSI of
stimuli, these specific measures can extend the applicability and use of information studies in
neuroscience.
Methods
Simulated visual neurons
Data for the information calculations performed on the example visual neuron in this paper
wasgenerated using a model of visual neurons proposed by Keat et al (2001). This
phenomenological model isable to reproduce both the precise spike timing and variability
of observed cat LGN neurons, and thus generated ‘realistic’ neuronal data for the purpose of
evaluating information quantities.
The simulated neuron is presented as either black or white frames (with equal probability)
at 128 Hz, and generates a neural spike train. The basis of the neural response is a linear
convolution between the stimulus s(t) and a linear kernel K ):
g(t) =
0
−∞
dτ K )s(t τ).
Stimulus-specific information 187
This function of time g(t) is modified by adding correlated Gaussian noise and a term that
represents spike-dependent effects, which accounts for a neuronal refractory period. Spikes
occur when the total of g(t), noise, and spike-dependent effects exceeds a threshold.
Forthe information calculations, a large data set consisting of 32 million stimulus frames
(representing roughly 70 h in real time) was used. This copious data set allowed for an
information analysis of long stimulus and response words simultaneously, though information
theoretic analysis is possible with far fewer data (see Liu et al 2001).
Non-linear neurons
To generate a neuron with a spike-triggered average of zero, the same model as above was
used, with one modification: the linear convolution g(t) wassquared before plugging it into
therest of the model. As a result, opposite stimuli, which result in g
0
and g
0
in the linear
model, now have the same effect on the response: (g
0
)
2
.Otherparameters of the model were
scaled so that this model had the same overall spike rate.
Acknowledgments
Iamgrateful to Mark Goldman for extensive input and comments on this manuscript. This
work was supported by an NSF Postdoctoral Fellowship in Biological Informatics.
References
Arcas B A Y, Fairhall A L and Bialek W 2000 What can a single neuron compute? Adv. Neural Inform. Process. Syst.
13 75–81
Borst A and Theunissen F E 1999 Information theory and neural coding Nature Neurosci. 2 947–57
Cover T M and Thomas J A 1991 Elements of Information Theory (New York: Wiley)
Dan Y, Alonso J M, Usrey W M and Reid R C 1998 Coding of visual information by precisely correlated spikes in
the lateral geniculate nucleus Nature Neurosci. 1 501–7
DeWeese M R and Meister M 1999 How to measure the information gained from one symbol Network: Comput.
Neural Syst. 10 325–40
Keat J, Reinagel P, Reid R C and Meister M 2001 Predicting every spike: a model for the responses of visual neurons
Neuron 30 803–17
LiuRC, Tzonev S, Rebrick S and Miller K D 2001 Variability andinformation in a neural code of the cat lateral
geniculate nucleus J. Neurophysiol. 86 2789–806
Reinagal P and Reid R C 2000 Temporal coding of visual information in the thalamus J. Neurosci. 20 5392–400
Theunissen F, Roddey J C, Stufflebeam S, Clague H and MillerJP1995 Information theoretic analysis of dynamical
encoding by four identified primary sensory interneurons in the cricket cercal system J. Neurophysiol. 75 1345–64
... A fundamental question on neurons endowed with such tuning curves relates to the relationship between the tuning curve and the information transfer profile of the neuron across feature values. Although this relationship has been explored in neural responses across different sensory modalities (Bezzi, Samengo, Leutgeb, & Mizumori, 2002;Butts, 2003;Butts & Goldman, 2006;DeWeese & Meister, 1999;Montgomery & Wehr, 2010), the question on the relationship between spatial information transfer and spatial tuning curve within the place field of hippocampal place cells has not been quantitatively assessed. Neurons in the hippocampus receive spatial information about a given arena and a substantial fraction of them respond to different spatial locations in the same arena (Andersen, Morris, Amaral, Bliss, & O'Keefe, 2006;Moser, Kropff, & Moser, 2008;Moser, Moser, & McNaughton, 2017;Moser, Rowland, & Moser, 2015;O'Keefe, 1976;O'Keefe & Dostrovsky, 1971). ...
... An ideal information metric that fulfills this requirement is the stimulus-specific information (SSI), a measure that was specifically defined to convey the amount of information that the responses of a neuron convey about a particular stimulus. SSI is defined as the average specific information across all the neural firing rates that are elicited when the animal traverses a particular spatial location, with specific information referring to the information that a particular firing rate response provides about which spatial location was being traversed (Butts, 2003;Butts & Goldman, 2006;DeWeese & Meister, 1999;Montgomery & Wehr, 2010). We employed SSI as the principal metric to assess the relationship between rate-based spatial tuning curves and spatial information transfer. ...
... To calculate the SSI, the spatial stimulus and the firing rate response were segregated into 80 and 40 bins, respectively. The SSI was calculated using the expression given below (Butts, 2003;Butts & Goldman, 2006;Montgomery & Wehr, 2010): ...
Article
Full-text available
The relationship between the feature-tuning curve and information transfer profile of individual neurons provides vital insights about neural encoding. However, the relationship between the spatial tuning curve and spatial information transfer of hippocampal place cells remains unexplored. Here, employing a stochastic search procedure spanning thousands of models, we arrived at 127 conductance-based place-cell models that exhibited signature electrophysiological characteristics and sharp spatial tuning, with parametric values that exhibited neither clustering nor strong pairwise correlations. We introduced trial-to-trial variability in responses and computed model tuning curves and information transfer profiles, using stimulus-specific (SSI) and mutual (MI) information metrics, across locations within the place field. We found spatial information transfer to be heterogeneous across models, but to reduce consistently with increasing levels of variability. Importantly, whereas reliable low-variability responses implied that maximal information transfer occurred at high-slope regions of the tuning curve, increase in variability resulted in maximal transfer occurring at the peak-firing location in a subset of models. Moreover, experience-dependent asymmetry in place-field firing introduced asymmetries in the information transfer computed through MI, but not SSI, and the impact of activity-dependent variability on information transfer was minimal compared to activity-independent variability. We unveiled ion-channel degeneracy in the regulation of spatial information transfer, and demonstrated critical roles for N-methyl-d-aspartate receptors, transient potassium and dendritic sodium channels in regulating information transfer. Our results demonstrate that trial-to-trial variability, tuning-curve shape and biological heterogeneities critically regulate the relationship between the spatial tuning curve and spatial information transfer in hippocampal place cells.
... By 27 observing the activity of individual neurons, it became evident that different stimulus frequencies did not evoke equal 28 responses (Fig. 1F), suggesting that fS1 neurons carry information about vibration frequency in the number of spikes 29 they fire. 30 31 To explicitly test the latter hypothesis, we computed the stimulus-specific information (SSI) as a measure of frequency 32 encoding by the evoked Δf/f 0 responses (Butts, 2003;Butts and Goldman, 2006). The majority of responding neurons 33 (>85% of 725 cells, 19 mice, 42 fields of view) were found to be informative about frequency (i.e. had responses with 34 significant SSI, p<0.01, permutation test), which is in stark contrast to the 3% of neurons in the primate S1 responding to 35 fingertip vibrations in the same frequency range (Harvey et al., 2013). ...
... 30 31 To explicitly test the latter hypothesis, we computed the stimulus-specific information (SSI) as a measure of frequency 32 encoding by the evoked Δf/f 0 responses (Butts, 2003;Butts and Goldman, 2006). The majority of responding neurons 33 (>85% of 725 cells, 19 mice, 42 fields of view) were found to be informative about frequency (i.e. had responses with 34 significant SSI, p<0.01, permutation test), which is in stark contrast to the 3% of neurons in the primate S1 responding to 35 fingertip vibrations in the same frequency range (Harvey et al., 2013). To characterize their frequency dependent tuning, 36 mean responses were fit with a descriptive function ( Fig. 2A-H). ...
... Stimulus specific information 42 43 As a measure of frequency ( ) encoding by a neuron's evoked Δf/f 0 responses ( ) we calculated the stimulus specific 44 information ( ) according to the method of Butts and Goldman (Butts, 2003;Butts and Goldman, 2006). The 45 ...
Preprint
Sensing vibrations that propagate through solid substrates conveys fundamental information about moving objects and other nearby dynamic events. Here we report that neurons responsive to substrate vibrations applied to the mouse forelimb reveal a new way of representing frequency information in the primary somatosensory cortex (S1). In contrast to vibrotactile stimulation of primate glabrous skin, which produces temporally entrained spiking and frequency independent firing rates, we found that mouse S1 neurons rely on a different coding scheme: their spike rates are conspicuously tuned to a preferred frequency of the stimulus. Histology, peripheral nerve block and optogenetic tagging experiments furthermore reveal that these responses are associated with the activation of mechanoreceptors located in deep subdermal tissue of the distal forelimb. We conclude that the encoding of frequency information of substrate-borne vibrations in the mouse S1 might be analogous to the representation of pitch of airborne sound in auditory cortex.
... A fundamental question on neurons endowed with such tuning curves relates to the relationship between the turning curve and the sensory information transfer profile of the neuron across feature values. Although this relationship has been explored in neural responses across different sensory modalities (Bezzi, Samengo, Leutgeb, & Mizumori, 2002;Butts, 2003;Butts & Goldman, 2006;DeWeese & Meister, 1999;Montgomery & Wehr, 2010), the question on the relationship between spatial information transfer and spatial tuning curve within the place field of hippocampal place cells has not been quantitatively assessed. ...
... To calculate the SSI the spatial stimulus and the firing rate response were segregated into 80 and 40 bins, respectively. The SSI was calculated using the expression given below (Butts, 2003;Butts & Goldman, 2006;Montgomery & Wehr, 2010): ...
... The copyright holder for this preprint this version posted September 19, 2020. ; https://doi.org/10.1101/2020.09.17.301747 doi: bioRxiv preprint particular firing rate response (Butts, 2003;Butts & Goldman, 2006;Montgomery & Wehr, 2010). Before employing !" ! for computing the SSI, bias in !" ! ...
Preprint
The relationship between the feature-tuning curve and information transfer profile of individual neurons provides vital insights about neural encoding. However, the relationship between the spatial tuning curve and spatial information transfer of hippocampal place cells remains unexplored. Here, employing a stochastic search procedure spanning thousands of models, we arrived at 127 conductance-based place-cell models that exhibited signature electrophysiological characteristics and sharp spatial tuning, with parametric values that exhibited neither clustering nor strong pairwise correlations. We introduced trial-to-trial variability in responses and computed model tuning curves and information transfer profiles, using stimulus-specific (SSI) and mutual (MI) information metrics, across locations within the place field. We found spatial information transfer to be heterogeneous across models, but to reduce consistently with increasing degrees of variability. Importantly, whereas reliable low-variability responses implied that maximal information transfer occurred at high-slope regions of the tuning curve, increase in variability resulted in maximal transfer occurring at the peak-firing location in a subset of models. Moreover, experience-dependent asymmetry in place-field firing introduced asymmetries in the information transfer computed through MI, but not SSI, and the impact of activity-dependent variability on information transfer was minimal compared to activity-independent variability. Biophysically, we unveiled a many-to-one relationship between different ion channels and information transfer, and demonstrated critical roles for N-methyl-D-aspartate receptors, transient potassium and dendritic sodium channels in regulating information transfer. Our results emphasize the need to account for trial-to-trial variability, tuning-curve shape and biological heterogeneities while assessing information transfer, and demonstrate ion-channel degeneracy in the regulation of spatial information transfer.
... Here, the MI is close to zero for many time bins and shows peaks in time bins that are nonoverlapping between neurons, which means when averaged, the mean MI will be at a low value (as in Fig 8A, red). Second, we decomposed the MI into stimulus-specific information (I SSI ; [54][55][56]), which measures how much information about each stimulus is provided by the response. Note that the conventionally computed MI is the weighted average of I SSI across all stimuli. ...
... Information theoretic analyses. We used stimulus-specific information (I SSI ) [54][55][56] to estimate the amount of information that each recorded neuron provided about each stimulus. We also computed the weighted average of I SSI across stimuli to determine overall information content, which is conventionally referred to as the MI between the stimulus and response. ...
Article
Full-text available
Early in auditory processing, neural responses faithfully reflect acoustic input. At higher stages of auditory processing, however, neurons become selective for particular call types, eventually leading to specialized regions of cortex that preferentially process calls at the highest auditory processing stages. We previously proposed that an intermediate step in how nonselective responses are transformed into call-selective responses is the detection of informative call features. But how neural selectivity for informative call features emerges from nonselective inputs, whether feature selectivity gradually emerges over the processing hierarchy, and how stimulus information is represented in nonselective and feature-selective populations remain open question. In this study, using unanesthetized guinea pigs (GPs), a highly vocal and social rodent, as an animal model, we characterized the neural representation of calls in 3 auditory processing stages—the thalamus (ventral medial geniculate body (vMGB)), and thalamorecipient (L4) and superficial layers (L2/3) of primary auditory cortex (A1). We found that neurons in vMGB and A1 L4 did not exhibit call-selective responses and responded throughout the call durations. However, A1 L2/3 neurons showed high call selectivity with about a third of neurons responding to only 1 or 2 call types. These A1 L2/3 neurons only responded to restricted portions of calls suggesting that they were highly selective for call features. Receptive fields of these A1 L2/3 neurons showed complex spectrotemporal structures that could underlie their high call feature selectivity. Information theoretic analysis revealed that in A1 L4, stimulus information was distributed over the population and was spread out over the call durations. In contrast, in A1 L2/3, individual neurons showed brief bursts of high stimulus-specific information and conveyed high levels of information per spike. These data demonstrate that a transformation in the neural representation of calls occurs between A1 L4 and A1 L2/3, leading to the emergence of a feature-based representation of calls in A1 L2/3. Our data thus suggest that observed cortical specializations for call processing emerge in A1 and set the stage for further mechanistic studies.
... Here, the MI is close to zero for many time bins, and shows peaks in time bins that are non-overlapping between neurons, which means when averaged, the mean MI will be at a low value (as in Fig 8A, red). Second, we decomposed the MI into stimulus-specific information (ISSI; [54][55][56]), which measures how much information about each stimulus is provided by the response. ...
... We used stimulus-specific information (ISSI) [54][55][56] to estimate the amount of information that each recorded neuron provided about each stimulus. We also computed the weighted average of ISSI across stimuli to determine overall information content, 890 which is conventionally referred to as the mutual information (MI) between the stimulus and response. ...
Preprint
Full-text available
Early in auditory processing, neural responses faithfully reflect acoustic input. At higher stages of auditory processing, however, neurons become selective for particular call types, eventually leading to specialized regions of cortex that preferentially process calls at the highest auditory processing stages. We previously proposed that an intermediate step in how non-selective responses are transformed into call-selective responses is the detection of informative call features. But how neural selectivity for informative call features emerges from non-selective inputs, whether feature selectivity gradually emerges over the processing hierarchy, and how stimulus information is represented in non-selective and feature-selective populations remain open questions. In this study, using unanesthetized guinea pigs, a highly vocal and social rodent, as an animal model, we characterized the neural representation of calls in three auditory processing stages: the thalamus (vMGB), and thalamorecipient (L4) and superficial layers (L2/3) of primary auditory cortex (A1). We found that neurons in vMGB and A1 L4 did not exhibit call-selective responses and responded throughout the call durations. However, A1 L2/3 neurons showed high call-selectivity with about a third of neurons responding to only one or two call types. These A1 L2/3 neurons only responded to restricted portions of calls suggesting that they were highly selective for call features. Receptive fields of these A1 L2/3 neurons showed complex spectrotemporal structures that could underlie their high call feature selectivity. Information theoretic analysis revealed that in A1 L4 stimulus information was distributed over the population and was spread out over the call durations. In contrast, in A1 L2/3, individual neurons showed brief bursts of high stimulus-specific information, and conveyed high levels of information per spike. These data demonstrate that a transformation in the neural representation of calls occurs between A1 L4 and A1 L2/3, leading to the emergence of a feature-based representation of calls in A1 L2/3. Our data thus suggest that observed cortical specializations for call processing emerge in A1, and set the stage for further mechanistic studies.
... Since S i is essentially S i shifted in time, we can assume that H(S i ) is equal to H(S i ). Hence, the aforementioned difference becomes the mutual information I( Figure 3.c), which removes from the information contained in S i the information contained in S i knowing already A i [29,14,21,6,11]. Note that the results involving the mutual information I(A i ; S i ) are consistent with those obtained with the Kraskov-Stögbauer-Grassberger (KSG) method for estimating the mutual information for time series with continuous data [20], which was used in addition to [5] to verify our findings. ...
Preprint
Full-text available
The mastery of skills such as playing tennis or balancing an inverted pendulum implies a very accurate control of movements to achieve the task goals. Traditional accounts of skilled action control that focus on either routinization or perceptual control make opposite predictions about the ways we achieve mastery. The notion of routinization emphasizes the decrease of the variance of our actions, whereas the notion of perceptual control emphasizes the decrease of the variance of the states we visit, but not of the actions we execute. Here, we studied how participants managed control tasks of varying levels of complexity, which consisted in controlling inverted pendulums of different lengths. We used information-theoretic measures to compare the predictions of alternative theoretic accounts that focus on routinization and perceptual control, respectively. Our results indicate that the successful performance of the control task strongly correlates with the decrease of state variability and the increase of action variability. As postulated by perceptual control theory, the mastery of skills consists in achieving stable control goals by flexible means.
... There are two common, sensible quantities we can define when we want to consider the information overlap between an random variable and an outcome: the information gain, also known as specific information and the surprise (DeWeese and Meister, 1999;Butts, 2003). These two quantities are usually defined separately in the cognitive sciences and neuroscience (Williams, 2011); however, we can unify them after relaxing the symmetry of the mutual information as done above: ...
Preprint
Full-text available
Information theory is of importance to machine learning, but the notation for information-theoretic quantities is sometimes opaque. The right notation can convey valuable intuitions and concisely express new ideas. We propose such a notation for machine learning users and expand it to include information-theoretic quantities between events (outcomes) and random variables. We apply this notation to a popular information-theoretic acquisition function in Bayesian active learning which selects the most informative (unlabelled) samples to be labelled by an expert. We demonstrate the value of our notation when extending the acquisition function to the core-set problem, which consists of selecting the most informative samples \emph{given} the labels.
... However, the MI does not capture how the fidelity of the encoding depends on the stimulus. Some authors have proposed stimulus-specific variants of MI [111,112,16,113]. ...
Preprint
Full-text available
A central goal of neuroscience is to understand the representations formed by brain activity patterns and their connection to behavior. The classical approach is to investigate how individual neurons encode the stimuli and how their tuning determines the fidelity of the neural representation. Tuning analyses often use the Fisher information to characterize the sensitivity of neural responses to small changes of the stimulus. In recent decades, measurements of large populations of neurons have motivated a complementary approach, which focuses on the information available to linear decoders. The decodable information is captured by the geometry of the representational patterns in the multivariate response space. Here we review neural tuning and representational geometry with the goal of clarifying the relationship between them. The tuning induces the geometry, but different sets of tuned neurons can induce the same geometry. The geometry determines the Fisher information, the mutual information, and the behavioral performance of an ideal observer in a range of psychophysical tasks. We argue that future studies can benefit from considering both tuning and geometry to understand neural codes and reveal the connections between stimulus, brain activity, and behavior.
... where S is the discrete random variable formed by the set of the animal's spatial locations s, and R is the discrete random variable formed by the set of possible spike count responses r [39][40][41] . This was corrected for estimation bias by subtracting an analytical estimate of the bias 42 . ...
Article
Full-text available
By investigating the topology of neuronal co-activity, we found that mnemonic information spans multiple operational axes in the mouse hippocampus network. High-activity principal cells form the core of each memory along a first axis, segregating spatial contexts and novelty. Low-activity cells join co-activity motifs across behavioral events and enable their crosstalk along two other axes. This reveals an organizational principle for continuous integration and interaction of hippocampal memories.
Article
A central goal of neuroscience is to understand the representations formed by brain activity patterns and their connection to behaviour. The classic approach is to investigate how individual neurons encode stimuli and how their tuning determines the fidelity of the neural representation. Tuning analyses often use the Fisher information to characterize the sensitivity of neural responses to small changes of the stimulus. In recent decades, measurements of large populations of neurons have motivated a complementary approach, which focuses on the information available to linear decoders. The decodable information is captured by the geometry of the representational patterns in the multivariate response space. Here we review neural tuning and representational geometry with the goal of clarifying the relationship between them. The tuning induces the geometry, but different sets of tuned neurons can induce the same geometry. The geometry determines the Fisher information, the mutual information and the behavioural performance of an ideal observer in a range of psychophysical tasks. We argue that future studies can benefit from considering both tuning and geometry to understand neural codes and reveal the connections between stimuli, brain activity and behaviour.
Article
Full-text available
Correlated firing among neurons is widespread in the nervous system. Precisely correlated spiking, occurring on a millisecond time scale, has recently been observed among neurons in the lateral geniculate nucleus with overlapping receptive fields. We have used an information-theoretic analysis to examine the role of these correlations in visual coding. Considerably more information can be extracted from two cells if temporal correlations between them are considered. The percentage increase in information depends on the degree of correlation; the average increase is approximately 20% for strongly correlated pairs. Thus, precise temporal correlation could be used as an additional information channel from thalamus to visual cortex.
Article
Full-text available
The amount of information a sensory neuron carries about a stimulus is directly related to response reliability. We recorded from individual neurons in the cat lateral geniculate nucleus (LGN) while presenting randomly modulated visual stimuli. The responses to repeated stimuli were reproducible, whereas the responses evoked by nonrepeated stimuli drawn from the same ensemble were variable. Stimulus-dependent information was quantified directly from the difference in entropy of these neural responses. We show that a single LGN cell can encode much more visual information than had been demonstrated previously, ranging from 15 to 102 bits/sec across our sample of cells. Information rate was correlated with the firing rate of the cell, for a consistent rate of 3.6 +/- 0.6 bits/spike (mean +/- SD). This information can primarily be attributed to the high temporal precision with which firing probability is modulated; many individual spikes were timed with better than 1 msec precision. We introduce a way to estimate the amount of information encoded in temporal patterns of firing, as distinct from the information in the time varying firing rate at any temporal resolution. Using this method, we find that temporal patterns sometimes introduce redundancy but often encode visual information. The contribution of temporal patterns ranged from -3.4 to +25.5 bits/sec or from -9.4 to +24.9% of the total information content of the responses.
Article
Full-text available
A central theme in neural coding concerns the role of response variability and noise in determining the information transmission of neurons. This issue was investigated in single cells of the lateral geniculate nucleus of barbiturate-anesthetized cats by quantifying the degree of precision in and the information transmission properties of individual spike train responses to full field, binary (bright or dark), flashing stimuli. We found that neuronal responses could be highly reproducible in their spike timing (approximately 1-2 ms standard deviation) and spike count (approximately 0.3 ratio of variance/mean, compared with 1.0 expected for a Poisson process). This degree of precision only became apparent when an adequate length of the stimulus sequence was specified to determine the neural response, emphasizing that the variables relevant to a cell's response must be controlled to observe the cell's intrinsic response precision. Responses could carry as much as 3.5 bits/spike of information about the stimulus, a rate that was within a factor of two of the limit the spike train could transmit. Moreover, there appeared to be little sign of redundancy in coding: on average, longer response sequences carried at least as much information about the stimulus as would be obtained by adding together the information carried by shorter response sequences considered independently. There also was no direct evidence found for synergy between response sequences. These results could largely, but not entirely, be explained by a simple model of the response in which one filters the stimulus by the cell's impulse response kernel, thresholds the result at a fairly high level, and incorporates a postspike refractory period.
Article
1. The stimulus/response properties of four identified primary sensory interneurons in the cricket cercal sensory system were studied using electrophysiological techniques. These four cells are thought to represent a functionally discrete subunit of the cercal system: they are the only cells that encode information about stimulus direction to higher centers for low intensity stimuli. Previous studies characterized the quantity of information encoded by these cells about the direction of air currents in the horizontal plane. In the experiments reported here, we characterized the quantity and quality of information encoded in the cells' elicited responses about the dynamics of air current waveforms presented at their optimal stimulus directions. The total sample set included 22 cells. 2. This characterization was achieved by determining the cells' frequency sensitivities and encoding accuracy using the methods of stochastic systems analysis and information theory. The specific approach used for the analysis was the "stimulus reconstruction" technique in which a functional expansion was derived to transform the observed spike train responses into the optimal estimate (i.e., "reconstruction") of the actual stimulus. A novel derivation of the crucial equations is presented. The reverse approach is compared with the more traditional forward analysis, in which an expansion is derived that transforms the stimulus to a prediction of the spike train response. Important aspects of the application of these analytical approaches are considered. 3. All four interneurons were found to have identical frequency tuning, as assessed by the accuracy with which different frequency components of stimulus waveforms could be reconstructed with a linear expansion. The interneurons encoded significant information about stimulus frequencies between 5 and 80 Hz, which peak sensitivities at approximately 15 Hz. 4. All four interneurons were found to have identical stimulus/response latencies. The mean latency between a stimulus component and the corresponding elicited spike was 17 ms. All four interneurons also had identical integration times. The integration time, measured by the duration of stimulus, which could affect the probability of spiking, was approximately 50 ms. 5. The accuracy of the encoding can be expressed as a signal-to-noise ratio, where the noise is a scaled difference between the original signal and the best estimate of the signal. Peak signal-to-noise ratios of approximately 1 were obtained for the cells across all stimulus power levels, using only the linear expansion term. Analysis of the data indicated that the consideration of second-order nonlinear transformations of the stimulus would not have increased the calculated encoding accuracy. 6. The encoding accuracy also can be expressed in the information theoretic units of bits/second, which characterizes the information transmission rate of the cell. Bits/second values varied between 10 and 80 for the 22 different cells in our experimental set. The information rate values were highly correlated with the mean spike rates of the interneurons, but were not correlated with the stimulus power levels. However, normalizing the absolute information rates by the mean spike rate in each case yielded a measure of bits/spike that was remarkably invariant across all experiments. The measured bits/spike rate was approximately 1 for all experiments. This result is discussed in the context of recent theoretical studies on optimal encoding. 7. Although the dynamic sensitivities of the four interneurons were identical, their directional sensitivities are known to be orthogonal. Thus the cells are complementary to one another from a functional standpoint: whereas a particular cell will be insensitive to air currents from some directions, one or more of the other three cells will be sensitive to stimuli from those directions...
Article
Information theory quantifies how much information a neural response carries about the stimulus. This can be compared to the information transferred in particular models of the stimulus-response function and to maximum possible information transfer. Such comparisons are crucial because they validate assumptions present in any neurophysiological analysis. Here we review information-theory basics before demonstrating its use in neural coding. We show how to use information theory to validate simple stimulus-response models of neural coding of dynamic stimuli. Because these models require specification of spike timing precision, they can reveal which time scales contain information in neural coding. This approach shows that dynamic stimuli can be encoded efficiently by single neurons and that each spike contributes to information transmission. We argue, however, that the data obtained so far do not suggest a temporal code, in which the placement of spikes relative to each other yields additional information.
Article
Information theory provides a powerful framework to analyse how neurons represent sensory stimuli or other behavioural variables. A recurring question regards the amount of information conveyed by a specific neuronal response. Here we show that the commonly used definition for this quantity has a serious flaw: the information accumulated during subsequent observations of neural activity fails to combine additively. Additivity is a highly desirable property, both on theoretical grounds and for the practical purpose of analysing population codes. We propose an alternative measure for the information per observation and prove that this is the only definition that satisfies additivity. The old and the new definitions measure very different aspects of the neural code, which is illustrated with visual responses from a motion-sensitive neuron in the primate cortex. Our analysis allows additional interpretation of several published results, which suggests that the neurons studied are operating far from their information capacity.
Article
In the early visual system, neuronal responses can be extremely precise. Under a wide range of stimuli, cells in the retina and thalamus fire spikes very reproducibly, often with millisecond precision on subsequent stimulus repeats. Here we develop a mathematical description of the firing process that, given the recent visual input, accurately predicts the timing of individual spikes. The formalism is successful in matching the spike trains from retinal ganglion cells in salamander, rabbit, and cat, as well as from lateral geniculate nucleus neurons in cat. It adapts to many different response types, from very precise to highly variable. The accuracy of the model allows a compact description of how these neurons encode the visual stimulus.
Predicting every spike: a model for the responses of visual neurons Neuron 30 803–17 Liu R Variability and information in a neural code of the cat lateral geniculate nucleus 2789–806 Reinagal P and Reid R C 2000 Temporal coding of visual information in the thalamus
  • Neural
  • J Keat
  • P Reinagel
  • R Reid
  • M C Meister
  • S Tzonev
  • Rebrick
  • K Miller
Neural Syst. 10 325–40 Keat J, Reinagel P, Reid R C and Meister M 2001 Predicting every spike: a model for the responses of visual neurons Neuron 30 803–17 Liu R C, Tzonev S, Rebrick S and Miller K D 2001 Variability and information in a neural code of the cat lateral geniculate nucleus J. Neurophysiol. 86 2789–806 Reinagal P and Reid R C 2000 Temporal coding of visual information in the thalamus J. Neurosci. 20 5392–400 Theunissen F, Roddey J C, Stufflebeam S, Clague H and Miller J P 1995 Information theoretic analysis of dynamical encodingbyfouridentifiedprimarysensoryinterneuronsinthecricketcercalsystemJ.Neurophysiol.751345–64
  • T M Cover
  • J Thomas
Cover T M and Thomas J A 1991 Elements of Information Theory (New York: Wiley)