ArticlePDF Available

How much information is associated with a particular stimulus?

June 2003
Network Computation in Neural Systems 14(2):177-87

June 2003
14(2):177-87

DOI:10.1088/0954-898X/14/2/301

Source
PubMed

Authors:

Daniel Butts

University of Maryland, College Park

Although the Shannon mutual information can be used to reveal general features of the neural code, it cannot directly address which symbols of the code are significant. Further insight can be gained by using information measures that are specific to particular stimuli or responses. The specific information is a previously proposed measure of the amount of information associated with a particular response; however, as I show, it does not properly characterize the amount of information associated with particular stimuli. Instead, I propose a new measure: the stimulus-specific information (SSI), defined to be the average specific information of responses given the presence of a particular stimulus. Like other information theoretic measures, the SSI does not rely on assumptions about the neural code, and is robust to non-linearities of the system. To demonstrate its applicability, the SSI is applied to data from simulated visual neurons, and identifies stimuli consistent with the neuron's linear kernel. While the SSI reveals the essential linearity of the visual neurons, it also successfully identifies the well-encoded stimuli in a modified example where linear analysis techniques fail. Thus, I demonstrate that the SSI is an appropriate measure of the information associated with particular stimuli, and provides a new unbiased method of analysing the significant stimuli of a neural code.

A non-linear neuron. The distribution of the SSI for the 2 10 stimuli is calculated for a non-linear neuron whose spike-triggered average is zero. This neuron is designed such that opposite stimuli evoke the same response, resulting in pairs of stimuli that have the same SSI (inset).

…

Figures - uploaded by Daniel Butts

Content may be subject to copyright.

Content uploaded by Daniel Butts

Content may be subject to copyright.

INSTITUTE OF PHYSICS PUBLISHING NETWORK: COMPUTATION IN NEURAL SYSTEMS

Network: Comput. Neural Syst. 14 (2003) 177–187 PII: S0954-898X(03)52703-8

How much information is associated with a particular

stimulus?

Daniel A Butts

Department of Neurobiology, Harvard Medical School, 220 Longwood Avenue, Boston,

MA 02115, USA

E-mail: daniel

butts@hms.harvard.edu

Received 21 August 2002, in ﬁnal form 24 October 2002

Published 28 January 2003

Online at stacks.iop.org/Network/14/177

Abstract

Although the Shannon mutual information can be used to reveal general features

of the neural code, it cannot directly address which symbols of the code are

signiﬁcant. Further insight can be gained by using information measures that

are speciﬁc to particular stimuli or responses. The speciﬁc information is a

previously proposed measure of the amount of information associated with a

particular response; however, as I show, it does not properly characterize the

amount of information associated with particular stimuli. Instead, I propose

anewmeasure: the stimulus-speciﬁc information (SSI), deﬁned to be the

average speciﬁc information of responses given the presence of a particular

stimulus. Like other information theoretic measures, the SSI does not rely on

assumptions about the neural code, and is robust to non-linearities of the system.

To demonstrate its applicability, the SSI is applied to data from simulated visual

neurons, and identiﬁes stimuli consistent with the neuron’s linear kernel. While

the SSI reveals the essential linearity of the visual neurons, it also successfully

identiﬁes the well-encoded stimuli in a modiﬁed example where linear analysis

techniques fail. Thus, I demonstrate that the SSI is an appropriate measure of

theinformation associated with particular stimuli, and provides a new unbiased

method of analysing the signiﬁcant stimuli of a neural code.

1. Introduction

Information theory provides measures for comparing different coding schemes of neurons

and neural ensembles, avoiding both biases caused by preconceptions of the neural code and

complications inherent in analysing neuronal systems with non-linearities and under conditions

of complex stimuli. As a result, information theory has been used to analyse neural data in a

variety of sensory systems(for a review,see Borst and Theunissen 1999). Such studies typically

make comparisons between the Shannon mutual information given different classiﬁcations of

178 DAButts

stimulus and response ensembles, but do not address which stimuli and responses within these

ensembles are signiﬁcant in information transmission.

As aresult, DeWeese and Meister (1999) proposed an information theoretic measure of

the signiﬁcance of particular symbols in the neural code: the speciﬁc information.Thespeciﬁc

information of a particular response is deﬁned as the reduction in uncertainty in the stimulus

gained by the observation of that response. Since the mutual information represents the average

reduction in the uncertainty of the stimulus gained by one measurement, speciﬁc information

is intuitively a good representation of the degree to which a given response contributes to the

overall mutual information. Furthermore, DeWeese and Meister (1999) show that speciﬁc

information has unique properties that are appropriate for a measure of the information of a

response.

Speciﬁc information can be applied to both particular stimuli and particular responses due

to the symmetry between stimulus and responseininformation measures. However, because of

the asymmetry of stimulus and response with respect to causality (i.e., stimuli cause responses

and not vice versa), here I show that speciﬁc information does not provide a good measure

of stimulus signiﬁcance. I propose a new measure, the ‘stimulus-speciﬁc information’ (SSI),

which is deﬁned to be the average reduction in uncertainty of one observation given a particular

stimulus.

Since the deﬁnition of the SSI relies on a measure of the information associated

with responses (i.e., speciﬁc information), I will ﬁrst deﬁne speciﬁc information (proposed

by DeWeese and Meister 1999). Using a simple example, I will then show that it does not

provide a good measure of the information associated with particular stimuli, and motivate the

deﬁnition of the SSI.

To demonstrate its effectiveness, the SSI is applied to data from realistic simulations of

neurons in the lateral geniculate nucleus (LGN) presented with full-ﬁeld white-noise stimuli

(Keat et al 2001), and to a modiﬁed version of this model that deﬁes typical linear analyses.

In both cases, the SSI identiﬁes the most signiﬁcant stimuli and offers additional insight into

the underlying neural code.

2. Decomposing mutual information into response-speciﬁc information

Consider a system presented with an ensemble of stimuli S and whose behaviour can be

classiﬁed into a set of responses R.The Shannon mutual information between the stimulus S

and response R ensembles of this system is given by

I [R, S] =



s∈S



r∈ R

p(s, r) log

p(s, r)

p(s) p(r)

(1)

where p(s, r) is the joint probability distribution, i.e., the probability of simultaneously

observing the stimulus s ∈ S and the response r ∈ R.The joint probability distribution

p(s, r) can be computed by counting the frequency of each stimulus–response pair over a

sufﬁcient amount of time, over which the ‘natural’ ensemble of stimuli is sampled with the

probability given by the prior distribution p(s).

While the mutual information can be used to objectively evaluate different coding schemes

through different classiﬁcations of the stimulus and response ensembles (S and R), it represents

an average over the entire set of stimuli s ∈ S and responsesr ∈ R.Itisoftenofinterest to know

whichparticular stimuli are effectively encoded by the system, and which particular responses

communicate information about the stimuli. Such questions can be addressed by decomposing

I [R, S]into measures that represent the contributions of speciﬁc stimuli or responses to the

Stimulus-speciﬁc information 179

mutual information, i.e.

I [R, S] =



s∈S

p(s)i(s) =



r∈ R

p(r)i(r ). (2)

In this sense, the mutual information is explicitly a weighted average over individual

contributions from particular stimuli or particular responses.

There are an arbitrary number of ways to perform such decompositions, meaning that there

is not one single measure that represents the ‘speciﬁc’ information associated with a particular

stimulus or response. As a result, an appropriate measure must be chosen that properly signiﬁes

therole of particular stimuli and responses in information transmission.

DeWeese and Meister (1999) argue that the information of a response must be additive,

since intuitively, information should accumulate over consecutive independent measurements

such that the total information from multiple measurements is equal to the sum of information

gained from each measurement separately. They show that there is only one decomposition of

mutual information that is additive, the speciﬁc information,givenby

(r) = H [S] − H [S|r](3)

where the entropy of the prior distribution is given by H [S] =−



p(s) log

p(s) and

theconditional entropy associated with a particular response r is given by H [S|r] =

−



p(s|r) log

p(s|r).

The speciﬁc information has a straightforward intuitive meaning with regards to the process

of information transmission. Entropy is a measure of the uncertainty in probability distributions

(see Cover and Thomas 1991): broad distributions have large entropy, and narrow distributions

(where the variable in question is more localized to particular values) have little entropy. The

speciﬁc information is a difference in entropy of stimulus distributions (equation (3)), and thus

represents the amount that the initial uncertainty of the stimulus is reduced by observing the

response r.Furthermore,speciﬁc information is calibrated to mutual information: the speciﬁc

information is a direct measure of the amount learned by a given measurement, the mutual

information is the amount learned by a measurement on average overall possible responses.

Thus, the speciﬁc information serves as an information theoretic measure through which

thesigniﬁcance of different responses can be compared. Responses with a high speciﬁc

information reduce the uncertainty in the stimulus the most and are thus most signiﬁcant to

thesystem, and responses that do not reduce the amount of uncertainty in the stimulus are less

signiﬁcant.

3. The signiﬁcance of astimulus is different to that of a response

Shannon’s mutual information is a statistical measure of the interdependence of two random

variables, which, in the study of sensory systems, is usually stimulus and response. As a

result of this generality, expressions for information theoretic quantities do not distinguish

between stimulus and response (for example, see equation (1)), meaning that tools applicable

to responses can be applied to stimuli. For example, the speciﬁc information of a stimulus can

be calculated by interchanging response and stimulus in equation (3):

(s) = H [R] − H [R|s]. (4)

Is the speciﬁc information an appropriate measure of the information associated with a

particular stimulus? Below I show that, due to the asymmetry of stimulus and response with

respect to causality, a stimulus that is signiﬁcant to a system does not have the same properties

as a signiﬁcant response.

180 DAButts

Consider a simple joint probability distribution p(r, s) where there are only two stimuli

and two responses, with the probabilities of each stimulus–response pair given by

Without a measurement, the probability of stimulus s

and the probability of s

,i.e.,

the prior distribution is p(s) ={

} with an entropy H [S] = 2 −

log

3 = 0.81 bits.

Before addressing the question of stimulus signiﬁcance, consider the signiﬁcant responses

of this system. Each response is equally likely, but conveys different amounts of information

about the stimuli. Observation of the ﬁrst response r

designates an ambiguous situation, with

equal probability of either stimulus: p(s|r

) ={

}: H [S|r

] = 1bit. The second response,

however, clearly designates s

,sincep(s|r

) ={1, 0}: H [S|r

] = 0bits. The information

content of each response is reﬂected in its speciﬁc information, with i

) =−0.19 bits

(uninformative), and i

) = H [S] = 0.81 bits. The total mutual information (equation (1))

can be calculated from the speciﬁc informations as their weighted average over the ensemble

of responses (equation (2)): I [R, S] = 0.31 bits.

This example was devised to place an intuitive notion of stimulus signiﬁcance at odds

with the speciﬁc information applied to stimuli. With s

present, the response is completely

predictable (r

), whereas the presence of s

could result in either response. As a result, the

speciﬁc information applied to stimuli is higher for s

: i

) = 0.08 bitsversus i

) = 1bit.

However, recall that responses encode information about stimuli,andnot the reverse.

Though s

has the maximal speciﬁc information, neither response speciﬁes it: observation of

givesanequalprobability of s

and s

,andobservation of r

unambiguously designates s

As aresult, from an information transmission standpoint, s

is encoded more effectively than

,incontrast to their speciﬁc informations.

Why does speciﬁc informationfailtoselect the more effectively encoded stimulus?

Speciﬁc information is largest for those stimuli that have few responses associated with them,

without regard to whether these responses are informative. For example, consider a neuron

(such as the example discussed later in this paper) that is unresponsive (does not ﬁre) to many

stimuli. These stimuli would have a large speciﬁc information because ‘not ﬁring’ is the only

response associated with them. But observing a ‘not ﬁring’ response would be very ambiguous

since there are many stimuli that do not cause the neuron to ﬁre. Thus, the speciﬁc information

of a stimulus that the neuron does not respond to would be relatively high, but one would not

say that it was well encoded.

4. Stimulus-speciﬁc information

Itherefore propose that the most informative stimuli are those that cause the most informative

responses, and are thus well encoded by the system. The responses associated with a stimulus

s are given by the conditional distribution p(r|s),andtheinformation conveyed by a particular

response r is given by its speciﬁc information i

(r).Thus, I propose an information theoretic

measure of stimulus signiﬁcance called the SSI, given by

ssi

(s) ≡



p(r|s)i

(r) =



p(r|s){H [S] − H [S|r]}. (5)

Since the speciﬁc information i

(r) is the reduction in uncertainty of the stimulus gained by

the particular observation r ∈ R,theSSI i

ssi

(s) is the average reduction of uncertainty gained

from one measurement given the stimulus s ∈ S.

Stimulus-speciﬁc information 181

Stimulus

Response

7.8 ms

(∆t=0.5)

-1 -1 +1

A. B.

-100 -80 -60 -40 -20 0

0.1

0.2

0.3

0.4

Latency λ (ms)

Mutual Information (bits)

=10

(∆t=0.5)

Figure 1. Mutual information between full-ﬁeld ﬂicker stimuli and neuronal spikes. (A) A three-

frame stimulus word [−1, −1, +1] and a one-frame response word [1, 1] are associated at a given

latency λ between the end of the stimulus word and beginning of the response word. (B) Mutual

information is calculated between stimulus and response given a choice for L

= 10, L

= 1, and

t = 1/2, and is shown as a function of latency λ.

When calculated using the example of the last section, we see that i

ssi

) = 0.48 bits, and

ssi

) =−0.19 bits. Like the speciﬁc information, the weighted average of i

ssi

(s) over the

stimulus ensemble s ∈ S gives the mutual information, and the value of the SSI is calibrated to

the mutual information and speciﬁc information. For example, the value of i

ssi

) is consistent

with the fact that an observation completely determines the stimulus (giving close to 1 bit of

information) half of the time, and otherwise is not informative. The SSI of s

demonstrates

that the only possible measurement that can result when s

is presented is uninformative.

5. Simulated visual neurons

To demonstrate the use of SSI, I perform an information theoretic calculation on simulated

visual neurons. Data were generated using a model proposed by Keat et al (2001), which

accurately reproduces the timing of single action potentials as well as their statistical

distribution over multiple trials. Using model parameters that simulate the behaviour of cat

LGN neurons, spike trains were generated in response to random full-ﬁeld ﬂicker stimuli

presented at 128 Hz (see methods).

Information quantities between the full-ﬁeld ﬂicker stimulus S and resulting spike trains

R can be calculated as a function of several parameters: the length of the stimulus word L

thelength and time resolution of the response word (L

and t), and the latency λ between

them (see ﬁgure 1(A)). This method of calculating information is similar to those of Liu et al

(2001). For this paper, I will use L

= 10 frames, L

= 1 frame, and t = 0.5 frames, where

is chosen to be as large as possible given the limited number of data (see methods), and

and t are sufﬁciently representative of results found with longer response words (data

not shown). General questions about the dependence of mutual information on stimulus and

response word length and resolution are addressed elsewhere (Liu et al 2001).

The mutual information I [R, S]isshowninﬁgure 1(B) as a function of latency λ.It

peaks at a latency of −16.5ms, when the average response carries 0.37 bits about the stimulus.

Since I [R, S] can be decomposed into SSI (see equation (2)), the total mutual information can

be thought of as an average that includes both stimuli that are well encoded and those that are

not. To determine the contribution of each stimulus to the total mutual information, the SSI is

calculated for each of the 2

= 1024 stimuli.

182 DAButts

Stimulus-Specific Information (bits)

Number of Stimuli

Specific Information (bits)

0 1 2 3 4 5 6

-1.5 -1 -0.5 0 0.5

-100 -80 -60 -40 -20 0

SSI-Averaged Stimulus

Spike-Triggered Average

Time before response (ms)

-0.4

0.0

0.4

0.8

-0.1

0.0

0.1

0.2

Figure 2. Information measures of particular stimuli. (A) SSI i

ssi

(s) and speciﬁc information i

(s)

are calculated for the 2

stimuli at a latency λ of −16.5msandtheirdistributions are displayed as

histograms in the left and right frames respectively. Bins with greater than 10 elements are cut off to

focus on the few outlying stimuli. The ﬁve stimuli with the largest SSI are labelled A–E and shown

in the inset of the left frame. Two of them, A and C,havethelowest speciﬁc information (right

frame). (B) To identify the features of the stimulus ensemble that are well encoded, an average

stimulus is calculated by weighting each stimulus by its SSI (circles). This SSI-weighted average

stimulus is approximately proportional to the spike-triggered average stimulus of the neuron up to

ascalefactor (solid curve). Note that the spike-triggered average stimulus is scaled so it is directly

comparable to the units of average SSI.

The left panel of ﬁgure 2(A) shows a histogram of the SSI distribution of the stimulus

ensemble. The minimum SSI was 0.085 bits (leaving the lowest bin between 0 and 0.05 bits

empty). However, over half the stimuli (599 out of 1024) have SSIs in the next lowest bin

(between 0.05 and 0.1 bits). On the other extreme, the top ﬁve stimuli are well separated from

the rest of the distribution, and are shown in the inset. Note that the ‘best’ stimulus (A)isa

simple off-to-on transition which occurs at −47.8ms. Otherstimuli with lower SSI have a

slightly different off-to-on latency (C and D), or an ‘on’ frame instead of an ‘off’ frame at

large latencies (B and E).

For comparison, the right panel of ﬁgure 2(A) shows the distribution of speciﬁc

informations for the same stimuli (i

(s),givenbyequation (4)). Speciﬁc information gives

nearly the opposite classiﬁcation of stimuli as SSI: 486 out of 1024 stimuli are clustered in

the largest occupied bin (0.74 bits). At the same time, stimuli with some of the highest SSIs

have some of the lowest speciﬁc informations; stimuli A and C have the two lowest speciﬁc

informations (ﬁgure 2(A), right), and stimuli B, D,andE are lost in the broader distribution.

Stimulus-speciﬁc information 183

As explained above, this is due to the fact that many stimuli rarely elicit a spike, meaning that

they are associated with only one response (‘no spike’), and have high speciﬁc information.

At thesame time, stimuli that often result in a spike cause an uncertain response since they

may or may not elicit a spike.

Of course, the ﬁve stimuli with the largest SSIs are only presented 0.5% of the time,

and almost 90% of the stimuli have an SSI less than 1 bit but account for almost half of

the informationconveyed. To determine the important features in the stimulus ensemble

that lead to informative responses, I calculate an average stimulus based on each stimulus’s

fractional contribution to the mutual information p(s)i

ssi

(s)/I[R, S]. This ‘SSI-weighted

average stimulus’ is shown as circles in ﬁgure 2(B), and is compared to a more common

measure of signiﬁcant stimuli: the spike-triggered average stimulus (STA), shown as a solid

curve.Notethat these results apply for a given latency λ =−16.5ms,meaningthat since the

SSI-weighted average has a resolution of one frame (7.8 ms), there are ten discrete points in

this average (L

= 10). The SSI-weighted average calculatedforother latencies is in similar

agreement to the STA (data not shown).

The close agreement between the spike-triggered and SSI-weighted average stimulus

results from a combination of two factors:

(1) the ability particular stimuli to evoke spikes in the Keat model is proportional to a linear

convolution with the stimulus (see the methods for more details); and

(2) the majority of information in the responses of this neuron is carried by spikes (found by

calculating the speciﬁc information of responses i

(r) of equation (3)—data not shown).

As aresult of these two conditions, the SSI of a given stimulus is roughly proportional to the

number of spikes that were associated with it, i.e., i

ssi

(s) ≈ p(spike|s)i

(spike) ∝ p(spike|s).

Thus, each stimulus contributes to the spike-triggered and SSI-weighted averages in proportion

to the number of spikes that each stimulus evokes, resulting in the same shape shown in

ﬁgure 2(B).

6. Speciﬁc surprise identiﬁes a different aspect of stimulus signiﬁcance

Thus, the SSI gives an appropriate characterization of the ‘best encoded’ stimuli, in stark

contrast to the speciﬁc information. However, as mentioned earlier, there are many possible

stimulus-speciﬁc decompositions of mutual information (equation (2)), some of which might

provide alternative but reasonable classiﬁcations. For example, in their discussion of a measure

of the information associated with individual symbols (i.e., stimuli and responses), DeWeese

and Meister (1999) consider the ‘speciﬁc surprise’, given by (when applied to stimuli)

sur

(s) =



p(r|s) log

p(r|s)

p(r)

. (6)

Note that this is the Kullback–Leibler distance between the conditional probability p(r|s) and

the marginal distribution p(r).Below,Idemonstrate that the speciﬁc surprise provides an

alternative measure of stimulus signiﬁcance, but it highlights other properties of the stimulus

in a way that lacks the intuitive meaning of the SSI as a measure of the information associated

with a stimulus.

Speciﬁc surprise, in the form of equation (6), compares the marginal distribution p(r) to

the conditional distribution p(r|s).Asdiscussed above in relation to speciﬁc information, such

184 DAButts

Specific Surprise (bits)

Number of Stimuli

A. B.

0 1 2 3 4 5 6

-100 -80 -60 -40 -20 0

-0.1

0.0

0.1

0.2

Averaged Stimulus

Spike-Triggered Average

Time before response (ms)

Surprise-Averaged

SSI-Averaged

Figure 3. The speciﬁc surprise. (A) The distribution of speciﬁc surprises over the stimulus

ensemble is calculated for same visual neuron as in ﬁgure 2. Though the distribution is very similar

to the SSI distribution (ﬁgure 2(A), left)—with the same top ﬁve stimuli, there are subtle differences,

such as their order (A–B–D–E–C). (B) The speciﬁc-surprise-weighted average stimulus (ﬁlled

circles) is compared to the spike-triggered average stimulus (solid curve, scaled for the best ﬁt) and

the best ﬁt of the SSI-weighted average from ﬁgure 2(B) (dotted curve).

acomparison is not intuitively meaningful with respect to causality. However, using Bayes’

law, speciﬁc surprise can be re-expressed as

sur

(s) =



p(r|s)



log

p(s)

− log

p(s|r)



. (7)

The logarithm of the reciprocal probability of a stimulus is often referred to as its ‘surprise’,

since rarer stimuli are more ‘surprising’. In this form, speciﬁc surprise has a causal meaning:

the reduction in surprise of a particular stimulus gained from each response, averaged over

all responses associated with that stimulus.Thus, whereas the SSI weighs each response r

by its effect over the stimulus ensemble (through the change of H [S]toH [S|r]), the speciﬁc

surprise measures of the effect of the response on just the stimulus in question (through the

change of p(s) to p(s|r )).

How does using a different response weight change the evaluation of stimulus signiﬁcance?

In ﬁgure 3(A), the speciﬁc surprise is calculated for the simulated visual neuron. It provides

classiﬁcations of the stimuli similar to the SSI, and identiﬁes the same top ﬁve stimuli, though

with a different order (A–B–D–E–C). The subtle differences between the two measures in

this example are reﬂected in a comparison between the SSI-weighted average of ﬁgure 2(B)

and the speciﬁc-surprise-weighted average, calculated in the same way, shown in ﬁgure 3(B).

The speciﬁc-surprise-weighted average is shown as solid circles, and the best scaling of the

spike-triggered average is shown as a solid curve. In the meantime, since the SSI-weighted

average ﬁt the STA almost exactly (ﬁgure 2(B)), theSTAscaled to the SSI-weighted average

is shown as a dotted curve.Notethat the shape of the surprise-weighted average stimulus is

adifferent shape to the STA, and it has a smaller magnitude compared to the SSI-weighted

average, meaning that larger speciﬁc surprise does not as closely correspond to particular

features of the stimulus.

The speciﬁc surprise and the SSI behave more distinctly in the simple example of the 2 ×2

joint probability distribution described earlier in this paper. In this case, the speciﬁc surprise is

higher for the second stimulus s

(which only had one ambiguous response r

associated with

it): i

sur

) = 0.08 bits and i

sur

) = 1bit. This contrasts with the intuitive notion that s

better encoded than s

,sinces

is never clearly designated by a response. Does this mean that

speciﬁc surprise is a bad measure of a well-encoded stimulus?

Stimulus-speciﬁc information 185

0 1 2 3 4 5 6

B'B'

Stimulus-Specific Information (bits)

Number of Stimuli

A'C' B'D'

Figure 4. A non-linear neuron. The distribution of the SSI for the 2

stimuli is calculated for a

non-linear neuron whose spike-triggered average is zero. This neuron is designed such that opposite

stimuli evoke the same response, resulting in pairs of stimuli that have the same SSI (inset).

In fact, it is simply a different measurewith a different interpretation: while s

unambiguously denoted by an observation of r

(changing its probability:

→ 1), the

measurement of r

decreases its likelihood (

→

), nearly cancelling the change in surprise:

sur

) = 0.08 =

(−0.59) +

(0.42) bits. In the meantime, the speciﬁc surprise of s

relatively large (i

sur

) = 1bit) because the only response associated with s

is r

,andit

leads to an increase in the probability of observing s

→

.Thus, each possible response

contributes to the speciﬁc surprise of a particular stimulus s in relation to how much its

observation increases the likelihood of s.

Why do the speciﬁc surprise and the SSI give similar results with the visual neuron? For

aresponse to be informative (i.e., have a large speciﬁc information i

(r)), the entropy of

its conditional distribution p(s|r) must be less than that of the prior p (s),meaning that the

probability of some stimuli must be increased. As a result, in many cases, an informative

response will result in the largest changes in probability of the stimuli that are best encoded,

and the speciﬁc surprise and the SSI will be qualitative agreement. However, as the simple

2 × 2examplemade clear, this is not always the case.

So,the speciﬁc surprise and the SSI are measuring fundamentally different aspects of

stimulus signiﬁcance. Since the SSI is explicitly measuring an average difference in entropies,

it has intuitive meaning with regard to information transmission. As a result, the SSI in both

the simple 2 × 2example and the visual neuron fulﬁl this intuition of what a well-encoded

stimulus is, in contrast to the speciﬁc surprise.

7. Non-linear neurons

In the example of the visual neuron, the SSI correctly identiﬁed the signiﬁcant stimuli to a

neuron, in a way that was validated by linear analysis of the neuron, i.e., the spike-triggered

average. In this way, the agreementbetween the spike-triggered and SSI-weighted average

stimulus demonstrated that the neuron is essentially linear. Thus, while this example validates

the performance of the SSI, it makes clear that linear analysis would have been sufﬁcient in

identifying the signiﬁcant stimuli in this case.

Information theory is most useful in studying neurons with non-linear properties, or those

that do not encode the bulk of their information in single spikes. To demonstrate the potential

utility of the SSI, the visual neuron model discussed above is modiﬁed to exhibit non-linear

186 DAButts

behaviour. This modiﬁed model retains the same spiking properties, but is designed such

that it has, on average, the same response to a stimulus as it does to the opposite stimulus

(see the methods for more details). For example,anon-to-off transition will now evoke the

same number of spikes as an off-to-on transition with the same latency. As a result, the spike-

triggered average stimulus of this neuron is zero, since a spike is just as likely to be evoked by a

particular stimulus as its opposite. In this case, important stimuli can only be identiﬁed through

higher-order statistics, such as the spike-triggered covariance used by Arcas et al (2000).

Of course, differences in higher-order statistics are detected by information theoretic

measures, and the SSI distribution for the modiﬁed non-linear neuron is shown in ﬁgure 4

(calculated in the same way as the distribution of ﬁgure 2(A)). Thesixstimuli with the largest

SSI are shown in the inset, demonstrating the pairing of each stimulus with its opposite.

8. Conclusions

Ihaveshownthatthe SSI is an appropriate and reliable measure of the information associated

with aparticular stimulus. As with other information measures, the SSI is calculated without

particular assumptions about the coding scheme, and is robust to non-linearities in the system

being studied. As a result, the SSI is particularly useful in identifying the stimuli that are

signiﬁcant to a neuron where linear analyses break down. The SSI is also useful in cases

where neural responses other than individual spikes carry information, though such examples

were not considered in this paper.

Unfortunately, unlike the speciﬁc information of a response proposed by DeWeese and

Meister (1999), the SSI does not have a mathematical quality that shows it to be the only

possible measure of information associated with a stimulus. As previously discussed, there

are many possible stimulus-speciﬁc decompositions of mutual information; as a result, I have

chosen a decomposition that has an intuitive meaning with respect to information transmission,

and furthermore have shown it gives expected results in both simple constructed examples and

realistic examples using neuronal data.

The SSI has the same drawbacks as other information measures: a signiﬁcant number

of data are required in order to properly estimate the underlying probability distributions

needed by such calculations. However, since the same data as are used to calculate Shannon’s

mutual information can also be used for the speciﬁc information of responses and the SSI of

stimuli, these speciﬁc measures can extend the applicability and use of information studies in

neuroscience.

Methods

Simulated visual neurons

Data for the information calculations performed on the example visual neuron in this paper

wasgenerated using a model of visual neurons proposed by Keat et al (2001). This

phenomenological model isable to reproduce both the precise spike timing and variability

of observed cat LGN neurons, and thus generated ‘realistic’ neuronal data for the purpose of

evaluating information quantities.

The simulated neuron is presented as either black or white frames (with equal probability)

at 128 Hz, and generates a neural spike train. The basis of the neural response is a linear

convolution between the stimulus s(t) and a linear kernel K (τ ):

g(t) =



−∞

dτ K (τ )s(t − τ).

Stimulus-speciﬁc information 187

This function of time g(t) is modiﬁed by adding correlated Gaussian noise and a term that

represents spike-dependent effects, which accounts for a neuronal refractory period. Spikes

occur when the total of g(t), noise, and spike-dependent effects exceeds a threshold.

Forthe information calculations, a large data set consisting of 32 million stimulus frames

(representing roughly 70 h in real time) was used. This copious data set allowed for an

information analysis of long stimulus and response words simultaneously, though information

theoretic analysis is possible with far fewer data (see Liu et al 2001).

Non-linear neurons

To generate a neuron with a spike-triggered average of zero, the same model as above was

used, with one modiﬁcation: the linear convolution g(t) wassquared before plugging it into

therest of the model. As a result, opposite stimuli, which result in g

and −g

in the linear

model, now have the same effect on the response: (g

)

.Otherparameters of the model were

scaled so that this model had the same overall spike rate.

Acknowledgments

Iamgrateful to Mark Goldman for extensive input and comments on this manuscript. This

work was supported by an NSF Postdoctoral Fellowship in Biological Informatics.

References

Arcas B A Y, Fairhall A L and Bialek W 2000 What can a single neuron compute? Adv. Neural Inform. Process. Syst.

13 75–81

Borst A and Theunissen F E 1999 Information theory and neural coding Nature Neurosci. 2 947–57

Cover T M and Thomas J A 1991 Elements of Information Theory (New York: Wiley)

Dan Y, Alonso J M, Usrey W M and Reid R C 1998 Coding of visual information by precisely correlated spikes in

the lateral geniculate nucleus Nature Neurosci. 1 501–7

DeWeese M R and Meister M 1999 How to measure the information gained from one symbol Network: Comput.

Neural Syst. 10 325–40

Keat J, Reinagel P, Reid R C and Meister M 2001 Predicting every spike: a model for the responses of visual neurons

Neuron 30 803–17

LiuRC, Tzonev S, Rebrick S and Miller K D 2001 Variability andinformation in a neural code of the cat lateral

geniculate nucleus J. Neurophysiol. 86 2789–806

Reinagal P and Reid R C 2000 Temporal coding of visual information in the thalamus J. Neurosci. 20 5392–400

Theunissen F, Roddey J C, Stufﬂebeam S, Clague H and MillerJP1995 Information theoretic analysis of dynamical

encoding by four identiﬁed primary sensory interneurons in the cricket cercal system J. Neurophysiol. 75 1345–64

Spatial information transfer in hippocampal place cells depends on trial-to-trial variability, symmetry of place-field firing, and biophysical heterogeneities

Article

Full-text available

Jul 2021
NEURAL NETWORKS

The relationship between the feature-tuning curve and information transfer profile of individual neurons provides vital insights about neural encoding. However, the relationship between the spatial tuning curve and spatial information transfer of hippocampal place cells remains unexplored. Here, employing a stochastic search procedure spanning thousands of models, we arrived at 127 conductance-based place-cell models that exhibited signature electrophysiological characteristics and sharp spatial tuning, with parametric values that exhibited neither clustering nor strong pairwise correlations. We introduced trial-to-trial variability in responses and computed model tuning curves and information transfer profiles, using stimulus-specific (SSI) and mutual (MI) information metrics, across locations within the place field. We found spatial information transfer to be heterogeneous across models, but to reduce consistently with increasing levels of variability. Importantly, whereas reliable low-variability responses implied that maximal information transfer occurred at high-slope regions of the tuning curve, increase in variability resulted in maximal transfer occurring at the peak-firing location in a subset of models. Moreover, experience-dependent asymmetry in place-field firing introduced asymmetries in the information transfer computed through MI, but not SSI, and the impact of activity-dependent variability on information transfer was minimal compared to activity-independent variability. We unveiled ion-channel degeneracy in the regulation of spatial information transfer, and demonstrated critical roles for N-methyl-d-aspartate receptors, transient potassium and dendritic sodium channels in regulating information transfer. Our results demonstrate that trial-to-trial variability, tuning-curve shape and biological heterogeneities critically regulate the relationship between the spatial tuning curve and spatial information transfer in hippocampal place cells.

Frequency selective encoding of substrate vibrations in the somatosensory cortex

Preprint

Feb 2018

Sensing vibrations that propagate through solid substrates conveys fundamental information about moving objects and other nearby dynamic events. Here we report that neurons responsive to substrate vibrations applied to the mouse forelimb reveal a new way of representing frequency information in the primary somatosensory cortex (S1). In contrast to vibrotactile stimulation of primate glabrous skin, which produces temporally entrained spiking and frequency independent firing rates, we found that mouse S1 neurons rely on a different coding scheme: their spike rates are conspicuously tuned to a preferred frequency of the stimulus. Histology, peripheral nerve block and optogenetic tagging experiments furthermore reveal that these responses are associated with the activation of mechanoreceptors located in deep subdermal tissue of the distal forelimb. We conclude that the encoding of frequency information of substrate-borne vibrations in the mouse S1 might be analogous to the representation of pitch of airborne sound in auditory cortex.

Spatial information transfer in hippocampal place cells depends on trial-to-trial variability, symmetry of place-field firing and biophysical heterogeneities

Preprint

Sep 2020

The relationship between the feature-tuning curve and information transfer profile of individual neurons provides vital insights about neural encoding. However, the relationship between the spatial tuning curve and spatial information transfer of hippocampal place cells remains unexplored. Here, employing a stochastic search procedure spanning thousands of models, we arrived at 127 conductance-based place-cell models that exhibited signature electrophysiological characteristics and sharp spatial tuning, with parametric values that exhibited neither clustering nor strong pairwise correlations. We introduced trial-to-trial variability in responses and computed model tuning curves and information transfer profiles, using stimulus-specific (SSI) and mutual (MI) information metrics, across locations within the place field. We found spatial information transfer to be heterogeneous across models, but to reduce consistently with increasing degrees of variability. Importantly, whereas reliable low-variability responses implied that maximal information transfer occurred at high-slope regions of the tuning curve, increase in variability resulted in maximal transfer occurring at the peak-firing location in a subset of models. Moreover, experience-dependent asymmetry in place-field firing introduced asymmetries in the information transfer computed through MI, but not SSI, and the impact of activity-dependent variability on information transfer was minimal compared to activity-independent variability. Biophysically, we unveiled a many-to-one relationship between different ion channels and information transfer, and demonstrated critical roles for N-methyl-D-aspartate receptors, transient potassium and dendritic sodium channels in regulating information transfer. Our results emphasize the need to account for trial-to-trial variability, tuning-curve shape and biological heterogeneities while assessing information transfer, and demonstrate ion-channel degeneracy in the regulation of spatial information transfer.

Neuronal selectivity to complex vocalization features emerges in the superficial layers of primary auditory cortex

Article

Full-text available

Jun 2021
PLOS BIOL

Early in auditory processing, neural responses faithfully reflect acoustic input. At higher stages of auditory processing, however, neurons become selective for particular call types, eventually leading to specialized regions of cortex that preferentially process calls at the highest auditory processing stages. We previously proposed that an intermediate step in how nonselective responses are transformed into call-selective responses is the detection of informative call features. But how neural selectivity for informative call features emerges from nonselective inputs, whether feature selectivity gradually emerges over the processing hierarchy, and how stimulus information is represented in nonselective and feature-selective populations remain open question. In this study, using unanesthetized guinea pigs (GPs), a highly vocal and social rodent, as an animal model, we characterized the neural representation of calls in 3 auditory processing stages—the thalamus (ventral medial geniculate body (vMGB)), and thalamorecipient (L4) and superficial layers (L2/3) of primary auditory cortex (A1). We found that neurons in vMGB and A1 L4 did not exhibit call-selective responses and responded throughout the call durations. However, A1 L2/3 neurons showed high call selectivity with about a third of neurons responding to only 1 or 2 call types. These A1 L2/3 neurons only responded to restricted portions of calls suggesting that they were highly selective for call features. Receptive fields of these A1 L2/3 neurons showed complex spectrotemporal structures that could underlie their high call feature selectivity. Information theoretic analysis revealed that in A1 L4, stimulus information was distributed over the population and was spread out over the call durations. In contrast, in A1 L2/3, individual neurons showed brief bursts of high stimulus-specific information and conveyed high levels of information per spike. These data demonstrate that a transformation in the neural representation of calls occurs between A1 L4 and A1 L2/3, leading to the emergence of a feature-based representation of calls in A1 L2/3. Our data thus suggest that observed cortical specializations for call processing emerge in A1 and set the stage for further mechanistic studies.

A complex feature-based representation of vocalizations emerges in the superficial layers of primary auditory cortex

Preprint

Full-text available

Apr 2021

Early in auditory processing, neural responses faithfully reflect acoustic input. At higher stages of auditory processing, however, neurons become selective for particular call types, eventually leading to specialized regions of cortex that preferentially process calls at the highest auditory processing stages. We previously proposed that an intermediate step in how non-selective responses are transformed into call-selective responses is the detection of informative call features. But how neural selectivity for informative call features emerges from non-selective inputs, whether feature selectivity gradually emerges over the processing hierarchy, and how stimulus information is represented in non-selective and feature-selective populations remain open questions. In this study, using unanesthetized guinea pigs, a highly vocal and social rodent, as an animal model, we characterized the neural representation of calls in three auditory processing stages: the thalamus (vMGB), and thalamorecipient (L4) and superficial layers (L2/3) of primary auditory cortex (A1). We found that neurons in vMGB and A1 L4 did not exhibit call-selective responses and responded throughout the call durations. However, A1 L2/3 neurons showed high call-selectivity with about a third of neurons responding to only one or two call types. These A1 L2/3 neurons only responded to restricted portions of calls suggesting that they were highly selective for call features. Receptive fields of these A1 L2/3 neurons showed complex spectrotemporal structures that could underlie their high call feature selectivity. Information theoretic analysis revealed that in A1 L4 stimulus information was distributed over the population and was spread out over the call durations. In contrast, in A1 L2/3, individual neurons showed brief bursts of high stimulus-specific information, and conveyed high levels of information per spike. These data demonstrate that a transformation in the neural representation of calls occurs between A1 L4 and A1 L2/3, leading to the emergence of a feature-based representation of calls in A1 L2/3. Our data thus suggest that observed cortical specializations for call processing emerge in A1, and set the stage for further mechanistic studies.

Skilled motor control implies a low entropy of states but a high entropy of actions

Preprint

Full-text available

Dec 2021

The mastery of skills such as playing tennis or balancing an inverted pendulum implies a very accurate control of movements to achieve the task goals. Traditional accounts of skilled action control that focus on either routinization or perceptual control make opposite predictions about the ways we achieve mastery. The notion of routinization emphasizes the decrease of the variance of our actions, whereas the notion of perceptual control emphasizes the decrease of the variance of the states we visit, but not of the actions we execute. Here, we studied how participants managed control tasks of varying levels of complexity, which consisted in controlling inverted pendulums of different lengths. We used information-theoretic measures to compare the predictions of alternative theoretic accounts that focus on routinization and perceptual control, respectively. Our results indicate that the successful performance of the control task strongly correlates with the decrease of state variability and the increase of action variability. As postulated by perceptual control theory, the mastery of skills consists in achieving stable control goals by flexible means.

A Practical & Unified Notation for Information-Theoretic Quantities in ML

Preprint

Full-text available

Jun 2021

Information theory is of importance to machine learning, but the notation for information-theoretic quantities is sometimes opaque. The right notation can convey valuable intuitions and concisely express new ideas. We propose such a notation for machine learning users and expand it to include information-theoretic quantities between events (outcomes) and random variables. We apply this notation to a popular information-theoretic acquisition function in Bayesian active learning which selects the most informative (unlabelled) samples to be labelled by an expert. We demonstrate the value of our notation when extending the acquisition function to the core-set problem, which consists of selecting the most informative samples \emph{given} the labels.

Neural tuning and representational geometry

Preprint

Full-text available

Apr 2021

A central goal of neuroscience is to understand the representations formed by brain activity patterns and their connection to behavior. The classical approach is to investigate how individual neurons encode the stimuli and how their tuning determines the fidelity of the neural representation. Tuning analyses often use the Fisher information to characterize the sensitivity of neural responses to small changes of the stimulus. In recent decades, measurements of large populations of neurons have motivated a complementary approach, which focuses on the information available to linear decoders. The decodable information is captured by the geometry of the representational patterns in the multivariate response space. Here we review neural tuning and representational geometry with the goal of clarifying the relationship between them. The tuning induces the geometry, but different sets of tuned neurons can induce the same geometry. The geometry determines the Fisher information, the mutual information, and the behavioral performance of an ideal observer in a range of psychophysical tasks. We argue that future studies can benefit from considering both tuning and geometry to understand neural codes and reveal the connections between stimulus, brain activity, and behavior.

Integrating new memories into the hippocampal network activity space

Article

Full-text available

Mar 2021
NAT NEUROSCI

By investigating the topology of neuronal co-activity, we found that mnemonic information spans multiple operational axes in the mouse hippocampus network. High-activity principal cells form the core of each memory along a first axis, segregating spatial contexts and novelty. Low-activity cells join co-activity motifs across behavioral events and enable their crosstalk along two other axes. This reveals an organizational principle for continuous integration and interaction of hippocampal memories.

Neural tuning and representational geometry

Article

Sep 2021
NAT REV NEUROSCI

A central goal of neuroscience is to understand the representations formed by brain activity patterns and their connection to behaviour. The classic approach is to investigate how individual neurons encode stimuli and how their tuning determines the fidelity of the neural representation. Tuning analyses often use the Fisher information to characterize the sensitivity of neural responses to small changes of the stimulus. In recent decades, measurements of large populations of neurons have motivated a complementary approach, which focuses on the information available to linear decoders. The decodable information is captured by the geometry of the representational patterns in the multivariate response space. Here we review neural tuning and representational geometry with the goal of clarifying the relationship between them. The tuning induces the geometry, but different sets of tuned neurons can induce the same geometry. The geometry determines the Fisher information, the mutual information and the behavioural performance of an ideal observer in a range of psychophysical tasks. We argue that future studies can benefit from considering both tuning and geometry to understand neural codes and reveal the connections between stimuli, brain activity and behaviour.

Coding of visual information by precisely correlated spikes in the lateral geniculate nucleus

Article

Full-text available

Nov 1998

Correlated firing among neurons is widespread in the nervous system. Precisely correlated spiking, occurring on a millisecond time scale, has recently been observed among neurons in the lateral geniculate nucleus with overlapping receptive fields. We have used an information-theoretic analysis to examine the role of these correlations in visual coding. Considerably more information can be extracted from two cells if temporal correlations between them are considered. The percentage increase in information depends on the degree of correlation; the average increase is approximately 20% for strongly correlated pairs. Thus, precise temporal correlation could be used as an additional information channel from thalamus to visual cortex.

Temporal Coding of Visual Information in the Thalamus

Article

Full-text available

Aug 2000

The amount of information a sensory neuron carries about a stimulus is directly related to response reliability. We recorded from individual neurons in the cat lateral geniculate nucleus (LGN) while presenting randomly modulated visual stimuli. The responses to repeated stimuli were reproducible, whereas the responses evoked by nonrepeated stimuli drawn from the same ensemble were variable. Stimulus-dependent information was quantified directly from the difference in entropy of these neural responses. We show that a single LGN cell can encode much more visual information than had been demonstrated previously, ranging from 15 to 102 bits/sec across our sample of cells. Information rate was correlated with the firing rate of the cell, for a consistent rate of 3.6 +/- 0.6 bits/spike (mean +/- SD). This information can primarily be attributed to the high temporal precision with which firing probability is modulated; many individual spikes were timed with better than 1 msec precision. We introduce a way to estimate the amount of information encoded in temporal patterns of firing, as distinct from the information in the time varying firing rate at any temporal resolution. Using this method, we find that temporal patterns sometimes introduce redundancy but often encode visual information. The contribution of temporal patterns ranged from -3.4 to +25.5 bits/sec or from -9.4 to +24.9% of the total information content of the responses.

Variability and Information in a Neural Code of the Cat Lateral Geniculate Nucleus

Article

Full-text available

Jan 2002

A central theme in neural coding concerns the role of response variability and noise in determining the information transmission of neurons. This issue was investigated in single cells of the lateral geniculate nucleus of barbiturate-anesthetized cats by quantifying the degree of precision in and the information transmission properties of individual spike train responses to full field, binary (bright or dark), flashing stimuli. We found that neuronal responses could be highly reproducible in their spike timing (approximately 1-2 ms standard deviation) and spike count (approximately 0.3 ratio of variance/mean, compared with 1.0 expected for a Poisson process). This degree of precision only became apparent when an adequate length of the stimulus sequence was specified to determine the neural response, emphasizing that the variables relevant to a cell's response must be controlled to observe the cell's intrinsic response precision. Responses could carry as much as 3.5 bits/spike of information about the stimulus, a rate that was within a factor of two of the limit the spike train could transmit. Moreover, there appeared to be little sign of redundancy in coding: on average, longer response sequences carried at least as much information about the stimulus as would be obtained by adding together the information carried by shorter response sequences considered independently. There also was no direct evidence found for synergy between response sequences. These results could largely, but not entirely, be explained by a simple model of the response in which one filters the stimulus by the cell's impulse response kernel, thresholds the result at a fairly high level, and incorporates a postspike refractory period.

Information Theoretic Analysis of Dynamical Encoding by Four Primary Sensory Interneurons in the Cricket Cercal System

Article

May 1996

1. The stimulus/response properties of four identified primary sensory interneurons in the cricket cercal sensory system were studied using electrophysiological techniques. These four cells are thought to represent a functionally discrete subunit of the cercal system: they are the only cells that encode information about stimulus direction to higher centers for low intensity stimuli. Previous studies characterized the quantity of information encoded by these cells about the direction of air currents in the horizontal plane. In the experiments reported here, we characterized the quantity and quality of information encoded in the cells' elicited responses about the dynamics of air current waveforms presented at their optimal stimulus directions. The total sample set included 22 cells. 2. This characterization was achieved by determining the cells' frequency sensitivities and encoding accuracy using the methods of stochastic systems analysis and information theory. The specific approach used for the analysis was the "stimulus reconstruction" technique in which a functional expansion was derived to transform the observed spike train responses into the optimal estimate (i.e., "reconstruction") of the actual stimulus. A novel derivation of the crucial equations is presented. The reverse approach is compared with the more traditional forward analysis, in which an expansion is derived that transforms the stimulus to a prediction of the spike train response. Important aspects of the application of these analytical approaches are considered. 3. All four interneurons were found to have identical frequency tuning, as assessed by the accuracy with which different frequency components of stimulus waveforms could be reconstructed with a linear expansion. The interneurons encoded significant information about stimulus frequencies between 5 and 80 Hz, which peak sensitivities at approximately 15 Hz. 4. All four interneurons were found to have identical stimulus/response latencies. The mean latency between a stimulus component and the corresponding elicited spike was 17 ms. All four interneurons also had identical integration times. The integration time, measured by the duration of stimulus, which could affect the probability of spiking, was approximately 50 ms. 5. The accuracy of the encoding can be expressed as a signal-to-noise ratio, where the noise is a scaled difference between the original signal and the best estimate of the signal. Peak signal-to-noise ratios of approximately 1 were obtained for the cells across all stimulus power levels, using only the linear expansion term. Analysis of the data indicated that the consideration of second-order nonlinear transformations of the stimulus would not have increased the calculated encoding accuracy. 6. The encoding accuracy also can be expressed in the information theoretic units of bits/second, which characterizes the information transmission rate of the cell. Bits/second values varied between 10 and 80 for the 22 different cells in our experimental set. The information rate values were highly correlated with the mean spike rates of the interneurons, but were not correlated with the stimulus power levels. However, normalizing the absolute information rates by the mean spike rate in each case yielded a measure of bits/spike that was remarkably invariant across all experiments. The measured bits/spike rate was approximately 1 for all experiments. This result is discussed in the context of recent theoretical studies on optimal encoding. 7. Although the dynamic sensitivities of the four interneurons were identical, their directional sensitivities are known to be orthogonal. Thus the cells are complementary to one another from a functional standpoint: whereas a particular cell will be insensitive to air currents from some directions, one or more of the other three cells will be sensitive to stimuli from those directions...

Information theory and neural coding

Article

Dec 1999

Information theory quantifies how much information a neural response carries about the stimulus. This can be compared to the information transferred in particular models of the stimulus-response function and to maximum possible information transfer. Such comparisons are crucial because they validate assumptions present in any neurophysiological analysis. Here we review information-theory basics before demonstrating its use in neural coding. We show how to use information theory to validate simple stimulus-response models of neural coding of dynamic stimuli. Because these models require specification of spike timing precision, they can reveal which time scales contain information in neural coding. This approach shows that dynamic stimuli can be encoded efficiently by single neurons and that each spike contributes to information transmission. We argue, however, that the data obtained so far do not suggest a temporal code, in which the placement of spikes relative to each other yields additional information.

How to measure the information gained from one symbol

Article

Dec 1999

Information theory provides a powerful framework to analyse how neurons represent sensory stimuli or other behavioural variables. A recurring question regards the amount of information conveyed by a specific neuronal response. Here we show that the commonly used definition for this quantity has a serious flaw: the information accumulated during subsequent observations of neural activity fails to combine additively. Additivity is a highly desirable property, both on theoretical grounds and for the practical purpose of analysing population codes. We propose an alternative measure for the information per observation and prove that this is the only definition that satisfies additivity. The old and the new definitions measure very different aspects of the neural code, which is illustrated with visual responses from a motion-sensitive neuron in the primate cortex. Our analysis allows additional interpretation of several published results, which suggests that the neurons studied are operating far from their information capacity.

Predicting every spike: A model for the responses of visual neurons

Article

Jul 2001
NEURON

In the early visual system, neuronal responses can be extremely precise. Under a wide range of stimuli, cells in the retina and thalamus fire spikes very reproducibly, often with millisecond precision on subsequent stimulus repeats. Here we develop a mathematical description of the firing process that, given the recent visual input, accurately predicts the timing of individual spikes. The formalism is successful in matching the spike trains from retinal ganglion cells in salamander, rabbit, and cat, as well as from lateral geniculate nucleus neurons in cat. It adapts to many different response types, from very precise to highly variable. The accuracy of the model allows a compact description of how these neurons encode the visual stimulus.

Predicting every spike: a model for the responses of visual neurons Neuron 30 803–17 Liu R Variability and information in a neural code of the cat lateral geniculate nucleus 2789–806 Reinagal P and Reid R C 2000 Temporal coding of visual information in the thalamus

325-40

Neural
J Keat
P Reinagel
R Reid
M C Meister
S Tzonev
Rebrick
K Miller

Neural Syst. 10 325–40 Keat J, Reinagel P, Reid R C and Meister M 2001 Predicting every spike: a model for the responses of visual neurons Neuron 30 803–17 Liu R C, Tzonev S, Rebrick S and Miller K D 2001 Variability and information in a neural code of the cat lateral geniculate nucleus J. Neurophysiol. 86 2789–806 Reinagal P and Reid R C 2000 Temporal coding of visual information in the thalamus J. Neurosci. 20 5392–400 Theunissen F, Roddey J C, Stufflebeam S, Clague H and Miller J P 1995 Information theoretic analysis of dynamical encodingbyfouridentifiedprimarysensoryinterneuronsinthecricketcercalsystemJ.Neurophysiol.751345–64

Jan 1991

T M Cover
J Thomas

Cover T M and Thomas J A 1991 Elements of Information Theory (New York: Wiley)

How much information is associated with a particular stimulus?

Abstract and Figures

Recommended publications

Selective attention model with spiking elements

The Episodic Nature of Spike Trains in the Early Visual Pathway

Data-Driven Approaches to Understanding Visual Neuron Activity

Temporal precision in the neural code and the timescales of natural vision