ArticlePDF Available

Shape-coding in IT cells generalizes over contrast and mirror reversal, but not figure-ground reversal

October 2001
Nature Neuroscience 4(9):937-42

October 2001
4(9):937-42

DOI:10.1038/nn0901-937

Source
PubMed

Authors:

Gordon C Baylis

Western Kentucky University

We assessed how the visual shape preferences of neurons in the inferior temporal cortex of awake, behaving monkeys generalized across three different stimulus transformations. Stimulus-preferences of particular cells among different polygon displays were correlated across reversed contrast polarity or mirror reversal, but not across figure-ground reversal. This corresponds with psychological findings on human shape judgments. Our results imply that neurons in inferior temporal cortex respond to components of visual shape derived only after figure-ground assignment of contours, not to the contours themselves.

Example stimuli.Top, classic figure−ground display, together with its components. Humans rate a mirror image of the figure as more similar to the original figure−ground display than the original ground in isolation. Stimuli a−f, Visual displays for the single-cell recording experiment, showing how 8 different displays were generated from one particular curved contour (type 1 is shown). Bottom right, 3 additional types of curved contours (2−4); each of these analogously generated a 8 different displays (2a−h, 3a−h, 4a−h). Three aspects of the displays were manipulated orthogonally, in a 2 2 2 fashion illustrated by the layout of shapes 1a−h, which shows one 2 2 table of possible displays in the 'front' plane (b, d, f, h), with another 2 2 table of possible displays in the 'back' plane (a, c, e, g). All displays comprised either a white filled polygon on a black background, or vice versa. The difference between examples in the front plane and the back plane in the illustration depicts this contrast-polarity transform for otherwise equivalent displays. Each display also appeared in mirror image form. The difference between examples in adjacent columns for each of the 2 2 tables in the schematic illustrates this mirror-reversal transform. Finally, a given curved contour could have the figural region (as defined by surroundedness14, 17) on its left or right, leading to the figure−ground transform between examples in adjacent rows for each 2 2 table in the schematic. Only the figure−ground transform leaves the curved contour entirely unchanged.

…

Correlation plots for a single cell.Plots show mean firing rates (in the period 100−600 ms after the stimulus) for an illustrative neuron, for particular stimuli along the x-axis, and for transformed versions of the same stimuli along the y-axis. (a) Contrast reversal transform. (b) Mirror reversal. (c) Figure−ground reversal. The total set of 32 stimuli all contribute to each plot. For each plot, this set was divided into 2 subsets of 16, with each member of one subset providing a transformed version for one member of the other subset. (a, b) Stimuli that induced a particular firing rate led to a similar rate when transformed (correlations of 0.92 and 0.68 respectively, for this particular cell). No such relationship is apparent in (c) across the figure−ground transform (R = 0.0).

…

Regions of IT cortex in which the cells were recorded, drawn on sections from the brain of monkey A.Top, locations of these coronal sections are shown on a schematic monkey brain.

…

Figures - uploaded by Gordon C Baylis

Content may be subject to copyright.

Content uploaded by Gordon C Baylis

Content may be subject to copyright.

Inferotemporal (IT) cortex is involved in visual shape repre-

sentation and visual object recognition, based on evidence from

single-cell recording

1–9

, functional imaging

10,11

and lesion

studies

9,12

. In comparison with earlier visual areas, cells in IT

have larger receptive fields and show more abstract preferences

for complex shape properties

2,4,6,9,13

, but exactly how this

region represents shape remains controversial

3–5,13

. Here we

examined shape representation within IT in relation to fig-

ure–ground reversal, as well as other stimulus manipulations

that served as control comparisons.

The figure–ground assignment of a given visual display can

dramatically alter the shape that human observers perceive (exam-

ples, top of Fig. 1). Adjacent figure and ground regions defined

by a common contour are perceived as very different. Human

observers typically recognize the figure later (for example, the face

in the top row of Fig. 1), but not the ground (white shape in that

row), even for judgments based on exactly the same shared con-

tour

14–18

. Moreover, they rate a mirror image of the figure as more

similar

to the original figure–ground display than an image of

the ground in isolation. This arises even though the ground probe

shares exactly the same curved contour as in the originally exposed

display, whereas the mirror image of the figure has a mirror-

reversed contour. These phenomena also arise for shapes made

by unfamiliar contours

15–20

(see below), not just for profiles of

meaningful shapes. Such effects reveal the influence of one-sided

edge assignment on visual shape perception in humans

15–20

Here we examined how the shape preferences of IT cells in

the primate brain may relate to these psychological phenomena.

Specifically, we tested how the preferences of individual IT cells

for stimuli drawn from a population of pseudorandom two-

dimensional shapes would generalize across three different trans-

formations: figure–ground reversal, reversal of contrast-polarity

and mirror-image reflection about the vertical (Fig. 1a–h). All

shapes were polygons with straight edges at the top, bottom and

along one side, and with a pseudo-randomly curved contour on

Shape-coding in IT cells generalizes

over contrast and mirror reversal,

but not figure-ground reversal

Gordon C. Baylis

1,2

and Jon Driver

University of Plymouth, Plymouth Institute of Neuroscience, 12 Kirkby Place, Plymouth, PL4 8AA, UK

Department of Psychology, University of South Carolina, Columbia, South Carolina 29208, USA

Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK

Correspondence should be addressed to G.C.B. (gordon@pion.ac.uk) or J.D. (j.driver@ucl.ac.uk)

We assessed how the visual shape preferences of neurons in the inferior temporal cortex of awake,

behaving monkeys generalized across three different stimulus transformations. Stimulus-

preferences of particular cells among different polygon displays were correlated across reversed

contrast polarity or mirror reversal, but not across figure–ground reversal. This corresponds with

psychological findings on human shape judgments. Our results imply that neurons in inferior tempo-

ral cortex respond to components of visual shape derived only after figure–ground assignment of

contours, not to the contours themselves.

the other side

15–20

. The curved contours of possible polygons dif-

fered in their identity and location (left or right side of polygon).

It is possible that any selectivity in the responses of IT cells to

these stimuli is determined just by these physical differences

among the displays. Alternatively, IT responses might show pat-

terns that are more like shape judgments in human observers,

where figure and ground regions are perceived to have very dif-

ferent shapes despite their common defining contour, with the

mirror image of any figure being perceived as more similar to

that figure than its ground (as confirmed for the present stimuli

also; see below). For the displays used here, exactly the same

curved contour was present across a reversal of figure–ground

assignment (Fig. 1, compare a to e, c to g, b to f, and d to h), yet

this contour produces shapes that look very different to human

observers when figure and ground are reversed

14–20

The curved contour was necessarily on opposite sides of the

figure region versus the adjacent ground region within any display

(stimuli a–h, Fig. 1). Our further manipulation of mirror-imag-

ing (see also ref. 6) controlled for this, as the curved contour of

any mirror image of an original figure is on the same side as the

curved contour of the original ground (stimuli a–h, Fig. 1). The

figure and ground region of each individual display also differed

in contrast polarity (one white, the other black). Our orthogo-

nal manipulation of reversing contrast-polarity (see also ref. 8)

controlled for this, as a contrast-reversal of an original figure has

the same polarity as the original ground (stimuli a–h, Fig. 1).

We recorded activity from IT cells in monkeys to determine

their firing rates for the different stimuli, and to determine how

these rates correlated across the three different stimulus trans-

forms. We also required human observers to rate the similarity

of the displays across the same transforms.

ESULTS

We recorded from 88 cells in areas TEa, TEm and TE3 (ref. 20)

of 2 awake monkeys while they viewed displays drawn from 32

articles

nature neuroscience • volume 4 no 9 • september 2001 937

32 stimuli (Fig. 3; the same nomenclature is used for these stim-

uli as in Fig. 1). The pattern of firing rates was similar across the

contrast-reversal and mirror-image transforms, but not across

the figure–ground reversal. (Compare the four appropriate pair-

ings of graphs, each with four bars, across each of these trans-

forms, Fig. 3.) Peri-stimulus time histograms for this cell in

response to the 16 stimuli generated from types 1 and 2 show

that responses were similar across mirror-image and especially

contrast-reversal transforms, but that they differed markedly

across the figure–ground transform (Fig. 4). For instance, type

2 received a much more vigorous response than type 1 in version

b, but the opposite ordering applied for version f (the

figure–ground transform of version b).

In histograms of the distributions of correlations for all cells

in the population across the three transformations, most cells

showed significant positive correlations in shape preference across

reversed contrast polarity in the display (mean correlation coef-

Fig. 1. Example stimuli. Top, classic figure–ground display, together

with its components. Humans rate a mirror image of the figure as more

similar to the original figure–ground display than the original ground in

isolation. Stimuli a–f, Visual displays for the single-cell recording exper-

iment, showing how 8 different displays were generated from one par-

ticular curved contour (type 1 is shown). Bottom right, 3 additional

types of curved contours (2–4); each of these analogously generated a

8 different displays (2a–h, 3a–h, 4a–h). Three aspects of the displays

were manipulated orthogonally, in a 2 × 2 × 2 fashion illustrated by the

layout of shapes 1a–h, which shows one 2 × 2 table of possible displays

in the ‘front’ plane (b, d, f, h), with another 2 × 2 table of possible dis-

plays in the ‘back’ plane (a, c, e, g). All displays comprised either a white

filled polygon on a black background, or vice versa. The difference

between examples in the front plane and the back plane in the illustra-

tion depicts this contrast-polarity transform for otherwise equivalent

displays. Each display also appeared in mirror image form. The differ-

ence between examples in adjacent columns for each of the 2 × 2

tables in the schematic illustrates this mirror-reversal transform.

Finally, a given curved contour could have the figural region (as defined

by surroundedness

14,17

) on its left or right, leading to the

figure–ground transform between examples in adjacent rows for each

2 × 2 table in the schematic. Only the figure–ground transform leaves

the curved contour entirely unchanged.

938 nature neuroscience • volume 4 no 9 • september 2001

possibilities (stimuli a–h, Fig. 1, equivalent transforms were

implemented for types 2, 3 and 4, thus yielding 8 × 4 = 32 stim-

uli in total). Eighty-nine percent of cells (78/88) showed signifi-

cant differences in mean evoked firing rate in the interval 100 to

600 ms after stimulus onset, as a function of which of the 32 pos-

sible stimuli were shown (at p < 0.01 or better). We then exam-

ined how the shape preferences revealed by these differential

evoked firing rates correlated across the three transformations

we had applied to the stimuli. Most cells showed significant and

substantial correlations in stimulus preferences across reversals

of contrast polarity and across mirror imaging, but not across

figure–ground reversal.

We first show the correlations across these three transforms

for one illustrative neuron (Fig. 2). All 32 stimuli contributed to

each of these correlations, but stimuli were paired differently for

each correlation (see Methods). We plot the mean firing rates of

this cell (in the 100–600 ms interval after stimulus onset) for all

articles

0 10203040

010203040

0 10203040

Contrast reversal Mirror reversal

Figure-ground reversal

Response (spikes/s)

Fig. 2. Correlation plots for a single cell. Plots show mean firing rates (in the period 100–600 ms after the stimulus) for an illustrative neuron, for par-

ticular stimuli along the x-axis, and for transformed versions of the same stimuli along the y-axis. (a) Contrast reversal transform. (b) Mirror reversal.

(c) Figure–ground reversal. The total set of 32 stimuli all contribute to each plot. For each plot, this set was divided into 2 subsets of 16, with each

member of one subset providing a transformed version for one member of the other subset. (a, b) Stimuli that induced a particular firing rate led to

a similar rate when transformed (correlations of 0.92 and 0.68 respectively, for this particular cell). No such relationship is apparent in (c) across the

figure–ground transform (R = 0.0).

b c

For figure–ground reversals, the correlation coefficient averaged

very close to zero throughout the trial.

Finally, we examined how the average firing rates changed

during the trial for preferred versus non-preferred stimuli, and

how this generalized across the three transformations. To do this,

we first determined for each cell which of the 32 stimuli produced

the maximal mean firing rate in a 100-600 ms time bin after stim-

ulus onset; this defined the preferred stimulus for that cell. We

also identified the non-preferred stimulus for each cell, produc-

ing the lowest mean firing rate in the same time window. We

show the mean firing rates across all cells for their preferred ver-

sus non-preferred stimuli, at different times after stimulus onset

(Fig. 6a). Firing rates for contrast-polarity and mirror-image

reversals of these stimuli show how the preference was largely

maintained across both these transformations (Fig. 6b and c). In

contrast, the preference disappeared across the figure–ground

transformation, consistent with our other findings (Fig. 6d). We

confirmed the site of cellular recordings by histology (Fig. 7).

In a matching task (see Methods) on the same shapes as used in

our physiological work, 12 human observers selected the untrans-

formed figure on 87.5% of trials, the contrast-reversed version on

68.3% trials and the mirror-reversed version on 54.2% of trials.

The latter two transforms were each selected significantly more

often (p < 0.01) than the figure–ground reversal (only 19.7%).

Moreover, the figure–ground reversal was not selected any more

often than a shape with an entirely different contour (20.3%), and

most selections for either of these two types arose when they were

the only two alternatives presented (see Methods). These data con-

Fig. 4. Peri-stimulus time histograms of firing, using 20 ms bins, for the

illustrative cell. Response to the 8 variants of type 1 and type 2 shapes.

(Detailed responses to 16 stimuli are shown here, rather than all 32; type

3 and type 4 data (Fig. 3) are omitted for brevity.) 2 × 2 × 2 layout and

nomenclature for the stimuli are as in Figs. 2 and 3. The y-axis brace rep-

resents a firing rate of 100 spikes/s; the bar on the x-axis represents the

first 500 ms of stimulus presentation time.

Fig. 3. Firing rates in response to the 32 stimuli for a single

cell. Histograms show mean firing rate in the 100–600 ms

period after stimulus onset for the illustrative cell from Fig.

2, now shown for each individual stimulus. The layout in the

illustration has the same 2 × 2 × 2 arrangement as in the

schematic in the center of Fig. 1, and uses the same nomen-

clature for the 32 different stimuli (variants a–h on contour

types 1–4). Hence, comparing laterally adjacent pairs of his-

tograms addresses the mirror image transform of the stim-

uli; comparing histograms between the apparent ‘front’ and

‘back’ plane addresses contrast reversal; vertically adjacent

histograms represent a figure–ground reversal. The pattern

seen within each of the paired histograms stays similar

across both the contrast and mirror-image transforms, but

not across the figure–ground reversals, hence the correla-

tions in Fig. 2 for the same cell.

ficient, R = 0.59), and likewise across mirror-image

reflection of the presented shape (mean R = 0.46;

Fig. 5). In contrast, correlation coefficients for fig-

ure–ground reversal were typically low, and centered

around zero (mean R = 0.04). Chi-square tests, com-

paring the number of cells showing significant corre-

lations across the different transforms, found many

more such correlations for contrast versus

figure–ground reversal (χ

= 80.4, p < 0.0001), and

for mirror-image versus figure–ground reversal

(χ

= 44.96, p < 0.0001). In addition, correlations

were somewhat more pronounced for contrast than mirror image

reversal (χ

= 9.44, p < 0.05), in accord with the human simi-

larity ratings reported below.

The poor generalization across figure–ground reversal was

found equivalently for cells that showed a significant correlation

across both contrast and mirror-image reversal (black, bottom

histogram of Fig. 5) and those that did not (white, Fig. 5); these

distributions for figure–ground reversal did not differ. We also

assessed how the correlation coefficients developed as a function

of time after stimulus onset. For contrast-polarity and mirror-

image transformations, the average coefficients climbed rapidly,

reaching asymptote at around 200–300 ms post stimulus onset.

articles

nature neuroscience • volume 4 no 9 • september 2001 939

spikes / s

1a 2a 3a 4a

1c 2c 3c 4c

1e 2e 3e 4e

1g 2g 3g 4g

1d 2d 3d 4d

1b 2b 3b 4b

1f 2f 3f 4f

1h 2h 3h 4h

The pattern made by all four

bars in each graph remains the

same across this transform.

The

pattern

made by all

four bars in

each graph

changes

across this

transform.

The pattern made by all four bars in each

graph remains similar across this transform.

Mirror reversal

Figure-

ground

reversal

Contrast reversal

spikes / sspikes / s

spikes / s

Contrast

reversal

Mirror reversal

Figure–ground reversal

Fig. 6. Population response to preferred versus non-

preferred stimuli across the three stimulus transforms.

Mean firing rate across the population of neurons, for

successive 20-ms time bins, with standard errors.

(a) Responses to the preferred (blue) and non-preferred

(red) stimulus selected for each neuron. (b) Responses

to the contrast-reversed versions of each neuron’s pre-

ferred versus non-preferred stimuli, showing that the

preference is maintained. (c) Responses to the mirror

imaged versions of each neuron’s preferred versus non-

preferred stimuli, again showing that the preference is

still maintained. (d) Responses to the figure–ground

reversed versions of each neuron’s preferred versus

non-preferred stimuli, showing that the preference is

now abolished.

940 nature neuroscience • volume 4 no 9 • september 2001

firm that human perception of shape generalizes well across con-

trast reversal, and fairly well across mirror reversal. In contrast,

figure–ground reversal alters shape perception as much as the gen-

eration of a new shape from an entirely different contour. Our

findings for the shape preferences of IT cells in the primate brain

closely parallel these aspects of human perception.

ISCUSSION

The shape preferences of IT cells generalized well

across contrast reversals of the stimuli and across

mirror imaging of the stimuli, but not across fig-

ure–ground reversals. The two-dimensional poly-

gons we used varied in their particular curved

contour. The other lines (three straight edges) were

held constant in the stimulus sets assessed for each

transformation, as we tested for generalization across

contrast reversal, mirror imaging or figure–ground

reversal. These three transformations have very dif-

ferent influences on the critical curved contour. Contrast reversal

changes the polarity of this critical contour. Mirror reversal reflects

this contour about the vertical. Only figure–ground reversal leaves

the critical curved contour itself unchanged. (Its relative position

with respect to the body of the shape changes, but this is applied

equally to the mirror-reversal transform.) Thus, if the selective

responses of IT cells had been caused primarily by just the curved

contour that distinguished the various displays physically, then we

should have found maximum generalization across figure–ground

reversal, as only this keeps the curved contour constant. Howev-

er, the opposite result was found, with generalization absent only

for the figure–ground transform.

This demonstrates that the selectivity of IT responses is not

determined simply by the distinctive contours in a display, con-

trary to simple edge-based models of shape recognition discussed

elsewhere

5,22

. Instead, coding in IT follows similar principles to

that observed for human shape judgments. Human observers

rate the mirror image of an original figure as more similar to the

original display than the original ground

18,19

, as confirmed here

for the displays used in our physiological work. This arises even

though the ground shares the same informative contour as the

original figure, and hence has the ‘profile’ of the original figure

embedded in it as background. We found here that IT cells like-

wise generalized more strongly across mirror imaging than across

figure–ground reversal. Our findings for mirror imaging are con-

articles

Contrast reversal

Mirror reversal

Figure-ground reversal

0.2 0.4 0.6 0.8–0.2–0.4 1.0

Number of cells

Spearman correlation coefficient

Number of cellsNumber of cells

Fig. 5. Correlations of ranked stimulus preferences for each of the three

transforms in the cell population. Histograms show the population distri-

butions of Spearman rank-order correlations in firing rate (for

100–600 ms following stimulus onset) between transformed versions of

the stimuli. Each bar indicates the number of cells from the population

showing a particular size of correlation. Most cells show reliable positive

correlations (with 15 degrees of freedom) across the contrast-reversal

transform and mirror-reversal transform. Correlations for the

figure–ground transform are much lower overall, averaging near zero.

For figure–ground reversal plot, cells that showed significant correlations

for both contrast and mirror-image reversal are represented in black;

those that did not, in white.

–200 0 200 400 600

Response (spikes / second)

Time relative to stimulus onset (ms)

–200 0 200 400 600

Response (spikes / second)

Time relative to stimulus onset (ms)

–200 0 200 400 600

Time relative to stimulus onset (ms)

Response (spikes / second)

–200 0 200 400 600

Time relative to stimulus onset (ms)

Response (spikes / second)

Untransformed

Contrast reversal

Mirror reversal

Figure-ground reversal

a b

contour. The present study finds no support for the latter view

at the level of IT responses, as the cells did not respond differen-

tially to the presence of their preferred ‘profile’ in the background

to the current stimulus (for example, Fig. 6d). Instead, our results

show that shape description in IT cortex is entirely constrained by

one-sided assignment of contours to figural objects.

ETHODS

Animals and surgery. The experiment was conducted with two male

macaque monkeys (Macaca fascicularis, 4.8 and 5.8 kg). With aseptic

surgery, we placed a recording chamber and inserted a scleral coil in the

left eye. All procedures were approved by the Institutional Animal Care

and Use Committee.

Recording techniques. The activity of single neurons was recorded with

epoxy-insulated tungsten microelectrodes (FHC, Brunswick, Maine) as

the monkey sat in a primate chair, using standard techniques for single-

cell recording

. Action potentials of single cells were amplified using BAK

neurophysiological hardware, passed through a dual-window discrimi-

nator, with output TTL pulses timed to a resolution of 0.1 ms by the

computer controlling the experiment. Maintenance of fixation was con-

firmed using the scleral search-coil technique

, measuring eye position

with an accuracy of 30´ every 16 ms. Data were rejected from trials dur-

ing which the monkey was not fixating appropriately when the stimulus

appeared, or during which eye movements of more than 2° occurred in

the first 600 ms following stimulus onset.

X-radiographs were used to locate the position of the microelectrode

on each recording track relative to bony landmarks. The position of

cells was reconstructed from the X-ray coordinates taken, together with

serial 50-µm histological sections showing the micro-lesions made at

the end of some of the microelectrode tracks. Recording sites were all

located within the lower bank of the superior temporal sulcus and in

the adjacent dorsal part of the inferior temporal gyrus. All recording

sites were localized within cytoarchitectonic areas TEa, TEm and TE3,

as described previously

(Fig. 7).

Stimulus presentation and task. The 32 visual stimuli (Fig. 1) were stored

digitally on a computer disk, and displayed on a Sony video monitor

using a Data Translation video framestore (512 × 480 pixels; Marlboro,

Massachusetts). Maximum and minimum luminances on the screen were

5.2 and 0.22 footlamberts, respectively. The exposed shape was either

white on a black background, or vice versa (Fig. 1). Each shape averaged

2.8° in width and 3.5° in height, with the center of the curved contour

located centrally at fixation.

Before a trial, the whole screen was gray. The screen then went black or

white for 1 s, so that the subsequent figural stimulus could later appear

against this background with the opposite polarity, which produces entire-

ly unambiguous figural assignment in human observers. This preliminary

change to the luminance of the whole screen was unrelated to our com-

parisons (see also the baseline data, before onset of the experimental stim-

ulus, in Fig. 6). After 1 s, a central fixation dot of opposite polarity to the

rest of the screen appeared for 500 ms. The fixation dot was followed by

the experimental shape for 1 s, then the screen returned to gray for 3.5 s,

before the start of the next trial. The monkeys performed a simple visual

task during testing (adapted from ref. 29) to ensure that they fixated the

stimuli. If the shape shown centrally was any one of the 32 in the experi-

mental set (a black or white shape, Fig. 1), then the monkeys could obtain

a fruit juice reward during its exposure, provided they were fixating with-

in 1° of the central location. If the central shape was a red square (11% of

trials, excluded from analyses), then the monkey had to withhold licking to

avoid ingesting aversive hypertonic saline. A 0.5-s signal buzzer preceded the

presentation of the stimulus. (This sounded concurrently with the central

fixation point.) Thus, if the monkey fixated correctly before the stimulus

appeared, he had sufficient time to discriminate black or white experi-

mental shapes from the red square, and then obtain fruit juice while it was

still available (during the central stimulus).

Before the experiment, the monkeys had been trained on a simple

visual discrimination task. They viewed the monitor with a central fix-

ation point, and could lick to obtain fruit juice when a white or black

sistent with other single-cell evidence

; the contrast-reversal find-

ings also agree with previous studies

8,23

. The additional com-

parison with figure–ground reversal here reveals that the selective

responses of IT neurons correspond with psychological obser-

vations

15–20

on how one-sided assignment of edges to figures

constrains human perception of contoured shapes, and accord

with human similarity ratings.

A longstanding question in vision research

14,15,18–20

is why fig-

ures and their abutting grounds are perceived as so different in

shape, despite the common contour. One computational proposal

is that the visual system may decompose shapes into convex

parts

16,19,24

. A convexity in the outline of a figure (for example, the

‘nose’ in the face profile at the top left of Fig. 1) will produce a cor-

responding concavity in the abutting ground region of the image,

and vice-versa

. This will lead to different convex parts on either

side of a given contour. Our finding that IT neurons were driven

by the figural shapes resulting from one-sided edge assignment, not

by contours per se, seems consistent with shape representation with-

in IT in terms of the layout of such component parts

5,19,24

We re-analyzed the data in terms of another transform among

the stimuli, to assess this hypothesis further. We correlated stim-

ulus preferences across a transform that can be illustrated with ref-

erence to Fig. 1, comparing the lower left stimulus in the front

panel (stimulus f, Fig. 1) to the top right stimulus in the back panel

(stimulus c, Fig. 1), and so on. We thus compared pairs of stimuli

that had the same contrast polarity and faced in the same direc-

tion, but had just the curved edge itself (not the shape as a whole)

reflected. Thus, after figural assignment, the two members of the

pair should have different convex parts. (Indeed, the change in

convex parts is the same as the change for a figure–ground reversal,

except that the parts now ‘point’ in the same direction.) We found

that the (null) correlations across this transform were equivalent to

those for our standard figure–ground reversal transform, averaging

0.09 versus 0.04, respectively, with no difference between the pop-

ulation distributions of these correlations.

Taken together, our results accord with theories of object

recognition that propose

15–20,25,26

that one-sided edge assign-

ment precedes shape description in the visual system, with

decomposition into component parts proceeding only for the

figural side of any contour. A rival account

proposes instead

that part decomposition initially arises for both sides of every

articles

nature neuroscience • volume 4 no 9 • september 2001 941

Fig. 7. Regions of IT cortex in which the cells were recorded, drawn on

sections from the brain of monkey A. Top, locations of these coronal

sections are shown on a schematic monkey brain.

942 nature neuroscience • volume 4 no 9 • september 2001

circle was presented centrally for 1 s immediately after the buzzer, but

had to withhold a lick when a red square was presented instead. Having

mastered this with greater than 95% accuracy, they were trained to

maintain central fixation. After several weeks of this training (without

any exposure to the experimental stimuli like those in Fig. 1), the exper-

imental trials were run in blocks of 36, each comprising the 32 experi-

mental stimuli, plus 4 trials with red squares, all in random order. For

each cell studied, three to seven blocks of trials were run, each with dif-

ferent random orders of stimuli.

Analyses. The two monkeys gave the same pattern of results, and so

are considered jointly here. For each trial, the number of action poten-

tials occurring in a 500-ms period starting 100 ms after stimulus onset

was initially considered. This period was chosen because most of the

neurons studied typically showed vigorous responses to visual stim-

uli with latencies just above 100 ms, and the monkeys consistently held

central fixation for the first 600 ms of stimulation. To test whether a

neuron was showing selectivity among the set of 32 shapes, analysis

of variance was performed on the response rates to the different stim-

uli. Only those cells (78/88) that showed a significant effect of stimu-

lus (at p < 0.001 or better) were included in the further analyses

(31 from one monkey, 47 from the other), as only these could address

our experimental questions.

For each of our three orthogonal transformation of the stimuli (con-

trast polarity, mirror-imaging and figure–ground reversal) the set of

32 stimuli can be divided into two subsets of 16, one subset providing

transformed versions of each member of the other subset. To calcu-

late the influence of one specific transformation (such as contrast

reversal) on the stimulus selectivity of a single cell, we correlated the

firing rates to the 16 stimuli in one subset against those for the corre-

sponding members of the other subset (Fig. 2; data from one illustra-

tive cell), using Spearman’s rank-order correlation. This was initially

done for firing rate across the 100–600-ms time bin following stimu-

lus onset, for every cell (Fig. 5).

To see how the correlations (and thus the generalization of stimulus

selectivity across a particular transform) developed over time, we next cal-

culated the correlation coefficients for time bins of increasing extent. These

were calculated for spikes in response to each stimulus in the first 20 ms,

then the first 40 ms, and so on, up to 500 ms after the stimulus period.

Average correlations climbed rapidly to form an asymptote around

200–300 ms after stimulus onset for contrast and mirror-image transforms,

but remained near zero throughout the trial for figure–ground reversal.

As another way to study the effects of stimulus transforms on the

stimulus selectivity of the cell population, we examined how the dif-

ference in responses to the preferred and the non-preferred stimulus

developed over time. This was done for each cell by calculating the

response to its optimal stimulus in successive 20-ms time bins. A sim-

ilar time course of firing rate was then calculated for the ‘non-pre-

ferred’ stimulus for each cell. These values were then averaged across

the population of 78 cells to produce the diagram shown in Fig. 6a.

Analogous procedures were used to plot the responses to contrast-

reversed versions of the same two stimuli (Fig. 6b), mirror-reversed

versions (Fig. 6c) or figure–ground reversed versions (Fig. 6d).

For the human shape-judgment task, observers were presented with a

sample shape for 400 ms, and two test stimuli were then added to the dis-

play at bottom left and bottom right. They were asked to judge which of

these two test stimuli was more similar in shape to the sample. The test

stimuli were two different shapes drawn with equal probability without

replacement from the following set: the original shape, a contrast reversal,

a mirror reversal, a figure–ground reversal and a shape with a different

contour. Observers indicated by pressing a left or right key which test

shape was more like the sample shape. The 12 repetitions of each of 20

possible permutations of test pairings were averaged together to generate

overall preferences for each type of transform when presented at test.

CKNOWLEDGEMENTS

G.C.B. was supported by grants from the National Institutes of Health (R29

NS27296) and the National Science Foundation (SBR 96-16555). J.D. was

supported by the Biotechnology and Biological Sciences Research Council (UK).

RECEIVED 27 APRIL; ACCEPTED 30 JULY 2001

1. Baylis, G. C., Rolls, E. T. & Leonard, C. M. Functional subdivisions of the

temporal lobe neocortex. J. Neurosci. 7, 330–342 (1987).

2. Desimone, R., Schein, S. J., Moran, J. & Ungerleider, L. G. Contour, color and

shape analysis beyond the striate cortex. Vision Res. 24, 441–452 (1985)

3. DiCarlo, J. J. & Maunsell, J. H. R. Form representation in monkey

inferotemporal cortex is virtually unaltered by free viewing. Nat. Neurosci. 3,

814–821 (2000)

4. Logothetis, N. K., Pauls, J. & Poggio, T. Shape representation in the inferior

temporal cortex of monkeys. Curr. Biol. 5, 552–563 (1995).

5. Riesenhuber, M. & Poggio, T. Nat. Neurosci. 3, 1199–1204 (2000).

6. Rollenhagen, J. E. & Olson, C. R. Mirror-image confusion in single neurons

of the macaque inferotemporal cortex. Science 287, 1506–1508 (2000).

7. Rolls, E. T., Judge, S. J. & Sanghera, M. K. Activity of neurons in the

inferotemporal cortex of the alert monkey. Brain Res. 130, 229–238 (1977).

8. Sary G., Vogel, R. & Orban, G. Cue-invariant shape selectivity of macaque

inferior temporal neurons. Science 260, 995997 (1993).

9. Tanaka, K. Inferotemporal cortex and object vision. Annu. Rev. Neurosci. 19,

109–139 (1996).

10. Malach, R. et al. Object-related activity revealed by functional magnetic

resonance imaging in human occipital cortex. Proc. Natl. Acad. Sci. USA 92,

8135–8139 (1995).

11. Farah, M. J. & Aguirre, G. K. Imaging visual recognition: PET and fMRI

studies of functional anatomy of human visual recognition. Trends Cogn. Sci.

3, 179–185 (1999).

12. Farah, M. J. Visual Agnosia (MIT Press, Cambridge, Massachusetts, 1990).

13. Plaut, D. C. & Farah, M. J. Visual object representation: Interpreting

neurophysiological data within a computational framework. J. Cogn.

Neurosci. 2, 320–343 (1990)

14. Rubin, E. Visuell Wahrgenommee Figuren (Gyldendalske Boghandel,

Copenhagen, Germany, 1915).

15. Baylis, G. C. & Driver, J. One-sided edge-assignment in vision: 1. Figure-

ground segmentation and attention to objects. Curr. Dir. Psychol. Sci. 4,

140–146 (1995).

16. Driver, J. & Baylis, G. C. One-sided edge-assignment in vision: 2. Part

decomposition, shape description, and attention to objects. Curr. Dir.

Psychol. Sci. 4, 201–206 (1995)

17. Driver, J. & Baylis, G. C. Edge-assignment and figure-ground segmentation in

short-term visual matching. Cognit. Psychol. 31, 248–306 (1996)

18. Baylis, G. C. & Cale, E. The figure has a shape, but the ground does not:

Evidence from covert testing of shape recognition. J. Exp. Psychol. Hum.

Percept. Perform. 27, 633–643 (2001).

19. Hoffman, D. D. & Richards, W. A. Parts of recognition. Cognition 18, 65–96

(1984).

20. Baylis, G. C. & Driver, J. Obligatory edge-assignment in vision: the role of

figure and part segmentation in symmetry detection. J. Exp. Psychol. Hum.

Percept. Perform. 6, 1323–1342 (1995).

21. Selzer, B. & Pandya, D. N. Afferent cortical connections and architectonics of

the superior temporal sulcus and surrounding cortex in the rhesus monkey.

Brain Res. 149, 1–24 (1978)

22. Pinker, S. Visual cognition: an introduction. Cognition 18, 1–64 (1984).

23. Rolls, E. T. & Baylis, G. C. Size and contrast have only small effects on the

responses to faces of neurons in the cortex of the superior temporal sulcus of

the monkey. Exp. Brain. Res. 65, 38–48 (1986).

24. Biederman, I. Recognition-by-components: a theory of human image

understanding. Psychol. Rev. 94, 115–147 (1987).

25. Nakayama, K., Shimojo, S. & Silverman, G. H. Stereoscopic depth: its relation

to image segmentation, grouping, and the recognition of occluded objects.

Perception 18, 55–68 (1989).

26. Palmer, S. & Rock, I. Rethinking perceptual organisation: the role of uniform

connectedness. Psychon. Bull. Rev. 1, 29–55 (1994).

27. Peterson, M. A. Object recognition processes can and do operate before

figure-ground organisation. Curr. Dir. Psychol. Sci. 3, 105–111 (1994).

28. Robinson D. A. A method of measuring eye-movements using a scleral search

coil in a magnetic field. IEEE Trans. Biomed. Eng. 101, 131–145 (1963).

29. Baylis, G. C., Rolls, E. T. & Leonard, C. M. Selectivity between faces in the

responses of a population of neurons in the cortex of the superior temporal

sulcus of the monkey. Brain Res. 342, 91–102 (1985).

articles

Emergence of brain-like mirror-symmetric viewpoint tuning in convolutional neural networks

Article

Full-text available

Apr 2024
eLife

Primates can recognize objects despite 3D geometric variations such as in-depth rotations. The computational mechanisms that give rise to such invariances are yet to be fully understood. A curious case of partial invariance occurs in the macaque face-patch AL and in fully connected layers of deep convolutional networks in which neurons respond similarly to mirror-symmetric view (e.g., left and right profiles). Why does this tuning develop? Here, we propose a simple learning-driven explanation for mirror-symmetric viewpoint tuning. We show that mirror-symmetric viewpoint tuning for faces emerges in the fully connected layers of convolutional deep neural networks trained on object recognition tasks, even when the training dataset does not include faces. First, using 3D objects rendered from multiple views as test stimuli, we demonstrate that mirror-symmetric viewpoint tuning in convolutional neural network models is not unique to faces: it emerges for multiple object categories with bilateral symmetry. Second, we show why this invariance emerges in the models. Learning to discriminate among bilaterally symmetric object categories induces reflection-equivariant intermediate representations. AL-like mirror-symmetric tuning is achieved when such equivariant responses are spatially pooled by downstream units with sufficiently large receptive fields. These results explain how mirror-symmetric viewpoint tuning can emerge in neural networks, providing a theory of how they might emerge in the primate brain. Our theory predicts that mirror-symmetric viewpoint tuning can emerge as a consequence of exposure to bilaterally symmetric objects beyond the category of faces, and that it can generalize beyond previously experienced object categories.

Visual cortical processing—From image to object representation

Article

Full-text available

Jun 2023

Rüdiger von der Heydt

Image understanding is often conceived as a hierarchical process with many levels, where complexity and invariance of object representation gradually increase with level in the hierarchy. In contrast, neurophysiological studies have shown that figure-ground organization and border ownership coding, which imply understanding of the object structure of an image, occur at levels as low as V1 and V2 of the visual cortex. This cannot be the result of back-projections from object recognition centers because border-ownership signals appear well-before shape selective responses emerge in inferotemporal cortex. Ultra-fast border-ownership signals have been found not only for simple figure displays, but also for complex natural scenes. In this paper I review neurophysiological evidence for the hypothesis that the brain uses dedicated grouping mechanisms early on to link elementary features to larger entities we might call “proto-objects”, a process that is pre-attentive and does not rely on object recognition. The proto-object structures enable the system to individuate objects and provide permanence, to track moving objects and cope with the displacements caused by eye movements, and to select one object out of many and scrutinize the selected object. I sketch a novel experimental paradigm for identifying grouping circuits, describe a first application targeting area V4, which yielded negative results, and suggest targets for future applications of this paradigm.

Fast discrimination of fragmentary images: the role of local optimal information

Article

Full-text available

Feb 2023
FRONT HUM NEUROSCI

In naturalistic conditions, objects in the scene may be partly occluded and the visual system has to recognize the whole image based on the little information contained in some visible fragments. Previous studies demonstrated that humans can successfully recognize severely occluded images, but the underlying mechanisms occurring in the early stages of visual processing are still poorly understood. The main objective of this work is to investigate the contribution of local information contained in a few visible fragments to image discrimination in fast vision. It has been already shown that a specific set of features, predicted by a constrained maximum-entropy model to be optimal carriers of information (optimal features), are used to build simplified early visual representations (primal sketch) that are sufficient for fast image discrimination. These features are also considered salient by the visual system and can guide visual attention when presented isolated in artificial stimuli. Here, we explore whether these local features also play a significant role in more natural settings, where all existing features are kept, but the overall available information is drastically reduced. Indeed, the task requires discrimination of naturalistic images based on a very brief presentation (25 ms) of a few small visible image fragments. In the main experiment, we reduced the possibility to perform the task based on global-luminance positional cues by presenting randomly inverted-contrast images, and we measured how much observers’ performance relies on the local features contained in the fragments or on global information. The size and the number of fragments were determined in two preliminary experiments. Results show that observers are very skilled in fast image discrimination, even when a drastic occlusion is applied. When observers cannot rely on the position of global-luminance information, the probability of correct discrimination increases when the visible fragments contain a high number of optimal features. These results suggest that such optimal local information contributes to the successful reconstruction of naturalistic images even in challenging conditions.

Emergence of brain-like mirror-symmetric viewpoint tuning in convolutional neural networks

Preprint

Full-text available

Jan 2023

Primates can recognize objects despite 3D geometric variations such as in-depth rotations. The computational mechanisms that give rise to such invariances are yet to be fully understood. A curious case of partial invariance occurs in the macaque face-patch AL and in fully connected layers of deep convolutional networks in which neurons respond similarly to mirror-symmetric views (e.g., left and right profiles). Why does this tuning develop? Here, we propose a simple learning-driven explanation for mirror-symmetric viewpoint tuning. We show that mirror-symmetric viewpoint tuning for faces emerges in the fully connected layers of convolutional deep neural networks trained on object recognition tasks, even when the training dataset does not include faces. First, using 3D objects rendered from multiple views as test stimuli, we demonstrate that mirror-symmetric viewpoint tuning in convolutional neural network models is not unique to faces: it emerges for multiple object categories with bilateral symmetry. Second, we show why this invariance emerges in the models. Learning to discriminate among bilaterally symmetric object categories induces reflection-equivariant intermediate representations. AL-like mirror-symmetric tuning is achieved when such equivariant responses are spatially pooled by downstream units with sufficiently large receptive fields. These results explain how mirror-symmetric viewpoint tuning can emerge in neural networks, providing a theory of how they might emerge in the primate brain. Our theory predicts that mirror-symmetric viewpoint tuning can emerge as a consequence of exposure to bilaterally symmetric objects beyond the category of faces, and that it can generalize beyond previously experienced object categories.

DSRP Theory: A Primer

Article

Full-text available

Mar 2022

DSRP Theory is now over 25 years old with more empirical evidence supporting it than any other systems thinking framework. Yet, it is often misunderstood and described in ways that are inaccurate. DSRP Theory describes four patterns and their underlying elements—identity (i) and other (o) for Distinctions (D), part (p) and whole (w) for Systems (S), action (a) and reaction (r) for Relationships (R), and point (ρ) and view (v) for Perspectives (P)—that are universal in both cognitive complexity (mind) and material complexity (nature). DSRP Theory provides a basis for systems thinking or cognitive complexity as well as material complexity (systems science). This paper, as a relatively short primer on the theory, provides clarity to those wanting to understand DSRP and its implications.

Distinctions Organize Information in Mind and Nature: Empirical Findings of Identity-Other Distinctions (D) in Cognitive and Material Complexity

Article

Full-text available

Dec 2021

The transdisciplinary importance of Distinctions is well-established as foundational to such diverse phenomena as recognition, identification, individual and social identity, marginalization, externalities, boundaries, concept formation, etc. and synonymous general ideas such as thingness, concepts, nodes, objects, etc. Cabrera provides a formal description of and predictions for identity-other Distinctions(D) or "D-rule" as one of four universals for the organization of information that is foundational to systems and systems thinking as well as the consilience of knowledge. This paper presents 7 empirical studies in which (unless otherwise noted) software was used to create an experiment for subjects to complete a task and/or answer a question. The samples vary for each study (ranging from N=407 to N=34,398) and are generalizeable to a normal distribution of the US population. These studies support—with high statistical significance—the predictions made by DSRPTheory regarding identity-other Distinctions including its: universality as an observable phenomenon in both mind (cognitive complexity) and nature (ontological complexity) (i.e., parallelism); internal structures and dynamics; mutual dependencies on other universals (i.e., Relationships, Systems, andPerspectives); role in structural predictions; and, efficacy as a metacognitive skill. In conclusion, these data suggest the observable and empirical existence, universality, efficacy, and parallelism (between cognitive and ontological complexity) of identity-other Distinctions(D).

Relationships Organize Information in Mind and Nature: Empirical Findings of Action-Reaction Relationships (R) in Cognitive and Material Complexity

Article

Full-text available

Dec 2021

The transdisciplinary importance of Relationships is well-established as foundational to such diverse phenomena as feedback, interconnectedness, causality, network dynamics, complexity, etc. and synonymous with connections, links, edges, interconnections, etc. Cabrera provides a formal description of and predictions action-reaction Relationships (R) or "R-rule" as one of four universals for the organization of information that is foundational to systems and systems thinking as well as the consilience of knowledge. This paper presents 7 empirical studies in which (unless otherwise noted) software was used to create an experiment for subjects to complete a task and/or answer a question. The samples vary for each study (ranging from N=407 to N=34,398) and are generalizeable to a normal distribution of the US population. These studies support—with high statistical significance—the predictions made by DSRP Theory regarding action-reaction Relationships including its: universality as an observable phenomenon in both mind (cognitive complexity) and nature (ontological complexity) (i.e., parallelism); internal structures and dynamics; mutual dependencies on other universals (i.e., Distinctions, Systems, and Perspectives); role in structural predictions; and, efficacy as a metacognitive skill. In conclusion, these data suggest the observable and empirical existence, universality, efficacy, and parallelism (between cognitive and ontological complexity) of action-reaction Relationships (R).

Which deep learning model can best explain object representations of within-category exemplars?

Article

Full-text available

Sep 2021
J VISION

Dongha Lee

Deep neural network (DNN) models realize human-equivalent performance in tasks such as object recognition. Recent developments in the field have enabled testing the hierarchical similarity of object representation between the human brain and DNNs. However, the representational geometry of object exemplars within a single category using DNNs is unclear. In this study, we investigate which DNN model has the greatest ability to explain invariant within-category object representations by computing the similarity between representational geometries of visual features extracted at the high-level layers of different DNN models. We also test for the invariability of within-category object representations of these models by identifying object exemplars. Our results show that transfer learning models based on ResNet50 best explained both within-category object representation and object identification. These results suggest that the invariability of object representations in deep learning depends not on deepening the neural network but on building a better transfer learning model.

Same-different letter decision task: a study with Spanish children with dyslexia.

Article

Aug 2022
INFANC APRENDIZ

It is common to see mirror errors in letters in early stages of reading due to the mirror-generalization process that allows a visual stimulus to be identified independently of its orientation. To avoid such errors, this process must be inhibited. A special case would be children with dyslexia since their difficulties with the alphabetic code may also delay the acquisition of correct letter orientation. We investigated the relationship between reversible errors in reading and dyslexia. Twenty-seven Spanish-speaking children with dyslexia (7–12 years old) and 27 chronological-age-matched controls performed a ‘same-different’ letter decision task on reversible and non-reversible letters. Results showed that all participants required more time and committed more errors in discriminating reversible letters. In addition, worse execution was observed in the dyslexic group, which seems to indicate that this group is delayed in the acquisition of correct letter orientation. Therefore, our results indicated that overcoming reading errors in mirrors depends to some extent on the reading competence of the children and the ability to inhibit the process of generalization of mirrors.

The “Fish Tank” Experiments: Metacognitive Awareness of Distinctions, Systems, Relationships, and Perspectives (DSRP) Significantly Increases Cognitive Complexity

Article

Full-text available

Mar 2022

In the field of systems thinking, there are far too many opinioned frameworks and far too few empirical studies. This could be described as a “gap” in the research but it is more like a dearth in the research. More theory and empirical validation of theory are needed if the field and the phenomenon of systems thinking holds promise and not just popularity. This validation comes in the form of both basic (existential) and applied (efficacy) research studies. This article presents efficacy data for a set of empirical studies of DSRP Theory. According to Cabrera, Cabrera, and Midgley, DSRP Theory has equal or more empirical evidence supporting it than any existing systems theories (including frameworks, which are not theories). Four separate studies show highly statistically relevant findings for the effect of a short (less than one minute) treatment of D, S, R, and P. Subjects’ cognitive complexity and the systemic nature of their thinking increased in all four studies. These findings indicate that even a short treatment in DSRP is effective in increasing systems thinking skills. Based on these results, a longer, more in-depth treatment—such as a one hour or semester long training, such is the norm—would therefore likely garner transformative results and efficacy.

Subdivisions of the temporal lobe neocortex

Article

Full-text available

Feb 1987

In order to gather evidence on functional subdivisions of the temporal lobe neocortex of the primate, the activity of more than 2600 single neurons was recorded in 10 myelo- and cytoarchitecturally defined subdivisions of the cortex in the superior temporal sulcus (STS) and inferior temporal gyrus of the anterior part of the temporal lobe of 5 hemispheres of 3 macaque monkeys. First, convergence of different modalities into each area was investigated. Areas TS and TAa, in the upper part of this region, were found to receive visual as well as auditory inputs. Areas TPO, PGa, and IPa, in the depths of the STS, received visual, auditory, and somatosensory inputs. Areas TEa, TEm, TE3, TE2, and TE1, which extend from the ventral bank of the STS through the inferior temporal gyrus, were primarily unimodal visual areas. Second, of the cells with visual responses, it was found that some neurons in areas TS-IPa could be activated only by moving visual stimuli, whereas the great majority of neurons in areas TEa-TE1 could be activated by stationary visual stimuli. Third, it was found that there were few sharply discriminating visual neurons in areas TS and TAa; of the sharply discriminating visual neurons in other areas, however, neurons that responded primarily to faces were found predominantly in areas TPO, TEa, and TEm (in which they represented 20% of the neurons with visual responses); neurons that were tuned to relatively simple visual stimuli such as sine-wave gratings, color, or simple shapes were relatively common in areas TEa, TEm, and TE3; and neurons that responded only to complex visual stimuli were common in areas IPa, TEa, TEm, and TE3. These findings show inter alia that areas TPO, PGa, and IPa are multimodal, that the inferior temporal gyrus areas are primarily unimodal, that there are areas in the cortex in the anterior and dorsal part of the STS that are specialized for the analysis of moving visual stimuli, that neurons responsive primarily to faces are found predominantly in areas TPO, TEa, and TEm, and that architectural subdivisions of the temporal lobe cortex are related to neuronal response properties.

Obligatory Edge Assignment in Vision: The Role of Figure and Part Segmentation in Symmetry Detection

Article

Full-text available

Dec 1995

Symmetry detection within a shape can be effortless, perhaps because the 2 sides have matching parts according to D. D. Hoffman and W. A. Richards's (1984) minima rule. It becomes more difficult when figure–ground factors are manipulated to reverse the assignment of convexity and concavity for one of the sides. Under these conditions, symmetrical contours no longer have matching parts, but repeated contours do. Repetition now becomes easier to detect than symmetry, reversing the classic effect observed by E. Mach (1885/1959). It was also found here that symmetry perception is less efficient between the contours of 2 separate objects than within an object but that the reverse can apply for repetition detection. It was concluded that each contour in an image is obligatorily assigned to just one of the regions that it separates. This region becomes figural and acquires a shape description in terms of the component parts of its contours. (PsycINFO Database Record (c) 2012 APA, all rights reserved)

Inferotemporal Cortex and Object Vision

Article

Jan 1996

Keiji Tanaka

Cells in area TE of the inferotemporal cortex of the monkey brain selectively respond to various moderately complex object features, and those that cluster in a columnar region that runs perpendicular to the cortical surface respond to similar features. Although cells within a column respond to similar features, their selectivity is not necessarily identical. The data of optical imaging in TE have suggested that the borders between neighboring columns are not discrete; a continuous mapping of complex feature space within a larger region contains several partially overlapped columns. This continuous mapping may be used for various computations, such as production of the image of the object at different viewing angles, illumination conditions. and articulation poses.

Afferent cortical connections of the superior temporal sulcus in the rhesus monkey

Article

Jan 1975

Visual Object Representation: Interpreting Neurophysiological Data within a Computational Framework

Article

Oct 1990

Significant progress has been made in understanding vision by combining computational and neuroscientific constraints. However, for the most part these integrative approaches have been limited to low-level visual processing. Recent advances in our understanding of high-level vision in the two separate disciplines warrant an attempt to relate and integrate these results to extend our understanding of vision through object representation and recognition. This paper is an attempt to contribute to this goal, by using a computational framework arising out of computer vision research to organize and interpret human and primate neurophysiology and neuropsychology.

The figure has a shape, but the ground does not: Evidence from a priming paradigm

Article

Jun 2001

In four experiments, the authors examined the extent to which the ground interpretation of an edge may receive a shape description. These experiments used the priming effect that shapes have on perceptual judgments on a subsequent trial. A robust reduction in error rates and reaction times was seen when the figural shape was the same as that on the previous trial. This repetition priming effect may be due to activation of the shape description of the figure that remained from the previous trial. In contrast, no priming by the shape of the ground was seen even when the contrast sign of the figure reversed between trials. Priming for figural shapes occurred at a relatively abstract level because it was robust across reversals of contrast and orientation. These data suggest that the figural interpretation of a shape receives a shape description but that the ground does not.

A erent cortical connections and architectonics of the superior temporal sulcus and surrounding cort

Article

Jan 1978

One-Sided Edge Assignment in Vision: 1. Figure-Ground Segmentation and Attention to Objects

Article

Oct 1995

Object Recognition Processes Can and Do Operate Before Figure–Ground Organization

Article

Aug 1994

Mary A Peterson

One-Sided Edge Assignment in Vision: 2. Part Decomposition, Shape Description, and Attention to Objects

Article

Dec 1995

Rubin originally noted that vision distinguishes figures from back ground, even within two-dimen sional (2-D) displays. He and other Gestaltists1 suggested on the basis of introspection that figures are seen as having a definite shape, whereas ad joining ground is seen as shapeless despite the dividing contour in com mon with the figure. In a previous article,2 we presented recent perfor mance studies showing that the di viding edge between figure and ground is automatically assigned to the figurai shape, even when sub jects attempt to judge just the divid ing edge itself. As a result, recogni tion of the dividing edge is better if a test stimulus has this edge assigned in the same direction as in the pre ceding figurai shape. We argued that one-sided edge assignment provides an efficient heuristic for deriving the likely three-dimensional (3-D) source of any 2-D image. In the present review, we explore the con sequences of such edge assignment for shape representation and suggest an account for why only figures ap pear shaped.

Shape-coding in IT cells generalizes over contrast and mirror reversal, but not figure-ground reversal

Abstract and Figures

Recommended publications

Event-related potentials reveal an early advantage for luminance contours in the processing of objec...

Contrast Configuration Influences Grouping in Apparent Motion

Macaque VI neurons can signal ‘illusory’ contours

Ventral temporal lobe lesions and visual oddity performance