ArticlePDF Available

Early binding of feature pairs for visual perception

Authors:

Abstract and Figures

If features such as color and orientation are processed separately by the brain at early stages1, 2, how does the brain subsequently match the correct color and orientation? We found that spatially superposed pairings of orientation with either color or luminance could be reported even for extremely high rates of presentation, which suggests that these features are coded in combination explicitly by early stages, thus eliminating the need for any subsequent binding of information. In contrast, reporting the pairing of spatially separated features required rates an order of magnitude slower, suggesting that perceiving these pairs requires binding at a slow, attentional stage.
Content may be subject to copyright.
nature neuroscience • volume 4 no 2 • february 2001 127
brief communications
Early binding of feature
pairs for visual perception
Alex O. Holcombe and Patrick Cavanagh
Vision Sciences Laboratory, Department of Psychology, Harvard University, 33
Kirkland Street, Cambridge, Massachusetts 02138, USA
Correspondence should be addressed to A.O.H. (aholcombe@psy.ucsd.edu)
If features such as color and orientation are processed sepa-
rately by the brain at early stages
1,2
, how does the brain sub-
sequently match the correct color and orientation? We found
that spatially superposed pairings of orientation with either
color or luminance could be reported even for extremely high
rates of presentation, which suggests that these features are
coded in combination explicitly by early stages, thus eliminat-
ing the need for any subsequent binding of information. In
contrast, reporting the pairing of spatially separated features
required rates an order of magnitude slower, suggesting that
perceiving these pairs requires binding at a slow, attentional
stage.
To determine the temporal resolution of the perception of
feature pairs, we combined color and orientation features,
either spatially superimposed or spatially separated (Fig. 1).
In the superimposed condition, half the trials consisted of a
display alternating between a red patch tilted leftward and a
green patch tilted rightward (semicircular windowed gratings
with a sinusoidal 45° tilt). In the other trials, the pairing of
color and orientation was reversed. Observers reported whether
the red was paired with rightward or with leftward tilt.
Observers fixated 0.2° away from the straight edge of the
semicircle windowed grating 8.5° in diameter. The red and
green stimuli were set to equiluminance separately for each
observer at each temporal frequency with a minimum motion
method
3
. Following this setting, the red and green
0.59 cycles/degree gratings had peaks of equal luminance and
troughs of equal luminance, and the luminance of the peak
and trough of each cycle differed by about 38 cd/m
2
. The
trough of the mean red grating was 16.4 cd/m
2
, CIE (x = 0.59,
y = 0.35), and its peak was 54.2 cd/m
2
(x = 0.41, y = 0.34). The
mean green grating had trough of 17.4 cd/m
2
(x = 0.30,
y = 0.55) and peak of 56 cd/m
2
(x = 0.32, y = 0.39). These val-
ues ensured that the sum of the bright red and dark green gave
the same yellow as the sum of the dark red and bright green,
thus camouflaging the orientations of the green and red grat-
ings in the sum. Perfect camouflage in the sum was verified
with each observer and, for one observer (AH), was tested over
a range of contrasts bracketing the critical values to determine
the sensitivity to possible unintended display deviations. These
control data showed that a deviation of greater than 5% con-
trast would be required to bring performance to the 75%
threshold, whereas the computed value fell within 2% of the
optimal value (producing chance performance). The first and
last pairing in the sequence was masked; the trial began with
extremely rapid presentation of the stimuli, and gradually
slowed to the intended temporal frequency (a few hundred
milliseconds), after which the presentation rate gradually
increased, ending the trial with a postmask.
In the spatially separated condition, the same pairings were
used, but the color was presented as a uniform patch of satu-
rated red or green, and the orientation was presented as an
adjacent achromatic patch of tilted bars. Another block of the
same experiment tested judgments of brightness and orienta-
tion pairings. A dark (30 cd/m
2
) semicircle windowed grating
alternated with a bright (86 cd/m
2
) grating on a noise back-
ground. The difference between the peak and trough of both
gratings was always 55 cd/m
2
, which ensured that the sum of
the two gratings was the same regardless of the brightness–ori-
entation pairing.
The critical rates for threshold accuracy (75%) in report-
ing the pairings were slower than 3 Hz for each observer in the
spatially separated condition (Fig. 1). In the spatially super-
imposed condition, thresholds were nearly ten times better.
The average for four observers in the color condition was
18.8 Hz, and the average in the brightness condition was a
remarkable 35.5 Hz. This latter value corresponded to 14 ms
for each feature pair.
Several studies suggest that features are more likely to be
processed together if they form part of a single object or
group
4
. The critical rates for the spatially separated conditions
may have been slower than rates for the superimposed condi-
tions because the separated features appeared to be part of sep-
arate objects or groups. To test this possibility, we devised
several variations that grouped the features together, based on
displays shown to have effects in earlier reaction
time studies
4
. However, even when the separated
features were linked into a common object or sur-
face, critical rates remained low (Fig. 2). These
results do not rule out the possibility that an
object context could provide a threshold advan-
tage in the region of a dozen milliseconds, but any
possible advantage (statistically insignificant in
this study) was small compared to that afforded
by superposition (300 ms).
When the superimposed features were alter-
Fig. 1. The critical rates for 75% accuracy in pairing
spatially superimposed features (depicted by the sec-
ond and fourth icons along the horizontal axis) are
nearly ten times faster than rates for pairing spatially
separated features (first and third icons). Each bar rep-
resents the mean of the same four observers. Small ver-
tical bars, 1 s.e.m.
© 2001 Nature Publishing Group http://neurosci.nature.com
© 2001 Nature Publishing Group http://neurosci.nature.com
128 nature neuroscience • volume 4 no 2 • february 2001
nated at rates above approximately 10 Hz, observers reported
that they did not experience the brief individual stimulus pre-
sentations separately. Instead, some observers reported that
the two feature pairings slowly alternated in their awareness,
or that one dominated. Others said that both seemed available
simultaneously, as if they had been presented transparently.
Still, underlying this percept is a high-speed process able to
read out the combined values of color (or brightness) and ori-
entation within the brief presentation of a single pairing. If this
process were not able to resolve each brief presentation, it
would be faced with the patterns summed across two or more
intervals. Because of the way the stimuli were constructed, the
pairings were camouflaged when the intervals were summed
(red right plus green left is indistinguishable from red left plus
green right). Although the alternate pairings had to be read
out separately in the consecutive intervals, the experience of
each pairing extended over time, as reflected in the reports of
transparency and slow rivalry. It seems that subsequent
processes before awareness integrate the paired representations
over a relatively long interval.
The high pairing rates for spatially superimposed feature
pairs suggest, among other possibilities, that some features may
be assessed in combination from early levels. In this case, there
is no binding problem. For example, nonlinear ON- and OFF-
channels as early as retinal ganglion cells separate stimuli
brighter and darker than the background, even at high flicker
rates
5,6
. Once the bright and dark stimuli are represented in
separate channels so that they do not cancel, orientation analy-
sis can proceed independently of the flicker. In the case of our
color–orientation pairings, color-opponent cells may have a
similar involvement by separating the red and green stimuli,
even at high presentation rates
7
. In contrast, in the case of the
spatially separated condition, it is unlikely that any one cell is
selective for the orientation at one location and the brightness
or color at another. Judging the pairings of spatially separate
features would therefore require a combination of the respons-
es of disparate neurons selected by later, slower and attentive
stages
8,9
.
A
CKNOWLEDGEMENTS
This work was supported by an NEI NRSA graduate fellowship to A.O.H. and
EY09258 to P.C.
RECEIVED 24 AUGUST; ACCEPTED 12 DECEMBER 2000
1. Zeki, S. M. Nature 274, 423–428 (1978).
2. Treisman, A. & Gelade, G. Cognit. Psychol. 12, 97–136 (1980).
3. Cavanagh, P., MacLeod, D. I. A. & Anstis, S. M. J. Opt. Soc. Am. 4,
1428–1438 (1987).
4. Scholl, B. Cognition(in press).
5. Kuffler, S. W. J. Neurophysiol. 16, 37–68 (1953).
6. Lee, B. B., Martin, P. R. & Valberg, A. J. Physiol. (Lond.) 414, 245–263
(1989).
7. Gur, M. & Snodderly, M. Vision Res. 37, 377–382 (1997).
8. Verstraten, F. A. J., Cavanagh, P. & Labianca, A. Vision Res. 40, 3651-3664
(2000).
9. Lu, Z. & Sperling, G. Vision Res. 35, 2697–2722 (1995).
brief communications
Fig. 2. The critical rates for 75% accuracy do not differ
significantly when spatially separated features are linked to
form a group or object (white bars, unlinked versions as
depicted by the left icon of each pair; black bars, linked
versions as depicted by the right icon of each pair. Small
vertical bars, 1 s.e.m. Averages of data from 25 observers
are shown, except middle conditions, which shows aver-
age of data from 7 observers. The feature pairings of the
dumbbell and lozenge-shaped stimuli alternated between
combinations of light or dark on one end and left- or right-
tilt on the other. The feature patches of the dumbbell
were sharp-edged as depicted, whereas those of the
lozenge stimulus were smoothly graded into their sur-
round. The left edge of the wedge stimulus alternated
between right- and left-tilted, whereas the right region
graded smoothly from the gray to, alternately, either red
or green.
© 2001 Nature Publishing Group http://neurosci.nature.com
© 2001 Nature Publishing Group http://neurosci.nature.com
... EATURE-BASED signal selection is an essential function for the recognition of objects in complex realworld environments. In vision, different colored areas can be recognized as separate objects and their respective shapes and movements can be independently processed [1,2]. Similar feature-based signal segregation is also known in hearing [3]. ...
... Our questions were whether the failure of feature-based signal selection was observed not only when the target and distractor were presented simultaneously but also when presented separately, and if so, how much the target and distractors should be temporally separated to be segregated by touch. This task was similar to a task to measure the temporal resolution limit of feature binding in vision research (e.g., [2]), although spatial binding cues were not excluded in our case. The main condition was tested as Exp 1d. ...
... This is because the observed 2 Hz limit is comparable to the limit of cross-modal binding, which is thought to be mediated by a general-purpose attentional mechanism [37]. It should be noted that the temporal limit can be much faster for low-level binding, such as purely temporal binding of color and orientation (without using spatial cues) in vision is higher than 10 Hz [2]. Our results do now support similar fast feature binding in touch. ...
Article
Full-text available
For human sensory processing, cluttered real-world environments where signals from multiple objects or events overlap are challenging. A cognitive function useful in such situations is an attentional selection of one signal from others based on the difference in bound feature. For instance, one can visually select a specific orientation if it is uniquely colored. However, here we show that unlike vision, touch is very poor at feature-based signal selection. We presented two-orthogonal line segments with different vibration textures to a fingertip. Though observers were markedly sensitive to each feature, they were generally unable to identify the orientation bound with a specific texture when the segments were presented simultaneously or in rapid alternation. A similar failure was observed for a direction judgment task. These results demonstrate a general cognitive limitation of touch, highlighting its unique bias to integrate multiple signals into a global event rather than segment them into separate events.
... 16 A neuron might respond, for example, only to red vertical lines, thereby coding that these features belong together. 27 Neurons Figure 1. Two phases of visual processing (A) Image generated by OpenAI's image-generation network DALL,E2 using the prompt ''a zebra and a giraffe and a tree, realistic.'' ...
Article
When we look at an image, its features are represented in our visual system in a highly distributed manner, calling for a mechanism that binds them into coherent object representations. There have been different proposals for the neuronal mechanisms that can mediate binding. One hypothesis is that binding is achieved by oscillations that synchronize neurons representing features of the same perceptual object. This view allows separate communication channels between different brain areas. Another hypothesis is that binding of features that are represented in different brain regions occurs when the neurons in these areas that respond to the same object simultaneously enhance their firing rate, which would correspond to directing object-based attention to these features. This review summarizes evidence in favor of and against these two hypotheses, examining the neuronal correlates of binding and assessing the time course of perceptual grouping. I conclude that enhanced neuronal firing rates bind features into coherent object representations, whereas oscillations and synchrony are unrelated to binding.
... The crucial difference between WHM-which is friendly to the accounts of overflow proponents like Block-and WM models like those of BH and Cowan-which make overflow impossible-is that content anywhere in the witches' hat can and regularly does become phenomenal, even that which never moves into the crown. Processes of binding are occurring from the earliest visual processing areas onwards (Holcombe & Cavanagh, 2001;Lamme, 2010, p. 211) and all the way to the peak. Ultimately, what we take to be our ongoing stream of consciousness is composed of FOC (brim) and higher-order content (crown) all bound 251 and integrated together into a single unitary phenomenal experience. ...
Thesis
Full-text available
Is attention both necessary and sufficient for consciousness? Call this central question of this treatise, “Q.” We commonly have the experience of consciously paying attention to something, but is it possible to be conscious of something you are not attending to, or to attend to something of which you are not conscious? Where might we find examples of these? This treatise is a quest to find an answer to Q in two parts. Part I reviews the foundations upon which the discourse on Q is built. Different inputs to Q produce different answers. After consideration of the many ways “attention” and “consciousness” have been defined, I settle upon phenomenal consciousness and Executive Attention (defined as a suite of strategies for structuring cognition for further processing implemented by the executive of working memory) as the most interesting inputs to Q, and the ones on which Part II focuses. Attention without consciousness seems relatively easy to establish empirically, but consciousness without attention is much harder. The putative candidates all seem to have major problems, but I build a strong abductive case for the hitherto ignored case of foveal phenomenal overflow. We consciously see far more detail in our foveal fields than we can Executively Attend, although there is a serious obstacle to our ever confirming that empirically—identifying conscious content relies on Executive Attentional report. Triangulating the capacity limitations of attention, consciousness, and working memory strengthens this case for consciousness without attention, and suggests that cognition must work something like my “Witches’ Hat Model,” on which content can become conscious outside of Executive Attention or working memory. I conclude with some reflections on the implications of my arguments for the discourse on Q, and for other discourses such as the ontologies of attention and consciousness, theories of consciousness, some other cognitive concepts, and ethical considerations in humans, animals, and machines. A conclusive answer to Q continues to elude us. It may perhaps be an ultimately insoluble conundrum. But it is the very essence of humanity to seek an answer, and in so doing, to improve our understanding of our own nature: “The proper study of mankind is man.”
... This implies that the eyelid motion can become bound to the mouth movement resulting in a perceptual slowing of the eyelid closure. A more direct measure of binding utilises the observation that there is a generic upper limit for reporting feature pairings in two alternating feature sequences of around 3 Hz 16 , which can be exceeded if the feature pairs are perceptually coded as conjunctions 17,18 . Harrison, et al. 19 showed that the conjunction of eye gaze direction and eyebrow position can be reported at high alternation rates (approx. ...
Article
Full-text available
We asked how dynamic facial features are perceptually grouped. To address this question, we varied the timing of mouth movements relative to eyebrow movements, while measuring the detectability of a small temporal misalignment between a pair of oscillating eyebrows—an eyebrow wave. We found eyebrow wave detection performance was worse for synchronous movements of the eyebrows and mouth. Subsequently, we found this effect was specific to stimuli presented to the right visual field, implicating the involvement of left lateralised visual speech areas. Adaptation has been used as a tool in low-level vision to establish the presence of separable visual channels. Adaptation to moving eyebrows and mouths with various relative timings reduced eyebrow wave detection but only when the adapting mouth and eyebrows moved asynchronously. Inverting the face led to a greater reduction in detection after adaptation particularly for asynchronous facial motion at test. We conclude that synchronous motion binds dynamic facial features whereas asynchronous motion releases them, allowing adaptation to impair eyebrow wave detection.
... https://doi.org/10.1101 frequency, suggesting that at least for certain stimuli, color and form features are automatically encoded in a conjoined format without requiring a separate, laborious attention-driven binding step (Holcombe & Cavanagh, 2001). As another example, in an illusion called the McCollough Effect, orientation-specific color aftereffects can be induced; for example, adapting to alternating red and black vertical bars will lead to a green afterimage when subsequently viewing white and black vertical bars, but not white and black horizontal bars, suggesting another case in which color and orientation might be automatically encoded in a conjunctive format (see, e.g., Stromeyer, 1969). ...
Preprint
Full-text available
Despite decades of neuroscience research, our understanding of the relationship between color and form processing in the primate ventral visual pathway remains incomplete. Using fMRI multivoxel pattern analysis, this study examined the coding of color with both a simple form feature (orientation) and a mid-level form feature (curvature) in human early visual areas V1 to V4, posterior and central color regions, and shape areas in ventral and lateral occipito-temporal cortex. With the exception of the central color region (which showed color but not form decoding), successful color and form decoding was found in all other regions examined, even for color and shape regions showing univariate sensitivity to one feature. That said, all regions exhibited significant feature decoding biases, with decoding from color and shape regions largely consistent with their univariate preferences. Color and form are thus represented in neither a completely distributed nor a completely modular manner, but a biased distributed manner. Interestingly, coding of one feature in a brain region was always tolerant to changes in the other feature, indicating relative independence of color and form coding throughout the ventral visual cortex. Although evidence for interactive coding of color and form also existed, the effect was weak and only existed for color and orientation conjunctions in early visual cortex. No evidence for interactive coding of color and curvature was found. The predominant relationship between color and form coding in the human brain appears to be one of anatomical coexistence (in a biased distributed manner), but representational independence.
Article
A vertical target is perceived as tilted against a slightly tilted inducer surrounding it. To identify the temporal resolution and temporal extent of this phenomenon of orientation repulsion in the same paradigm, we used an alternating pair of inducer stimuli having complementary orientation distributions and quantified repulsion at various alternation frequencies. The duration of each inducer stimulus was inversely proportional to the frequency. When an orthogonal pair of D2 patterns, a type of grating whose luminance modulation in a particular orientation was the second-order partial derivative of an isotropic 2D-Gaussian, was used as the inducer, repulsion occurred when the duration exceeded 20 ms and leveled off at 30 ms and beyond. When a custom-made texture with a narrowband orientation distribution and another texture with a complementary orientation distribution were alternated as the inducer, repulsion gradually increased until the inducer duration reached 200 ms. The gradual increase in repulsion was observed regardless of whether the orientation of the inducer that appeared simultaneously with the target was discernible. These findings reveal that contextual modulation in orientation occurs at a high temporal resolution and continues to a long temporal extent under optimal conditions.
Article
Despite decades of research, our understanding of the relationship between color and form processing in the primate ventral visual pathway remains incomplete. Using fMRI multivoxel pattern analysis, we examined coding of color and form, using a simple form feature (orientation) and a mid-level form feature (curvature), in human ventral visual processing regions. We found that both color and form could be decoded from activity in early visual areas V1 to V4, as well as in the posterior color-selective region and shape-selective regions in ventral and lateral occipitotemporal cortex defined based on their univariate selectivity to color or shape, respectively (the central color region only showed color but not form decoding). Meanwhile, decoding biases towards one feature or the other existed in the color- and shape-selective regions, consistent with their univariate feature selectivity reported in past studies. Additional extensive analyses show that while all these regions contain independent (linearly additive) coding for both features, several early visual regions also encode the conjunction of color and the simple, but not the complex, form feature in a nonlinear, interactive manner. Taken together, the results show that color and form are encoded in a biased distributed and largely independent manner across ventral visual regions in the human brain.
Article
Full-text available
Across a broad range of stimulus types and tasks (16 stimulus types × 26 tasks, 1,744 observers in total), the present study employed an individual-item differences analysis to extract the factors of visual-attentional processing. Three orthogonal factors were identified and they can be summarized as an FVS 2.0 framework: featural, visual, and spatial strengths. Apart from one exception (low-level motion), the FVS 2.0 framework accounts for the vast majority (95.4%) of the variances in the 25 tasks. Therefore, the three straightforward factors provide a unifying framework for understanding the relationship between stimulus types as well as those between tasks. Combining these and other related results, the role of preattentive features seems to be rather different from the traditional view: visual features are general purpose, exclusive, innate, constancy based, and keyword like. A general-purpose, exclusive, innate, constancy-based and keyword-like (GEICK) conjecture is proposed which suggests that the features are conscious-level keywords generated by the specific brain area of V4 and/or IT and then used by all other brain areas. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Article
When a voluntary action is followed by a sensory outcome, their timings are perceived to shift toward each other compared to when they were generated independently. Recent studies have tried to explain this temporal binding effect based on the cue integration theory, in which the timing of action and outcome are estimated as a precision-weighted average of their individual estimates, although distinct results were obtained between the binding of action and outcome. This study demonstrates that cue integration underlies both action and outcome bindings, using visual changes as action outcomes. Participants viewed a moving clock presented on a screen to report the onset time of their action or the feature changes of visual objects that were relevant or irrelevant to the clock movement. The results revealed that the precision of outcome timing judgment was different based on the object that underwent a feature change. Moreover, consistent with the theory's prediction, the perceptual shifts of action and outcome timings were larger and smaller, respectively, when the precision of outcome timing judgments was higher. These results suggest that cue integration serves as a common mechanism in action and outcome bindings.
Article
Full-text available
Anatomical and functional studies of the visual cortex of the rhesus monkey have shown that it is made up of a multiplicity of distinct areas. These seem to be functionally specialised to analyse different features of the visual environment.
Article
Full-text available
Equiluminance ratios for red/green, red/blue and green/blue sine-wave gratings were determined by using a minimum-motion heterochromatic matching technique that permitted reliable settings at temporal frequencies as low as 0.5 Hz. The red/green equiluminance ratio was influenced by temporal but not spatial frequency, the green/blue ratio was influenced by spatial but not temporal frequency, and the red/blue ratio was influenced by both. After bleaching of the blue-sensitive cones, there was no change in equiluminance ratios, indicating no contribution of the blue-sensitive cones to the luminance channel even at low temporal and spatial frequencies. The inhomogeneity of yellow pigmentation within the macular region was identified as the source of the spatial-frequency effect on the blue/green ratio.
Article
Unlabelled: A powerful paradigm (the pedestal-plus-test display) is combined with several subsidiary paradigms (interocular presentation, stimulus superpositions with varying phases, and attentional manipulations) to determine the functional architecture of visual motion perception: i.e. the nature of the various mechanisms of motion perception and their relations to each other. Three systems are isolated: a first-order system that uses a primitive motion energy computation to extract motion from moving luminance modulations; a second-order system that uses motion energy to extract motion from moving texture-contrast modulations; and a third-order system that tracks features. Pedestal displays exclude feature-tracking and thereby yield pure measures of the first- and second-order systems which are found to be exclusively monocular. Interocular displays exclude the first- and second-order systems and thereby to yield pure measures of feature-tracking. Results: both first- and second-order systems are fast (with temporal frequency cutoff at 12 Hz) and sensitive. Feature tracking operates interocularly almost as well as monocularly. It is slower (cutoff frequency is 3 Hz) and it requires much more stimulus contrast than the first- and second-order systems. Feature tracking is both bottom-up (it computes motion from luminance modulation, texture-contrast modulation, depth modulation, motion modulation, flicker modulation, and from other types of stimuli) and top-down--e.g. attentional instructions can determine the direction of perceived motion.
Article
1. We have measured responses of macaque retinal ganglion cells to a uniform flickering field, with variation in luminance, chromaticity or both (heterochromatic flicker). 2. With heterochromatic flicker, as the luminance ratio of the flicker components was varied, phasic ganglion cell activity went through a minimum and an abrupt phase change close to equal luminance. Tonic ganglion cell responses underwent a gradual phase change without any minimum close to equal luminance. For red on-centre cells, when wavelengths above 570 nm were altered with white, a progressive phase advance occurred as luminance ratio (L lambda/LW) was increased. With wavelengths below 570 nm a progressive phase lag occurred. For green on-centre cells, the opposite pattern was found. For all tonic cells, the higher the temporal frequency, the more rapidly did such phase changes occur. A simple model incorporating a centre-surround delay of 3-8 ms could quantitatively account for these changes. 3. With luminance flicker of different dominant wavelengths, amplitudes and phase of responses of phasic ganglion cells were independent of wavelength at all frequencies. The amplitude and phase of the responses of tonic ganglion cells was very dependent on wavelength, as well as on flicker frequency. Their characteristics hardly ever resembled results from phasic cells. 4. For achromatic flicker, response phase of tonic cells at or above 10 Hz was variable, probably due to the centre-surround delay. Such variability was not seen among phasic cells. 5. An interesting implication of these results is that the ability of tonic ganglion cells to unambiguously signal rapid chromatic or spatial change is limited.
Article
A new hypothesis about the role of focused attention is proposed. The feature-integration theory of attention suggests that attention must be directed serially to each stimulus in a display whenever conjunctions of more than one separable feature are needed to characterize or distinguish the possible objects presented. A number of predictions were tested in a variety of paradigms including visual search, texture segregation, identification and localization, and using both separable dimensions (shape and color) and local elements or parts of figures (lines, curves, etc. in letters) as the features to be integrated into complex wholes. The results were in general consistent with the hypothesis. They offer a new set of criteria for distinguishing separable from integral features and a new rationale for predicting which tasks will show attention limits and which will not.
Article
When two isoluminant colors alternate at frequencies > 10 Hz, we perceive only one fused color with a minimal sensation of brightness flicker. In spite of the perception of color fusion, color opponent (CO) cells at early stages of the visual pathway are known to respond to chromatic flicker at frequencies far exceeding the perceptual fusion frequency. To explain color fusion, several groups have predicted that CO cells in V1-unlike the retina and lateral geniculate nucleus-should not follow high-frequency flicker. To test this prediction we recorded from 12 CO cells in various V1 layers. We found, contrary to expectations, that these neurons follow high frequency flicker well above heterochromatic fusion frequencies. All followed 15 Hz flicker and 10/12 followed 30 Hz flicker. For three cells, we tested 60 Hz luminance flicker and found clear responses. We thus present evidence of cortical activity in alert, trained monkeys that is clearly representing visual stimulation, yet is not perceived. Our data call into question explanations of perceptual phenomena that invoke a low temporal frequency cut-off of CO cells in V1 to account for the failure to perceive fast temporal changes in the chromatic domain.
Article
The maximum speed for attentive tracking of targets was measured in three types of (radial) motion displays: ambiguous motion where only attentive tracking produced an impression of direction, apparent motion, and continuous motion. The upper limit for tracking (about 50 deg s-1) was an order of magnitude lower than the maximum speed at which motion can be perceived for some of these stimuli. In all cases but one, the ultimate limit appeared to be one of temporal frequency, 4-8 Hz, not retinal speed or rotation rate. It was argued that this rate reflects the temporal resolution of attention, the maximum rate at which events can be individuated from those that precede or follow them. In one condition, evidence was also found for a speed limit to attentive tracking, a maximum rate at which attention could follow a path around the display.
  • M Gur
  • M Snodderly
Gur, M. & Snodderly, M. Vision Res. 37, 377–382 (1997).
  • Z Lu
  • G Sperling
Lu, Z. & Sperling, G. Vision Res. 35, 2697–2722 (1995).