Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A general explanation for the observer's ability to judge the mean size of simple geometrical figures, such as circles, was advanced. Results indicated that, contrary to what would be predicted by statistical averaging, the precision of mean size perception decreases with the number of judged elements. Since mean size discrimination was insensitive to how total size differences were distributed among individual elements, this suggests that the observer has a limited cognitive access to the size of individual elements pooled together in a compulsory manner before size information reaches awareness. Confirming the associative law of addition means, observers are indeed sensitive to the mean, not the sizes of individual elements. All existing data can be explained by an almost general theory, namely, the Noise and Selection (N&S) Theory, formulated in exact quantitative terms, implementing two familiar psychophysical principles: the size of an element cannot be measured with absolute accuracy and only a limited number of elements can be taken into account in the computation of the average size. It was concluded that the computation of ensemble characteristics is not necessarily a tool for surpassing the capacity limitations of perceptual processing.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... They are also formed for high-level features such as facial expression (Haberman & Whitney, 2007), identity (Roberts et al., 2019), lifelikeness (Yamanashi Leib et al., 2016), and economic value (Yamanashi Leib et al., 2020). They are formed more accurately if included items are similar to each other (Ariely, 2001;Corbett et al., 2012;Dakin, 2001;Im & Halberda, 2013;Maule & Franklin, 2015;Solomon, 2010;Solomon et al., 2011;Sweeny et al., 2013;Utochkin & Tiurina, 2014) and if more items are included during averaging due to noise cancelation (Allik et al., 2013;Baek & Chong, 2020a;Brezis et al., 2018;Haberman & Whitney, 2010;Lee et al., 2016;Parkes et al., 2001;Robitaille & Harris, 2011;Solomon, 2010;Solomon et al., 2011). 1 Ensemble representations are ubiquitous and useful for many important visual functions. They can be used for learning and identifying categories (Khayat & Hochstein, 2019;Utochkin, 2015). ...
... Existing observer models of ensemble perception provide strong quantitative predictions that fit the performance of human observers well in some ensemble tasks. Most of the models account for averaging (Allik et al., 2013;Baek & Chong, 2020a;Parkes et al., 2001;Solomon et al., 2011), and only few of them go further to account for variability perception (e.g., Morgan et al., 2008;Solomon, 2010). These models are focused on quantifying observer inefficiencies in a set of parameters that could account for the patterns of performance found in experimental data. ...
... These models are focused on quantifying observer inefficiencies in a set of parameters that could account for the patterns of performance found in experimental data. For example, the precision of average computation was shown to depend on limits imposed by the early noise involved with individual representations (Allik et al., 2013;Baek & Chong, 2020a;Im & Halberda, 2013;Parkes et al., 2001;Solomon et al., 2011), the number of items sampled from the entire set that can be sufficient to accomplish the observed error magnitude given the set distribution (Allik et al., 2013;Im & Halberda, 2013;Parkes et al., 2001;Solomon et al., 2011), late noise involved after sampling information about individual items and applied to readout ensemble summaries from this sample (Baek & Chong, 2020a;Parkes et al., 2001;Solomon et al., 2011), and distributed attention (Baek & Chong, 2020a). Whereas these models suggest accurate quantitative predictions about average and variance discrimination (Allik et al., 2013;Baek & Chong, 2020a;Parkes et al., 2001;Solomon, 2010;Solomon et al., 2011), their computational algorithms should not necessarily be taken for the mechanism of ensemble representation and their parameter estimates should not be taken to reflect independent representational states within that mechanism. ...
Article
Full-text available
Ensemble representations have been considered as one of the strategies that the visual system adopts to cope with its limited capacity. Thus, they include various statistical summaries such as mean, variance, and distributional properties and are formed over many stages of visual processing. The present study proposes a population-coding model of ensemble perception to provide a theoretical and computational framework for these various facets of ensemble perception. The proposed model consists of a simple feature layer and a pooling layer. We assumed ensemble representations as population responses in the pooling layer and decoded various statistical properties from population responses. Our model successfully predicted averaging performance in orientation, size, color, and motion direction across different tasks. Furthermore, it predicted variance discrimination performance and the priming effects of feature distributions. Finally, it explained the well-known variance and set-size effects and has a potential for explaining the adaptation and clustering effects.
... For example, it was observed that the accuracy of the mean size discrimination remained approximately constant with the increase of the set size from 4 to 16 (Ariely, 2001, Fig. 4). Subsequent studies confirmed the observation that the accuracy of the mean size discrimination is typically independent of the number of elements in the set (Allik et al., 2013;Chong & Treisman, 2005b). This independence was considered as evidence that the number of elements does not affect the accuracy with which the mean value of a perceptual attribute can be determined (Alvarez, 2011;Ariely, 2001;Chong & Treisman, 2005b). ...
... This independence was considered as evidence that the number of elements does not affect the accuracy with which the mean value of a perceptual attribute can be determined (Alvarez, 2011;Ariely, 2001;Chong & Treisman, 2005b). However, this was a mistake because simple statistical considerations could lead to a conclusion that if the information from N elements is pooled together arithmetically, the accuracy is expected to increase proportionally with the square root of N (Allik et al., 2013;Fouriezos et al., 2008;Sorkin et al., 1991). Thus, a nearly constant discrimination precision demonstrates, in fact, a significant drop in the accuracy with which each additional element is processed. ...
... The observed discrimination precision of the mean value can be obtained assuming that only two to three elements out of N are attended, and information recorded from them is pooled together for finding their mean size (Myczek & Simons, 2008). However, inattentional feature blindness cannot be separated from the representational noise based on the discrimination precision alone (Allik et al., 2013). Because these two factors have a trade-off effect on the discrimination precision, the slope of the psychometric discrimination function or any of its characteristics is an ambiguous indicator (Allik et al., 2013). ...
Article
Visual perception is capable of pooling multiple local orientation signals into a single more accurate summary orientation. However, there is still a lack of systematic inquiry into which summary statistics are implemented in that process. Here, the task was to recognize in which direction, clockwise or counter-clockwise, the mean orientation of a set of randomly distributed Gabor patches (N = 1, 2, 4, and 8) was rotated from the implicit vertical. The mean orientation discrimination accuracy did not improve with the increase of the number N of elements in proportion to the square-root-N, as could be expected if noisy internal representations were arithmetically averaged. The Proportion of Informative Elements (PIE), defined as the percentage of elements having an orientation different from the vertical, also affected the discrimination precision, violating the arithmetic averaging rules. The decrease in the orientation discrimination precision with the increase of the PIE would suggest that the orientation pooling could be more adequately described by a quadratic or higher power mean. Thus, we parameterized the averaging process for the power parameter of the generalized mean formula. The results indicate that different pooling rules in different trials may apply, for example, the arithmetic mean in some and the maximal deviation rule in others. It is concluded that pooling of orientation information is a relatively inaccurate process for which different perceptual cues and their combination rules can be used.
... They are also formed for high-level features such as facial expression (Haberman & Whitney, 2007), identity (Roberts et al., 2019), lifelikeness (Yamanashi Leib et al., 2016), and economic value (Yamanashi Leib et al., 2020). They are formed more accurately if included items are similar to each other (Ariely, 2001;Corbett et al., 2012;Dakin, 2001;Im & Halberda, 2013;Maule & Franklin, 2015;Sweeny et al., 2013;Solomon, 2010;Solomon et al., 2011;Utochkin & Tiurina, 2014) and if more items are included during averaging due to noise cancellation (Allik et al., 2013;Baek & Chong, 2020a;Brezis et al., 2018;Haberman & Whitney, 2010;Lee et al., 2016;Parkes et al., 2001;Robitaille & Harris, 2011;Solomon, 2010;Solomon et al., 2011) 1 . ...
... Existing observer models of ensemble perception provide strong quantitative predictions that fit the performance of human observers well in some ensemble tasks. Most of the models account for averaging (Allik et al., 2013;Baek & Chong, 2020a;Parkes et al. 2001;Solomon et al., 2011) and only few of them go further to account for variability perception (e.g., Morgan et al., 2008;Solomon, 2010). These models are focused on quantifying parameters of internal representations that underlie the observed performance. ...
... These models are focused on quantifying parameters of internal representations that underlie the observed performance. For example, the precision of average computation was shown to depend on early noise involved with individual representations, the number of averaged items (Allik et al., 2013;Baek & Chong, 2020a;Parkes et al. 2001;Solomon et al., 2011), late noise involved with average computation (Baek & Chong, 2020a;Parkes et al. 2001;Solomon et al., 2011), and distributed attention (Baek & Chong, 2020a). ...
Preprint
Full-text available
Ensemble representations have been considered as one of strategies that the visual system adopts to cope with its limited capacity. Thus, they include various statistical summaries such as mean, variance, and distributional properties and are formed over many stages of visual processing. The current study proposes a population coding model of ensemble perception to provide a theoretical and computational framework for these various facets of ensemble perception. The proposed model consists of a simple feature layer and a pooling layer. We assumed ensemble representations as population responses in the pooling layer and decoded various statistical properties from population responses. Our model successfully predicted averaging performance in orientation, size, color, and motion direction across different tasks. Furthermore, it predicted variance discrimination performance and the priming effects of feature distributions. Finally, it explained the well-known variance and set size effects and has a potential for explaining the adaptation and clustering effects.
... Accounting for this finding, Kanaya et al. (2018) put forth an amplification hypothesis of perceptual averaging, stating that physically salient items (in their case the largest or highest-frequency items) are more heavily weighted than less salient items in the determination of such summary statistics. More specifically, this hypothesis rests on the idea that perceptual averaging occurs via the sampling of just a subset of items (e.g., Allik et al., 2013;Myczek & Simons, 2008) approximately equal to the square root of all items (e.g., Dakin, 2001;Whitney & Yamanashi Leib, 2018), rather than through exhaustive sampling of all items in a set, as others have argued (e.g., Ariely, 2001;Chong & Treisman, 2005a;Chong et al., 2008). ...
... Moreover, with respect to the fate of the less salient (i.e., the non-matching items), our results are consistent with Iakovlev and Utochkin (2020), in that we too demonstrate that while salient items overcontribute to estimations of the mean, such judgments are not based solely on these items. Indeed, while we cannot directly speak to whether individuals employed exhaustive sampling of all presented items (e.g., Ariely, 2001;Chong & Treisman, 2005a;Chong et al., 2008) or instead relied on just a partial sample of the items (e.g., Allik et al., 2013;Dakin, 2001;Myczek & Simons, 2008;Whitney & Yamanashi Leib, 2018), it is worth noting that to achieve the observed bias of approximately 3º using the latter strategy, on average, individuals would have needed to sample memory-matching items over non-matching items at a rate of 3 to 2 (i.e., sampling 3 items from a distribution centered at 15º and 2 items from a distribution centered at -15º would yield an average value of 3º). ...
... As such, we ultimately arrive at the same conclusion as Iakovlev and Utochkin (2020) in that we come to three possible routes by which memory-driven selection may bias perceptual averaging. First, in line with the amplification hypothesis and partial sampling theories of perceptual averaging more generally (e.g., Allik et al., 2013;Dakin, 2001;Myczek & Simons, 2008;Whitney & Yamanashi Leib, 2018), it is possible that through memory-guided selection, memory-matching items more freely gain access to a privileged sample of items to which perceptual average calculations are based. For example, to achieve the bias of ~3º observed in Experiments 1 and 3 (where color was most relevant to the VWM task), from a partial sampling perspective, this would imply that memory-matching items were sampled over non-matching items at a rate of 3 to 2. Alternatively, the observed bias may instead be accounted for by an exhaustive sampling account of perceptual averaging (e.g., Ariely, 2001;Chong & Treisman, 2005a;Chong et al., 2008). ...
Article
Full-text available
The process by which multiple items within an object grouping are rapidly summarized along a given visual dimension into a single mean value (i.e., perceptual averaging) has increasingly been shown to interact dynamically with visual working memory (VWM). Commonly, this interaction is studied with respect to the influence of perceptual averaging over VWM, but it is also the case that VWM can support perceptual averaging. Here, we argue that, in the presence of memory-matching elements, VWM exerts an obligatory influence over perceptual averaging even when it is detrimental to do so. Over four experiments, we tested our hypothesis by having individuals perform a mean orientation estimation task while concurrently maintaining a colored object in VWM. We anticipated that mean orientation reports would be attracted to the local mean of memory-matching items if such items are prioritized in perceptual average judgments. This was indeed the case as we observed a persistent bias in mean orientation judgments toward the subset mean of items matching the VWM item color, despite color being entirely irrelevant to the mean orientation task. Our results thus highlight a goal-invariant influence of VWM over perceptual averaging, which we attribute to amplification through memory-driven selection. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
... The threshold defined as the slope of the discrimination function increases when the representational noise becomes larger. But it also increases when the fraction of used information becomes smaller, and there is no way to tell which of these two possible factors caused a specific discrimination function slope (Allik et al., 2013). Thus, progress in understanding ensemble perception is in finding a unique indicator of inattentional feature blindness that cannot be confused with representational noise. ...
... Allik et al., 2013, p. 38). This uncertainty led to presentation of explanations in which representational noise played no significant role or in which a massive involvement of inattentional blindness was presumed-of course, in addition to noisy representations (Allik et al., 2013;Dakin, 2001;Solomon et al., 2016;Solomon et al., 2011). In extreme cases, the proposed models assumed that a substantial fraction of elements were missed because of attentional failure. ...
... An indicator by which inattentional feature blindness can be detected without confusion with representational noise would also challenge model builders. It is no longer sufficient to merely demonstrate that a good model fit can be obtained based on the assumption that a considerable number of elements were neglected (e.g., Allik et al., 2013;Dakin, 2001;Solomon & Morgan, 2017;Solomon et al., 2011). More direct tests are needed to demonstrate the degree to which inattentional blindness plays any role in ensemble perception. ...
Article
In ensemble displays, two principal factors determine the precision with which the mean value of some perceptual attribute, such as size and orientation, can be discriminated: inefficiency and representational noise of each element. Inefficiency is mainly caused by biased inference, or by inattentional (feature) blindness (i.e., some elements or their features are not processed). Here, we define inattentional feature blindness as an inability to perceive the value(s) of certain feature(s) of an object while the presence of the object itself may be registered. Separation of the effects of inattentional (feature) blindness and perceptual noise has escaped traditional analytic methods because of their trade-off effects on the slope of the psychometric discrimination function. Here, we propose a method that can separate the effects of inattentional feature blindness from that of the representational noise. The basic idea is to display a set of elements from which only one contains information relevant for solving the task, while all other elements are “dummies” carrying no useful information because they do not differ from the reference. If the single informative element goes unprocessed, the correct answer can only be given by a random guess. The guess rate can be modeled similarly to the lapse rate, traditionally represented by λ. As an illustration, we present evidence that the presence versus lack of inattentional feature blindness in orientation pooling depends on the feature types present in the display.
... The ability to extract the mean value of a visual feature from a set of items spans across lowlevel features, such as size (e.g., Allik et al., 2013;Ariely, 2001;Corbett et al., 2012;Corbett & Melcher, 2014;Im & Halberda, 2013;Luo & Zhao, 2018;Tiurina & Utochkin, 2019;Tokita et al., 2016), line length (Bauer, 2017;Utochkin, Khvostov, & Stakina, 2018), orientation (J. A. Solomon, 2010;Utochkin et al., 2018;Witt, 2019) and hue (Maule & Franklin, 2015Michael, de Gardelle, & Summerfield, 2014;Tong et al., 2015) to highlevel features, such as emotion and gender (e.g., Haberman & Whitney, 2007, facial expressions (e.g., Griffiths et al., 2018;Li et al., 2016;Wolfe et al., 2015), and lifelikeness (e.g., Yamanashi Leib, Kosovicheva, & Whitney, 2016). Size is perhaps one of the most studied features, and is tested with various stimuli: typically with dots/circles (e.g., Ariely, 2001;Chong & Treisman, 2003) but also with concrete illustrations of strawberries and lollipops (Yang, Tokita, & Ishiguchi, 2018). ...
... For small set sizes of two to three items, viewers show a lot of uncertainty, taking longer to estimate the mean of the ensemble and often being very inaccurate -their estimates deviating more from the true mean value than estimates of larger set sizes. Typically, viewers perform better with more items in the ensemble (Allik et al., 2013;Baek & Chong, 2020a;Haberman & Whitney, 2010). ...
... For example, participants respond to average discrimination tasks faster when they are first exposed to an array with the same variance as the subsequent array that they need to extract the average from (Michael et al., 2014). Constant variance also serves as a buffer, as sensitivity to the mean is less influenced by the shape of the statistical distribution (e.g., uniform, normal, bimodal; Allik et al., 2013;Chong & Treisman, 2003;Haberman & Whitney, 2009). It is interesting that our visual system is robust against the shape of distributions, compared to how much the shape of a distribution affects students' assessment of variance in histograms (Cooper & Shore, 2008;Kaplan et al., 2014;Meletiou-Mavrotheris & Lee, 2002). ...
Article
Full-text available
In the age of big data, we are constantly inventing new data visualizations to consolidate massive amounts of numerical information into smaller and more digestible visual formats. These data visualizations use various visual features to convey quantitative information, such as spatial position in scatter plots, color saturation in heat maps, and area in dot maps. These data visualizations are typically composed of ensembles, or groups of related objects, that together convey information about a data set. Ensemble perception, or one’s ability to perceive summary statistics from an ensemble, such as the mean, has been used as a foundation for understanding and explaining the effectiveness of certain data visualizations. However, research in data visualization has revealed some perceptual biases and conceptual difficulties people face when trying to utilize the information in these graphs. In this tutorial review, we will provide a broad overview of research conducted in ensemble perception, discuss how principles of ensemble encoding have been applied to the research in data visualization, and showcase the barriers graphs can pose to learning statistical concepts, using histograms as a specific example. The goal of this tutorial review is to highlight possible connections between three areas of research—ensemble perception, data visualization, and statistics education—and to encourage research in the practical applications of ensemble perception in solving real-world problems in statistics education.
... Although the reasons for this discrepancy are still unclear, in both cases participants relied on some items more than on others. This finding relates to the notion of capacity that has been put forward in early cognitive models of attention and working memory, and that has also been part of recent theoretical accounts of ensemble perception (Allik et al., 2013;Solomon, May, & Tyler, 2016). ...
... In the context of extracting a set-average, capacity can be defined as the number of items pooled together in the estimation (Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Dakin, 2001;Solomon, May, & Tyler, 2016). Whereas this definition assumes an all-or-none selection of some items and not others, an alternative view involving distributed attention can be also considered. ...
... By contrast, in the orientation averaging task we expect either a fixed (or improved) precision with the set size of the array, as a result of averaging the encoding noise. To validate these conclusions, we used computational modeling to fit the data with two models, namely (1) the limited-capacity (subsampling) model (Allik et al., 2013;Solomon, May, & Tyler, 2016) and (2) the distributed attention or 'zoom lens' model (Baek & Chong, 2020a), and extracted the capacity or attention parameters for the two tasks. Finally, we examined the weights given to the mid-range and extreme values and compared them across the tasks (Vandormael et al., 2017;Vanunu et al., 2020). ...
Article
Full-text available
Recent research has established that humans can extract the average perceptual feature over briefly presented arrays of visual elements or the average of a rapid temporal sequence of numbers. Here we compared the extraction of the average over briefly presented arrays, for a perceptual feature (orientations) and for numerical values (1-9 digits), using an identical experimental design for the two tasks. We hypothesized that the averaging of numbers, more than of orientations, would be constrained by capacity limitations. Arrays of Gabor elements or digits were simultaneously presented for 300 ms and observers were required to estimate the average on a continuous response scale. In each trial the elements were sampled from normal distributions (of various means) and we varied the set size (4-12). We found that while for orientation the averaging precision remained constant with set size, for numbers it decreased with set size. Using computational modeling we also extracted capacity parameters (the number of elements that are pooled in the average extraction). Despite marked heterogeneity between observers, the capacity for orientations (around eight items) was much larger than for numbers (around four items). The orientation task also had a larger fraction of participants relying on distributed attention to all elements. Our study thus supports the idea that numbers more than perceptual features are subject to capacity or attentional limitations when observers need to evaluate the average over an ensemble of stimuli.
... An opposite theory argues that ensemble summary representations are supported by a limited-capacity mechanism associated with the bottleneck of focused attention and working memory. This theory suggests that only a small subset of items is sampled and integrated to accomplish proxy statistics for the entire ensemble (Allik et al., 2013;Maule & Franklin, 2016;Solomon, 2010). Based on different model estimates, it was suggested that the visual system might integrate a fixed number of items (Maule & Franklin, 2016; or a flexible number of items depending on the number of observed items like the square root of presented items (Dakin, 2001;Gorea, Belkoura, & Solomon, 2014;Solomon, 2010;Whitney & Yamanashi Leib, 2018) or about a half of them (Allik et al., 2013). ...
... This theory suggests that only a small subset of items is sampled and integrated to accomplish proxy statistics for the entire ensemble (Allik et al., 2013;Maule & Franklin, 2016;Solomon, 2010). Based on different model estimates, it was suggested that the visual system might integrate a fixed number of items (Maule & Franklin, 2016; or a flexible number of items depending on the number of observed items like the square root of presented items (Dakin, 2001;Gorea, Belkoura, & Solomon, 2014;Solomon, 2010;Whitney & Yamanashi Leib, 2018) or about a half of them (Allik et al., 2013). ...
... Unlike the theories suggesting parallel exhaustive processing of all items, the limited-capacity models should explain rules used to select an appropriate sample. While some of such theories propose random sampling (Allik et al., 2013;Marchant, Simons, & de Fockert, 2013;Maule & Franklin, 2016), others suggest non-random factors that can favor specific items for ensemble sampling . As the limited-capacity sampling theories are organically related to focused attention, it is natural to suppose that factors guiding attentional deployment in a scene can be used for item prioritization in sampling. ...
Article
Full-text available
Ensemble statistics are often thought of as a reliable impression of numerous items despite limited capacities to consciously represent each individual. However, whether all items equally contribute to ensemble summaries (e.g., mean) and whether they might be affected by known limited-capacity processes, such as focused attention, is still debated. We addressed these questions via a recently described “amplification effect,” a systematic bias of perceived mean (e.g., average size) towards the more salient “tail” of a feature distribution (e.g., larger items). In our experiments, observers adjusted the mean orientation of sets of items varying in set size. We made some of the items more salient or less salient by changing their size. While the whole orientation distribution was fixed, the more salient subset could be shifted relative to the set mean or differ in range. We measured the bias away from the set mean and the standard deviation (SD) of errors, as it is known to reflect the physical range from which ensemble information is sampled. We found that bias and SD changes followed the shifts and range changes in salient subsets, providing evidence for amplification. However, these changes were weaker than those expected from sampling only salient items, suggesting that less salient items were also sampled. Importantly, the SD decreased as a function of set size, which is only possible if the number of sampled elements increased with set size. Overall, we conclude that orientation summary statistics are sampled from an entire ensemble and modulated by the amplification effect of attention.
... The resulting ensemble representation is an average signal that is more precisely represented than the individual representations in a set (Alvarez & Oliva, 2009;Galton, 1907;Sun & Chong, 2020). With this reduction process as a key subject of interest, a number of computational models have been proposed for mean orientation (e.g., Parkes et al., 2001) and mean size (e.g., Allik et al., 2013;Baek & Chong, 2020a;Solomon et al., 2011). Although these models have different assumptions in the incorporation of components, such as early noise, late noise, and attention, their basic computational scheme concerns reducing multiple inputs into one summary variable. ...
... For example, studies that investigate mean size perception show multiple items in the display set, followed by a single item as a probe. Observers are asked to identify whether the mean of the display set is larger or smaller than the probe, or they are asked to reproduce the mean by adjusting the probe's size to approximate the mean of the display set (Allik et al., 2013;Ariely, 2001;Bauer, 2009Bauer, , 2017Chong & Treisman, 2003;Lee, Baek, & Chong, 2016;Li & Yeh, 2017;Marchant, Simons, & de Fockert, 2013;Oriet & Hozempa, 2016). ...
... Similarly, Webb, Ledgeway, and McGraw (2007) showed that population coding algorithms that read-out from directionally tuned activity (e.g., maximumlikelihood) predicted human motion perception better than statistical estimates of central tendency, while Brezis, Bronfman, and Usher (2015) used a population coding-based model to approximate numerical averaging. Statistical summary as a population response calls into question the assumption of subsampling models where only a few items are used to compute the mean (Allik et al., 2013;Myczek & Simons, 2008;Solomon et al., 2011;Solomon et al., 2011). Although a few item samples will be able to predict the mean of the population to a reliable degree for any type of distribution, it is hard to make sense of how distributional information can still be preserved when only a few items are selected and used in the encoding of ensemble information. ...
Article
Full-text available
Ongoing discussions on perceptual averaging have the implicit assumption that individual representations are reduced into a single prototypical representation. However, some evidence suggests that the mean representation may be more complex. For example, studies that use a single item probe to estimate mean size often show biased estimations. To this end, we investigate whether the mean representation of size is reduced to a single mean or includes other properties of the set. Participants estimate the mean size of multiple circles in the display set by adjusting the mean size of the circles in the probe set that followed. Across 3 experiments, we vary the similarity of set-size, variance, and skewness between the display and probe sets and examine how property congruence affects mean estimation. Altogether, we find that keeping properties consistent between the 2 compared sets improves mean estimation accuracy. These results suggest that mean representation is not simply encoded as a single mean but includes properties such as numerosity, variance, and the shape of a distribution. Such multiplex nature of summary representation could be accounted for by a population summary that captures the distributional properties of a set rather than a single summary statistic. (PsycInfo Database Record (c) 2020 APA, all rights reserved)
... Early accounts suggested that ensemble perception is similar to "gist" perception, in that people form emergent perceptions of a group from a coarse visual analysis of the group as a whole (Ariely, 2001;Chong & Treisman, 2003;Parkes et al., 2001). Contemporary evidence is mixed, however, regarding whether ensemble perception unfolds as a function of distributed attention across an entire stimulus set (Baek & Chong, 2020) or whether perceivers sample from a subset of items within a stimulus set to generate summary statistics for the whole group (Allik et al., 2013;Goldenberg et al., 2021;Maule & Franklin, 2016). The latter account (selective summarization) suggests that ensemble percepts may be differentially influenced by the most salient members of a group (Kanaya et al., 2018;Maule & Franklin, 2016). ...
... Aligned with this theoretical perspective, we have argued that if ensemble perceptions operate (at least in part) to maximize social functioning in groups, then this should be reflected in its operating characteristics. Building on evidence that perceivers base their ensemble perceptions on a subsample of visible group members (Allik et al., 2013;Goldenberg et al., 2021;Maule & Franklin, 2016), we proposed, and found evidence, that ensemble perceptions are biased by the contributions of individuals in a group who are most relevant to the perceiver. That is, rather than equally weighting all individuals in a group, perceivers in these studies preferentially attended to those who were higher in socialidentity relevance. ...
Article
Full-text available
Research in vision science suggests that people possess a perceptual mechanism—ensemble perception—which enables them to rapidly identify the characteristics of groups (e.g., emotion, sex-ratio, race-ratio). This work examined whether ensemble perceptions of groups are driven by the characteristics of group members whose behavior is most likely to impact the perceiver. Specifically, we predicted that more self-relevant group members would be weighted more heavily in ensemble perceptions than less self-relevant group members. Study 1 (n = 83) found that young adult participants’ ensemble perceptions of emotion were biased in favor of more self-relevant (younger adult) group members’ emotional expressions, compared to less self-relevant (older adult) group members’ emotional expressions, and that these ensemble perceptions informed judgments of belonging in the group. Study 2 recruited older (n = 94) and younger (n = 97) adult participants and again found a general pattern of bias in favor of more self-relevant (younger adult) group members’ emotional expressions in ensemble perceptions of emotion and that these ensemble perceptions informed evaluations of belonging in the group. Finally, Study 3 (n = 193) directly manipulated the self-relevance of older and younger adult group members and found that the extent of bias in ensemble perceptions of emotion depended on whether younger or older adults were made more self-relevant. Results suggest that incidental cues of social identity can bias ensemble perceptions of emotion and influence downstream judgments of belonging.
... BMC Biology (2024) 22:28 One prevailing view argues that ensemble statistics are extracted via mechanisms that operate in parallel treating all elements the same [6,21,22]. For example, in one class of models, ensemble statistics are computed by pooling over many local features that are initially processed in parallel ( Fig. 1) [23][24][25][26], but see [27][28][29][30]. Models of this kind rest on the implicit assumption that ensemble perception operates in a spatially uniform field-i.e., that elements at the fovea or far in the periphery contribute equally to ensemble perception (Fig. 1A). ...
... Similarly, alternative models based on sub-sampling, where statistics are derived from a small randomly selected subset of items. [27][28][29][30], fail to account for the influence of the whole ensemble uncertainty on the magnitude of the bias. Thus, existing models must incorporate an additional component, whose implementation may vary depending on the model, to account for the highly spatially anisotropic and uncertainty-weighted nature of ensemble perception. ...
Article
Full-text available
Background The human brain can rapidly represent sets of similar stimuli by their ensemble summary statistics, like the average orientation or size. Classic models assume that ensemble statistics are computed by integrating all elements with equal weight. Challenging this view, here, we show that ensemble statistics are estimated by combining parafoveal and foveal statistics in proportion to their reliability. In a series of experiments, observers reproduced the average orientation of an ensemble of stimuli under varying levels of visual uncertainty. Results Ensemble statistics were affected by multiple spatial biases, in particular, a strong and persistent bias towards the center of the visual field. This bias, evident in the majority of subjects and in all experiments, scaled with uncertainty: the higher the uncertainty in the ensemble statistics, the larger the bias towards the element shown at the fovea. Conclusion Our findings indicate that ensemble perception cannot be explained by simple uniform pooling. The visual system weights information anisotropically from both the parafovea and the fovea, taking the intrinsic spatial anisotropies of vision into account to compensate for visual uncertainty.
... One prevailing view argues that ensemble statistics are extracted via mechanisms that operate in parallel treating all elements the same (6,21,22). For example, in one class of models, ensemble statistics are computed by pooling over many local features that are initially processed in parallel ( Figure 1; (23)(24)(25)(26), but see (27)(28)(29)(30). Models of this kind rest on the implicit assumption that ensemble perception operates in a spatially uniform field -i.e., that elements at the fovea or far in the periphery contribute equally to ensemble perception ( Figure 1A). ...
... Similarly, alternative models based on sub-sampling, where statistics are derived from a small randomly selected subset of items. (27)(28)(29)(30), fail to account for the influence of the whole ensemble uncertainty on the magnitude of the bias. Thus, existing models must incorporate an additional component, whose implementation may vary depending on the model, to account for the highly spatially anisotropic and uncertainty-weighted nature of ensemble perception. ...
Preprint
Full-text available
The human brain can rapidly represent sets of similar stimuli by their ensemble summary statistics, like the average orientation or size. Classic models assume that ensemble statistics are computed by integrating all elements with equal weight. Challenging this view, here we show that ensemble statistics are estimated by combining parafoveal and foveal statistics in proportion to their reliability. In a series of experiments, observers reproduced the average orientation of an ensemble of stimuli under varying levels of visual uncertainty. Ensemble statistics were affected by multiple spatial biases. In particular, a strong and persistent bias toward the center of the visual field. This bias, evident in the majority of subjects and in all experiments, scaled with uncertainty: the higher the uncertainty in the ensemble statistics the larger the bias towards the element shown at the fovea. Our findings indicate that ensemble perception cannot be explained by simple uniform pooling. The visual system weights information anisotropically from both the parafovea and the fovea, taking the intrinsic spatial anisotropies of vision into account to compensate for visual uncertainty.
... On the other hand, sub-sampling models assume a subset of stimuli to be integrated to the mean estimation (e.g., Allik et al., 2013;Solomon et al., 2011). Conventional sub-sampling models with equal weighting clearly do not fit the recency findings in sequential mean estimation tasks, which calls for more sophisticated implementations of sub-sampling models with different integration rules, such as recency or fixation-weighted schemes. ...
... Instead of a fixed directional prediction, FIM predicts that a higher set size could positively, neutrally, or negatively impact estimation accuracy. Indeed, existing studies reported conflicting results on the impact of set size on ensemble coding tasks (e.g., Allik et al., 2013;Ariely, 2001;Baek & Chong, 2020;Dakin, 2001;Haberman & Whitney, 2009;Robitaille & Harris, 2011). More importantly, FIM suggests that the impact of set size on mean estimation accuracy could vary with set size ranges. ...
Article
Full-text available
The mean estimation task, which explicitly asks observers to estimate the mean feature value of multiple stimuli, is a fundamental paradigm in research areas such as ensemble coding and cue integration. The current study uses computational models to formalize how observers summarize information in mean estimation tasks. We compare model predictions from our Fidelity-based Integration Model (FIM) and other models on their ability to simulate observed patterns in within-trial weight distribution, across-trial information integration, and set size effects on mean estimation accuracy. Experiments show non-equal weighting within trials in both sequential and simultaneous mean estimation tasks. Observers implicitly overestimated trial means below the global mean and underestimated trial means above the global mean. Mean estimation performance declined and stabilized with increasing set sizes. FIM successfully simulated all observed patterns, while other models failed. FIM's information sampling structure provides a new way to interpret the capacity limit in visual working memory and sub-sampling strategies. As a model framework, FIM offers task-dependent modeling for various ensemble coding paradigms, facilitating research synthesis across different studies in the literature.
... Another phenomenon in ensemble perception worth mentioning is that the report variance is conserved across set size (e.g., Ariely, 2001;Chong & Treisman 2005) all the way down to one item (Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Brielmann & Pelli, 2020). The finding that increased set size fails to decrease the variability of average reports suggests that the variance of reporting is limited by a late noise that arises after combining the members of the ensemble. ...
... As in previous studies (e.g., Allik et al., 2013;Ariely, 2001;Chong & Treisman 2005;Brielmann & Pelli, 2020), we found that reporting the average across items instead of the value of one item failed to decrease the variability of reports. This finding strengthens our previous suggestion that the variance of reporting is limited by a late noise that arises after combining the members of the ensemble. ...
Article
Full-text available
How many pleasures can you track? In a previous study, we showed that people can simultaneously track the pleasure they experience from two images. Here, we push further, probing the individual and combined pleasures felt from seeing four images in one glimpse. Participants (N = 25) viewed 36 images spanning the entire range of pleasure. Each trial presented an array of four images, one in each quadrant of the screen, for 200 ms. On 80% of the trials, a central line cue pointed, randomly, at some screen corner either before (precue) or after (postcue) the images were shown. The cue indicated which image (the target) to rate while ignoring the others (distractors). On the other 20% of trials, an X cue requested a rating of the combined pleasure of all four images. Later, for baseline reference, we obtained a single-pleasure rating for each image shown alone. When precued, participants faithfully reported the pleasure of the target. When postcued, however, the mean ratings of images that are intensely pleasurable when seen alone (pleasure >4.5 on a 1–9 scale) dropped below baseline. Regardless of cue timing, the rating of the combined pleasure of four images was a linear transform of the average baseline pleasures of all four images. Thus, while people can faithfully track two pleasures, they cannot track four. Instead, the pleasure of otherwise above-medium-pleasure images is diminished, mimicking the effect of a distracting task.
... В других моделях предполагается, что количество усредняемых объектов зависит от общего количества объектов в ансамбле. Так, по разным оценкам, предполагается, что зрительная система усредняет все объекты ансамбля (Baek, Chong, 2020a), примерно половину всего ансамбля (Allik et al., 2013) или квадратный корень из размера предъявленного набора объектов Whitney, Yamanashi Leib, 2018). Также существуют разные предположения относительно того, по какому принципу отбираются объекты. ...
... Также существуют разные предположения относительно того, по какому принципу отбираются объекты. Большинство моделей предполагают, что объекты отбираются случайным образом (Allik et al., 2013;, однако есть доказательства в пользу отбора объектов, которые в большей степени привлекают внимание, то есть самые перцеп-тивно заметные (salient) объекты (Kanaya et al., 2018). Помимо этого, было выдвинуто предложение о неслучайном отборе объектов на основе различных стратегий , однако эта идея подверглась критике на эмпирических и логических основаниях. ...
Article
Full-text available
Под восприятием ансамблей обычно понимают способность наблюдателя за короткое время с высокой степенью точности оценивать обобщенные статистические свойства множества объектов (среднее, дисперсия, количество). В обзоре описывается феноменология восприятия ансамблей и методы его исследования. Описываются конкурирующие концепции механизмов отбора и обработки информации для расчета статистик ансамбля, одна из которых предполагает грубую обработку всех объектов сразу, а другая - детальную обработку лишь нескольких отобранных объектов с последующим обобщением оцененных свойств на остальные объекты. Рассматривается развитие взглядов на сущность внутренних репрезентаций, через которые становятся доступны статистики ансамбля: от идеи репрезентации единственной величины (например, для среднего значения признака) до относительно новой идеи «богатой» репрезентации, приблизительно воспроизводящей все распределение признаков предъявленных объектов. Рассматривается роль репрезентации ансамблей в организации восприятия и решении ряда перцептивных задач. Наконец, в обзоре рассматриваются потенциальные нейрофизиологические корреляты восприятия ансамблей и перспективные теоретические модели его нейронных механизмов.
... Research in ensemble perception has frequently used idealobserver models to investigate its possible underlying mechanism and integration efficiency (e.g., Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Baek & Chong, 2020;Myczek & Simons, 2008;Parkes et al., 2001;Solomon et al., 2011). While these modelling approaches vary in terms of decision structures, model components and/or assumptions, they all have highlighted the important role of internal noise to achieve better goodness-of-fit of a model. ...
... On the other hand, Myczek and Simons (2008) showed that simulations of various subsampling strategies using one to two set members could sufficiently explain observed performance, suggesting that ensemble perception can be produced cognitively via a serial inspection of a few set members and need not be automatic or unconstrained by limited attentional capacity. Others have argued for a middle ground account in which the visual system does indeed engage in ensemble processing, which has a limited capacity (Allik et al., 2013;de Fockert & Marchant, 2008;Haberman & Whitney, 2009;Maule & Franklin, 2016). ...
Article
Full-text available
The accurate perception of human crowds is integral to social understanding and interaction. Previous studies have shown that observers are sensitive to several crowd characteristics such as average facial expression, gender, identity, joint attention, and heading direction. In two experiments, we examined ensemble perception of crowd speed using standard point-light walkers (PLW). Participants were asked to estimate the average speed of a crowd consisting of 12 figures moving at different speeds. In Experiment 1, trials of intact PLWs alternated with trials of scrambled PLWs with a viewing duration of 3 seconds. We found that ensemble processing of crowd speed could rely on local motion alone, although a globally intact configuration enhanced performance. In Experiment 2, observers estimated the average speed of intact-PLW crowds that were displayed at reduced viewing durations across five blocks of trials (between 2500 ms and 500 ms). Estimation of fast crowds was precise and accurate regardless of viewing duration, and we estimated that three to four walkers could still be integrated at 500 ms. For slow crowds, we found a systematic deterioration in performance as viewing time reduced, and performance at 500 ms could not be distinguished from a single-walker response strategy. Overall, our results suggest that rapid and accurate ensemble perception of crowd speed is possible, although sensitive to the precise speed range examined.
... Another phenomenon in ensemble perception worth mentioning is that the report variance is conserved across set size (e.g., Ariely, 2001;Chong & Treisman 2005) all the way down to one item (Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Brielmann & Pelli, 2020). The finding that increased set size fails to decrease the variability of average reports suggests that the variance of reporting is limited by a late noise that arises after combining the members of the ensemble. ...
... As in previous studies (e.g., Allik et al., 2013;Ariely, 2001;Chong & Treisman 2005;Brielmann & Pelli, 2020), we found that reporting the average across items instead of the value of one item failed to decrease the variability of reports. This finding strengthens our previous suggestion that the variance of reporting is limited by a late noise that arises after combining the members of the ensemble. ...
... Although both studies found a small trend in the direction predicted by the Ensemble model, there was no compelling evidence to reject the Independent Observations model. Studies of visual perception and ensemble coding have often generated conceptually analogous findings, where increasing set size led to a relatively constant sensitivity (Allik et al., 2013;Alvarez, 2011;Ariely, 2001;Chong & Treisman, 2005). ...
Article
Full-text available
A photo lineup, which is a cross between an old/new and a forced-choice recognition memory test, consists of one suspect, whose face was either seen before or not, and several physically similar fillers. First, the participant/witness must decide whether the person who was previously seen is present (old/new) and then, if present, choose the previously seen target (forced choice). Competing signal-detection models of eyewitness identification performance make different predictions about how certain variables will affect a witness’s ability to discriminate previously seen (guilty) suspects from new (innocent) suspects. One key variable is the similarity of the fillers to the suspect in the lineup, and another key variable is the size of the lineup (i.e., the number of fillers). Previous research investigating the role of filler similarity has supported one model, known as the Ensemble model, whereas previous research investigating the role of lineup size has supported a competing model, known as the Independent Observations model. We simultaneously manipulated these two variables (filler similarity and lineup size) and found a pattern that is not predicted by either model. When the fillers were highly similar to the suspect, increasing lineup size reduced discriminability, but when the fillers were dissimilar to the suspect, increasing lineup size enhanced discriminability. The results suggest that each additional filler adds noise to the decision-making process and that this noise factor is minimized by maximizing filler dissimilarity.
... A recurring question in the ensemble literature asks if ensemble representations are built on global processing of all items in the relevant subset or if performance can be based on a smaller sample from that set. There are multiple claims that the level of accuracy, seen in the data, can be achieved if observers efficiently sampled only a small handful of random items (e.g., Allik et al., 2013;Gorea et al., 2014;Myczek & Simons, 2008;Solomon, 2010). This could allow ensemble summary statistics to be computed using mechanisms whose capacity would not exceed those of focused attention and/or working memory. ...
Article
The visual system can rapidly calculate the ensemble statistics of a set of objects; for example, people can easily estimate an average size of apples on a tree. To accomplish this, it is not always useful to summarize all the visual information. If there are various types of objects, the visual system should select a relevant subset: only apples, not leaves and branches. Here, we ask what kind of visual information makes a “good” ensemble that can be selectively attended to provide an accurate summary estimate. We tested three candidate representations: basic features, preattentive object files, and full-fledged bound objects. In four experiments, we presented a target and several distractors’ sets of differently colored objects. We found that conditions where a target ensemble had at least one unique color (basic feature) provided ensemble averaging performance comparable to the baseline displays without distractors. When the target subset was defined as a conjunction of two colors or color-shape partly shared with distractors (so that they could be differentiated only as preattentive object files), subset averaging was also possible but less accurate than in the baseline and feature conditions. Finally, performance was very poor when the target subset was defined by an exact feature relationship, such as in the spatial conjunction of two colors (spatially bound object). Overall, these results suggest that distinguishable features and, to a lesser degree, preattentive object files can serve as the representational basis of ensemble selection, while bound objects cannot.
... Aligning our results with the insights from previous studies (Myczek and Simons, 2008;Solomon et al., 2011;Allik et al., 2013), 311 we infer computational mechanisms of ensemble perception: serial and focused-attention processing may underlie the slow and 312 selective ensemble computation. However, this does not entirely preclude the involvement of parallel processing in computing 313 an ensemble. ...
Preprint
Full-text available
The visual system is capable of computing summary statistics of multiple visual elements at a glance. While numerous studies have demonstrated ensemble perception across different visual features, the timing at which the visual system forms an ensemble representation remains unclear. This is mainly because most previous studies did not uncover time-resolved neural representations during ensemble perception. Here we used orientation ensemble discrimination tasks along with EEG recordings to decode orientation representations over time while human observers discriminated an average of multiple orientations. We observed alternation in orientation representations over time, with stronger neural representations of the individual elements in a set of orientations, but we did not observe significantly strong representations of the average orientation at any time points. We also found that a cumulative average of the orientation representations over approximately 500 ms converged toward the average orientation. More importantly, this cumulative orientation representation significantly correlated with the individual difference in the perceived average orientation. These findings suggest that the visual system gradually extracts an orientation ensemble, which may be represented as a cumulative average of transient orientation signals, through selective processing of a subset of multiple orientations that occurs over several hundred milliseconds.
... Furthermore, this ability was not sensitive to set size, as observers could produce a summary statistic that was accurate for as many as 16 simple shapes and complex stimuli (e.g., Haberman & Whitney, 2007;Maule & Franklin, 2015). In contrast, some studies have found that larger set sizes confer a benefit to ensemble coding performance in terms of both accuracy (e.g., Allik et al., 2013;Baek & Chong, 2020;Solomon, 2010) and response time (Robitaille & Harris, 2011). Interestingly, observers typically demonstrate worse accuracy when performing a member-identification task ("Was this item in the display you saw previously?") compared to when they were asked to report an average feature of a display (Ariely, 2001;Haberman & Whitney, 2007, 2009. ...
Article
Full-text available
Ensemble coding (the brain’s ability to rapidly extract summary statistics from groups of items) has been demonstrated across a range of low-level (e.g., average color) to high-level (e.g., average facial expression) visual features, and even on information that cannot be gleaned solely from retinal input (e.g., object lifelikeness). There is also evidence that ensemble coding can interact with other cognitive systems such as long-term memory (LTM), as observers are able to derive the average cost of items. We extended this line of research to examine if different sensory modalities can interact during ensemble coding. Participants made judgments about the average sweetness of groups of different visually presented foods. We found that, when viewed simultaneously, observers were limited in the number of items they could incorporate into their cross-modal ensemble percepts. We speculate that this capacity limit is caused by the cross-modal translation of visual percepts into taste representations stored in LTM. This was supported by findings that (a) participants could use similar stimuli to form capacity-unlimited ensemble representations of average screen size and (b) participants could extract the average sweetness of displays when items were viewed in sequence, with no capacity limitation (suggesting that spatial attention constrains the number of necessary visual cues an observer can integrate in a given moment to trigger cross-modal retrieval of taste). Together, the results of our study demonstrate that there are limits to the flexibility of ensemble coding, especially when multiple cognitive systems need to interact to compress sensory information into an ensemble representation.
... Intuitively, a mean representation can be generated by linearly averaging the representations of individual items, [9][10][11] either over all items 4,5 or over a subset of items, 12,13 where individual items were treated independently. On the other hand, ensemble perception is largely influenced by the internal structure of the stimulus set, such as the variance, 14 the Gestalt grouping, 15 and the number of items, 16 suggesting interdependencies of individual items during ensemble perception. ...
Article
Full-text available
Statistically summarizing information from a stimulus array into an ensemble representation (e.g., the mean) improves the efficiency of visual processing. However, little is known about how the brain computes the ensemble statistics. Here, we propose that ensemble processing is realized by nonadditive integration, rather than linear averaging, of individual items. We used a linear regression model approach to extract EEG responses to three levels of information: the individual items, their local interactions, and their global interaction. The local and global interactions, representing nonadditive integration of individual items, elicited rapid and independent neural responses. Critically, only the neural representation of the global interaction predicted the precision of the ensemble perception at the behavioral level. Furthermore, spreading attention over the global pattern to enhance ensemble processing directly promoted rapid neural representation of the global interaction. Taken together, these findings advocate a global, nonadditive mechanism of ensemble processing in the brain.
... Indeed, there is an ongoing debate about the way in which people compute these group averages. Some researchers suggest ensemble perception occurs by encoding all of the items in the group in a distributed manner (Baek & Chong, 2020), while others argue that ensemble perception occurs by sampling a subset of items (Allik et al., 2013;Maule & Franklin, 2015), with participants preferentially sampling the most salient items (Goldenberg et al., 2021;Kanaya et al., 2018;Sweeny et al., 2013). This debate about how ensemble representations are computed may be particularly relevant when it comes to people's perception of groups of faces, which are both socially salient and visually complex objects. ...
Article
Full-text available
When looking at groups of people, we can extract information from the different faces to derive properties of the group, such as its average facial emotion, although how this average is computed remains a matter of debate. Here, we examined whether our participants’ personal familiarity with the faces in the group, as well as the intensity of the facial expressions, biased ensemble perception. Participants judged the average emotional expression of ensembles of four different identities whose expressions depicted either neutral, angry, or happy emotions. For the angry and happy expressions, the intensity of the emotion could be either low (e.g., slightly happy) or high (very happy). When all the identities in the ensemble were unfamiliar, the presence of any high intensity emotional face biased ensemble perception towards its emotion. However, when a familiar face was present in the ensemble, perception was biased towards the familiar face’s emotion regardless of its intensity. These findings reveal that how we perceive the average emotion of a group is influenced by both the emotional intensity and familiarity of the faces comprising the group, supporting the idea that different faces may be weighted differently in ensemble perception. These findings have important implications for the judgements we make about a group’s overall emotional state may be biased by individuals within the group.
... A recurring question in the ensemble literature asks if ensemble representations are built on global processing of all items in the relevant subset or if performance can be based on a smaller sample from that set. There are multiple claims that the level of accuracy, seen in the data, can be achieved if observers efficiently sampled only a small handful of random items (e.g., Allik et al., 2013;Gorea et al., 2014;Myczek & Simons, 2008;Solomon, 2010). This could allow ensemble summary statistics to be computed using mechanisms whose capacity would not exceed those of focused attention and/or working memory. ...
Preprint
Full-text available
The visual system can rapidly summarize multiple objects in a form of ensemble statistics: e.g., people can easily estimate an average size of apples on a tree. To accomplish this, it is not always enough to summarize all the visual information. If there are various types of objects, the visual system should select a relevant subset: only apples without leaves. Here, we ask: what is the representational basis of ensemble selection, i.e., what kind of visual information makes a ‘good’ ensemble that can be selectively attended to provide an accurate summary estimate? We tested three candidate representations: basic features, preattentive object files, and full-fledged bound objects. In four experiments, we presented a target and several distractors’ sets of differently colored objects. We found that conditions, where a target ensemble had at least one unique color (basic feature), provided ensemble averaging performance comparable to the baseline displays without distractors. When the target subset was defined as a conjunction of two colors or color-shape partly shared with distractors (so that they could be differentiated only as preattentive object files), subset averaging was also possible but less accurate than in the baseline and the feature conditions. Finally, performance was very poor when the target subset was defined by an exact feature relationship, such as in the spatial conjunction of two colors (spatially bound object). Overall, these results suggest that distinguishable features and, to a lesser degree, preattentive object files can serve as the representational basis of ensemble selection, while bound objects cannot.
... Other theories portray ensemble perception as sampling with quite low efficiency in terms of the number of items that carry useful information. Although the accuracy of summary statistical judgments fairly overcomes that of judgments of individual features, numerous computational models claim that this level of accuracy could be effectively accomplished if observers effectively sampled a small fraction of items (Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Dakin, 2001;Gorea, Belkoura, & Solomon, 2014;Im & Halberda, 2013;Marchant, Simons, & de Fockert, 2013;Maule & Franklin, 2016;Myczek & Simons, 2008;Solomon, 2010;Solomon, Morgan, & Chubb, 2011). In most such theories, an otherwise-ideal observer randomly picks a few items corrupted by the early noise and takes the arithmetic mean of this sample, and finally the late noise is applied. ...
Article
Full-text available
Many studies have shown that observers can accurately estimate the average feature of a group of objects. However, the way the visual system relies on the information from each individual item is still under debate. Some models suggest some or all items sampled and averaged arithmetically. Another strategy implies "robust averaging," when middle elements gain greater weight than outliers. One version of a robust averaging model was recently suggested by Teng et al. (2021), who studied motion direction averaging in skewed feature distributions and found systematic biases toward their modes. They interpreted these biases as evidence for robust averaging and suggested a probabilistic weighting model based on minimization of the virtual loss function. In four experiments, we replicated systematic skew-related biases in another feature domain, namely, orientation averaging. Importantly, we show that the magnitude of the bias is not determined by the locations of the mean or mode alone, but is substantially defined by the shape of the whole feature distribution. We test a model that accounts for such distribution-dependent biases and robust averaging in a biologically plausible way. The model is based on well-established mechanisms of spatial pooling and population encoding of local features by neurons with large receptive fields. Both the loss functions model and the population coding model with a winner-take-all decoding rule accurately predicted the observed patterns, suggesting that the pooled population response model can be considered a neural implementation of the computational algorithms of information sampling and robust averaging in ensemble perception.
... Both studies found a small trend in the direction predicted by the ensemble model, but the results provide no compelling reason to reject the independent observations model. This result is not unlike findings from the ensemble coding literature where several studies found relatively constant sensitivity with increasing set size (Allik et al., 2013;Alvarez, 2011;Ariely, 2001;Chong & Treisman, 2003). ...
Article
Full-text available
Police investigators worldwide use lineups to test an eyewitness's memory of a perpetrator. A typical lineup consists of one suspect (who is innocent or guilty) plus five or more fillers who resemble the suspect and who are known to be innocent. Although eyewitness identification decisions were once biased by police pressure and poorly constructed lineups, decades of social science research led to the development of reformed lineup procedures that provide a more objective test memory. Under these improved testing conditions, cognitive models of memory can be used to better understand and ideally enhance eyewitness identification performance. In this regard, one question that has bedeviled the field for decades is how similar the lineup fillers should be to the suspect to optimize performance. Here, we model the effects of manipulating filler similarity to better understand why such manipulations have the intriguing effects they do. Our findings suggest that witnesses rely on a decision variable consisting of the degree to which the memory signal for a particular face in the lineup stands out relative to the crowd of memory signals generated by the set of faces in the lineup. The use of that decision variable helps to explain why discriminability is maximized by choosing fillers that match the suspect on basic facial features typically described by the eyewitness (e.g., age, race, gender) but who otherwise are maximally dissimilar to the suspect. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... A recurring question in the ensemble literature asks if ensemble representations are built on global processing of all items in the relevant subset or if performance can be based on a smaller sample from that set. There are multiple claims that the level of accuracy, seen in the data, can be achieved if observers efficiently sampled only a small handful of random items (e.g., Allik et al., 2013;Gorea et al., 2014;Myczek & Simons, 2008;Solomon, 2010). This could allow ensemble summary statistics to be computed using mechanisms whose capacity would not exceed those of focused attention and/or working memory. ...
... Therefore, we showed that an observer's performance in the one-item condition should not be taken as a default measure of the "ensemble-blind" mode of visual working memory. Second, although performance in the full-set condition rarely exceeded the level of optimal summation of individual change signals, summation itself implies that some decisions are based only on the sum of evidence rather than on any of the individual items (Allik et al., 2013;Haberman & Whitney, 2011), at least if the optimal decision rule is used. Our demonstration is in line with the previous evidence for the role of ensemble statistics in working memory tasks (Brady & Alvarez, 2011, 2015aCorbett, 2017;Utochkin & Brady, 2020). ...
Preprint
Full-text available
Growing empirical evidence shows that ensemble information (e.g., the average feature or feature variance of a set of objects) affects visual working memory for individual items. Recently, Harrison, McMaster, and Bays (2021) used a change detection task to test whether observers explicitly rely on ensemble representations to improve their memory for individual objects. They found that sensitivity to simultaneous changes in all memorized items (which also entail changes in set summary statistics) rarely exceeded a level predicted by the so-called optimal summation model. This model implies simple integration of evidence for change from all individual items but without any additional evidence coming from ensemble. Here, we argue that performance at the optimal summation level does not rule out the use of ensemble information. First, in two experiments, we show that, even if evidence from only one item is available at test, the statistics of the whole memory set affect performance. Second, we argue that the optimal decision strategy described by Harrison et al. is at least partly ensemble-based, whereas a strictly item-based strategy (the so-called minimum rule) predicts much lower sensitivity that both our and Harrison et al. (2021)’s observers consistently outperformed. We conclude that observers can encode ensemble information into working memory and rely on it at test.
... Consequently, even an increase in the exposure time may not result in improving the precision of individual representations. Prior studies investigating the extraction of a mean representation from sets of dots with different diameters found that the maximum number of dots that could be processed was approximately four (Allik et al., 2013;Gorea et al., 2014). Because face information is more complex than dots, the number of faces that can be involved in processing may be even fewer. ...
Article
Full-text available
Individuals can perceive the mean emotion or mean identity of a group of faces. It has been considered that individual representations are discarded when extracting a mean representation; for example, the “element-independent assumption” asserts that the extraction of a mean representation does not depend on recognizing or remembering individual items. The “element-dependent assumption” proposes that the extraction of a mean representation is closely connected to the processing of individual items. The processing mechanism of mean representations and individual representations remains unclear. The present study used a classic member-identification paradigm and manipulated the exposure time and set size to investigate the effect of attentional resources allocated to individual faces on the processing of both the mean emotion representation and individual representations in a set and the relationship between the two types of representations. The results showed that while the precision of individual representations was affected by attentional resources, the precision of the mean emotion representation did not change with it. Our results indicate that two different pathways may exist for extracting a mean emotion representation and individual representations and that the extraction of a mean emotion representation may have higher priority. Moreover, we found that individual faces in a group could be processed to a certain extent even under extremely short exposure time and that the precision of individual representations was relatively poor but individual representations were not discarded.
... If the mean nearest distance is judged then we are in fact talking about extraction of summary statistics or what is often called ensemble perception (Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Ariely, 2001;Chong & Treisman, 2003;Whitney & Leib, 2018). Because the perceptual system cannot count the exact number of elements without support from a symbolic system, it needs to rely on some sort of summary statistics characterizing how close or distant on average elements are from one another. ...
Article
Full-text available
The occupancy model (OM) was proposed to explain how the spatial arrangement of dots in sparse random patterns affects their perceived numerosity. The model’s central thesis maintained that each dot seemingly fills or occupies its surrounding area within a fixed radius ro and the total area collectively occupied by all the dots determines their apparent number. Because the perceptual system is not adapted for the precise estimation of area, it looks likely that the OM is just a convenient computational algorithm that does not necessarily correspond to the processes that actually take place in the perceptual system. As an alternative, the proximity model (PM) was proposed, which instead relies on a binomial function with the probability β characterizing the perceptual salience with which each element can be registered by the perceptual system. It was also assumed that the magnitude of β is proportional to the distance between a dot and its nearest neighbor. A simulation experiment demonstrated that the occupancy area computed according to the OM can almost perfectly be replicated by the mean nearest neighbor distance. It was concluded that proximity between elements is a critical factor in determining their perceived numerosity, but the exact algorithm that is used for the measure of proximities is yet to be established.
... The first section encompasses what can arguably be coined lower-level ensemble perception, including ensemble properties such as size (Haberman & Suresh, 2020;Allik et al., 2014;Ariely, 2001;Corbett & Oriet, 2011;Khayat & Hochstein, 2019;Morgan et al., 2008;Solomon, 2010), position (Alvarez & Oliva, 2008;Sun et al., 2021), color and contrast (Bauer, 2009;Khayat & Hochstein, 2019;Rajendran et al., 2020), etc. Of course, the term "low-level" is loaded, and this grouping is not intended to suggest (or exclude) any particular model of visual processing; it does not preclude feedback or necessitate strict feedforward hierarchical processing. ...
... We did not find that suppressing coarse processing disturbed ensemble processing, despite their similarities. This might be because the averaging process involves two stages (Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Baek & Chong, 2020a;Parkes et al., 2001;Solomon, 2010;Solomon, Morgan, & Chubb, 2011): encoding individual items and averaging them. The flicker adaptation might have only influenced the stage of encoding individual items without affecting averaging. ...
Article
Ensemble perception is efficient because it summarizes redundant and complex information. However, it loses the fine details of individual items during the averaging process. Such characteristics of ensemble perception are similar to those of coarse processing. Here, we tested whether extracting an average of a set was similar to coarse processing. To manipulate coarse processing, we used the fast flicker adaptation known as suppressing coarse information processed by the magnocellular pathway. We hypothesized that ifcomputing the average ofa set relied on coarse processing, the precision ofan averaging task should decrease after adaptation compared to baseline (no-adaptation). Across experiments with various features (orientation in Experiment 1, size in Experiment 2, and facial expression in Experiment 3), we found that suppressing coarse information did not disrupt the performance of the averaging tasks. Rather, adaptation increased the precision ofmean representation. The precision of mean representation might have increased because fine information was relatively enhanced after adaptation. Our results suggest that the quality of ensemble representation relies on that of individual items.
... How these summaries are computed, however, is still a matter of debate. While some argue that perceivers encode all items in a distributed manner (Baek & Chong, 2020), others argue that only a subset of objects within a set are encoded (Allik et al., 2013;Maule & Franklin, 2016), recently evaluated as equal to the square root of number of items in a set (Whitney & Yamanashi Leib, 2018). ...
Article
Full-text available
How do people go about reading a room or taking the temperature of a crowd? When people catch a brief glimpse of an array of faces, they can only focus their attention on some of the faces. We propose that perceivers preferentially attend to faces exhibiting strong emotions, and that this generates a crowd emotion amplification effect—estimating a crowd’s average emotional response as more extreme than it is. Study 1 (N = 50) documents the crowd amplification effect. Study 2 (N = 50) replicates the effect even when we increase exposure time. Study 3 (N = 50) uses eye-tracking to show that attentional bias to emotional faces drives amplification. These findings have important implications for many domains in which individuals have to make snap judgments regarding a crowd’s emotionality, from public speaking to controlling crowds.
... These local representations then presumably feed into a summary representation of the set via a pooling mechanism, at which point local information is then lost in favor of a percept or judgment at the gist level (Haberman & Whitney, 2009, 2011. This characterization of the ensemble mechanism is consistent with the feedforward architecture of the visual system, and simulations that feature this type of approach are able to approximate human perception quite well (Allik et al., 2013;Baek & Chong, 2020;Ji et al., 2020;Sweeny, et al., 2015). In this characterization, global-level information is the outcome of the ensemble process, not an input. ...
Article
Most visual scenes contain information at different spatial scales, including the local and global, or the detail and gist. Global processes have become increasingly implicated in research examining summary statistical perception, initially as the output of ensemble coding, and more recently as a gating mechanism for selecting which information is included in the averaging process itself. Yet local and global processing are known to be rapidly integrated by the visual system, and it is plausible that global-level information, like spatial organization, may be included as an input during ensemble coding. We tested this hypothesis using an ensemble shape-perception task in which observers evaluated the mean aspect ratios of sets of ellipses. In addition to varying the aspect ratios of the individual shapes, we independently varied the spatial arrangements of the sets so that they had either flat or tall organizations at the global level. We found that observers made precise summary judgments about the average aspect ratios of the sets by integrating information from multiple shapes. More importantly, global flat and tall organizations were incorporated into ensemble judgments about the sets; summary judgments were biased in the directions of the global spatial arrangements on each trial. This global-to-local integration even occurred when the global organizations were masked. Our results demonstrate that the process of summary representation can include information from both the local and global scales. The gist is not just an output of ensemble representation – it can be included as an input to the mechanism itself.
... In situations such as this, the rapid extraction of summary statistics of the elements (numerical returns or emotional expressions), in particular their average, has obvious advantages. Recent research over the last two decades has convincingly demonstrated that humans have a remarkable ability to extract summary statistics from large sets of visual elements, briefly presented together or in fast sequence, with regards to visual properties such as size, orientation and even emotional expression (Allik, Toom, Raidvee, Averin, & Kreegipuu, 2013;Ariely, 2001;Chong & Treisman, 2005;Dakin, 2001;Haberman & Whitney, 2011;Khayat & Hochstein, 2018;Robitaille & Harris, 2011;Parkes et al., 2001). Similarly, it has been reported that humans can extract the arithmetic average (the simplest summary statistic) from rapid numerical sequences Malmi & Samson, 1983) For example, Brezis et al., (2017) reported that human observers are able to provide quite accurate estimates of numerical average for sequences of 2-digit numbers (sequence length [6][7][8][9][10][11][12] that are presented at a rate of 4/sec. ...
Preprint
Full-text available
We examine the ability of observers to extract summary statistics (such as the mean and the relative-variance) from rapid numerical sequences (two digit numbers presented at a rate of 4/sec). In four experiments, we find that the participants show a remarkable ability to extract such summary statistics and that their precision in the estimation of the sequence-mean improves with the sequence-length (subject to individual differences). Using model selection for individual participants we find that, when they only estimate the sequence average, most participants rely on a holistic process of frequency based estimation and there is a minority who rely on a rule-based and capacity limited mid-range strategy. When both the sequence-average and the relative variance is estimated about half of the participants rely on these two strategies. Importantly, the holistic strategy appears more efficient in terms of its precision. We discuss implications for the domains of two pathways numerical processes and decision-making.
... On the other hand, the individual sizes ended up being spaced unequally in terms of the entire ensemble so that the absolute step size between items within the large set was always bigger than that within the small set according to the large set/small set mean ratio. Such asymmetry resulting from our size generation algorithm, in fact, complied with Weber's law that perceived difference between two sizes is approximately proportional to their sizes in the domains of both individual size and mean size (e.g., Allik et al., 2013). This way of size generation for individual circles in the small and large sets was implemented in previous work and has been shown to ensure that perceived variability of individual members (e.g., variance or range) is roughly the same across the categories (e.g., Khvostov & Utochkin, 2019). ...
Article
Full-text available
Ensemble representations are often described as efficient tools when summarizing features of multiple similar objects as a group. However, it can sometimes be more useful not to compute a single summary description for all of the objects if they are substantially different, for example when they belong to entirely different categories. It was proposed that the visual system can efficiently use the distributional information of ensembles to decide whether simultaneously displayed items belong to single or several different categories. Here we directly tested how the feature distribution of items in a visual array affects an ability to discriminate individual items (Experiment 1) and sets (Experiments 2–3) when participants were instructed explicitly to categorize individual objects based on the median of size distribution. We varied the width (narrow or fat) as well as the shape (smooth or two-peaked) of distributions in order to manipulate the ease of ensemble extraction from the items. We found that observers unintentionally relied on the grand mean as a natural categorical boundary and that their categorization accuracy increased as a function of the size differences among individual items and a function of their separation from the grand mean. For ensembles drawn from two-peaked size distributions, participants showed better categorization performance. They were more accurate at judging within-category ensemble properties in other dimensions (centroid and orientation) and less biased by superset statistics. This finding corroborates the idea that the two-peaked feature distributions support the “segmentability” of spatially intermixed sets of objects. Our results emphasize important roles of ensemble statistics (mean, range, distribution shape) in explicit visual categorization.
... Further, their 16-element stimulus had only four of these distinct and discriminable hues, each repeated four times. Thus, if considering the number of distinct hues in the stimulus, the sampling rate estimated by Maule and Franklin is consistent with our modeling results, which would indicate roughly two samples for the four-element stimulus, as well as previous estimates from other feature dimensions when only a small number of items were available to the observer (Allik et al., 2013;Haberman & Whitney, 2009;Sweeny & Whitney, 2014). It is also possible that the small number of distinct hues in Maule and Franklin's study may have encouraged a more cognitive strategy: Tokita et al. (2016) proposed that observers employ different strategies when estimating summary statistics from small sets and similar strategies with larger sets. ...
Article
Full-text available
Color serves both to segment a scene into objects and background and to identify objects. Although objects and surfaces usually contain multiple colors, humans can readily extract a representative color description, for instance, that tomatoes are red and bananas yellow. The study of color discrimination and identification has a long history, yet we know little about the formation of summary representations of multicolored stimuli. Here, we characterize the human ability to integrate hue information over space for simple color stimuli varying in the amount of information, stimulus size, and spatial configuration of stimulus elements. We show that humans are efficient at integrating hue information over space beyond what has been shown before for color stimuli. Integration depends only on the amount of information in the display and not on spatial factors such as element size or spatial configuration in the range measured. Finally, we find that observers spontaneously prefer a simple averaging strategy even with skewed color distributions. These results shed light on how human observers form summary representations of color and make a link between the perception of polychromatic surfaces and the broader literature of ensemble perception.
... On the other hand, the individual sizes ended up being spaced unequally in terms of the entire ensemble so that the absolute step size between items within the large set was always bigger than that within the small set according to the large set/small set mean ratio. Such asymmetry resulting from our size generation algorithm, in fact, complied with Weber's law that perceived difference between two sizes is approximately proportional to their sizes both in the domains of individual size and mean size (e.g., Allik et al., 2013). This way of size generation for individual circles in the small and large sets was implemented in previous work and has shown to ensure that perceived variability of individual members (e.g., variance or range) is roughly the same across the categories (e.g., Khvostov & Utochkin, 2019). ...
Preprint
Full-text available
Ensemble representations are often described as efficient tools when summarizing features of multiple similar objects as a group. However, it can sometimes be more useful not to compute a single summary description for all of the objects if they are substantially different, for example, when they belong to entirely different categories. It was proposed that the visual system can efficiently use the distributional information of ensembles to decide whether simultaneously displayed items belong to single or several different categories. Here we directly tested how the feature distribution of items in a visual array affects an ability to discriminate individual items (Experiment 1) and sets (Experiments 2-3) when participants were instructed explicitly to categorize individual objects based on the median of size distribution. We varied the width (narrow or fat) as well as the shape (smooth or two-peaked) of distributions in order to manipulate the ease of ensemble extraction from the items. We found that observers unintentionally relied on the grand mean as a natural categorical boundary and that their categorization accuracy increased as a function of the size differences among individual items and a function of their separation from the grand mean. For ensembles drawn from two-peaked size distributions, participants showed better categorization performance. They were more accurate at judging within-category ensemble properties in other dimensions (centroid and orientation) and less biased by superset statistics. This finding corroborates the idea that the two-peaked feature distributions support the “segmentability” of spatially intermixed sets of objects. Our results emphasize important roles of ensemble statistics (mean, range, distribution shape) in explicit visual categorization.
... Debriefing surveys and above-chance performance suggest that they tried to do the task. But either responses in the animacy ensemble task were in fact based on the information from very few items (see Myczek & Simons, 2008;Allik et al., 2013), or individual animacy representations were too noisy to build a reliable impression about overall display animacy. Thus, observers failed to successfully integrate animacy information from multiple items. ...
Preprint
Many studies have shown that people can rapidly and efficiently categorize the animacy of individual objects and scenes, even with few visual features available. Does this necessarily mean that the visual system has an unlimited capacity to process animacy across the entire visual field? We tested this in an ensemble task requiring observers to judge the relative numerosity of animate vs. inanimate items in briefly presented sets of multiple objects. We generated a set of morphed “animacy continua” between pairs of animal and inanimate object silhouettes and tested them in both individual object categorization and ensemble enumeration. For the ensemble task, we manipulated the ratio between animate and inanimate items present in the display and we also presented two types of animacy distributions: “segmentable” (including only definitely animate and definitely inanimate items) or “non-segmentable” (middle-value, ambiguous morphs pictures were shown along with the definite “extremes”). Our results showed that observers failed to integrate animacy information from multiple items, as they showed very poor performance in the ensemble task and were not sensitive to the distribution type despite their categorization rate for individual objects being near 100%. A control condition using the same design with color as a category-defining dimension elicited both good individual object and ensemble categorization performance and a strong effect of the segmentability type. We conclude that good individual categorization does not necessarily allow people to build ensemble animacy representations, thus showing the limited capacity of animacy perception.
Preprint
Full-text available
This report targets the claim that gist representations of visual stimuli, called "ensemble averages", are perceptual representations of statistics pertaining to stimuli. We report predictions of a mathematical model based on classical memory architectures which assumes ensemble averages are statistical approximations to stimuli, and that those approximations are constructed within short-term memory. We report results of three new experiments that test those predictions. The results support the memory model and contradict the consensus view that representations of ensemble averages are computed early in perceptual processing via parallel processing or neural pooling, suggesting instead that they are computed via control processes acting on item representations held in visual short-term memory. We conclude that the flight toward new mechanisms that has occurred within the ensemble representation literature is ill-advised, and suggest that one first carefully consider what well-established memory models can accomplish in the ensemble "perception" domain.
Article
Growing empirical evidence shows that ensemble information (e.g., the average feature or feature variance of a set of objects) affects visual working memory for individual items. Recently, Harrison, McMaster, and Bays (2021) used a change detection task to test whether observers explicitly rely on ensemble representations to improve their memory for individual objects. They found that sensitivity to simultaneous changes in all memorized items (which also globally changed set summary statistics) rarely exceeded a level predicted by the so-called optimal summation model within the signal-detection framework. This model implies simple integration of evidence for change from all individual items and no additional evidence coming from ensemble. Here, we argue that performance at the level of optimal summation does not rule out the use of ensemble information. First, in two experiments, we show that, even if evidence from only one item is available at test, the statistics of the whole memory set affect performance. Second, we argue that optimal summation itself can be conceptually interpreted as one of the strategies of holistic, ensemble-based decision. We also redefine the reference level for the item-based strategy as the so-called "minimum rule," which predicts performance far below the optimum. We found that that both our and Harrison et al. (2021)'s observers consistently outperformed this level. We conclude that observers can rely on ensemble information when performing visual change detection. Overall, our work clarifies and refines the use of signal-detection analysis in measuring and modeling working memory.
Book
Full-text available
This Element outlines the recent understanding of ensemble representations in perception in a holistic way aimed to engage the general audience, novel and expert alike. The Element highlights the ubiquitous nature of this summary process, paving the way for a discussion of the theoretical and cortical underpinnings, and why ensemble encoding should be considered a basic, inherently necessary component of human perception. Following an overview of the topic, including a brief history of the field, the Element introduces overarching themes and a corresponding outline of the present work.
Chapter
This chapter provides an overall perspective on human decision making to human factors practitioners, developers of decision tools, product designers, and others who are interested in how people make decisions and how decision making might be improved. It presents a broad set of prescriptive and descriptive approaches. The chapter introduces principles of rational choice suggested by classical decision theory, followed by a discussion of research on human decision making which has led to the new perspectives of behavioral decision theory and behavioral economics, and naturalistic decision models. It addresses the topic of decision support and problem solving. The main idea of adaptive decision behavior, or contingent decision behavior, is that an individual decision maker uses different strategies in different situations. The chapter also addresses methods of supporting or improving group decision making. Expert systems are developed to capture knowledge for a very specific and limited domain of human expertise.
Article
People can extract summary statistical information from groups of similar objects, an ability called ensemble perception. However, not every object in a group is weighted equally. For example, in ensemble emotion perception, faces far from fixation were weighted less than faces close to fixation. Yet the contribution of foveal input in ensemble emotion perception is still unclear. In two experiments, groups of faces with varying emotions were presented for 100 ms at three different eccentricities (0°, 3°, 8°). Observers reported the perceived average emotion of the group. In two conditions, stimuli consisted of a central face flanked by eight faces (flankers) (central-present condition) and eight faces without the central face (central-absent condition). In the central-present condition, the emotion of the central face was either congruent or incongruent with that of the flankers. In Experiment 1, flanker emotions were uniform (identical flankers); in Experiment 2 they were varied. In both experiments, performance in the central-present condition was superior at 3° compared to 0° and 8°. At 0°, performance was superior in the central-absent (i.e., no foveal input) compared to the central-present condition. Poor performance in the central-present condition was driven by the incongruent condition where the foveal face strongly biased responses. At 3° and 8°, performance was comparable between central-present and central-absent conditions. Our results showed how foveal input determined the perceived emotion of face ensembles, suggesting that ensemble perception fails when salient target information is available in central vision.
Article
To efficiently process complex visual scenes, the visual system often summarizes statistical information across individual items and represents them as an ensemble. However, due to the lack of techniques to disentangle the representation of the ensemble from that of the individual items constituting the ensemble, whether there exists a specialized neural mechanism for ensemble processing and how ensemble perception is computed in the brain remain unknown. To address these issues, we used a frequency-tagging EEG approach to track brain responses to periodically updated ensemble sizes. Neural responses tracking the ensemble size were detected in parieto-occipital electrodes, revealing a global and specialized neural mechanism of ensemble size perception. We then used the temporal response function to isolate neural responses to the individual sizes and their interactions. Notably, while the individual sizes and their local and global interactions were encoded in the EEG signals, only the global interaction contributed directly to the ensemble size perception. Finally, distributed attention to the global stimulus pattern enhanced the neural signature of the ensemble size, mainly by modulating the neural representation of the global interaction between all individual sizes. These findings advocate a specialized, global neural mechanism of ensemble size perception and suggest that global interaction between individual items contributes to ensemble perception.
Article
Ensemble perception of a crowd of stimuli is very accurate, even when individual stimuli are invisible due to crowding. The ability of high‐precision ensemble perception can be an evolved compensatory mechanism for the limited attentional resolution caused by crowding. Thus the relationship of crowding and ensemble coding is like two sides of the same coin wherein attention may play a critical factor for their coexistence. The present study investigated whether crowding and ensemble coding were similarly modulated by attention, which can promote our understanding of their relation. Experiment 1 showed that diverting attention away from the target harmed the performance in both crowding and ensemble perception tasks regardless of stimulus density, but crowding was more severely harmed. Experiment 2 showed that directing attention toward the target bar enhanced the performance of crowding regardless of stimulus density. Ensemble perception of high‐density bars was also enhanced but to a lesser extent, while ensemble perception of low‐density bars was harmed. Together, our results indicate that crowding is strongly modulated by attention, whereas ensemble perception is only moderately modulated by attention, which conforms to the adaptive view.
Article
Full-text available
We examine the ability of observers to extract summary statistics (such as the mean and the relative-variance) from rapid numerical sequences of two digit numbers presented at a rate of 4/s. In four experiments (total N = 100), we find that the participants show a remarkable ability to extract such summary statistics and that their precision in the estimation of the sequence-mean improves with the sequence-length (subject to individual differences). Using model selection for individual participants we find that, when only the sequence-average is estimated, most participants rely on a holistic process of frequency based estimation with a minority who rely on a (rule-based and capacity limited) mid-range strategy. When both the sequence-average and the relative variance are estimated, about half of the participants rely on these two strategies. Importantly, the holistic strategy appears more efficient in terms of its precision. We discuss implications for the domains of two pathways numerical processing and decision-making.
Article
When forming impressions of groups, people's impressions tend to reflect the average rating of the group members. However, group impressions have also been found to depart from an unweighted average of member ratings. For example, in the recently reported group attractiveness effect, groups were found to be more attractive than would be expected based on the average rating of the group members. In contrast, other studies have found that groups are rated as less attractive than would be expected. In two experiments, we found evidence for a group extremity effect that can help explain these prior findings. In this group extremity effect, group ratings of negatively and positively evaluated groups of faces are more extreme in either direction than would be expected based on the average ratings of the members. For negatively evaluated groups of faces, group ratings were significantly more negative than would be expected based on the average ratings of the group members. For positively evaluated groups of faces, group ratings were significantly more positive than would be expected based on the average ratings of the group members. The group extremity effect was larger for groups with more variability in the ratings of the group members, suggesting that attention to extreme group members underlies the effect. These data demonstrate how the biases involved in evaluating individuals based on appearance can be amplified when rating groups.
Article
For efficient use of limited capacity, the visual system summarizes redundant information and prioritizes relevant information, strategies known respectively as ensemble perception and selective attention. Although previous studies showed a close relationship between these strategies, the specific mechanisms underlying the relationship have not been determined. We investigated how attention modulated mean-size computation. Fourteen people participated in this study. We hypothesized that attention biases mean-size computation by increasing the contribution (weighted averaging) and the apparent size (perceptual enlargement) of an attended item. Consistent with this hypothesis, our results showed that estimated mean sizes were biased toward the attended size and overestimated regardless of the attended size, supporting weighted averaging and perceptual enlargement, respectively. Taken together, the observed effects of selective attention on mean-size computation signify a close relationship between the two optimization mechanisms to achieve efficient management of the visual system’s limited capacity.
Article
Full-text available
Increasing numbers of studies have explored human observers' ability to rapidly extract statistical descriptions from collections of similar items (e.g., the average size and orientation of a group of tilted Gabor patches). Determining whether these descriptions are generated by mechanisms that are independent from object-based sampling procedures requires that we investigate how internal noise, external noise, and sampling affect subjects' performance. Here we systematically manipulated the external variability of ensembles and used variance summation modeling to estimate both the internal noise and the number of samples that affected the representation of ensemble average size. The results suggest that humans sample many more than one or two items from an array when forming an estimate of the average size, and that the internal noise that affects ensemble processing is lower than the noise that affects the processing of single objects. These results are discussed in light of other recent modeling efforts and suggest that ensemble processing of average size relies on a mechanism that is distinct from segmenting individual items. This ensemble process may be more similar to texture processing.
Article
Full-text available
The visual system rapidly represents the mean size of sets of objects (Ariely, 2001). Here, we investigated whether mean size is explicitly encoded by the visual system, along a single dimension like texture, numerosity, and other visual dimensions susceptible to adaptation. Observers adapted to two sets of dots with different mean sizes, presented simultaneously in opposite visual fields. After adaptation, two test patches replaced the adapting dot sets, and participants judged which test appeared to have the larger average dot diameter. They generally perceived the test that replaced the smaller mean size adapting set as being larger than the test that replaced the larger adapting set. This differential aftereffect held for single test dots (Experiment 2) and high-pass filtered displays (Experiment 3), and changed systematically as a function of the variance of the adapting dot sets (Experiment 4), providing additional support that mean size is adaptable, and therefore explicitly encoded dimension of visual scenes.
Article
Full-text available
The human observer is surprisingly inaccurate in discriminating proportions between two spatially overlapping sets of randomly distributed elements moving in opposite directions. It was shown that observers took into account an equivalent of 74 % of all moving elements when the task was to estimate their relative number, but only an equivalent of 21 % of the same elements when the task was to discriminate between opposite directions. It was concluded that, in the motion direction discrimination task, a large proportion of the signal from all of the elements was inaccessible to the observers, whereas the majority of the signal was accessible in a numerosity task. This type of perceptual limitation belongs to the attentional blindness category, where a strong sensory signal cannot be noticed when processing is diverted by parallel events. In addition, we found no evidence for the common-fate principle, as the ability to discriminate numerical proportions remained the same, irrespective of whether all estimated elements were moving coherently in one direction or unpredictably in opposite directions.
Article
Full-text available
We have a remarkable ability to accurately estimate average featural information across groups of objects, such as their average size or orientation. It has been suggested that, unlike individual object processing, this process of feature averaging occurs automatically and relatively early in the course of perceptual processing, without the need for objects to be processed to the same extent as is required for individual object identification. Here, we probed the processing stages involved in feature averaging by examining whether feature averaging is resistant to object substitution masking (OSM). Participants estimated the average size (Experiment 1) or average orientation (Experiment 2) of groups of briefly presented objects. Masking a subset of the objects using OSM reduced the extent to which these objects contributed to estimates of both average size and average orientation. Contrary to previous findings, these results suggest that feature averaging benefits from late stages of processing, subsequent to the initial registration of featural information. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
Article
Full-text available
Why do the equally spaced dots in figure 1 appear regularly spaced? The answer 'because they are' is naive and ignores the existence of sensory noise, which is known to limit the accuracy of positional localization. Actually, all the dots in figure 1 have been physically perturbed, but in the case of the apparently regular patterns to an extent that is below threshold for reliable detection. Only when retinal pathology causes severe distortions do regular grids appear perturbed. Here, we present evidence that low-level sensory noise does indeed corrupt the encoding of relative spatial position, and limits the accuracy with which observers can detect real distortions. The noise is equivalent to a Gaussian random variable with a standard deviation of approximately 5 per cent of the inter-element spacing. The just-noticeable difference in positional distortion between two patterns is smallest when neither of them is perfectly regular. The computation of variance is statistically inefficient, typically using only five or six of the available dots.
Article
Full-text available
( This reprinted article originally appeared in Psychological Review, 1927, Vol 34, 273–286. The following is a modified version of the original abstract which appeared in PA, Vol 2:527. ) Presents a new psychological law, the law of comparative judgment, along with some of its special applications in the measurement of psychological values. This law is applicable not only to the comparison of physical stimulus intensities but also to qualitative judgments, such as those of excellence of specimens in an educational scale. The law is basic for work on Weber's and Fechner's laws, applies to the judgments of a single observer who compares a series of stimuli by the method of paired comparisons when no "equal" judgments are allowed, and is a rational equation for the method of constant stimuli.
Article
Full-text available
Different laboratories have achieved a consensus regarding how well human observers can estimate the average orientation in a set of N objects. Such estimates are not only limited by visual noise, which perturbs the visual signal of each object's orientation, they are also inefficient: Observers effectively use only √N objects in their estimates (e.g., S. C. Dakin, 2001; J. A. Solomon, 2010). More controversial is the efficiency with which observers can estimate the average size in an array of circles (e.g., D. Ariely, 2001, 2008; S. C. Chong, S. J. Joo, T.-A. Emmanouil, & A. Treisman, 2008; K. Myczek & D. J. Simons, 2008). Of course, there are some important differences between orientation and size; nonetheless, it seemed sensible to compare the two types of estimate against the same ideal observer. Indeed, quantitative evaluation of statistical efficiency requires this sort of comparison (R. A. Fisher, 1925). Our first step was to measure the noise that limits size estimates when only two circles are compared. Our results (Weber fractions between 0.07 and 0.14 were necessary for 84% correct 2AFC performance) are consistent with the visual system adding the same amount of Gaussian noise to all logarithmically transduced circle diameters. We exaggerated this visual noise by randomly varying the diameters in (uncrowded) arrays of 1, 2, 4, and 8 circles and measured its effect on discrimination between mean sizes. Efficiencies inferred from all four observers significantly exceed 25% and, in two cases, approach 100%. More consistent are our measurements of just-noticeable differences in size variance. These latter results suggest between 62 and 75% efficiency for variance discriminations. Although our observers were no more efficient comparing size variances than they were at comparing mean sizes, they were significantly more precise. In other words, our results contain evidence for a non-negligible source of late noise that limits mean discriminations but not variance discriminations.
Article
Full-text available
From the principle that subjective dissimilarity between 2 stimuli is determined by their ratio, Fechner derives his logarithmic law in 2 ways. In one derivation, ignored and forgotten in modern accounts of Fechner's theory, he formulates the principle in question as a functional equation and reduces it to one with a known solution. In the other derivation, well known and often criticized, he solves the same functional equation by differentiation. Both derivations are mathematically valid (the much-derided "expedient principle" mentioned by Fechner can be viewed as merely an inept way of pointing at a certain property of the differentiation he uses). Neither derivation uses the notion of just-noticeable differences. But if Weber's law is accepted in addition to the principle in question, then the dissimilarity between 2 stimuli is approximately proportional to the number of just-noticeable differences that fit between these stimuli: The smaller Weber's fraction the better the approximation, and Weber's fraction can always be made arbitrarily small by an appropriate convention. We argue, however, that neither the 2 derivations of Fechner's law nor the relation of this law to thresholds constitutes the essence of Fechner's approach. We see this essence in the idea of additive cumulation of sensitivity values. Fechner's work contains a surprisingly modern definition of sensitivity at a given stimulus: the rate of growth of the probability-of-greater function, with this stimulus serving as a standard. The idea of additive cumulation of sensitivity values lends itself to sweeping generalizations of Fechnerian scaling.
Article
Full-text available
An optimal agent will base judgments on the strength and reliability of decision-relevant evidence. However, previous investigations of the computational mechanisms of perceptual judgments have focused on integration of the evidence mean (i.e., strength), and overlooked the contribution of evidence variance (i.e., reliability). Here, using a multielement averaging task, we show that human observers process heterogeneous decision-relevant evidence more slowly and less accurately, even when signal strength, signal-to-noise ratio, category uncertainty, and low-level perceptual variability are controlled for. Moreover, observers tend to exclude or downweight extreme samples of perceptual evidence, as a statistician might exclude an outlying data point. These phenomena are captured by a probabilistic optimal model in which observers integrate the log odds of each choice option. Robust averaging may have evolved to mitigate the influence of untrustworthy evidence in perceptual judgments.
Article
Full-text available
People are sensitive to the summary statistics of the visual world (e.g., average orientation/speed/facial expression). We readily derive this information from complex scenes, often without explicit awareness. Given the fundamental and ubiquitous nature of summary statistical representation, we tested whether this kind of information is subject to the attentional constraints imposed by change blindness. We show that information regarding the summary statistics of a scene is available despite limited conscious access. In a novel experiment, we found that while observers can suffer from change blindness (i.e., not localize where change occurred between two views of the same scene), observers could nevertheless accurately report changes in the summary statistics (or "gist") about the very same scene. In the experiment, observers saw two successively presented sets of 16 faces that varied in expression. Four of the faces in the first set changed from one emotional extreme (e.g., happy) to another (e.g., sad) in the second set. Observers performed poorly when asked to locate any of the faces that changed (change blindness). However, when asked about the ensemble (which set was happier, on average), observer performance remained high. Observers were sensitive to the average expression even when they failed to localize any specific object change. That is, even when observers could not locate the very faces driving the change in average expression between the two sets, they nonetheless derived a precise ensemble representation. Thus, the visual system may be optimized to process summary statistics in an efficient manner, allowing it to operate despite minimal conscious access to the information presented.
Article
Full-text available
The present research concerns the hypothesis that intuitive estimates of the arithmetic mean of a sample of numbers tend to increase as a function of the sample size; that is, they reflect a systematic sample size bias. A similar bias has been observed when people judge the average member of a group of people on an inferred quantity (e.g., a disease risk; see Price, 2001; Price, Smith, & Lench, 2006). Until now, however, it has been unclear whether it would be observed when the stimuli were numbers, in which case the quantity need not be inferred, and "average" can be precisely defined as the arithmetic mean. In two experiments, participants estimated the arithmetic mean of 12 samples of numbers. In the first experiment, samples of from 5 to 20 numbers were presented simultaneously and participants quickly estimated their mean. In the second experiment, the numbers in each sample were presented sequentially. The results of both experiments confirmed the existence of a systematic sample size bias.
Article
Full-text available
This paper explores the nature of the representations used for computing mean visual size of an array of visual objects of different sizes. In Experiment 1 we found that mean size judgments are accurately made even when the individual objects (circles) upon which those judgments were based were distributed between the two eyes. Mean size judgments were impaired, however, when a subset of the constituent objects involved in the estimation of mean size were rendered invisible by interocular suppression. These findings suggest that mean size is computed from relatively refined stimulus information represented at stages of visual processing beyond those involved in binocular combination and interocular suppression. In Experiment 2 we used an attentional blink paradigm to learn whether this refined information was susceptible to the constraints of attention. Accuracy of mean size judgments was unchanged when one of the two arrays of circles was presented within a rapid serial visual presentation sequence, regardless of task requirement (single vs. dual task) and the array's time of presentation relative to the brief appearance of a target that was the focus of attention. Evidently the refined stimulus information used for computing mean size remains available even in the absence of focused attention.
Article
Full-text available
We frequently encounter groups of similar objects in our visual environment: a bed of flowers, a basket of oranges, a crowd of people. How does the visual system process such redundancy? Research shows that rather than code every element in a texture, the visual system favors a summary statistical representation of all the elements. The authors demonstrate that although it may facilitate texture perception, ensemble coding also occurs for faces-a level of processing well beyond that of textures. Observers viewed sets of faces varying in emotionality (e.g., happy to sad) and assessed the mean emotion of each set. Although observers retained little information about the individual set members, they had a remarkably precise representation of the mean emotion. Observers continued to discriminate the mean emotion accurately even when they viewed sets of 16 faces for 500 ms or less. Modeling revealed that perceiving the average facial expression in groups of faces was not due to noisy representation or noisy discrimination. These findings support the hypothesis that ensemble coding occurs extremely fast at multiple levels of visual analysis.
Article
Full-text available
In this brief response to commentaries by Ariely (2008) and Chong, Joo, Emmanouil, and Treisman (2008) on our earlier article, we highlight the two key assumptions underlying earlier claims about statistical summary representations of object size and argue that existing studies have not met either of them. We note why statistical summary representations of size are different from such representations of motion or orientation, and we emphasize the need for simulations of performance to exclude focused attention explanations for judgments of average size.
Article
Full-text available
Myczek and Simons (2008) have shown that findings attributed to a statistical mode of perceptual processing can, instead, be explained by focused attention to samples of just a few items. Some new findings raise questions about this claim. (1) Participants, given conditions that would require different focused attention strategies, did no worse when the conditions were randomly mixed than when they were blocked. (2) Participants were significantly worse at estimating the mean size when given small samples than when given the whole display. (3) One plausible suggested strategy--comparing the largest item in each display, rather than the mean size--was not, in fact, used. Distributed attention to sets of similar stimuli, enabling a statistical-processing mode, provides a coherent account of these and other phenomena.
Article
Full-text available
18 male Os made magnitude estimates of average length (AL) and average inclination (AI) on arrays of lines representing mean lengths of 8-73 in. and mean inclinations of 10-80–. Each array on which AL was judged contained 6 horizontal lines of different lengths, and each array on which AI was judged contained 6 lines of constant length sloping at different angles in the 1st quadrant. AL was judged in proportion to a continuously present standard line 12, 36, or 60 in. long, and AI in proportion to a standard inclinded at 20, 45, or 70– above horizontal. Relationships between subjective and stimulus magnitude at both group and individual levels were adequately described by linear functions for AI (slopes of .80-1.20 and PSI intercepts of -6.29-10.88) and by power functions for AL (exponents of .84-1.19). Under the medium and large AL standards, linear functions were equally satisfactory. An interpretation was given in terms of the relationship between subjective average and the continuum averaged. (16 ref.)
Article
Full-text available
CONSIDERS EXPERIMENTAL RESEARCH THAT HAS USED PROBABILITY THEORY AND STATISTICS AS A FRAMEWORK WITHIN WHICH TO STUDY HUMAN STATISTICAL INFERENCE. EXPERIMENTS HAVE INVESTIGATED ESTIMATES OF PROPORTIONS, MEANS, VARIANCES, AND CORRELATIONS, BOTH OF SAMPLES AND OF POPULATIONS. IN SOME EXPERIMENTS, PARAMETERS OF POPULATIONS WERE STATIONARY; IN OTHERS, THE PARAMETERS CHANGED OVER TIME. THE EXPERIMENTS ALSO INVESTIGATED THE DETERMINATION OF SAMPLE SIZE AND TRIAL-BY-TRIAL PREDICTIONS OF EVENTS TO BE SAMPLED FROM A POPULATION. IN GENERAL, THE RESULTS INDICATE THAT PROBABILITY THEORY AND STATISTICS CAN BE USED AS THE BASIS FOR PSYCHOLOGICAL MODELS THAT INTEGRATE AND ACCOUNT FOR HUMAN PERFORMANCE IN A WIDE RANGE OF INFERENTIAL TASKS. (115 REF.) (PsycINFO Database Record (c) 2006 APA, all rights reserved).
Article
Full-text available
The cinematograms of 12 two-state elements arranged in the clock positions in space and in a sequence of adjacent 100-ms frames in time were used as stimuli. Some positions in each frame (or all 12 of them) could be labeled as "domain" ones, and every element that was T frames and S positions (clockwise or counterclockwise) apart from a domain element could repeat the latter's state with probability P. The probability of the rotation direction identification was obtained as a function of T, S, P, number of frames, and the domain positions selection scheme. A generalized version of the reversed phi phenomenon was obtained: if P less than .5, then the psychometric value lies below .5 level. All the data can be accounted for by a simple model according to which the choice of direction is based on the counts of the different types of dipoles, each type being characterized by the probability and the weight of its count: In most situations all dipoles but the shortest ones (connecting the neighboring elements of successive frames) can be ignored.
Article
One set of properties that may emerge when attention is distributed over an array of similar items is the general statistics of the set. We investigated mean size perception using three different methods: a change detection "flicker" task, implicit priming, and a dual task paradigm. In the change detection task, participants discriminated a change of size from a change of location of an array of circles of different sizes. The changes either did or did not result in a mean change of size. Performance was more efficient when the mean size changed than when it did not, suggesting that the mean is perceptually represented. In the implicit priming task, we used a same-different size judgment on two circles, preceded by a prime display containing 12 circles of two different sizes. One or both of the target circles could match either the mean size (that was never presented), or one of the two primed sizes, or a size that was 20% larger or smaller than a presented size. The priming benefit was as large when the targets matched the mean size of the prime display as when they matched a size that was actually presented. To manipulate the deployment of attention, we used a dual task paradigm in which the secondary task required either global or focused attention. Thresholds for judging the mean size of circles in the array were lower when the concurrent task required global compared to local attention, even when the secondary tasks were closely matched in difficulty. The results support the proposal that we preserve statistical information from sets of similar objects rather than representing all the detailed information and that this information is best extracted when attention is globally deployed. Means of sets may be one ingredient of a schematic representation that could explain our coherent perception of a visual scene.
Article
Ss lifted a sequence of six weights with instructions to judge their average heaviness. A model that took the judgment to be an average of the stimulus values, weighted for serial position, gave a satisfactory quantitative account of the data. The later weights in each sequence had greater influence on the judgment, a recency effect.
Article
An investigation is described of the types of averaging response made to several volues of a variable when these nre presented symbolically or graphically. Symbolic information was presented to subjects on small white cards on which were typed either 10 or 20 two-digit numbers. Graphical information was presented as 10 or 20 points on inch ruled tenths graph paper. Sets of data differed according to whether they were normally or skew distributed about the arithmetic mean value, and in the extent of scatter about the mean.The results confirmed previous work in showing that error of judgment increased as scatter increased, and to a greater extent with symbolic than "with grnphical material. Error was also greater with skew than with normal information. Comparison between the results obtained with the two types of distribution showed that the previously reported finding that error also increased with increasing amount of information was incorrect and was duo to inadequate experimental control of scatter for different amounts of information.Analysis of individual performances showed that subjects' responses differed in type and that their introspective reports related to their performance. In particular the proportion of judgments which could be classed as of ' arithmetic mean ' type varied significantly with the methods described.Further experiments are suggested to try to discover which of two possible models of mental averaging is correct and why error increased with increased scatter in the information.
Article
Perhaps the most basic fact about human visual encoding of relative spatial position is that the length or separation discrimination threshold increases with the mean length or separation being judged. In this study, the cause of this increase was investigated by measuring the effect of a parallel flanking line on the perceived separation of a pair of target lines. A standard separation discrimination paradigm was used with the flanking line placed outside the target pair. Perceived target separation was increased by the presence of the flanking line whenever the distance to the flanking line was less than the mean target separation. Modeling this effect as the product of a Gaussian weighting function times the distance to the flanking line, we inferred the size of the position integration area. The increase in the position integration area with increasing separation was found to be sufficient to account for the concomitant increase in separation discrimination thresholds, i.e., for Weber’s law for separation.
Article
A CERTAIN class of problems do not as yet appear to be solved according to scientific rules, though they are of much importance and of frequent recurrence. Two examples will suffice. (1) A jury has to assess damages. 2) The council of a society has to fix on a sum of money, suitable for some particular purpose. Each voter, whether of the jury or of the council, has equal authority with each of his colleagues. How can the right conclusion be reached, considering that there may be as many different estimates as there are members? That conclusion is clearly not the average of all the estimates, which would give a voting power to ``cranks'' in proportion to their crankiness. One absurdly large or small estimate would leave a greater impress on the result than one of reasonable amount, and the more an estimate diverges from the bulk of the rest, the more influence would it exert. I wish to point out that the estimate to which least objection can be raised is the middlemost estimate, the number of votes that it is too high being exactly balanced by the number of votes that it is too low. Every other estimate is condemned by a majority of voters as being either too high or too low, the middlemost alone escaping this condemnation. The number of voters may be odd or even. If odd, there is one middlemost value; thus in 11 votes the middlemost is the 6th; in 99 votes the middlemost is the 50th. If the number of voters be even, there are two middlemost values, the mean of which must be taken; thus in 12 votes the middlemost lies between the 6th and the 7th; in 100 votes between the 50th and the 51st. Generally, in 2n-1 votes the middlemost is the nth; in 2n votes it lies between the nth and the (n + 1)th.
Article
An investigation is described of ability to estimate averages of several values of a variable presented either symbolically or graphically. Symbolic information was presented to subjects on small white cards on which were typed between 10 and 20 two-digit numbers, Graphical information was presented as 10 to 20 points on inch, ruled tenths, graph paper. In most cases described, subjects wore allowed ten seconds to examine the information for judgement. With symbolic information they were allowed unlimited time after this in which to form a judgement, but with graphical information the examination time included the setting of a cursor line at a position across the graph corresponding to the judged average value. The information presented varied not only in amount but also in scatter of the values about their mean.The results show that for both symbolic and graphical information, the error of judgement increases with increasing amount and scatter of material presented. The effects of increasing amounts of information are much more marked with high scatter than with low scatter material. Differences between subjects were negligible amongst students but were pronounced among a group of chemical process operators. Scatter was much the most important variable affecting averaging accuracy and some interesting results were obtained when the material presented contained one item markedly different in value from its fellows.The results are discussed and further experiments are suggested to increase an understanding of the mechanisms of statistical judgements.
Article
When two elements are presented closely aligned, the average saccade endpoint will generally be located in between these two elements. This 'global effect' has been explained in terms of the center of gravity account which states that the saccade endpoint is based on the relative saliency of the different elements in the visual display. In the current study, we tested one of the implications of the center of gravity account: when two elements are presented closely aligned with the same size and the same distance from central fixation, the saccade should land on the intermediate location, irrespective of the stimulus size. To this end, two equally-sized elements were presented simultaneously and participants were required to execute an eye movement to the visual information presented on the display. Results showed that the strongest global effect was observed in the condition with smaller stimuli, whereas the saccade averaging was weaker when larger stimuli were presented. In a second experiment, in which only one element was presented, we observed that the width of the distribution of saccade endpoints is influenced by stimulus size in that the distribution is broader with smaller stimuli. We conclude that perfect saccade averaging is not always the default response by the oculomotor system. There appears to be a tendency to initiate an eye movement towards one of the visual elements, which becomes stronger with increasing stimulus size. This effect might be explained by an increased uncertainty in target localization for smaller stimuli, resulting in a higher probability of the merging of two stimulus representations into one representation.
Article
Recent psychophysical investigations showed that humans have the ability to compute the mean size of a set of visual objects. The investigations suggest that the visual system is able to form an overall, statistical representation of a set of objects, while the information about individual members of the set is lost. We proposed a neural model that computes the mean size of a set of similar objects. The model is a feedforward, two-dimensional neural network with three layers. Computer simulations showed that the presented model of statistical processing is able to form abstract numerical representation and to compute the mean size independently from the visual appearance of objects. This is achieved in a fast, parallel manner without serial scanning of the visual field. The mean size is computed indirectly by comparing the total activity in the input layer and in the third layer. Therefore, the information about the size of individual elements is lost. An extended model is able to hold statistical information in the working memory and to handle the computation of the mean size for surfaces with empty interiors.
Article
Although we intuitively believe that salient or distinctive objects will capture our attention, surprisingly often they do not. For example, drivers may fail to notice another car when trying to turn or a person may fail to see a friend in a cinema when looking for an empty seat, even if the friend is waving. The study of attentional capture has focused primarily on measuring the effect of an irrelevant stimulus on task performance. In essence, these studies explore how well observers can ignore something they expect but know to be irrelevant. By contrast, the real-world examples above raise a different question: how likely are subjects to notice something salient and potentially relevant that they do not expect? Recently, several new paradigms exploring this question have found that, quite often, unexpected objects fail to capture attention, a phenomenon known as ‘inattentional blindness’. This review considers evidence for the effects of irrelevant features both on performance (‘implicit attentional capture’) and on awareness (‘explicit attentional capture’). Taken together, traditional studies of implicit attentional capture and recent studies of inattentional blindness provide a more complete understanding of the varieties of attentional capture, both in the laboratory and in the real world.
Article
Despite several processing limitations that have been identified in the visual system, research shows that statistical information about a set of objects could be perceived as accurately as the information about a single object. It has been suggested that extraction of summary statistics represents a different mode of visual processing, which employs a parallel mechanism free of capacity limitations. Here, we demonstrate, using reaction time measures, that increasing the number of stimuli in the set results in faster reaction times and better accuracy for estimating the mean tendency of a set. These results provide clear evidence that extraction of summary statistics relies on a distributed attention mode that operates across the whole display at once and that this process benefits from larger samples across which the summary statistics are calculated.
Article
We tested Ariely's (2001) proposal that the visual system represents the overall statistical properties of sets of objects against alternative accounts of rapid averaging involving sub-sampling strategies. In four experiments, observers could rapidly extract the mean size of a set of circles presented in an RSVP sequence, but could not reliably identify individual members. Experiment 1 contrasted performance on a member identification task with performance on a mean judgment task, and showed that the tasks could be dissociated based on whether the test probe was presented before or after the sequence, suggesting that member identification and mean judgment are subserved by different mechanisms. In Experiment 2, we confirmed that when given a choice between a probe corresponding to the mean size of the set and a foil corresponding to the mean of the smallest and largest items only, the former is preferred to the latter, even when observers are explicitly instructed to average only the smallest and largest items. Experiment 3 showed that a test item corresponding to the mean size of the set could be reliably discriminated from a foil but the largest item in the set, differing by an equivalent amount, could not. In Experiment 4, observers rejected test items dissimilar to the mean size of the set in a member identification task, favoring test items that corresponded to the mean of the set over items that were actually shown. These findings suggest that mean representation is accomplished without explicitly encoding individual items.
Article
Six observers were asked to indicate in which of two opposite directions, to the right or to the left, an entire display appeared to move, based on the proportion of right vs leftward motion elements, each of which was distinctly visible. The performance of each observer was described by Thurstone's discriminative processes and Bernoulli trial models which described empirical psychometric functions equally well. Although formally it was impossible to discriminate between these two models, treating observer as a counting device which measures a randomly selected subsample of all available motion elements had certain advantages. According to the Bernoulli trial model decisions about the global motion direction in a range of 12-800 elements were based on taking into account about 4±2 random moving dot elements. This small number is not due to cancellation of the opposite motion vectors since the motion direction recognition performance did not improve after the compared motion directions were made orthogonal. This may indicate that the motion pooling mechanism studied in our experiment is strongly limited in capacity.
Article
Previous research suggests that sets of similar items are represented using a rapid averaging mechanism that automatically extracts statistical properties within 50 ms. However, typically in these studies, displays are not masked, so it is possible that the sets are available for longer than this duration. In the present study, using masked displays, we (a) tested a newly proposed strategy for extracting the mean size of a set of circles, and (b) more precisely evaluated the time course of rapid averaging. The results indicate that when viewing conditions are poor, performance can be explained by assuming that observers rely on information from previous trials. In this study, observers required at least a 200-ms exposure time in order to derive the average size of a set of circles without relying on information from previously-viewed sets, suggesting that rapid averaging is not as fast as previously assumed and, therefore, that it may not be an automatic process.
Article
The visual system can only accurately represent a handful of objects at once. How do we cope with this severe capacity limitation? One possibility is to use selective attention to process only the most relevant incoming information. A complementary strategy is to represent sets of objects as a group or ensemble (e.g. represent the average size of items). Recent studies have established that the visual system computes accurate ensemble representations across a variety of feature domains and current research aims to determine how these representations are computed, why they are computed and where they are coded in the brain. Ensemble representations enhance visual cognition in many ways, making ensemble coding a crucial mechanism for coping with the limitations on visual processing.
Article
When required to identify the orientation of an item outside the center of the visual field, the mean orientation predicts performance better than the orientation of any individual item in that region. Here I examine whether the visual system also preserves the variance of orientations in these so-called "crowded" displays. In Experiment 1, I determined the separation between items necessary to prevent neighbors from interfering with discrimination between different orientations in a single, target item. In Experiment 2, I used this separation and measured the effect of orientation variance on discrimination between mean orientations in these consequently uncrowded displays. In Experiment 3, I measured the relationship between the just-noticeable difference in variance and the smaller of two orientation variances in uncrowded displays. Finally, in Experiments 4 and 5, I reduced the separation between items and measured the effect of crowding on mean and variance discriminations. When considered together, the results of all these experiments imply that the visual system computes orientation variances with both more efficiency and greater precision than it computes orientation means. Although crowding made it difficult for some observers to discriminate between small amounts of orientation variance, it had no other significant effect on visual estimates mean orientation and orientation variance.
Article
Stevens’s power law (Ψ∝Φ β) captures the relationship between physical (Φ) and perceived (Ψ) magnitude for many stimulus continua (e.g., luminance and brightness, weight and heaviness, area and size). The exponent (β) indicates whether perceptual magnitude grows more slowly than physical magnitude (β < 1), directly as physical magnitude (β ≈ 1), or more quickly than physical magnitude (β > 1). These exponents are typically determined using judgments of single stimuli. Miller and Sheldon (1969) found that the validity of Stevens’s Power Law could be extended to the case where the mean of a property in an ensemble of items was judged (i.e., average length or average tilt where β ≈ 1). The present experiments investigate the extension of this finding to perceived brightness with β ≈ 0.33 and find evidence consistent with predictions made by Miller and Sheldon.
Article
There is a great deal of structural regularity in the natural environment, and such regularities confer an opportunity to form compressed, efficient representations. Although this concept has been extensively studied within the domain of low-level sensory coding, there has been limited focus on efficient coding in the field of visual attention. Here we show that spatial patterns of orientation information ("spatial ensemble statistics") can be efficiently encoded under conditions of reduced attention. In our task, observers monitored for changes to the spatial pattern of background elements while they were attentively tracking moving objects in the foreground. By using stimuli that enable us to dissociate changes in local structure from changes in the ensemble structure, we found that observers were more sensitive to changes to the background that altered the ensemble structure than to changes that did not alter the ensemble structure. We propose that reducing attention to the background increases the amount of noise in local feature representations, but that spatial ensemble statistics capitalize on structural regularities to overcome this noise by pooling across local measurements, gaining precision in the representation of the ensemble.
Article
Myczek and Simons (2008) have described a computational model that subsamples a few items from a set with high accuracy, showing that this approach can do as well as, or better than, a model that captures statistical representations of the set. Although this is an intriguing existence proof, some caution should be taken before we consider their approach as a model for human behavior. In particular, I propose that such simulation-based research should be based on a more expanded range of phenomena and that it should include more accurate representations of errors in judgments.
Article
An absolute scale of performance is set up in terms of the performance of an ideal picture pickup device, that is, one limited only by random fluctuations in the primary photo process. Only one parameter, the quantum efficiency of the primary photo process, locates position on this scale. The characteristic equation for the performance of an ideal device has the form BC2α2 = constant where B is the luminance of the scene, and C and α are respectively the threshold contrast and angular size of a test object in the scene. This ideal type of performance is shown to be satisfied by a simple experimental television pickup arrangement. By means of the arrangement, two parameters, storage time of the eye and threshold signal-to-noise ratio are determined to be 0.2 seconds and five respectively. Published data on the performance of the eye are compared with ideal performance. In the ranges of B(10-6 to 102 footlamberts), C(2 to 100 percent) and α(2' to 100'), the performance of the eye may be matched by an ideal device having a quantum efficiency of 5 percent at low lights and 0.5 percent at high lights. This is of considerable technical importance in simplifying the analysis of problems involving comparisons of the performance of the eye and man-made devices. To the extent that independent measurements of the quantum efficiency of the eye confirm the values (0.5 percent to 5.0 percent), the performance of the eye is limited by fluctuations in the primary photo process. To the same extent, other mechanisms for describing the eye that do not take these fluctuations into account are ruled out. It is argued that the phenomenon of dark adaptation can be ascribed only in small part to the primary photo-process and must be mainly controlled by a variable gain mechanism located between the primary photo-process and the nerve fibers carrying pulses to the brain.
Article
Global motion perception from a sequence of random dot patterns has been studied by means of the competition technique which consists of making a normally less salient motion path in a superimposed multiple-path stimulus more powerful by adding luminous energy to elements forming this path. The perceived motion direction of a sequence of random dot patterns can be dramatically changed by increasing luminance of some fraction of dots leaving all spatial and temporal intervals between dots unchanged. The threshold luminance increment delta I that is required in order to change the perceived motion direction indicates that differently oriented local motion vectors are resolved into a single common motion vector along which the whole pattern appears to move. An inverse spatial proximity rule was discovered: within a certain spatial limit the motion strength of a particular motion path is proportional to the distance between stimulus elements forming this path.
Article
Psychometric functions were collected to measure biases and sensitivities in certain classical illusory configurations, such as the Müller-Lyer. We found that sensitivities (thresholds or just noticeable differences) were generally not affected by the introduction of illusory biases, and the implications of this for theories of the illusions are discussed. Experiments on the Müller-Lyer figure showed that the effect depends upon mis-location of the ends of the figure, rather than upon a global expansion as demanded by the size-constancy theory. A new illusion is described in which the perceived position of a dot is displaced towards the centre of a surrounding cluster of dots, even though it is clearly discriminable from other members of the cluster by their colour. We argue that illusions illustrate powerful constraints upon visual processing: they arise when subjects are instructed to carry out a task to which the visual system is not adapted.
Article
We have measured the overall statistical efficiency of human subjects discriminating the amplitude of visual pattern signals added to noisy backgrounds. By changing the noise amplitude, the amount of intrinsic noise can be estimated and allowed for. For a target containing a few cycles of a spatial sinusoid of about 5 cycles per degree, the overall statistical efficiency is as high as 0.7 +/- 0.07, and after correction for intrinsic noise, efficiency reaches 0.83 +/- 0.15. Such a high figure leaves little room for residual inefficiencies in the neural mechanisms that handle these patterns.
Article
This paper examines how observers estimate the overall orientation of spatially disorganised textures containing variable orientation. Experiments used asymmetrical distributions of orientations to separate the predictions from different models of average orientation estimation. Stimuli were composed of two spatially intermingled sets of oriented patches, each set having Gaussian distributed element orientation. The threshold separation of the means of the two sets was determined for a variety of tasks. Discrimination of these textures from a reference composed of two sets with the same mean orientation was well predicted by discrimination of orientation variability. A single interval judgement of which set contained more elements required a greater separation of the set orientations and suggested that the sets must be resolved in the orientation domain for independent representation of their properties. That resolution is required to perform this task further suggests that orientational skew is not coded. Threshold offsets for judgement of average orientation were re-expressed as shifts of four candidate features for coding the central tendency of texel orientations. Comparison with similar thresholds for single distributions of orientations indicated that average orientation is assigned to the centroid of a set of orientation measures.