Conference PaperPDF Available

Neurally Inspired Mechanisms for the Dynamic Visual Attention Map Generation Task

Authors:

Abstract

A model for dynamic visual attention is briefly introduced in this paper. A PSM (problem-solving method) for a generic “Dynamic Attention Map Generation” task to obtain a Dynamic Attention Map from a dynamic scene is proposed. Our approach enables tracking objects that keep attention in accordance with a set of characteristics defined by the observer. This paper mainly focuses on those subtasks of the model inspired in neuronal mechanisms, such as accumulative computation and lateral interaction. The subtasks which incorporate these biologically plausible capacities are called “Working Memory Generation” and “Thresholded Permanency Calculation”.
A preview of the PDF is not available
... Another top-down model of dynamic visual attention in (Lopez et al., 2003) segments objects in a scene, continuously focuses the attention on shapes which possess desired features after eliminating those unnecessary. The observed characteristics are relevant with elements' shape and motion showing in the dynamic scene. ...
... The observed characteristics are relevant with elements' shape and motion showing in the dynamic scene. Thus, this model is useful for real world observation (Lopez et al., 2003). Fig. 6 illustrates the result of applying this model to a predefined "dynamic attention map generation task", where observer's attention has been paid onto moving objects in accordance with a set of desired features (Lopez et al., 2003). ...
... Thus, this model is useful for real world observation (Lopez et al., 2003). Fig. 6 illustrates the result of applying this model to a predefined "dynamic attention map generation task", where observer's attention has been paid onto moving objects in accordance with a set of desired features (Lopez et al., 2003). The processing procedure, named PSM (Problem-Solving Method), can be divided into following six different subtasks. ...
Article
Full-text available
In the context of surveillance, visual attention is referred to as an ability to rapidly locate the most salient or relevant target in biological or artificial vision systems. To support the basic research leading to determination of the principles on which humans and animals accomplish agile sensing, a survey of bio-inspired visual attention in target detection is presented in this paper. We first discuss some relevant fundamental concepts and techniques. Then several up-to-date algorithms and implementations are introduced in either bottom-up or top-down mechanisms. This paper also gives a tutorial for the existing researches and provides future challenges.
... The Guided Search model [164, 25] is another example that integrates imagebased stimuli and task-dependent knowledge into an overall activation map which corresponds to the saliency map. In a recent work [89], a very similar model of attention with a top-down component has been reported. Olshausen et al. have presented a computational model that simulates the shifter circuits theory [109], that is the segregation of objects of interest and the routing of the corresponding visual information to higher stage of the visual cortex. ...
... The extension of the visual attention model with a top-down component can take model on previous works, e.g. [96, 164, 89]. @BULLET Applying visual attention to visual navigation of autonomous mobile systems , like mobile robots. ...
Article
Visual Attention: From Bio-Inspired Modeling to Visual attention is the ability of a vision system, be it biological or artificial, to rapidly select the most salient and thus the most relevant data about the environment in which the system is operating. The main goal of this visual mechanism is to drastically reduce the amount of visual information that must be processed by high level and thus complex tasks, such as object recognition, which leads to a considerable speed up of the entire vision process. This thesis copes with various aspects related to visual attention, ranging from biologically inspired computational modeling of this visual behavior to its real-time realization on dedicated hardware, and its successful application to solve real computer vision tasks. Indeed, the contributions brought together in this thesis can be divided into four main parts. The first part deals with the computational modeling of visual attention by assessing the significance of novel features like depth and motion to the visual attention mechanism. Thereby, two models have been conceived and validated, namely the 3D- and the dynamic models of visual attention. In the second part, the biological plausibility of various versions of the visual attention model is evaluated. Therefore, the performance of our visual attention model is compared with human visual attention behavior, assuming that the human visual attention is intimately linked to the eye movements. The third part of the thesis covers our contribution on the realization of a real-time operating system of visual attention. Indeed, the computational model of visual attention is implemented on a highly parallel architecture conceived for general purpose image processing, which allows to reach real-time requirements. Last but not least, the visual attention model has been successfully applied to speed up but also to increase the performance of various real tasks related to computer vision. Thereby, image compression, color image segmentation, visual object tracking, and automatic traffic sign detection and recognition largely benefit from the salient scene information provided by the proposed visual attention algorithm. Specifically, they use this information to automatically adjust their internal parameters according to scene contents, thus, considerably enhancing the quality of the achieved results
... Nuestro grupo lleva varios años desarrollando sus investigaciones en el estudio del movimiento en secuencias de imágenes. Todas estas investigaciones nos han permitido afrontar aplicaciones como el reconocimiento de siluetas de objetos móviles en entornos ruidosos [Fer03a], la clasificación de móviles según sus características del movimiento, como su velocidad o su aceleración [Fer01a] y [Fer01b], y en aplicaciones relacionadas con la atención selectiva visual [Lop03a], [Lop04]. En todos estos trabajos se ha abordado la solución a estos problemas usando una serie de métodos de inspiración biológica basados en dos mecanismos fundamentales: (1) la computación acumulativa, [Fer92], [Fer95a], [Fer97] y [Mir03b]; y (2) una versión generalizada del cálculo realizado por las redes de inhibición lateral algorítmica (ALI) [Mir01], [Fer01a], [Fer01b], [Del02], [Fer03b] y [Fer03c]. ...
... Later on these objects parts (also called zones, patches or spots) will be treated as whole objects. In previous papers from our research team some algorithms for the segmentation of the image in different objects have been proposed based on the detection of motion, the permanency effect and lateral interaction [13,26]. Thus, based on the satisfactory results of the algorithms commented, we propose, in order to solve the current problem, to incorporate mechanisms of charge and discharge (based on the permanency effect), as well as mechanisms of lateral interaction. ...
Article
A new computational architecture of dynamic visual attention is introduced in this paper. Our approach defines a model for the generation of an active attention focus on a dynamic scene captured from a still or moving camera. The aim is to obtain the objects that keep the observer’s attention in accordance with a set of predefined features, including color, motion and shape. The solution proposed to the selective visual attention problem consists in decomposing the input images of an indefinite sequence of images into its moving objects, by defining which of these elements are of the user’s interest, and by keeping attention on those elements through time. Thus, the three tasks involved in the attention model are introduced. The Feature-Extraction task obtains those features (color, motion and shape features) necessary to perform object segmentation. The Attention-Capture task applies the criteria established by the user (values provided through parameters) to the extracted features and obtains the different parts of the objects of potential interest. Lastly, the Attention-Reinforcement task maintains attention on certain elements (or objects) of the image sequence that are of real interest.
... The aim of subtask Motion feature extraction is to calculate the dynamic (motion) features of the image pixels, that is to say, in our case, the presence of motion. Due to our experience363738394041 we know some methods to get that information. Indeed, to diminish the effects of noise due to the changes in illumination in motion detection, variation in gray level bands at each image pixel is performed. ...
Article
This paper describes a method for visual surveillance based on biologically motivated dynamic visual attention in video image sequences. Our system is based on the extraction and integration of local (pixels and spots) as well as global (objects) features. Our approach defines a method for the generation of an active attention focus on a dynamic scene for surveillance purposes. The system segments in accordance with a set of predefined features, including gray level, motion and shape features, giving raise to two classes of objects: vehicle and pedestrian. The solution proposed to the selective visual attention problem consists of decomposing the input images of an indefinite sequence of images into its moving objects, defining which of these elements are of the user's interest at a given moment, and keeping attention on those elements through time. Features extraction and integration are solved by incorporating mechanisms of charge and discharge—based on the permanency effect—, as well as mechanisms of lateral interaction. All these mechanisms have proved to be good enough to segment the scene into moving objects and background.
... Our approach starts obtaining the objectÕs parts from their grey level bands. Later on these objects parts (also called zones, patches or spots) will be treated as whole objects incorporating lateral interaction methods (Caballero et al., 2001 Ló pez et al., 2003). In this proposal, the patches present in the Working Memory are constructed from the Interest Map compared with the Grey Level Bands Map. ...
Article
A new computational model for active visual attention is introduced in this paper. The method extracts motion and shape features from video image sequences, and integrates these features to segment the input scene. The aim of this paper is to highlight the importance of the motion features present in our algorithms in the task of refining and/or enhancing scene segmentation in the method proposed. The estimation of these motion parameters is performed at each pixel of the input image by means of the accumulative computation method, using the so-called permanency memories. The paper shows some examples of how to use the “motion presence”, “module of the velocity” and “angle of the velocity” motion features, all obtained from accumulative computation method, to adjust different scene segmentation outputs in this dynamic visual attention method.
... The name dynamic selective visual attention (DSVA) embraces a set of image processing mechanisms for focusing vision on those regions of the image where there are relevant local space-time events. These DSVA mechanisms help find, using an active search process, the relevant information at each moment to perform the interaction task with the system [1], [2]. In this paper a special focus on the behavior of sensitivity and stability in our visual attention method is pursued. ...
Conference Paper
Full-text available
In this paper a special focus on the relationship between sensitivity and stability in a dynamic selective visual attention method is described. In this proposal sensitivity is associated to short-term memory and stability to long-term memory, respectively. In first place, all necessary mechanisms to provide sensitivity to the system are included in order to succeed in keeping the attention in our short-term memory. Frame to frame attention is captured on elements constructed from image pixels that fulfill the requirements established by the user and gotten after feature integration. Then, stability is provided by including mechanisms to reinforce attention, in such a way that elements that accept the user’s predefined requirements are strengthened up to be configured as the system attention centre stored in our long-term memory.
Conference Paper
Full-text available
A new method for active visual attention is briefly introduced in this paper. The method extracts motion and shape features from indefinite image sequences, and integrates these features to segment the input scene. The aim of this paper is to highlight the importance of the accumulative computation method for motion features extraction in the active selective visual attention model proposed. We calculate motion presence and velocity at each pixel of the input image by means of accumulative computation. The paper shows an example of how to use motion features to enhance scene segmentation in this active visual attention method.
Article
Full-text available
Abstract In a recent article, knowledge modelling at the knowledge level for the task of moving objects detection in image sequences has been introduced. In this paper, the algorithmic lateral inhibition (ALI) method is now applied in the generic dynamic and selective visual attention (DSVA) task with the objective of moving objects detection, labelling and further tracking. The four basic subtasks, namely feature extraction, feature integration, attention building and attention reinforcement in our proposal of DSVA are described in detail by inferential CommonKADS schemes. It is shown that the ALI method, in its various forms, that is to say, recurrent and non-recurrent, temporal, spatial and spatial-temporal, may,perfectly be used as a problem-solving-method,in most of the subtasks involved in the DSVA task. q 2005 Elsevier Ltd. All rights reserved. Keywords: Dynamic and selective visual attention; Algorithmic lateral inhibition; Feature extraction; Feature integration 1. Modelling the dynamic,and selective visual attention task
Article
Full-text available
Reports 5 experiments conducted with 52 paid Ss in which detection of a visual signal required information to reach a system capable of eliciting arbitrary responses required by the experimenter. Detection latencies were reduced when Ss received a cue indicating where the signal would occur. This shift in efficiency appears to be due to an alignment of the central attentional system with the pathways to be activated by the visual input. It is also possible to describe these results as being due to a reduced criterion at the expected target position. However, this ignores important constraints about the way in which expectancy improves performance. A framework involving a limited-capacity attentional mechanism seems to capture these constraints better than the more general language of criterion setting. Using this framework, it was found that attention shifts were not closely related to the saccadic eye movement system. For luminance detection, the retina appears to be equipotential with respect to attention shifts, since costs to unexpected stimuli are similar whether foveal or peripheral. (26 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Because the visual system cannot process all of the objects, colors, and features present in a visual scene, visual attention allows some visual stimuli to be selected and processed over others. Most research on visual attention has focused on spatial or location-based attention, in which the locations occupied by stimuli are selected for further processing. Recent research, however, has demonstrated the importance of objects in organizing (or segregating) visual scenes and guiding attentional selection. Because of the long history of spatial attention research, theories of spatial attention are more mature than theories of other visual processes, such as object segregation and object attention. In the present paper, I outline a biased competition account of object segregation and attention, following similar accounts that have been developed for spatial attention (Desimone and Duncan, 1995). In my biased competition account, I seek to understand how some objects can be segregated and selected over other objects in a complex visual scene. Under this account, there are two sources of visual information that allow an object to be processed over other objects: bottom-up information carried by the physical stimulus and top-down information based on an observer's goals. I use the biased competition account to combine many diverse findings from the object segregation and attention literatures into a common framework.
Article
Full-text available
An important component of routine visual behavior is the ability to find one item in a visual world filled with other, distracting items. This ability to performvisual search has been the subject of a large body of research in the past 15 years. This paper reviews the visual search literature and presents a model of human search behavior. Built upon the work of Neisser, Treisman, Julesz, and others, the model distinguishes between a preattentive, massively parallel stage that processes information about basic visual features (color, motion, various depth cues, etc.) across large portions of the visual field and a subsequent limited-capacity stage that performs other, more complex operations (e.g., face recognition, reading, object identification) over a limited portion of the visual field. The spatial deployment of the limited-capacity process is under attentional control. The heart of the guided search model is the idea that attentional deployment of limited resources isguided by the output of the earlier parallel processes. Guided Search 2.0 (GS2) is a revision of the model in which virtually all aspects of the model have been made more explicit and/or revised in light of new data. The paper is organized into four parts: Part 1 presents the model and the details of its computer simulation. Part 2 reviews the visual search literature on preattentive processing of basic features and shows how the GS2 simulation reproduces those results. Part 3 reviews the literature on the attentional deployment of limited-capacity processes in conjunction and serial searches and shows how the simulation handles those conditions. Finally, Part 4 deals with shortcomings of the model and unresolved issues.
Article
Bacteria employ restriction enzymes to cut or restrict DNA at or near specific words in a unique way. Many restriction enzymes cut the two strands of double-stranded DNA at different positions leaving overhangs of single-stranded DNA. Two pieces of DNA may be rejoined or ligated if their terminal overhangs are complementary. Using these operations fragments of DNA, or oligonucleotides, may be inserted and deleted from a circular piece of plasmid DNA. We propose an encoding for the transition table of a Turing machine in DNA oligonucleotides and a corresponding series of restrictions and ligations of those oligonucleotides that, when performed on circular DNA encoding an instantaneous description of a Turing machine, simulate the operation of the Turing machine encoded in those oligonucleotides. DNA based Turing machines have been proposed by Charles Bennett but they invoke imaginary enzymes to perform the state-symbol transitions. Our approach differs in that every operation can be performed using commercially available restriction enzymes and ligases.
Article
PSYCHOLOGICAL SCIENCE Abstract— A large body of evidence suggests that visual attention selects objects as well as spatial locations. If attention is to be regarded as truly object based, then it should operate not only on object repre- sentations that are explicit in the image, but also on representations that are the result of earlier perceptual completion processes. Report- ing the results of two experiments, we show that when attention is directed to part of a perceptual object, other parts of that object enjoy an attentional advantage as well. In particular, we show that this object-specific attentional advantage accrues to partly occluded objects and to objects defined by subjective contours. The results cor- roborate the claim that perceptual completion precedes object-based attentional selection. The world consists of objects and surfaces. It is reasonable to sup- pose, therefore, that the human visual system has evolved to represent and operate on visual information in terms of objects and surfaces. Recent evidence supports this idea in showing that visual attention—the process of selecting a salient or task-relevant subset of visual information for deeper processing than the rest—can act on an object-based represen-
Article
The operation of attention in the visual field has often been compared to a spotlight. We propose that a more apt analogy is that of a zoom or variable-power lens. Two experiments focused upon the following questions: (1) Can the spatial extent of the attentional focus be made to vary in response to precues? (2) As the area of the attentional focus increases, is there a decrease in processing efficiency for stimuli within the focus? (3) Is the boundary of the focus sharply demarked from the residual field, or does it show a gradual dropoff in processing resources? Subjects were required to search eight-letter circular displays for one of two target letters and reaction times were recorded. One to four adjacent display positions were precued by underlines at various stimulus onset asynchronies before display presentation. A response competition paradigm was used, in which the “other target” was used as a noise letter in noncued as well as cued locations. The results were in good agreement with the zoom lens model.
Chapter
Usually we have assumed that the neuron was the atom in the architecture of the Neurons System. However, there is such a wealth of drendo-dendritic connections and synaptic mechanisms that it seems essential to distinguish different styles of analog microcomputation. In this paper we look inside the synaptic structure after a local process of accumulation of persistent activity and their discharge towards the spike trigger zone. To illustrate the usefulness of this information processing behaviour in image motion analysis, and architecture for extraction and selection of length velocity ratio invariants (LVR) is proposed, simulated and partially evaluated.
Article
A new manner of relating formal language theory to the study of informational macromolecules is initiated. A language is associated with each pair of sets where the first set consists of double-stranded DNA molecules and the second set consists of the recombinational behaviors allowed by specified classes of enzymatic activities. The associated language consists of strings of symbols that represent the primary structures of the DNA molecules that may potentially arise from the original set of DNA molecules under the given enzymatic activities. Attention is focused on the potential effect of sets of restriction enzymes and a ligase that allow DNA molecules to be cleaved and reassociated to produce further molecules. The associated languages are analysed by means of a new generative formalism called a splicing system. A significant subclass of these languages, which we call the persistent splicing languages, is shown to coincide with a class of regular languages which have been previously studied in other contexts: the strictly locally testable languages. This study initiates the formal analysis of the generative power of recombinational behaviors in general. The splicing system formalism allows observations to be made concerning the generative power of general recombination and also of sets of enzymatic activities that include general recombination.
Article
To be able to understand the motion of non-rigid objects, techniques in image processing and computer vision are essential for motion analysis. Lateral interaction in accumulative computation for extracting non-rigid shapes from an image sequence has recently been presented, as well as its application to segmentation from motion. In this paper, we introduce a modified version of the first multi-layer architecture. This version uses the basic parameters of the LIAC model to spatio-temporally build up to the desired extent the shapes of all moving objects present in a sequence of images. The influences of LIAC model parameters are explained in this paper, and we finally show some examples of the usefulness of the model proposed.
Article
Some of the major computer vision techniques make use of neural nets. In this paper we present a novel model based on neural networks denominated lateral interaction in accumulative computation (LIAC). This model is based on a series of neuronal models in one layer, namely the local accumulative computation model, the double time scale model and the recurrent lateral interaction model. The LIAC model usefulness in the general task of motion detection may be appreciated by means of some significant examples of object detection in indefinite sequences of synthetic and real images.