Figure 3 - uploaded by Olivier Aycard
Content may be subject to copyright.
Angular definition: view from above

Angular definition: view from above

Source publication
Article
Full-text available
In this paper, a human speaker tracking method on audio and video data is presented. It is applied to conversation tracking with a robot. Audiovisual data fusion is performed in a two-steps process. Detection is performed independently on each modality: face detection based on skin color on video data and sound source localization based on the time...

Similar publications

Article
Full-text available
Resting state networks (RSNs) in the human brain were recently detected using high-density electroencephalography (hdEEG). This was done by using an advanced analysis workflow to estimate neural signals in the cortex and to assess functional connectivity (FC) between distant cortical regions. FC analyses were conducted either using temporal (tICA)...
Article
Full-text available
In this study, high quality asbestos free bake pad was produced from locally sourced raw materials. The disc brake friction lining with geometrical specification of Mitshibushi L-300 was produced using palm kernel shell and coconut shell powder as base material, polyester resin as binder material, graphite as lubricant, metal chips and carbides as...
Article
Full-text available
When observers in a virtual sound environment are in motion relative to a source, the virtual-auditory display must rapidly track the users head position and update the location-cueing acoustic filters—known as head-related transfer functions (HRTFs)—in order to accurately reflect the source’s location relative to the current head orientation and p...
Article
Full-text available
Cognitive control processes are advantageous when routines would not lead to the desired outcome, but this can be ill-advised when automated behavior is advantageous. The aim of this study was to identify neural dynamics related to the ability to adapt to different cognitive control demands – a process that has been referred to as ‘metacontrol.’ A...
Article
Full-text available
Due to an ill-posed and underestimated characteristic of bioluminescence tomography (BLT) reconstruction, a priori anatomical information obtained from computed tomography (CT) or magnetic resonance imaging (MRI), is usually incorporated to improve the reconstruction accuracy. The organs need to be segmented, which is time-consuming and challenging...

Citations

... The sound is localized on the horizontal dimension using the cross correlation and the Interaural Time Difference (ITD) of [8]. First, a sound buffer of 33 milliseconds is retrieved from the two microphones. ...
Conference Paper
Full-text available
This paper describes a fast audiovisual attention model applied to human detection and localization on a companion robot. Its originality lies in combining static and dynamic modalities over two analysis paths in order to guide the robot's gaze towards the most probable human beings' locations based on the concept of saliency. Visual, depth and audio data are acquired using a RGB-D camera and two horizontal microphones. Adapted state-of-the-art methods are used to extract relevant information and fuse them together via two dimensional gaussian representations. The obtained saliency map represents human positions as the most salient areas. Experiments have shown that the proposed model can provide a mean F-measure of 66 percent with a mean precision of 77 percent for human localization using bounding box areas on 10 manually annotated videos. The corresponding algorithm is able to process 70 frames per second on the robot (edit: on a single CPU thread).
Conference Paper
Robots are destined to live with humans and perform tasks for them. In order to do that, an adapted representation of the world including human detection is required. Evidential grids enable the robot to handle partial information and ignorance, which can be useful in various situations. This paper deals with an audiovisual perception scheme of a robot in indoor environment (apartment, house..). As the robot moves, it must take into account its environment and the humans in presence. This article presents the key-stages of the multimodal fusion: an evidential grid is built from each modality using a modified Dempster combination, and a temporal fusion is made using an evidential filter based on an adapted version of the generalized bayesian theorem. This enables the robot to keep track of the state of its environment. A decision can then be made on the next move of the robot depending on the robot’s mission and the extracted information. The system is tested on a simulated environment under realistic conditions.