Fig 4 - uploaded by Tianshu Qu
Content may be subject to copyright.
Directivity of the sound source in the median plane.

Directivity of the sound source in the median plane.

Source publication
Article
Full-text available
A measurement of head-related transfer functions (HRTFs) with high spatial resolution was carried out in this study. HRTF measurement is difficult in the proximal region because of the lack of an appropriate acoustic point source. In this paper, a modified spark gap was used as the acoustic sound source. Our evaluation experiments showed that the s...

Similar publications

Article
Full-text available
The residual inverse filter signal is obtained by extracting the effects of the supraglottal structures from the acoustic voice signal. The residual signal, which approximates the excitation in a voice model, consists of a series of periodic spikes that coincide with the start position of the glottal pulse, and, consequently, with the initial posit...
Article
Full-text available
This work shows the performance of an improved LS3/5A– based speaker, originally designed by the BBC about the 60's. To enhance the speaker performance, its design follows the acoustic and filter parameter adjustment, like the Thiele-Small parameters. Testing is carried out by using the frequency response and Sound Pressure Level (SPL). The results...
Article
Full-text available
Here we discuss mechanisms of attention in distributed soft-ware systems for robotics as well as their advantages towards more intelli-gent and capable robots. We start with a short overview of human atten-tion including filter and resource theories as well as with a discussion on attention mechanisms used in software frameworks dedicated to roboti...
Article
Full-text available
In this paper, the Acoustic Echo Cancellation (AEC) are investigated by using Finite Impulse Responses Adaptive Filter with the analysis of Mean Square Error (MSE) and its convergence property. It is the result of a project in the course Fundamental of Signal Processing at Chongqing University of Posts and Telecommunications. It focuses on Normaliz...

Citations

... Therefore, existing near-field HRTF databases obtained from measurements only contain data for a small number (usually less than ten) of source distances. See for example [4], [7], and [8]. ...
... This measurement setup can be uncomfortable for human listeners. At the point of writing, numerous far-field HRTF datasets [4,6,7,8,9,10] have been developed, and only a few NF HRTF datasets have been developed [11,12,13]. In general, NF HRTFs are acquired by three categories of approaches. ...
Conference Paper
Full-text available
Head-related transfer function (HRTF) is an essential component to create an immersive listening experience over headphones for virtual reality (VR) and augmented reality (AR) applications. Metaverse combines VR and AR to create immersive digital experiences, and users are very likely to interact with virtual objects in the near-field (NF). The HRTFs of such objects are highly individualized and dependent on directions and distances. Hence, a significant number of HRTF measurements at different distances in the NF would be needed. Using conventional static stop-and-go HRTF measurement methods to acquire these measurements would be time-consuming and tedious for human listeners. In this paper, we propose a continuous measurement system targeted for the NF, and efficiently capturing HRTFs in the horizontal plane within 45 secs. Comparative experiments are performed on head and torso simulator (HATS) and human listeners to evaluate system consistency and robustness.
... Anthropometric features can also be combined with listening tests to give the listeners recommended HRTFs, which can reduce perceptual errors (Pelzer et al., 2020). To date, experimental measurement of individual HRTFs is still the most accurate approach, and many databases have been established (Algazi et al., 2001;Qu et al., 2009;Watanabe et al., 2014). HRTF measurements can only be carried out at discrete and limited spatial directions. ...
Article
Individual head-related transfer functions (HRTFs) are usually measured with high spatial resolution or modeled with anthropometric parameters. This study proposed an HRTF individualization method using only spatially sparse measurements using a convolutional neural network (CNN). The HRTFs were represented by two-dimensional images, in which the horizontal and vertical ordinates indicated direction and frequency, respectively. The CNN was trained by using the HRTF images measured at specific sparse directions as input and using the corresponding images with a high spatial resolution as output in a prior HRTF database. The HRTFs of a new subject can be recovered by the trained CNN with the sparsely measured HRTFs. Objective experiments showed that, when using 23 directions to recover individual HRTFs at 1250 directions, the spectral distortion (SD) is around 4.4 dB; when using 105 directions, the SD reduced to around 3.8 dB. Subjective experiments showed that the individualized HRTFs recovered from 105 directions had smaller discrimination proportion than the baseline method and were perceptually undistinguishable in many directions. This method combines the spectral and spatial characteristics of HRTF for individualization, which has potential for improving virtual reality experience.
... To do binaural rendering, we choose head-related transfer function (HRTF), which describes the acoustic transfer function from sound source to ears under the condition of free field in frequency domain, expressing the comprehensive filtering effect of physiological structure on sound waves [12]. So far, most available HRTF data were gained manually in strict conditions, such as CIPIC database [13], PKU database [14], or ARI database [15]. Because HRTF significantly depends on the pinna, head, trunk and other parts of human body, it has a problem of individualization, meaning that it differs from person to person. ...
Article
With the rise of the concept of metaverse and the development of virtual reality (VR), the effect of 3D audio has become an important factor which influences the immersive experience. Considering the widely used binaural 3D audio, it’s important to evaluate the quality as different representation methods exist. However, there’s no reference stimulus when evaluating binaural 3D audio, making it difficult to compare different representation methods. This paper proposes a multi-attribute evaluation method on binaural 3D audio without reference stimulus. Through this method, a comparative experiment with channel-based, scene-based, and object-based binaural 3D audios was conducted. The results prove that binaural 3D audios of these three representation methods are similar in audio quality, but differ greatly in spatial sense, giving a solid data support to the feature comparison of channel-based, scene-based, and object-based methods. It is also proved that the evaluation method is valid through statistical analysis and can be used in other test requirements of binaural 3D audio.
... In order to create smoothly varying source motion in azimuth, we built upon an algorithm for synthesizing headrelated transfer functions (HRTFs) described by Gamper (2013). Gamper used a published database of high resolution KEMAR HRTF measurements in three-dimensional (3D) space (Qu et al., 2009) and developed an algorithm for computing high-quality interpolations of the space between the discrete measurements. Using this method, Gamper reported a root-mean-square error (RMSE) of no more than 6 dB between the spectral features of the original HRTF and the interpolated HRTF that was retrodicted with 50% of the measurements removed [cf. ...
Article
Source motion was examined as a cue for segregating concurrent speech or noise sources. In two different headphone-based tasks—motion detection (MD) and speech-on-speech masking (SI)—one source among three was designated as the target only by imposing sinusoidal variation in azimuth during the stimulus presentation. For MD, the lstener was asked which of the three concurrent sources was in motion during the trial. For SI, the listener was asked to report the words spoken by the moving speech source. MD performance improved as the amplitude of the sinusoidal motion (i.e., displacement in azimuth) increased over the range of values tested (±5° to ±30°) for both modulated noise and speech targets, with better performance found for speech. SI performance also improved as the amplitude of target motion increased. Furthermore, SI performance improved as word position progressed throughout the sentence. Performance on the MD task was correlated with performance on SI task across individual subjects. For the SI conditions tested here, these findings are consistent with the proposition that listeners first detect the moving target source, then focus attention on the target location as the target sentence unfolds.
... and torso scans available [22]. [104] and of the FABIAN mannequin acoustically measured and numerically calculated [105]. ...
... They are also commonly used in scientific and non-usual applications when sound source directivity must be neglected [4][5][6]. State of the art contains the engineering reports with the designs of traditional multi-transducer sound sources in arrays [7][8][9] and novel designs such as balloon dielectric elastomer sound sources [10], spark sources [11], or mimicked dodecahedrons [12]. While the new sound sources are being developed, the classic conventional speaker arrays are not sufficiently described in the literature and require deep revision and essential design principles development. ...
Article
Full-text available
Omnidirectional sound sources are standard devices used in numerous acoustic measurements, such as the ones described in ISO3382, ISO140, or ISO354 standards. They are used when information on the sound diffraction at an object is required. State of the art findings describe several engineering designs of omnidirectional sound sources; some commercial applications can be also found. However, there is no universal design method for this kind of sound sources, neither in terms of the size and number of the transducers nor any general electroacoustic principles. This paper describes the use of Finite Elements Method (FEM) to derive the directivity patterns of different speaker arrays, such as spherical speaker arrays and the most popular polyhedrons. The number of transducers studied in the paper varies from 4 to 36. The influence of transducer size and the enclosure size was also preliminarily investigated. The simulation results were assessed with new strict omnidirectionality quality measures, and the influence of the transducers' number or size on a final omnidirectional sound source performance was verified.
... The sampling rate of the sound stimulus was 22.05 kHz. Virtual localization was generated using head-related transfer functions ( Qu et al., 2009 ). Three locations were simulated: left-lateralized 90°, right-lateralized 90°and in the middle 0°; these locations were designated left, right and middle, respectively ( Fig. 1 A). ...
Article
Full-text available
Spatial hearing in humans is a high-level auditory process that is crucial to rapid sound localization in the environment. Both neurophysiological models with animals and neuroimaging evidence from human subjects in the wakefulness stage suggest that the localization of auditory objects is mainly located in the posterior auditory cortex. However, whether this cognitive process is preserved during sleep remains unclear. To fill this research gap, we investigated the sleeping brain's capacity to identify sound locations by recording simultaneous electroencephalographic (EEG) and magnetoencephalographic (MEG) signals during wakefulness and non-rapid eye movement (NREM) sleep in human subjects. Using the frequency-tagging paradigm, the subjects were presented with a basic syllable sequence at 5 Hz and a location change that occurred every three syllables, resulting in a sound localization shift at 1.67 Hz. The EEG and MEG signals were used for sleep scoring and neural tracking analyses, respectively. Neural tracking responses at 5 Hz reflecting basic auditory processing were observed during both wakefulness and NREM sleep, although the responses during sleep were weaker than those during wakefulness. Cortical responses at 1.67 Hz, which correspond to the sound location change, were observed during wakefulness regardless of attention to the stimuli but vanished during NREM sleep. These results for the first time indicate that sleep preserves basic auditory processing but disrupts the higher-order brain function of sound localization.
... We calculated SD at different distances and azimuths between our measured HRTFs and the SCUT database [2] , and between the PKU-IOA database [15] and the SCUT database as reference. Both are near-field HRTF databases measured on KEMAR. ...
Conference Paper
Full-text available
Near-field Head-Related Transfer Functions (HRTFs) depend on both source direction (azimuth/elevation) and distance. The acquisition procedure for near-field HRTF data on a dense spatial grid is time-consuming and prone to measurement errors. Therefore, existing databases only cover a few discrete source distances. Coming from the fact that continuous-azimuth acquisition of HRTFs has been made possible by applying the Normalized Least Mean Square (NLMS) adaptive filtering method, in this work we applied the NLMS algorithm in measuring near-field HRTFs under continuous variation of source distance. We developed and validated a novel measurement setup that allows the acquisition of near-field HRTFs for source distances ranging from 20 to 120 cm with one recording. We then evaluated the measurement accuracy by analyzing the estimation error from the adaptive filtering algorithm and the key characteristics of the measured HRTFs associated with near-field binaural rendering.
... It produces the acoustic wave with a shape close to that of the nanosecond-pulse discharge. Several studies [24][25][26][27][28][29] have been conducted around the acoustic radiation from spark discharges with the current pulse widths exceeding 1 ls. Due to the great deposition energy, the sound pressure can exceed 1000 Pa even at 10 cm away from the source. ...
Article
Full-text available
A single nanosecond-pulse discharge can produce a high-intensity pulsed acoustic wave. The pulse width of the acoustic wave is much wider than that of the current, more than 20 μs at 30 cm from the source, which is the basis of synthesizing low-frequency sound by repetitively nanosecond-pulse discharges. The investigations of electroacoustic characteristics and the sound formation process of the single nanosecond-pulse discharge are vital to advance this technology. In this paper, an experimental platform for the single nanosecond-pulse discharge was built, and time-domain waveforms of the voltage, the current, and the sound pressure were measured. The effects of electrode shape, current limiting resistors, and current pulse width on the acoustic wave were discussed. To analyze the formation process of the acoustic wave, the gas densities near the electrodes at different moments after the discharge were diagnosed by laser Schlieren photography. The result shows that the formation of the acoustic wave is much slower than the discharge. A two-stage model was developed to qualitatively describe the formation process of the acoustic wave, and numerical calculations were carried out using thermodynamic and hydrodynamic equations. At the end of the discharge, a huge pressure difference is formed inside and outside the gas channel due to the Joule heating, which can be considered as a shock wave. During the outward propagation, the wave tail is elongated by the difference in sound velocity at each point, and the thickness of the shock wave increases due to the dissipation. This eventually leads to the half-duration of more than 20 μs.