Figure 2 - uploaded by Sergei Aleinik
Content may be subject to copyright.
Directivity patterns for Delay & Sum and the proposed method for  

Directivity patterns for Delay & Sum and the proposed method for  

Source publication
Article
Full-text available
This paper presents a new adaptive technique for speech capturing in adverse conditions using microphone arrays. The proposed technique is based on frequency-domain alignment of microphone signals with the output of the fixed beamformer directed to the target speaker. This alignment procedure improves pattern directivity and reduces sidelobes. The...

Contexts in source publication

Context 1
... course, this conclusion is valid for any number of microphones. Figure 2 shows the resulting directivity patterns of an 8-microphone MA for a Conventional Delay & Sum FBF and the proposed method when only phase was used in (5). The distance between the microphones was 5 cm, the input signal was a 2000 Hz harmonical plane wave. ...
Context 2
... means that the resulting directivity pattern is equal to the square of the initial one. These conclusions are confirmed by Figure 3 which shows the resulting directivity patterns for the same parameters of the microphone array and the input signal as in Figure 2 when only the magnitude of is used in (5). We can see that main lobe of the dashed curve is slightly narrower, and the level of sidelobes for the dashed curve on the dB scale is two times lower than that for the solid curve. ...
Context 3
... can conclude that the phase mainly reduces the width of the main lobe, while the magnitude has a greater effect on the amplitude of the sidelobes. Of course mainlobe width reduction looks very useful. Moreover our investigation have shown that the use of the complex allows to get both mainlobe width and sidelobes reduction. However, the data in Figs. 2 and 3 were obtained using harmonic signals without interfering noise, i.e. in ideal conditions. On the other hand, our experiments showed that, first, the phase method works poorly in the presence of high level additive noise. Second, the use of the phase increases parasitic musical noise in the output speech signal. In the end we left the ...
Context 4
... patterns (DPs) shown in Figures 2 and 3 were obtained using an artificial single-frequency harmonic signal and therefore do not provide complete information about the performance of the proposed method. For clarification, we conducted a series of experiments using the model and real WBGN as input signal for MA. Figure 4 shows the DPs when the input signal is artificial WBGN from the direction . ...

Similar publications

Article
Full-text available
This article presents a 16-channel microphone-array recorder/processor that allows for a simultaneous and non-invasive detection of oral, oronasal and nasal segments in speech. Such devices and methods have not been used in the research on the articulation of sounds in the world’s languages. In this paper analysis of Polish nasalized vowel was pres...
Conference Paper
Full-text available
Far-field speech recognition in noisy and reverberant conditions remains a challenging problem despite recent deep learning breakthroughs. This problem is commonly addressed by acquiring a speech signal from multiple microphones and performing beamforming over them. In this paper, we propose to use a recurrent neural network with long short-term me...

Citations

... The interfering noise (music) was emitted by a load speaker (ϕ n = −56 • , D n = 5m); input Signal-to-Noise ratio was −6 dB. We try to separate speech from noise using equidistant linear 8-microphones MA described in [11,12]: 8 equally spaced microphones were placed with 5 cm inter-microphone spacing and 35 cm total aperture length; the signals of the microphones were sampled with the frequency 16 kHz; we used a standard Overlap-and-Add (OLA) technique with the frame length of 512 samples, 50 % overlapping and a Hann window. Figure 1 waveform C demonstrates the performance of the Zelinski post-filtering: interfering noise is almost totally suppressed. ...
Article
Full-text available
In this paper we propose a novel fast algorithm for calculating the transfer function of the Zelinski post-filter in a microphone array. The proposed algorithm requires less memory and fewer arithmetical multiplications. We demonstrate that for the “classical” algorithm computational complexity increases quadratically as a function of the number of microphones in the array. In contrast, the computational complexity of the proposed algorithm increases linearly. This provides a considerable acceleration in the calculation of the post-filter transfer function in real-time systems.
... Multichannel speech enhancement is a popular approach for improving ASR robustness in noisy conditions. Table 4 reports the results of various beamforming and spatial post-filtering techniques, namely minimum variance distortionless response (MVDR) beamforming with diagonal loading (Mestre and Lagunas, 2003), delay-and-sum (DS) beamforming (Cohen et al., 2010), Zelinski's post-filter (Zelinski, 1988), its modification by Simmer et al. (1994), and multichannel alignment (MCA) based beamforming (Stolbov and Aleinik, 2015) 3 . Apart from the MVDR beamformer which provides a very large improvement on simulated data but no improvement on real data, all tested techniques provide similar improvement on real and simulated data. ...
Article
Speech enhancement and automatic speech recognition (ASR) are most often evaluated in matched (or multi-condition) settings where the acoustic conditions of the training data match (or cover) those of the test data. Few studies have systematically assessed the impact of acoustic mismatches between training and test data, especially concerning recent speech enhancement and state-of-the-art ASR techniques. In this article, we study this issue in the context of the CHiME-3 dataset, which consists of sentences spoken by talkers situated in challenging noisy environments recorded using a 6-channel tablet based microphone array. We provide a critical analysis of the results published on this dataset for various signal enhancement, feature extraction, and ASR backend techniques and perform a number of new experiments in order to separately assess the impact of different noise environments, different numbers and positions of microphones, or simulated vs. real data on speech enhancement and ASR performance. We show that, with the exception of minimum variance distortionless response (MVDR) beamforming, most algorithms perform consistently on real and simulated data and can benefit from training on simulated data. We also find that training on different noise environments and different microphones barely affects the ASR performance, especially when several environments are present in the training data: only the number of microphones has a significant impact. Based on these results, we introduce the CHiME-4 Speech Separation and Recognition Challenge, which revisits the CHiME-3 dataset and makes it more challenging by reducing the number of microphones available for testing.
... It is claimed that Zelinski post-filtering is good for spatially uncorrelated noise suppression and works less well in the case of spatially correlated noise [2]. In [6][7] a novel method called "Multichannel alignment" has been proposed and investigated in details. This method suppresses both spatially correlated and uncorrelated noises. ...
... In [6][7] a new multichannel alignment method has been proposed. In this method, first, not a single but N transfer functions are calculated as follows: ...
... In our experiments we used the MA fully described in [7]: an equidistant microphone array with 8 microphones and inter-microphone spacing of 5 cm (i.e. the total aperture length is 35 cm).We used sampling frequency of 16 kHz and the well-known Overlap-and-Add (OLA) technique with the frame length of 512 samples, 50% overlapping and a Hann window. We chose the smoothing constant 9 . ...
Conference Paper
Full-text available
In this paper we present the results of a comparative study of algorithms for speech signal processing in a microphone array. We compared the multichannel alignment method and two modifications of the well-known Zelinski post-filtering. Comparisons were performed using artificial and real signals recorded in real noisy environments. The experiments helped us to devise recommendations for choosing a suitable method of signal processing for different noise conditions.
Conference Paper
This paper describes a new optimized method for calculating Zelinski post-filter transfer function for a microphone array. Optimized algorithm requires less memory and fewer arithmetical multiplications. We demonstrate that for the known algorithm computational complexity increases quadratically as a function of the number of microphones. In contrast, the computational complexity of the proposed algorithm increases linearly. This provides a considerable acceleration in the calculation of the post-filter transfer function.
Conference Paper
Full-text available
This paper presents a new method for estimating signal-to-noise ratio based on adaptive signal decomposition. Statistical simulation shows that the proposed method has lower variance and bias than the known signal-to-noise ratio measures. We discuss the parameters and characteristics of the proposed method and its practical implementation.