Francis Grenez's research while affiliated with Université Libre de Bruxelles and other places

What is this page?


This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Publications (44)


Principal component analysis of the spectrogram of the speech signal: Interpretation and application to dysarthric speech
  • Article

July 2019

·

90 Reads

·

28 Citations

Computer Speech & Language

·

Francis Grenez

·

·

The article concerns the interpretation of the principal components of the spectrogram of the speech signal and its application to the description of dysarthric speech. Each principal component is a linear combination of the frame spectra. We show that the first principal component of the spectrogram is closely related to the long-term average spectrum (LTAS) and the second principal component is the difference of two weighted sums of frame spectra reporting open and close vowel frame spectra respectively. We investigate articulation deficits in dysarthric speakers via cues obtained from principal components of the spectrogram of connected speech because long-term average spectra have been claimed to inform about speaker settings of the vocal tract.

Share


On the harmonic-to-noise ratio as an acoustic cue of vocal timbre of Parkinson speakers

September 2016

·

114 Reads

·

10 Citations

Biomedical Signal Processing and Control

·

Christophe Mertens

·

Francis Grenez

·

[...]

·

The relevance of the harmonic-to-noise ratio (HNR) and glottal cycle length jitter as cues of the vocal timbre of Parkinson speakers is investigated. HNR and vocal cycle length jitter are known to be suitable cues for the evaluation of the vocal timbre of dysphonic speakers. However, the question whether they are relevant descriptors of the voice quality of Parkinson speakers is still unanswered. Empirical mode decomposition (EMD) has been used to estimate the HNR by decomposing the log-magnitude spectrum of the speech signal into its harmonic, contour and noise components. Cycle length jitter has been estimated via the break-up by empirical mode decomposition of the cycle length time series into the intonation contour as well as the perturbations owing to tremor and jitter. HNR and cycle length jitter values of vowels [a] sustained by 205 Parkinson and 74 control speakers are in the same interval respectively and the differences are not statistically significant. Also, the standard deviations of the per-frame HNR values of an utterance do not differ statistically significantly between control and Parkinson speakers.


A time-scale based objective function for EEG signal parameters optimization

December 2015

·

16 Reads

This paper concerns the optimization of EEG signal parameters for epileptic seizure detection. In a previous study, a macroscopic model has been used to model various waveforms of EEG signal and to optimize its parameters by means of a genetic algorithm (GA). In the GA-based method for EEG parameters estimation, an optimization procedure is used. The aim of the optimization procedure is to minimize an objective function. The minimized error function compares the desired waveform (real EEG signal) and the waveform of the signal provided by the model both in the time domain and frequency domain. In the present study, we propose a time-scale based representation for the objective function as an alternative to the time and frequency based objective function used in the early study. The proposed objective function takes into account the non-stationary nature of the EEG signal. The performance of the proposed wavelet-based objective function is compared to that of the spectral objective function.


Figure 1: Categorization : cycle length time series, vocal jitter, neurological tremor, physiological tremor and trend (in that order) for a fragment of vowel [a] sustained by a Parkinson speaker.
Figure 2: Raw and smoothed typical estimated neurological tremor frequency time series for a fragment of vowel [a] sustained by a Parkinson speaker  
Vocal tremor analysis via AM-FM decomposition of empirical modes of the glottal cycle length time series
  • Conference Paper
  • Full-text available

September 2015

·

160 Reads

·

4 Citations

The presentation concerns a method that obtains the size and frequency of vocal tremor in speech sounds sustained by normal speakers and patients suffering from neurological disorders. The glottal cycle lengths are tracked in the temporal domain via salience analysis and dynamic programming. The cycle length time series is then decomposed into a sum of oscillating components by empirical mode decomposition the instantaneous envelopes and frequencies of which are obtained via an AM-FM decomposition. Based on their average instantaneous frequencies, the empirical modes are then assigned to four categories (intonation, physiological tremor, neurological tremor as well as jitter) and added within each. The within-category size of the cycle length perturbations is estimated via the standard deviation of the empirical mode sum divided by the average cycle length. The tremor frequency within the neurological tremor category is obtained via a weighted instantaneous average of the mode frequencies followed by a weighted temporal average. The method is applied to two corpora of vowels sustained by 123 and 74 control and 456 and 205 Parkinson speakers respectively

Download

Multiband vocal dysperiodicities analysis using empirical mode decomposition in the log-spectral domain

September 2014

·

34 Reads

·

5 Citations

Biomedical Signal Processing and Control

In this paper, empirical mode decomposition (EMD) is proposed as an alternative to decompose the log magnitude spectrum of the speech signal into its harmonic, envelope and noise components. The acoustic measure named harmonic-to-noise ratio (HNR) is used to summarize the degree of disturbance in the speech signal and consequently to evaluate the overall quality of the disordered voices produced by dysphonic speakers. Most approaches for HNR estimation have in common to involve the isolation of individual speech cycles or pseudo-harmonics/rhamonics in speech spectrum/cepstrum; however, this isolation cannot be carried out reliably in speech produced by severely hoarse speakers and may result in inaccurate HNR estimation. The EMD-based approach used in this study incorporates an appropriate procedure that estimates automatically the thresholds used by the clustering algorithm without knowledge of the fundamental frequency. The frequency range of the harmonic and noise components is divided into ten equally spaced intervals and the harmonic-to-noise ratios (HNRs) within each interval are used as independent variables to summarize the amount of perceived hoarseness. The proposed method is evaluated on a corpus comprising 251 normophonic and dysphonic speakers. Multiple correlation analysis carried out on HNRs from the different frequency bands shows that multi-band analysis based on empirical mode decomposition results in statistically significantly higher correlation of predicted scores with scores of perceived hoarseness over full-band analysis. Principal component analysis is carried out on the HNR measures obtained in the ten frequency bands. More than 97% of the total variance is explained by the first two principal components, PC1 and PC2. Experimental results show that the first principal component is interpretable in terms of the degree of the severity of hoarseness whereas the second principal component indicates whether the voice is high-pitched or low-pitched. It is shown that the first two principal components result in a high predictability of hoarseness scores.





Development and perceptual assessment of a synthesizer of disordered voices

October 2012

·

47 Reads

·

35 Citations

The Journal of the Acoustical Society of America

A synthesizer is based on a nonlinear wave-shaping model of the glottal area, an algebraic model of the glottal aerodynamics as well as concatenated-tube models of the trachea and vocal tract. Voice disorders are simulated by way of models of vocal frequency jitter and tremor, vocal amplitude shimmer and tremor, as well as pulsatile additive noise. Six experiments have been carried out to assess the synthesizer perceptually. Three experiments involve the perceptual categorization of male synthetic and human stimuli and one the auditory discrimination between synthetic and human tokens. A fifth experiment reports the auditory discrimination between synthetic tokens with different levels of additive and modulation noise. A sixth experiment reports the scoring by expert listeners of male synthetic stimuli on equal-appearing interval scales grade-roughness-breathiness (GRB). A first objective is to demonstrate the ability of the synthesizer to simulate vowel sounds that are valid exemplars of speech sounds produced by humans with voice disorders. A second objective is to learn how human expert raters perceptually map vocal frequency, additive and modulation noise as well as vowel categories into scores on GRB scales.


Citations (26)


... Azab and Khasawneh [36] used the spectrogram to detect malware files. Kachaa et al. [37] analyzed the different conditions of dysarthric speech, which is a speech disorder related to muscle weakness, using the spectrogram of voice signals to interpret the different states of this disorder. Zeng et al. [38] extracted the spectrogram of arm movements and used this feature to classify the movements. ...

Reference:

A Feature-Reduction Scheme Based on a Two-Sample t-Test to Eliminate Useless Spectrogram Frequency Bands in Acoustic Event Detection Systems
Principal component analysis of the spectrogram of the speech signal: Interpretation and application to dysarthric speech
  • Citing Article
  • July 2019

Computer Speech & Language

... However, it has been found that sustained phonation carries more disease discriminatory information and is more convenient to perform than isolated words and sentences [11,12]. Previous studies have investigated articulatory and rhythmic aspects for the sustained phonation task, where typical acoustic features include voice quality and rhythm [13,14,15]. However, these features do not consider the time-frequency properties of the voice [12,16]. ...

On the harmonic-to-noise ratio as an acoustic cue of vocal timbre of Parkinson speakers
  • Citing Article
  • September 2016

Biomedical Signal Processing and Control

... Finally, other important correlates of phonation instability are tremors. Classically tremor in voice [15] has been divided into three bands, known as physiological (between 2-4 Hz), neurological (5-8 Hz) and flutter (9-12 Hz). All these are semantic under the point of view of neuromotor disorders. ...

Vocal tremor analysis via AM-FM decomposition of empirical modes of the glottal cycle length time series

... The general procedure for the assessment of disordered voices is described by the flowchart in Figure 1. In the proposed method, the EMD algorithm is used in the log spectral domain for the estimation of the glottal source signal from the speech signal [16,17]. By means of the EMD algorithm, the logarithm of the magnitude spectrum of the speech signal is decomposed to oscillatory modes, called intrinsic mode functions (IMFs), that are clustered in two classes (the spectral envelope and the harmonic component) by a simple thresholding. ...

Multiband vocal dysperiodicities analysis using empirical mode decomposition in the log-spectral domain
  • Citing Article
  • September 2014

Biomedical Signal Processing and Control

... Previous studies have demonstrated that the excitatory and inhibitory gain (A and B) paramters of NMM play a central role in transitioning to seizurelike activity [45,46]. Building on these observations, majority of the studies have mostly considered estimating the combination of A and B from epileptic EEG data [6,12,16,28] using NMMs. In [12] the combination of parameters a and b was also additionally explored. ...

Early detection of epileptic seizures based on parameter identification of neural mass model
  • Citing Article
  • November 2013

Computers in Biology and Medicine

... For example, a lower f0 prompts earlier identification of the subharmonic (the lower frequency) as the true pitch, and the pitch drops more quickly in frequency-than amplitude-modulated tokens (Sun and Xu, 2002;Bergan and Titze, 2001). In addition, Fraj et al. (2012) assessed the perception of simulated disordered voice using additive pulsatile noise, frequency jitter and tremor, and amplitude shimmer and tremor. They found that the degree of roughness is positively associated with pulsatile noise and frequency jitter while negatively related to the vocal frequency. ...

Development and perceptual assessment of a synthesizer of disordered voices
  • Citing Article
  • October 2012

The Journal of the Acoustical Society of America

... L'hypothèse réaliste de sources en champ lointain impose que les angles d'incidence des rayons sont identiques pour le trajet réfléchi et le signal direct masqué. En décomposant cet angle commun selon les angles d'élévation et d'azimut, le retard de pseudodistance induit par la réception alternée est donné par [66] : ...

Statistical study of NLOS-Multipath in Urban Canyons
  • Citing Article

... We introduce a new type of transducer which is based on the electro-magnetic transducer principle. With this new type of transducer we are able to investigate different types of prototype waveforms (e.g., Liljencrants-Fant (LF) model [43], Hanquinet-Grenez-Schoentgen (HGS) model [44]) and can continuously change the frequency of this excitation signal. Furthermore, we design a prototype housing and attach a coupler disk to optimally transfer as much energy as possible through the neck tissue into the vocal tract. ...

Synthesis of Disordered Voices
  • Citing Conference Paper
  • February 2006

... Opportunities of VIP in design problems of FIR digital filters were actually not investigated. So, in [5] VIP technique only for minimization of the statistical coefficient wordlength was applied, and in [6] negative results on its using for the design with the given real coefficient wordlength were obtained. In this paper is shown that VIP technique gives excellent results in respect to the FIR linear phase digital filter design both with minimum of the total number of adders in multiplierless structures and with optimal magnitude responses in a minimax sense at the given coefficient wordlength. ...

Design of f.i.r. linear phase digital filters to minimise the statistical word length of the coefficients
  • Citing Article
  • October 1977

IEE Journal on Electronic Circuits and Systems

... To prevent the abovementioned geometry degradation problem, the GNSS ray-tracing method was proposed (Ercek et al. 2006;Miura et al. 2013). Instead of excluding the NLOS signals, this method corrects NLOS errors in them by simulating the GNSS signal traveling path using the 3D model of the surrounding infrastructures which can be either stored beforehand or built onboard (Wen et al. 2018;Pugliese et al. 2023). ...

NLOS-multipath effects on Pseudo-Range estimation in urban canyons for GNSS applications
  • Citing Conference Paper
  • December 2006