Fig 1 - uploaded by Emmanuel Ferragne
Content may be subject to copyright.
Accents of the British Isles.  

Accents of the British Isles.  

Source publication
Article
Full-text available
We illustrate how a high-dimension feature space typically used in speech technology can be adapted to the phonetic description of vowels in 13 accents of the British Isles. In a previous work (Ferragne & Pellegrino, 2010), we carried out a formant investigation of the vowel systems of the British Isles; due to erroneous formant estimation, two-thi...

Contexts in source publication

Context 1
... accents (the Inner London subset) was not included in the analysis after auditory assessment by a British phonetician. The remaining sample therefore consists of 13 accents, each represented by 10 male and 10 female speakers on average, for a total of 261 speakers. Table 1 shows the abbreviations that will be used to designate the accents and Fig. 1 shows a map of the British Isles. A list of 11 /hVd/ words was read five times by the participants: heed, hid, head, had, hard, hod, hoard, hood, who'd, Hudd, heard. We will not delve into the advantages and drawbacks of such word lists, suffice it to say that they neutralize information-related phonetic variation (e.g. varying degrees ...
Context 2
... a typically northern English accent, ncl should not normally have distinct phonemes for Hudd and hood. The typical vowel in heard should show a high degree of frontness (Watt & Allen, 2003). The vowel system of ncl is illustrated in Fig. 10. The shortest distance in the tree diagram (i.e. between hood and Hudd) seems to support the lack of FOOT-STRUT split. It is slightly greater than the distance found between these two vowels in other accents lacking the distinction; for instance, the tree distance between hood and Hudd for eyk, lan, and lvp is closer to 50 than 70. One ...
Context 3
... the nwa vowel system, the 11 test words are assumed to contain distinct phonemes. On the phonetic level, we expect rather front qualities for heard and Hudd and a back quality in who'd. Fig. 11 shows the nwa vowel system. The similarity between Hudd, heard, and front vowels is attested in both graphs. The vowels in Hudd and heard are very close to each other in the dendrogram (less so in the MDS plot) and their distance-slightly less than 70, seems typical of an absence of spectral difference. In spite of this, the vowels can ...
Context 4
... not be analysed here. From Ferragne and Pellegrino (2010), we know that our roi sample quite unexpectedly has the same phoneme in hood and Hudd. Phonetically, rhoticity is expected to influence the measurements, especially in heard since the latter has a strong tendency to be r-coloured throughout (Ferragne, 2008). The vowel system of roi in Fig. 12 shows that hood and Hudd are very close to each other. Auditorily, the lack of FOOT-STRUT split is confirmed: this is, according to Hickey (2004), not typical of Dublin English in general. Although the distance between them is quite substantial, it is worth noticing that the three words containing a graphic /rS tend to cluster ...
Context 5
... shl system is expected to show the FOOT-GOOSE merger and optionally, in terms of realization, some retraction for the KIT vowel. We anticipate that rhoticity may affect the measurements; Ferragne and Pellegrino (2010) showed that -as opposed to gla - rhoticity is maintained here by all speakers. In Fig. 13 the vowel system of shl is illustrated. The shortest distance (both in the tree and the MDS plot) can be found between hood and who'd, which supports the FOOT-GOOSE merger (see however Section 3.3). Notice that, contrary to gla, the vowel of hid does not cluster with central vowels, which is borne out by our perceptual impression. ...
Context 6
... accent of Ulster (uls) is expected to show a vowel system very close to that of Scottish English in that hood and who'd should have the same phoneme. On the phonetic level, these two vowels are quite front, and rhoticity may affect the quality of the vowel in hard, hoard, and heard. The vowel system of uls is shown in Fig. 14. There is rather good agreement between the tree and the MDS plot. According to the dendrogram, the shortest distance can be found between hood and who'd. This is not exactly the case in the MDS plot although the distance between hood and who'd is among the shortest. Both graphs in Fig. 14 show that hood and who'd pattern with front ...
Context 7
... hoard, and heard. The vowel system of uls is shown in Fig. 14. There is rather good agreement between the tree and the MDS plot. According to the dendrogram, the shortest distance can be found between hood and who'd. This is not exactly the case in the MDS plot although the distance between hood and who'd is among the shortest. Both graphs in Fig. 14 show that hood and who'd pattern with front vowels as anticipated. The separate group composed of hard, hoard, and heard may arise from their proximity in terms of rhoticity. It must be borne in mind that rhoticity sometimes surfaces as an r-colouring spanning the whole vowel. Therefore, in such cases, our attempt to play down the ...
Context 8
... smoothing density estimates ( Everitt et al., 2001, pp. 16-20) were computed from the raw duration of the restricted set of vowels under study. The difference between the two duration distribution estimates in each accent is expressed in Fig. 15 as the Jensen-Shannon divergence (multiplied by10 4 for the sake of legibility), which is a symmetric version of the Kullback-Leibler (Kullback, 1968) distance used to gauge the distance between two probability distributions. A rapid look at Fig. 15 shows that four accents (ean, gla, nwa, sse) exhibit a strong divergence between the ...
Context 9
... The difference between the two duration distribution estimates in each accent is expressed in Fig. 15 as the Jensen-Shannon divergence (multiplied by10 4 for the sake of legibility), which is a symmetric version of the Kullback-Leibler (Kullback, 1968) distance used to gauge the distance between two probability distributions. A rapid look at Fig. 15 shows that four accents (ean, gla, nwa, sse) exhibit a strong divergence between the two vowels, which very likely reflects a reliable distinction in duration. Conversely, the other 9 accents probably have no robust durational differences between the analysed ...
Context 10
... far as hood and Hudd are concerned (brm, eyk, lan, lvp, ncl, and roi), the low divergence in Fig. 15 and a careful inspection of the density estimates (there is an almost perfect overlap in each accent) suggest that duration could in any case not constitute a reliable cue to distinguish the two vowels. So the general picture tends to confirm that in the accents we have just mentioned, Hudd and hood (i.e. FOOT and STRUT) are one single ...
Context 11
... the biggest divergences in Fig. 15 (ean, nwa, and sse) exemplify pairs of vowels whose acoustic proximity is mainly disambiguated thanks to duration. In Fig. 15, we did not expect any difference in duration between hid and head in crn because there is absolutely no indication in the literature (Wells, 1982) that one of these two vowels should be longer than the other. ...
Context 12
... the biggest divergences in Fig. 15 (ean, nwa, and sse) exemplify pairs of vowels whose acoustic proximity is mainly disambiguated thanks to duration. In Fig. 15, we did not expect any difference in duration between hid and head in crn because there is absolutely no indication in the literature (Wells, 1982) that one of these two vowels should be longer than the other. So, the spectral similarity between hid and head in crn, as it is suggested by the dendrogram, is not counterbalanced by a ...
Context 13
... the mapping from the distance matrix to the final visual output. This aspect requires further investigation because different techniques yield very different results, as evidenced by the comparison between dendrograms and MDS plots. If we compare the shortest pairwise distance per accent from the matrix (the vowel pairs in question are shown Fig. 15) with those computed with hierarchical clustering or MDS, it appears that the tree diagrams in Section 3 are more faithful to shortest dissimilarities. Depending on the MDS technique, more or less weight will be given to the accuracy of small or large distances (Izenman, 2008); thus, a genuine benchmark of the various MDS algorithms ...
Context 14
... between two individual distance matrices (see Section 2.2) adequately reflects their proximity. Therefore, for each accent, mean vocalic distance matrices were computed; and distances between pairs of accents were expressed as 1 minus a correlation coefficient which was used to build a hierarchical clustering tree, again with average linkage (Fig. 16). Simultaneously, the distances between the original 261 vowel matrices were computed (as correlations) yielding a proximity matrix between the 261 speakers. The matrix was further converted to a dissimilarity matrix and submitted to non-metric MDS. In Fig. 17, for each accent, mean MDS coordinates were used to express accent centroids, ...
Context 15
... which was used to build a hierarchical clustering tree, again with average linkage (Fig. 16). Simultaneously, the distances between the original 261 vowel matrices were computed (as correlations) yielding a proximity matrix between the 261 speakers. The matrix was further converted to a dissimilarity matrix and submitted to non-metric MDS. In Fig. 17, for each accent, mean MDS coordinates were used to express accent centroids, and the 95% confidence interval of the mean represent the ...
Context 16
... Fig. 16, going from the root to the leaves, the first split separates Scottish and Irish varieties (gla, shl, roi, and uls) from the rest. This group of 4 accents further shows a division between the 2 Irish and the 2 Scottish accents. As for the remaining accents, lan and lvp constitute a separate group. Then, the following split draws a ...
Context 17
... As for the remaining accents, lan and lvp constitute a separate group. Then, the following split draws a distinction between linguistically southern (sse, brm, and ean) and northern (eyk, ncl, and nwa) accents: this rough bipartition may sound inaccurate for most dialectologist; that is why we will return to it shortly (Section 4.3). The MDS plot (Fig. 17) agrees with the dendrogram in that it highlights the separate group composed of lan and lvp. It also accurately reproduces the short distances between eyk, ncl, and nwa. The Scottish and Irish varieties no longer pattern together: while a cluster comprising gla, shl, and uls seems to emerge, roi, contrary to Fig. 16, does not pattern ...
Context 18
... (Section 4.3). The MDS plot (Fig. 17) agrees with the dendrogram in that it highlights the separate group composed of lan and lvp. It also accurately reproduces the short distances between eyk, ncl, and nwa. The Scottish and Irish varieties no longer pattern together: while a cluster comprising gla, shl, and uls seems to emerge, roi, contrary to Fig. 16, does not pattern with ...
Context 19
... (while it is generally thought to be rather northern) can be readily explained by our auditory analysis: only half of the speakers in the sample have one single phoneme for FOOT and STRUT words (which is a typically northern trait) while the remaining half have two separate phonemes (typically southern). Note however that the position of brm in Fig. 17 is more ambiguous than in Fig. 16. It is not surprising to see nwa pattern with the accents of the north of England; as Penhallurick (2004, p. 103) explains, North Wales is geographically close to (linguistically) northern English accents. However, although Penhallurick mentions the possibility of FOOT rhyming with STRUT in North ...
Context 20
... be rather northern) can be readily explained by our auditory analysis: only half of the speakers in the sample have one single phoneme for FOOT and STRUT words (which is a typically northern trait) while the remaining half have two separate phonemes (typically southern). Note however that the position of brm in Fig. 17 is more ambiguous than in Fig. 16. It is not surprising to see nwa pattern with the accents of the north of England; as Penhallurick (2004, p. 103) explains, North Wales is geographically close to (linguistically) northern English accents. However, although Penhallurick mentions the possibility of FOOT rhyming with STRUT in North Wales, we have found no such systemic ...
Context 21
... between nwa and northern accents. This proximity is probably best accounted for by similarities on the phonetic, realizational level. For instance, the proximity between had and hard found in nwa is also typical of the eyk sample, the rather front quality of heard is attested in brm, eyk, lvp, and ncl. While roi and uls cluster together in Fig. 16, our previous work (Ferragne, 2008), which involved more vowel types, showed that uls was closer to gla and shl (than it was to roi). The grouping of uls, gla, and shl in Ferragne (2008) was more consistent with the literature ( Hughes et al., 2005;Wells, 1982) because -among other possible reasons - there was an influx of Scots ...
Context 22
... with the literature ( Hughes et al., 2005;Wells, 1982) because -among other possible reasons - there was an influx of Scots settlers in Ulster in the 17th century (Hickey, 2004). Our current result, with Irish accents on one side and Scottish accents on the other, probably arises because fewer vowel types were analysed. Here again the MDS plot in Fig. 17 seems to contradict Fig. 16 since roi appears near the accents of the south of ...
Context 23
... et al., 2005;Wells, 1982) because -among other possible reasons - there was an influx of Scots settlers in Ulster in the 17th century (Hickey, 2004). Our current result, with Irish accents on one side and Scottish accents on the other, probably arises because fewer vowel types were analysed. Here again the MDS plot in Fig. 17 seems to contradict Fig. 16 since roi appears near the accents of the south of ...

Similar publications

Article
Full-text available
Multidimensional scaling is a technique for exploratory analysis of multidimensional data. The essential part of the technique is minimization of a multimodal function with unfavorable properties like invariants and non‐differentiability. In this paper a two‐level optimization based on combinatorial optimization and systems of linear equations is p...
Article
Full-text available
Tests and surveys have become common for acquiring information about edu-cational results. On the one hand, certain learning disorders can be diagnosed by solving tests at different development stages of children. Comparing the results of two tests separated in time, it is possible to determine whether the skills of a child are evolving at the prop...
Article
Full-text available
We examined how two distinct stimulus features, orientation and color, interact as contributions to global stimulus dissimilarity. Five subjects rated dissimilarity between pairs of bars (N = 30) varying in color (four cardinal hues, plus white) and orientation (six angles at 30° intervals). An exploratory analysis with individual-differences multi...

Citations

... Other major works on the acoustics of BrE include Deterding (1997), who analysed the formants of Standard Southern British English (SSBE) from the MARSEC data base where words in citation form are compared with those from words in connected speech, Hawkings and Midgley (2005) who also discovered that the frequencies of the vowels vary with age group, Ferragne and Pellegrino (2010), and Bjelakovic (2016). The magnitude of the changes that the English language has undergone over time has led Wells (2001) and Lindsey (2014), as cited by Bjelakovic (2016), to think of a possible modification of the phonetic symbols that are used to represent RP nucleus in some words. ...
Article
Full-text available
This study aims at studying the duration and the formants of the monophthongs /i/, /ɪ/, /a/, /æ/, /ɔ/ and /ɒ/ in the production of 19 undergraduate Cameroonian ESL (CamESL) students. The sounds put in hvd environment were read by the students in a calm environment, recorded and analysed using PRAAT version 6.1.16. The analysis of duration revealed that the students clearly distinguished between long and short vowels, but the study of the formants of the sounds indicated that no major distinction was made between the sound pairs, therefore resulting in the partial merger. The plot of the vowels also revealed a significant within-gender dispersion. It can therefore be concluded that CamESL learners’ productions were characterised by partial merger, inaccurate tongue advancement and within-gender dispersion.
... Linguistic features consider the syntactic structure of transcribed words, articulatory attributes represent place and manner of human phoneme articulation, and phonetic distance refers to distance matrix among acoustic features for the set of phoneme utterances. 6,7 Linguistic features may not be applicable in certain circumstances, such as for very short duration speech samples with insufficient phonemes, or low resource languages without text transcription. On the other hand, acoustic accent recognition is usually based only on acoustic features of the speech, which describe the physical characteristics of the sound produced. ...
... Most of the research involving the extraction of paralinguistic information from speech, has been traditionally focused on emotion recognition, speaker diarization, and speaker identification. [1][2][3][4][5][6][7][8][9][10][11][12] The proposed models predominantly convert short-term features of speech extracted from small stationary time windows, into a long-term feature vector, before applying a classifier for discrimination. The fixed-size long-term vector is calculated by statistical modelling over short-term features for complete utterance, and contains information of a longer duration of speech irrespective of the linguistic content. ...
Article
Full-text available
Social background profiling of speakers refers to estimating the geographical origin of speakers by their speech features. Methods for accent profiling that use linguistic features, require phoneme alignment and transcription of the speech samples. This paper proposes a purely acoustic accent profiling model, composed of multiple convolutional networks with global average-pooling layers, to classify the temporal sequence of acoustic features. The bottleneck representations of the convolutional networks, trained with the original signals and their low-pass filtered copies, are fed to a Support Vector Machine classifier for final prediction. The model has been analysed for a speech dataset of Indian speakers from social backgrounds spread across India. It has been shown that up to 85% accuracy is achievable for classifying the geographic origin of speakers corresponding to regional Indian languages; 17% higher than the benchmark deep learning model using the same features. Results have also indicated that classification of accents is easier using the second language of the speakers, as compared to their native language.
... Although thresholding may be applied to discard aberrant formant values [25], such a procedure may lead to discarding a large number of vowels (see e.g. [26]) and can hardly be applied to nasal vowels. In order to avoid including erroneous measures in analyses, we use MFCC measures to reliably capture the spectral characteristics of target vowels, following [26]. ...
... [26]) and can hardly be applied to nasal vowels. In order to avoid including erroneous measures in analyses, we use MFCC measures to reliably capture the spectral characteristics of target vowels, following [26]. A custom Praat [26] script is applied to extract on each target vowel 12 MFCCs (not including coefficient 0 related to the overall level of energy) on a 15 ms frame centred on the middle of the vowel, using a filter bank spaced by 100 Mel. ...
... In order to avoid including erroneous measures in analyses, we use MFCC measures to reliably capture the spectral characteristics of target vowels, following [26]. A custom Praat [26] script is applied to extract on each target vowel 12 MFCCs (not including coefficient 0 related to the overall level of energy) on a 15 ms frame centred on the middle of the vowel, using a filter bank spaced by 100 Mel. ...
Conference Paper
Full-text available
This paper presents a study on the temporal and spectral characteristics of oral and nasal vowels in French read speech, produced by speakers from 25 to 90 years. The study utilizes a set of eight vowel categories and combines classical vowel space-related metrics with MFCC parametrization to accurately include nasal vowels. In total 15,375 vowels were analyzed from data from 37 French speakers (20 females, 17 males). Our data showed (a) a slowing down (increase in vowel duration) with age and (b) no clear change in the overall centralization/dispersion of vowels with increasing age. However, for older male speakers, there was a decrease in the variability within vowel categories and a moderate increase in the separation between categories. Additionally, both male and female speakers demonstrated a greater acoustic distinction between oral and nasal vowels with aging.
... Since they were 'borrowed' from the speech processing literature into linguistic studies, cepstral coefficients have proven to be an informative measure regarding the strength of various types of contrasts [9], including the voicing contrast in fricatives. Thus, cepstral coefficients were previously successful in categorizing English obstruent bursts [10], English vowels [11], Romanian fricatives [12,13], Russian sibilant fricatives [14], Azerbaijani fricatives [15], and Greek fricatives [16]. The potential advantages of using this measure in phonetic studies have been discussed extensively in recent literature. ...
... The potential advantages of using this measure in phonetic studies have been discussed extensively in recent literature. For instance, [11] recommended Mel-frequency cepstral coefficients (MFCCs) as a means to compute distances between vowels. This method yielded a very good estimate of the acoustic distance between 13 different accents of the British Isles, leading the authors to conclude that "the argument that MFCCs cannot be wrong (while formants can) provides strong support for the use of MFCCs in phonetic studies, if only for practical reasons" (p. ...
... Recent studies attempt to detect and describe which cepstral coefficients encode selected types of information (Spinu & Lilley, 2016;Spinu, Kochetov, & Lilley, 2018), but straightforward interpretation of cepstrum in terms of articulatory gestures is still very limited. However, different variants of cepstral coefficients constitute a good model for the signal and prove to be effective for classification (Ferragne & Pellegrino, 2010;Bunnell, Polikoff, & McNicholas, 2004;Spinu, Vogel, & Timothy Bunnell, 2012;Spinu & Lilley, 2016;Spinu et al., 2018), which is the basis for CAST systems (Shahin et al., 2020;Krecichwost, Mocko, & Badura, 2021). Apart from cepstral analysis, some authors develop new features that provide original information (Koenig et al., 2013) or allow for low-dimensional speech description (Plummer & Reidy, 2018). ...
Article
Full-text available
This study addresses Polish retroflex sibilants /ʂ, ʐ/ produced by preschool children. The aims of our research were (1) to explore acoustic characteristics of normal and distorted (dental and interdental) articulation patterns of retro-flex fricatives and (2) to define and verify new acoustic features of frication noise. We extracted and analyzed a set of 80 acoustic features, including full-spectrum-based metrics (linear cepstral coefficients, mel-frequency cepstral coefficients, spectral moments) and noise-based metrics (noise energies, fricative formants, and original features: noise cepstral coefficients and fricative formant relations). The analysis involved linear mixed-effects models and Spearman's rank correlation over speech samples from 42 Polish children (21 with normal and 21 with distorted pronunciation). Normal articulation of Polish retroflex sibilants proved to be acoustically distinguishable from non-normative interdental and dental articulation. Significant acoustic differences between the classes considered were found both in the full-spectrum-based and noise-based features, including our proposed measures (p < 0:05). Thirty-six of 80 analyzed features proved significantly correlated (Spearman's jqj > 0:5; p < 0:05) with tongue position and front-cavity size. More evident cues for articulation pathologies were found in the voiceless sibilant /ʂ/ than in /ʐ/. Our study confirms that metrics describing the structure of frication noise bring information distinctive in particular articulatory oppositions for a more comprehensive acoustic description of sibilants.
... To mitigate these shortcomings, acoustic approaches have been developed for investigating language variation (Huckvale, 2007;Ferragne and Pellegrino, 2010;Strycharczuk et al., 2020;Bartelds et al., 2020). However, these studies either exclusively focused on the vowels (ignoring differences in the consonants), or were negatively influenced by non-linguistic variation in the speech signal. ...
Preprint
Full-text available
Deep acoustic models represent linguistic information based on massive amounts of data. Unfortunately, for regional languages and dialects such resources are mostly not available. However, deep acoustic models might have learned linguistic information that transfers to low-resource languages. In this study, we evaluate whether this is the case through the task of distinguishing low-resource (Dutch) regional varieties. By extracting embeddings from the hidden layers of various wav2vec 2.0 models (including new models which are pre-trained and/or fine-tuned on Dutch) and using dynamic time warping, we compute pairwise pronunciation differences averaged over 10 words for over 100 individual dialects from four (regional) languages. We then cluster the resulting difference matrix in four groups and compare these to a gold standard, and a partitioning on the basis of comparing phonetic transcriptions. Our results show that acoustic models outperform the (traditional) transcription-based approach without requiring phonetic transcriptions, with the best performance achieved by the multilingual XLSR-53 model fine-tuned on Dutch. On the basis of only six seconds of speech, the resulting clustering closely matches the gold standard.
... Another ACCDIST-based system was demonstrated to observe accent variation among a larger number of accents from across the British Isles in Ferragne and Pellegrino (2010), which also took advantage of the variation in phonetic realisations. In their study, Ferragne and Pellegrino took controlled wordlist data and created an ACCDIST-based model of the vowel systems of 261 speakers who represented 13 accents from the Accents of the British Isles (ABI) corpus (D'Arcy et al., 2004). ...
Article
Full-text available
Dialect variation spans different linguistic levels of analysis. Two examples include the typical phonetic realisations produced and the typical range of intonational choices made by individuals belonging to a given dialect group. Taking the modelling principles of a specific automatic accent recognition system, the work here characterises and observes the variation that exists within these two levels of analysis among eight Arabic dialects. Using a method that has previously shown promising performance on English accent varieties, we first model the segmental level of analysis from recordings of Arabic speakers to capture the variation in the phonetic realisations of the vowels and consonants. In doing so, we show how powerful this model can be in distinguishing between Arabic dialects. This paper then shows how this modelling approach can be adapted to instead characterise prosodic variation among these same dialects from the same speech recordings. This allows us to inspect the relative power of the segmental and prosodic levels of analysis in separating the Arabic dialects. This work opens up the possibility of using these modelling frameworks to study the extent and nature of phonetic and prosodic variation across speech corpora.
... To mitigate these shortcomings, acoustic approaches have been developed for investigating language variation (Huckvale, 2007;Ferragne and Pellegrino, 2010;Strycharczuk et al., 2020;Bartelds et al., 2020). However, these studies either exclusively focused on the vowels (ignoring differences in the consonants), or were negatively influenced by non-linguistic variation in the speech signal. ...
... Many studies have been conducted to investigate differences between accents in British English and have focused on a variety of accents, such as Cockney, Welsh, Received Pronunciation, and Estuary English, demonstrating the great diversity in the way English is pronounced in different regions of the UK (Ferragne & Pellegrino, 2010;Gordon, 2004;Grice, Ladd, & Arvaniti, 2002;Gussenhoven, 2008). ...
Article
This study is concerned with comparing the pronunciation in Southern Welsh, a Celtic language, and Cockney, an English dialect , regarding the place of articulation. The study uses a comparative method to shed light on the similarities and differences between the two accents. The data were collected from YouTube videos of speakers of Southern Welsh and Cockney and the consonant sound systems were analysed and compared. This study answers two main research questions: Do Southern Welsh and Cockney accents have the same consonants? What are the phonological differences between Southern Welsh and Cockney regarding place of articulation? The findings show that there are some phonological differences between Sothern Welsh and Cockney in terms of bilabial, labiodental, dental, alveolar, lateral, palatal, velar, and uvular sounds. However, they are similar in terms of post-alveolar and glottal sounds. Awareness of these phonological differences is important for EFL learners to develop strong competencies in dealing with these accents which are gaining an increasing popularity due to the unprecedented spread of social media networks and applications.
... Il n'y a pas d'innovation technique marquante ; c'est peut-être la raison pour laquelle j'ai attendu la n de ma thèse pour me livrer à ce type de travail alors qu'il aurait probablement dû constituer la première analyse de mon doctorat ! L'autre article de description phonétique des accents publié la même année (Ferragne et Pellegrino, 2010a) promettait, quant à lui, une histoire bien plus stimulante. En s'appuyant sur les distances entre voyelles calculées dans l'espace de paramètres typiques en technologies de la parole, les Mel-Frequency Cepstral Coecients, MFCC, cet article tentait de présenter une nouvelle façon de faire de la description phonétique des voyelles. ...
... L'article sur les distances (Ferragne et Pellegrino, 2010a) dans l'espace MFCC n'a pas eu l'impact que j'avais escompté. Il a évidemment souert des faiblesses que je viens de mentionner. ...
Thesis
Full-text available
Since scientists’ individual epistemological preferences infallibly shape the output of their research, this thesis starts with a presentation of the author’s position with respect to a number of methodological issues pertaining to the field of contemporary phonetics. Such concepts as corpora, experimental techniques, quantitative methods, and the role of technology are discussed with the aim of making the author’s scientific values and biases more explicit. The following chapters offer a selection of research works the author has carried out since his PhD in 2008. They show an evolution from corpus-based acoustic phonetics to more experimental protocols involving a great diversity of instruments and data types. From the automatic classification and acoustic-articulatory description of British Isles accents to the development of the gradient phonemicity hypothesis; from the study of speech rhythm to psycholinguistic experiments with French learners of English, the thesis covers the main findings and highlights how this wide array of interests and methods has served two consistent goals: an agnostic approach to new puzzles, and the possibility to efficiently help students develop their own scientific identity. The final part of the thesis addresses the forthcoming paradigm shift that deep learning will bring about in many academic fields with illustrations from the author’s recent work.