FIG 3 - uploaded by Saswati Rabha
Content may be subject to copyright.
(Color online) Illustration of waveform and VTC evidence for (a) unaspirated nasal, (b) aspirated nasal, and example of data annotation with (c) frication, aspiration, and vowels marked for /s h am/, (d) nasal, aspiration, vowel marked for /m h a/.

(Color online) Illustration of waveform and VTC evidence for (a) unaspirated nasal, (b) aspirated nasal, and example of data annotation with (c) frication, aspiration, and vowels marked for /s h am/, (d) nasal, aspiration, vowel marked for /m h a/.

Source publication
Article
Full-text available
Unlike aspiration in stops, occurrence of aspiration in non-stop consonants is quite rare. Most of the languages that have aspirated non-stop consonants are low-resource languages. Hence, data driven, quantitative, and statistical analysis of their aspiration phenomena is fairly limited. Rabha and Angami are considered in this study, as previous st...

Contexts in source publication

Context 1
... normalized values in the VTC evidence (V) are the highest for voiced bars due to the complete closure of the vocal tract. Contrarily, VTC values are the lowest for low vowels, which are produced with a wide open vocal tract configuration (Sarma and Prasanna, 2014; Fig. 3). In the case of nasals, too, vocal tract is completely closed, and the presence of nasal tract introduces a low first formant at around 250 Hz. This results in a high VTC value for nasals as well. Figure 3 shows the waveform and corresponding VTC values for a nasal sound followed by a low vowel. It can be seen that the VTC evidence ...
Context 2
... results in a high VTC value for nasals as well. Figure 3 shows the waveform and corresponding VTC values for a nasal sound followed by a low vowel. It can be seen that the VTC evidence has higher values in the nasal region, and has lower values in the vowel region. ...
Context 3
... explore this nature of the VTC values to discriminate between the aspirated and unaspirated nasals. Figures 3(a) and 3(b) show the waveform and the VTC evidence for an unaspirated and an aspirated nasal. It can be clearly seen that the VTC has lower values for the aspirated nasal compared to the unaspirated one. ...

Citations

... A linear decision-level fused model was obtained by linearly weighing the prediction probabilities from individual L 0 classifiers by α, β, and γ, respectively [30,43,44]. The final prediction (P F ) was obtained by the weighted summation of individual prediction probabilities using Eq (3) [45]. ...
Article
Full-text available
Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen.
... Additionally, secondary articulations such as palatalization or aspiration can complexify the articulatory and acoustic structure observed in fricatives. Though typologically rare, phonologically aspirated voiceless fricatives involve, for instance, the production of both frication and aspiration noise, leading to further challenges in their characterization (Rabha et al., 2019). ...
... Similarly, the method we propose can also be useful for studying fricatives with secondary features such as palatalization (as in Russian) or aspiration (as in Korean). In both cases, clearly identifying the frication noise section can be crucial for identifying the phoneme (Rabha et al., 2019). ...
Article
Acoustic variation is central to the study of speaker characterization. In this respect, specific phonemic classes such as vowels have been particularly studied, compared to fricatives. Fricatives exhibit important aperiodic energy, which can extend over a high-frequency range beyond that conventionally considered in phonetic analyses, often limited up to 12 kHz. We adopt here an extended frequency range up to 20.05 kHz to study a corpus of 15 812 fricatives produced by 59 speakers in Russian, a language offering a rich inventory of fricatives. We extracted two sets of parameters: the first is composed of 11 parameters derived from the frequency spectrum and duration (acoustic set) while the second is composed of 13 mel frequency cepstral coefficients (MFCCs). As a first step, we implemented machine learning methods to evaluate the potential of each set to predict gender and speaker identity. We show that gender can be predicted with a good performance by the acoustic set and even more so by MFCCs (accuracy of 0.72 and 0.88, respectively). MFCCs also predict individuals to some extent (accuracy = 0.64) unlike the acoustic set. In a second step, we provide a detailed analysis of the observed intra- and inter-speaker acoustic variation.
... The final classification probability (P L ) was obtained by the weighted summation of individual prediction probabilities using Eq. (2) 57 . ...
Article
Full-text available
Non-small Cell Lung Cancer (NSCLC) is a heterogeneous disease with a poor prognosis. Identifying novel subtypes in cancer can help classify patients with similar molecular and clinical phenotypes. This work proposes an end-to-end pipeline for subgroup identification in NSCLC. Here, we used a machine learning (ML) based approach to compress the multi-omics NSCLC data to a lower dimensional space. This data is subjected to consensus K-means clustering to identify the five novel clusters (C1–C5). Survival analysis of the resulting clusters revealed a significant difference in the overall survival of clusters (p-value: 0.019). Each cluster was then molecularly characterized to identify specific molecular characteristics. We found that cluster C3 showed minimal genetic aberration with a high prognosis. Next, classification models were developed using data from each omic level to predict the subgroup of unseen patients. Decision‑level fused classification models were then built using these classifiers, which were used to classify unseen patients into five novel clusters. We also showed that the multi-omics-based classification model outperformed single-omic-based models, and the combination of classifiers proved to be a more accurate prediction model than the individual classifiers. In summary, we have used ML models to develop a classification method and identified five novel NSCLC clusters with different genetic and clinical characteristics.
... A linear decision-level fused model was obtained by linearly weighing the prediction probabilities from individual L 0 classifiers by α, β, and γ, respectively (Pavlidis et al., 2002;Potamianos et al., 2003;Oh and Kang, 2017). The final prediction (P F ) was obtained by the weighted summation of individual prediction probabilities using (3) (Rabha et al., 2019). ...
Preprint
Full-text available
Cancer is a heterogeneous disease and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroup of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification irrespective of the tissue of origin. In this work, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to lower dimension using machine learning (ML) algorithm. This data was then clustered to identify multi-omics based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease free survival (DFS) (p-value < 0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that patients with different tumor types can be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen.
... In the case of linear decision-level fused model, the prediction probabilities obtained from L 0 models were weighted by α, β , and γ, respectively 17,29 . The final classification probability (P L ) was obtained by the weighted summation of individual prediction probabilities using equation (2) 56 . ...
Preprint
Full-text available
NSCLC is a heterogeneous disease with a poor prognosis. Identifying novel subtypes can help classify patients with similar molecular and clinical phenotypes and treatment responses. This study uses a machine learning-based approach to compress the multi-omics (mRNA, miRNA, methylation, and protein expression) NSCLC data to a lower dimensional space. This data is subjected to consensus K-means clustering to identify the five novel clusters (C1-C5). Survival analysis of the resulting clusters revealed a significant difference in the overall survival of different clusters (p-value: 0.019). Each cluster was then molecularly characterized to identify specific molecular characteristics. We found that cluster C3 showed minimal genetic aberration with a high prognosis. Next, classification models were developed using data from each omic level to predict the subgroup of unseen patients. A decision-level model was then built using these classifiers, which was used to classify unseen patients into five novel clusters. We also showed that the multi-omics-based classification model outperformed single-omic-based models, and the combination of classifiers proved to be a more accurate prediction model than the individual classifiers. In summary, we have used ML models to develop a classification method and identified five novel NSCLC clusters with different genetic and clinical characteristics. This classification can prove to help decide the therapeutic option for NSCLC patients.
... Additionally, secondary articulations such as palatalization or aspiration can complexify the articulatory and acoustic structure observed in fricatives. Though typologically rare, phonologically aspirated voiceless fricatives involve, for instance, the production of both frication and aspiration noise, leading to further challenges in their characterization (Rabha et al., 2019). ...
... Similarly, the method we propose can also be useful for studying fricatives with secondary features such as palatalization (as in Russian) or aspiration (as in Korean). In both cases, clearly identifying the frication noise section can be crucial for identifying the phoneme (Rabha et al., 2019). ...
Article
Full-text available
This paper shows that machine learning techniques are very successful at classifying the Russian voiceless non-palatalized fricatives [f], [s], and [ʃ] using a small set of acoustic cues. From a data sample of 6320 tokens of read sentences produced by 40 participants, temporal and spectral measurements are extracted from the full sound, the noise duration, and the middle 30 ms windows. Furthermore, 13 mel-frequency cepstral coefficients (MFCCs) are computed from the middle 30 ms window. Classifiers based on single decision trees, random forests, support vector machines, and neural networks are trained and tested to distinguish between these three fricatives. The results demonstrate that, first, the three acoustic cue extraction techniques are similar in terms of classification accuracy (93% and 99%) but that the spectral measurements extracted from the full frication noise duration result in slightly better accuracy. Second, the center of gravity and the spectral spread are sufficient for the classification of [f], [s], and [ʃ] irrespective of contextual and speaker variation. Third, MFCCs show a marginally higher predictive power over spectral cues (<2%). This suggests that both sets of measures provide sufficient information for the classification of these fricatives and their choice depends on the particular research question or application.
... The analysis of allophones, representing very short fragments of speech, which are defined as variants of phonemes, is a field of speech analysis that still poses many challenges (Mitterer et al., 2018;Recasens, 2012). Such a "microscopic" approach to speech analysis opens new perspectives in numerous areas, e.g., pre-lexical processing in spoken-word recognition, allophonic and phonemic identity in speech recognition (Rabha et al., 2019), automatic evaluation of pronunciation (Shahin and Ahmed, 2019), differences between speakers' native regional accents (Aubanel and Nguyen, 2010), or the evaluation of speech articulatory disorders (Jiao et al., 2017) as it allows a more in-depth speech analysis. ...
Article
The approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers' pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones were prepared by experts in English phonetics and phonology. The datasets created include recordings of words pronounced by nine native English speakers of standard southern British accent and 20 Polish L2 English users. Complete unedited words are treated as input data for feature extraction and classification algorithms such as k-nearest neighbors, naive Bayes method, long-short term memory, and convolutional neural network (CNN). Various signal representations, including low-level audio features, the so-called mid-term and feature trajectory, and spectrograms, are tested in the context of their usability for the detection of aspiration. The results obtained show high potential for an automated evaluation of pronunciation focused on a particular phonological feature (aspiration) when classifiers analyze whole words. Additionally, CNN returns satisfying results for the automated classification of words containing aspirated and unaspirated allophones produced by Polish L2 speakers.
... Aspirated nasals and aspirated fricatives are typo-logically rare in world languages and the literature to characterize them is very limited [1]. Aspirated nasals are reported in 0.2% of the total languages mentioned in the PHOIBLE database whereas aspirated fricatives are reported in 1% of the total languages [2]. Unlike Korean that has some literature available on the acoustic qualities for its aspirated fricatives [3], [4], aspirated fricatives in Rabha and aspirated nasals in Angami are rarely described [2], [5], [6]. ...
... Aspirated nasals are reported in 0.2% of the total languages mentioned in the PHOIBLE database whereas aspirated fricatives are reported in 1% of the total languages [2]. Unlike Korean that has some literature available on the acoustic qualities for its aspirated fricatives [3], [4], aspirated fricatives in Rabha and aspirated nasals in Angami are rarely described [2], [5], [6]. Hence, in this work, we conduct an acoustic study on aspiration in the widely documented aspirated fricatives in Korean and compare the study with the aspiration in the rarely documented aspirated fricatives and aspirated nasals in Rabha and Angami respectively. ...
... Korean is the national language of South Korea and North Korea and is spoken by 77,264,890 speakers worldwide [12]. Along with Korean [5], [13], a phonological distinction between unaspirated /s/ and aspirated /s h / has been observed in the Rabha language as well [2]. Also, nasals in Angami have shown contrast based on aspiration, thus contributing to /m/-/m h /, /n/-/n h / and /N/-/N h / pairs in the language [2]. ...
Conference Paper
Full-text available
This paper focuses on the phonetic analysis of Korean and Rabha fricatives and Angami nasals. Though aspirated consonants have been studied earlier, very few studies were found for the comparative study of aspirated fricatives and aspirated nasals. Previous literature has suggested the presence of aspirated fricatives. As there are limited studies on aspirated nasals, this paper tries to investigate the properties of aspirated nasals by comparing them with the aspiration in Korean aspirated fricative and analyses the feature that might distinguish between the aspirated and unaspirated counterparts of both the consonants. Features such as Intensity, Duration, Centre of Gravity (COG), F1 onset and Spectral tilt (H1-H2) are used to investigate whether there is a distinction between the aspirated and unaspirated fricatives and nasals. Results confirm that COG is a distinctive acoustic cue to discriminate the aspirated and unaspirated counterparts.
Article
Through comparison of regular sound correspondences in three closely related Tibeto-Burman (TB) languages, Ersu, Lizu, and Duoxu (collectively “ELD”), informed by external comparison with other TB languages and recent phonetic analyses of the production of voiceless nasals, we reconstruct *fricative-nasal sequences in their common ancestor, Proto-ELD. In the development of these historic clusters, two pathways of change can be recognized. Their difference lies in the divergent relative phasing of velic and oral gestures in the original fricative-nasal sequences: (i) fricative weakening (from a tight cluster): *FN > N̥ > h̃ > x (ii) fricative strengthening (from a loose cluster): *F-n > *F-t > t > k or *F‑n > s The different reflexes observed in Ersu, Lizu, and Duoxu represent different points along these two developmental pathways. These reconstructions and pathways of development have implications for our understanding of both universal (phonetic) and language-specific aspects of change in fricative-nasal sequences. The first pathway makes it possible to explore the process of nasal devoicing beyond voiceless nasals so as to enrich our understanding of nasal devoicing in natural languages. The co-existence of two opposite pathways of change, on the other hand, provides insights into the morphological and syllabic structure of words with contiguous fricative-nasal sequences in ELD languages at different points in time – insights that may be valuable in examining the history of other languages and language families beyond the ELD cluster.