Figure - available from: International Journal of Speech Technology
This content is subject to copyright. Terms and conditions apply.
a Time-domain waveform, b corresponding spectrogram and cry modes present in the spectrogram of an infant suffering from asthma

a Time-domain waveform, b corresponding spectrogram and cry modes present in the spectrogram of an infant suffering from asthma

Source publication
Article
Full-text available
In this paper, spectrographic analysis of the infant cries is reported. For the spectrographic analysis of the infant cries ten different cry modes are used to analyze differences in different pathological cries. A comparison of spectrograms of the adult speech signal and infant cry signals is given. Based on differences in the distribution of ener...

Similar publications

Article
Full-text available
The decision on financing approval in sharia cooperatives has a high risk of the inability of customers to pay their credit obligations at maturity or referred to as bad credit. To maintain and minimize risk, an accurate method is needed to determine the financing agreement. The purpose of this study is to classify sharia cooperative loan history d...
Article
Full-text available
Support Vector Machine (SVM) is mainly used to classify the data into two categories. To solve the multi-category problems using SVM, researchers used two approaches. The first approach based on solving multiple SVM binary classifiers, whereas another approach based on solving a single optimization problem. In this paper, we have used the first app...
Article
Full-text available
This paper proposed a method of extracting the winter wheat area by combining support vector machine (SVM) with variable fuzzy sets. This method mainly aims to deal with mixed pixels in remote sensing images. The SVM classifier can accurately identify pure winter wheat pixels with training samples. However, winter wheat area information in mixed pi...
Article
Full-text available
UV-Vis spectroscopy has been used as a promising method for coffee quality evaluation including in authentication of several high-economic coffee types. In this paper, we have compared the abilities of linear discriminant analysis (LDA) and support vector machines classification (SVMC) methods for Luwak coffee classification. UV-Vis spectral data o...
Article
Full-text available
Several types of research have been done for early detection of breast cancer so that patient's life can be saved. Previous study was based upon mammogram images. But, mammogram images sometimes give false detection that may endanger the patient's health.. It is necessary to find an alternative method which is easier to implement and efficient also...

Citations

... Vibrato is defined to occur when there are at least four rapid up-and-down movements of fundamental frequency within one expiratory utterance 1,52 and has been studied in the context of congenital heart disease 1 , deafness 47 , and birth asphyxia 1,47 . ...
Preprint
Full-text available
Since the 1960s, neonatal clinicians have known that newborns suffering from certain neurological conditions exhibit altered crying patterns such as the high-pitched cry in birth asphyxia 1,2. Despite an annual burden of over 1.5 million infant deaths and disabilities 3,4 , early detection of neonatal brain injuries due to asphyxia remains a challenge, particularly in developing countries where the majority of births are not attended by a trained physician 5. Here we report on the first inter-continental clinical study to demonstrate that neonatal brain injury can be reliably determined from recorded infant cries using an AI algorithm we call Roseline. Previous and recent work has been limited by the lack of a large, high-quality clinical database of cry recordings, constraining the application of state-of-the-art machine learning. We develop a new training methodology for audio-based pathology detection models and evaluate this system on a large database of newborn cry sounds acquired from geographically diverse settings-5 hospitals across 3 continents. Our system extracts interpretable acoustic biomarkers that support clinical decisions and is able to accurately detect neurological injury from newborns' cries with an AUC of 92.5% (88.7% sensitivity at 80% specificity). Cry-based neurological monitoring opens the door for low-cost, easy-to-use, non-invasive and contact-free screening of at-risk babies, especially when integrated into simple devices like smartphones or neonatal ICU monitors. This would provide a reliable tool where there are no alternatives, but also curtail the need to regularly exert newborns to physically-exhausting or radiation-exposing assessments such as brain CT scans. This work sets the stage for embracing the infant cry as a vital sign and indicates the potential of AI-driven sound monitoring for the future of affordable healthcare.
... The Mel frequency detects frequencies below 1 KHz in a linear scale and detects frequencies above 1 KHz in a logarithmic scale [16]. The Mel scale for a given frequency (f) in Hz is calculated as shown in equation (1): ...
... As the next step, our workflow analyses each tone unit independently of the other. With our SNR levels, the phonetic-scale (10-20 ms) structure of the audio is corrupted, i.e. the short-windowed spectrogram contains too much noise to distinguish between speech, noise, and infant cry automatically [63]. Therefore, the analysis would be better conducted at a syllabic scale (100-250 ms), where the spectrum is more robust to high noise levels and the study of energy modulations is more reliable [62,64,65]. ...
Article
Full-text available
Infant cry is one of the first distinctive and informative life signals observed after birth. Neonatologists and automatic assistive systems can analyse infant cry to early-detect pathologies. These analyses extensively use reference expert-curated databases containing annotated infant-cry audio samples. However, these databases are not publicly accessible because of their sensitive data. Moreover, the recorded data can under-represent specific phenomena or the operational conditions required by other medical teams. Additionally, building these databases requires significant investments that few hospitals can afford. This paper describes an open-source workflow for infant-cry detection, which identifies audio segments containing high-quality infant-cry samples with no other overlapping audio events (e.g. machine noise or adult speech). It requires minimal training because it trains an LSTM-with-self-attention model on infant-cry samples automatically detected from the recorded audio through cluster analysis and HMM classification. The audio signal processing uses energy and intonation acoustic features from 100-ms segments to improve spectral robustness to noise. The workflow annotates the input audio with intervals containing infant-cry samples suited for populating a database for neonatological and early diagnosis studies. On 16 min of hospital phone-audio recordings, it reached sufficient infant-cry detection accuracy in 3 neonatal care environments (nursery—69%, sub-intensive—82%, intensive—77%) involving 20 infants subject to heterogeneous cry stimuli, and had substantial agreement with an expert’s annotation. Our workflow is a cost-effective solution, particularly suited for a sub-intensive care environment, scalable to monitor from one to many infants. It allows a hospital to build and populate an extensive high-quality infant-cry database with a minimal investment.
... There are few studies target studying RDS as a single pathology group; Matikolaie et al. [20] proposed a NCDS to detect newborns suffering from RDS from the healthy and obtained 73.80% accuracy. Chittora et al. [73] presented a spectrographic comparison of the RDS cries, where a double harmonic break was presented, suggesting that resonant study of the cry signal would be helpful in analyzing the RDS cries. Moreover, Lederman et al. [74] classified the preterm infants suffering from RDS from healthy preterm infants and achieved a 63% accuracy using hidden Markov models. ...
Article
Full-text available
: Crying is the only means of communication for a newborn baby with its surrounding environment, but it also provides significant information about the newborn’s health, emotions, and needs. The cries of newborn babies have long been known as a biomarker for the diagnosis of pathologies. However, to the best of our knowledge, exploring the discrimination of two pathology groups by means of cry signals is unprecedented. Therefore, this study aimed to identify septic newborns with Neonatal Respiratory Distress Syndrome (RDS) by employing the Machine Learning (ML) methods of Multilayer Perceptron (MLP) and Support Vector Machine (SVM). Furthermore, the cry signal was analyzed from the following two different perspectives: 1) the musical perspective by studying the spectral feature set of Harmonic Ratio (HR), and 2) the speech processing perspective using the short-term feature set of Gammatone Frequency Cepstral Coefficients (GFCCs). In order to assess the role of employing features from both short-term and spectral modalities in distinguishing the two pathology groups, they were fused in one feature set named the combined features. The hyperparameters (HPs) of the implemented ML approaches were fine-tuned to fit each experiment. Finally, by normalizing and fusing the features originating from the two modalities, the overall performance of the proposed design was improved across all evaluation measures, achieving accuracies of 2.49% and 95.3% by the MLP and SVM classifiers, respectively. The MLP classifier was outperformed in terms of all evaluation measures presented in this study, except for the Area Under Curve of Receiver Operator Characteristics (AUC-ROC), which signifies the ability of the proposed design in class separation. The achieved results highlighted the role of combining features from different levels and modalities for a more powerful analysis of the cry signals, as well as including a neural network (NN)-based classifier. Consequently, attaining a 95.3% accuracy for the separation of two entangled pathology groups of RDS and sepsis elucidated the promising potential for further studies with larger datasets and more pathology groups.
Article
In this study, Dunstan’s infant cry data set is pre-processed with the feature vector approach, including MFCC (19 features) and energy (one feature). By using extracted features and Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Convolutional Neural Network (CNN) classifiers, five classes of infant cry (“Neh” = hungry; “Eh” = need to burp; “Owh” = tired; “Eairh” = stomach cramp; “Heh” = physical discomfort) are distinguished. The proposed MLP and CNN structures are analyzed according to the loss and the accuracy based on the epoch; moreover, to evaluate the performance of classifiers AUC-ROC, Confusion matrix, accuracy, f1_score, recall, and precision have been used. All three classifiers are analyzed, and their results show that the CNN-designed model has the best performance. Results show that the performance will improve by increasing the complexity of the model. With this approach, classifiers are run 10 times, and the average accuracy for SVM for SMOTE and non-SMOTE data are obtained with tolerance 0.823 0.02, 0.861 0.02, respectively. These accuracies for MLP are 0.876 0.01, 0.892 0.01, and finally, for CNN, are 0.921 0.005, 0.911 0.005. At the best condition, an accuracy of 92.1 % is obtained for five classes of infant cries by the proposed CNN structure.
Article
The very first cry of an infant gives vital information about the health of infant, and as they grow the acoustics change with the development of their vocal tract system. This reflects the learning mechanism of infant cry-cause factors, which upon solving will give a huge impact in the areas of medical and household. The behaviour of infant cry records is frequently used for non-invasive infant health inspection and monitoring. Automated approaches for forecasting health status, on the other hand, are highly dependent on the features extracted. In this paper, the diagnostic feasibility of the time domain features to detect and discriminate various cry-cause factors of cry signals is investigated. Mean, peak value, RMS, crest factor, Impulse factor, shape factor, energy, and clearance factor are the features employed in this work. It is discovered that, among the features investigated, RMS is more effective than all other features in detecting cry-cause factors with a Probability value (P) of 2.23307 × 10⁻⁶ and it offers an accuracy of 91.67%, sensitivity of 90%, and specificity of 93.33%.
Conference Paper
A powerful automatic recognition system can have a great impact in decreasing infant mortality by early diagnosing different pathologies affecting newborn life. In this paper, an automatic infant cry classification model is developed to distinguish between normal and asphyxiated infants. This research studies the performance of using discrete wavelet derived Mel frequency cepstrum coefficients in estimating features from infant cry signals. The extracted features are fed to a binary support vector machine classifier and classification accuracy is computed to examine the efficiency of developed model compared to conventional Mel frequency cepstrum feature estimation technique. Results show that using wavelet derived Mel frequency cepstrum extracted features has produced a higher classification accuracy of 98.5% in discriminating normal and asphyxia infant cry signals.