Example of the structure of the folksonomy. Showing only some of the concepts (culture, politics, international, …), superclasses (sports, arts, …), classes (local sports, general…), etc

Source publication

Summary of the resources inventory, audio, and speech segments (SSG)

Inventory of components involved in the system configuration

Rates that have the greatest effect on the classification, according to...

Configurations implemented in the adiUP system

NIST SNR and WADA SNR of speech signals with regard to the signal length

Multilingual audio information management system based on semantic knowledge in complex environments

Article

Full-text available

Dec 2020

This paper proposes a multilingual audio information management system based on semantic knowledge in complex environments. The complex environment is defined by the limited resources (financial, material, human, and audio resources); the poor quality of the audio signal taken from an internet radio channel; the multilingual context (Spanish, Frenc...

Implementation of a System for Assessing the Quality of Spoken English Pronunciation Based on Cognitive Heuristic Computing

Article

Full-text available

Jul 2022
Comput Intell Neurosci

This paper analyzes and investigates the quality assessment of spoken English pronunciation using a cognitive heuristic computing approach and designs a corresponding spoken pronunciation quality assessment system for practical training. Using the general Goodness of Pronunciation assessment algorithm as a benchmark, the shortcomings of the traditional Goodness of Pronunciation method are explored through statistical experiments, and the validity of the overall posterior probability output from the speech model for pronunciation quality assessment is verified. For the analysis of rhythm, there is no common algorithm framework, but in this paper, the F0 similarity algorithm based on dynamic time regularization and the stop similarity algorithm based on forced alignment is proposed for the two main factors of rhythm, intonation, and pause, respectively. After framing, the Hamming window processing is used to make the signal smoother, reduce the side lobe size after fast Fourier transform processing, and solve the problem of spectrum leakage. Compared with the ordinary rectangular window function, the Hamming window can obtain a higher quality spectrum. And combined with CTC for speech recognition modeling, the recognition rates are comparable in the case of using BLSTM and bidirectional threshold cyclic unit BGRU as the hidden layer unit, respectively, and the training time is 23% less than BLSTM using BGRU; in addition, the BGRU-CTC model is improved by using a 2-BGRU-CTC model with 256 hidden layer nodes, so that the error rate of phoneme recognition is reduced to 33%. The effectiveness of the algorithm framework is also verified through experiments, which further proves the effectiveness of our proposed phoneme segment feature and rhyme similarity algorithm.

The Modular Design of an English Pronunciation Level Evaluation System Based on Machine Learning

Article

Full-text available

Jun 2022

To improve the accuracy of English pronunciation level evaluation, we study the modularization of the English pronunciation level evaluation system unfolding based on machine learning. The S3C2440A chip is used as the main processor of the system, and the spoken English recordings are sent to the evaluation module through the speech upload module. In the evaluation module, the pronunciation signal is filtered by the multilayer wavelet feature scale transformation method, and the intonation, speed, pitch, rhythm, and emotion features are decomposed and extracted. The test results show that the misjudgment rate of different mispronunciations is less than 1% when the system is used to evaluate the English pronunciation level, which proves that it has high evaluation accuracy. In-depth study of speech recognition related theories, speech scoring, and pronunciation correction algorithms are discussed, and an assisted learning system based on AP scoring method and pronunciation resonance peak comparison technology is proposed for the problem of inaccurate pronunciation scoring and lack of effective feedback of speech recognition technology applied to oral learning. The English pronunciation training system has achieved the expected pronunciation following of English phonetic symbols and words, real-time pronunciation. The English pronunciation training system has achieved all basic functions such as pronunciation following, real-time pronunciation evaluation, and pronunciation correction of English phonemes and works as expected. After testing, the system has achieved high accuracy in pronunciation scoring, and the similarity with experts’ scoring is over 90% for vowel and word pronunciation; the efficiency of pronunciation correction reaches 80%, which can improve learners’ pronunciation level to a certain extent.

Optimization of Music Feature Recognition System for Internet of Things Environment Based on Dynamic Time Regularization Algorithm

Article

Full-text available

May 2021
COMPLEXITY

Hong Kai

Because of the difficulty of music feature recognition due to the complex and varied music theory knowledge influenced by music specialization, we designed a music feature recognition system based on Internet of Things (IoT) technology. The physical sensing layer of the system places sound sensors at different locations to collect the original music signals and uses a digital signal processor to carry out music signal analysis and processing. The network transmission layer transmits the completed music signals to the music signal database in the application layer of the system. The music feature analysis module of the application layer uses a dynamic time regularization algorithm to obtain the maximum similarity between the test template and the reference. The music feature analysis module of the application layer uses the dynamic time regularization algorithm to obtain the maximum similarity between the test template and the reference template to realize the feature recognition of the music signal and determine the music pattern and music emotion corresponding to the music feature content according to the recognition result. The experimental results show that the system operates stably, can capture high-quality music signals, and can correctly identify music style features and emotion features. The results of this study can meet the needs of composers’ assisted creation and music researchers’ analysis of a large amount of music data, and the results can be further transferred to deep music learning research, human-computer interaction music creation, application-based music creation, and other fields for expansion.

Special issue on developing nature-inspired intelligence by neural systems

Article

Dec 2020
NEURAL COMPUT APPL

ESTIMADOR DE RELACIÓN SEÑAL A RUIDO USANDO REDES NEURONALES PARA RECONOCEDORES AUTOMÁTICOS DE VOZ EN USO HOSPITALARIO

Conference Paper

Full-text available

Nov 2023

Franco Bautista

Se desarrolla un estimador de relación señal a ruido (SNR) con el fin de determinar la calidad de registros sonoros de habla que ingresen a sistemas de reconocimiento automático de voz (ASR) para su transcripción en un contexto médico. Debido a que se necesita obtener un valor de SNR a partir de una única señal, sin poseer una referencia, se crean modelos de detección de actividad de voz haciendo uso de una red neuronal de arquitectura híbrida (CNN y BILSTM), y de dos sets de entrenamiento, uno con datos similares a los de su uso final y otro con ejemplos más generales. A través de un algoritmo que utiliza la información mencionada se logra estimar la relación señal a ruido con un error máximo de 3 dB para señales con un SNR de 0 dB o superior. Para definir un valor a partir del cual las señales se consideran aptas para ingresar al sistema ASR se calcula el Word Error Rate (WER) de 1000 transcripciones automáticas de estudios de laboratorio y se correlacionan los resultados con sus respectivos valores de SNR estimados. El umbral de rechazo se define en 15 dB para un WER esperado del 10 % o menos.

Agricultural and rural ecological management system based on big data in complex system

Article

Feb 2021

The rural agricultural ecosystem has an important influence on the development of our country’s economy, society and ecological environment at any time. In recent years, our country has paid more and more attention to rural agriculture, and scientific and reasonable management of the agricultural ecosystem is a problem that everyone is more concerned about. In order to solve the management problem of rural agro-ecosystem, this article uses big data as the research background and constructs an ecological management system index system for rural agro-ecosystem based on complex system theory. The experiment uses fertilizer consumption, water pollution degree, pest degree, carbon and nitrogen absorption and agricultural economic benefits in a rural agricultural ecosystem in a certain area as the system indicators of the ecological management system. Use data mining technology in big data to collect and process relevant data in the network, analyze and understand the agricultural ecosystem through complex systems, and finally calculate and analyze the data of various indicators. The final result showed that the consumption of fertilizer in the rural area with the introduction of the agricultural ecological management system was 70 Kg/m2 less than that in another rural area, and the water pollution and insect pests were two degrees lower than the other rural area. From 2019 to 2020, due to the introduction of the agricultural ecological management system, the carbon absorption of the agricultural ecosystem increased to 166 mt and 241 mt, and the nitrogen absorption increased to 1134 mt and 1430 mt. And under the influence of the agricultural ecological management system in this area, the total agricultural economic income of rural D and rural E that introduced the system was 12.51 million, which was 1.98 million more than the other three rural areas. It can be seen that the agricultural ecological management system has a positive effect on the rural agricultural ecological system.

Example of the structure of the folksonomy. Showing only some of the concepts (culture, politics, international, …), superclasses (sports, arts, …), classes (local sports, general…), etc

Citations