Figure - available from: Neural Computing and Applications
This content is subject to copyright. Terms and conditions apply.
Example of the structure of the folksonomy. Showing only some of the concepts (culture, politics, international, …), superclasses (sports, arts, …), classes (local sports, general…), etc

Example of the structure of the folksonomy. Showing only some of the concepts (culture, politics, international, …), superclasses (sports, arts, …), classes (local sports, general…), etc

Source publication
Article
Full-text available
This paper proposes a multilingual audio information management system based on semantic knowledge in complex environments. The complex environment is defined by the limited resources (financial, material, human, and audio resources); the poor quality of the audio signal taken from an internet radio channel; the multilingual context (Spanish, Frenc...

Citations

... Experiments show that the method captures speech errors better than knowledge-based and data-driven speech rules, but at a higher computational cost. DTCWT is used to extract lip texture features because of its displacement invariance and good orientation selectivity [6]. e Canberra distance between adjacent frames of lip texture features is utilized as visual dynamic features. ...
Article
Full-text available
This paper analyzes and investigates the quality assessment of spoken English pronunciation using a cognitive heuristic computing approach and designs a corresponding spoken pronunciation quality assessment system for practical training. Using the general Goodness of Pronunciation assessment algorithm as a benchmark, the shortcomings of the traditional Goodness of Pronunciation method are explored through statistical experiments, and the validity of the overall posterior probability output from the speech model for pronunciation quality assessment is verified. For the analysis of rhythm, there is no common algorithm framework, but in this paper, the F0 similarity algorithm based on dynamic time regularization and the stop similarity algorithm based on forced alignment is proposed for the two main factors of rhythm, intonation, and pause, respectively. After framing, the Hamming window processing is used to make the signal smoother, reduce the side lobe size after fast Fourier transform processing, and solve the problem of spectrum leakage. Compared with the ordinary rectangular window function, the Hamming window can obtain a higher quality spectrum. And combined with CTC for speech recognition modeling, the recognition rates are comparable in the case of using BLSTM and bidirectional threshold cyclic unit BGRU as the hidden layer unit, respectively, and the training time is 23% less than BLSTM using BGRU; in addition, the BGRU-CTC model is improved by using a 2-BGRU-CTC model with 256 hidden layer nodes, so that the error rate of phoneme recognition is reduced to 33%. The effectiveness of the algorithm framework is also verified through experiments, which further proves the effectiveness of our proposed phoneme segment feature and rhyme similarity algorithm.
... Computers have been widely used in language evaluation for language learning as well as speech recognition, and speech recognition technology is an important manifestation in determining the level of language learning. A large amount of language signal data in the language learning process, the complexity of speech variation, and the high dimensionality of speech feature parameters lead to more difficult recognition of speech features [1]. e computational volume of speech evaluation and recognition is too large, which requires higher hardware resources as well as software resources to realize the high-speed processing of massive speech signals [2]. ...
Article
Full-text available
To improve the accuracy of English pronunciation level evaluation, we study the modularization of the English pronunciation level evaluation system unfolding based on machine learning. The S3C2440A chip is used as the main processor of the system, and the spoken English recordings are sent to the evaluation module through the speech upload module. In the evaluation module, the pronunciation signal is filtered by the multilayer wavelet feature scale transformation method, and the intonation, speed, pitch, rhythm, and emotion features are decomposed and extracted. The test results show that the misjudgment rate of different mispronunciations is less than 1% when the system is used to evaluate the English pronunciation level, which proves that it has high evaluation accuracy. In-depth study of speech recognition related theories, speech scoring, and pronunciation correction algorithms are discussed, and an assisted learning system based on AP scoring method and pronunciation resonance peak comparison technology is proposed for the problem of inaccurate pronunciation scoring and lack of effective feedback of speech recognition technology applied to oral learning. The English pronunciation training system has achieved the expected pronunciation following of English phonetic symbols and words, real-time pronunciation. The English pronunciation training system has achieved all basic functions such as pronunciation following, real-time pronunciation evaluation, and pronunciation correction of English phonemes and works as expected. After testing, the system has achieved high accuracy in pronunciation scoring, and the similarity with experts’ scoring is over 90% for vowel and word pronunciation; the efficiency of pronunciation correction reaches 80%, which can improve learners’ pronunciation level to a certain extent.
... Finally, the effect of quantization error is considered in the whole model framework, and the quantization error is generated when thresholding continuous values into binary hash codes are added to the optimization objective to obtain hash codes with better expressiveness. e effectiveness of the framework is validated experimentally based on relevant data sets [7]. ...
Article
Full-text available
Because of the difficulty of music feature recognition due to the complex and varied music theory knowledge influenced by music specialization, we designed a music feature recognition system based on Internet of Things (IoT) technology. The physical sensing layer of the system places sound sensors at different locations to collect the original music signals and uses a digital signal processor to carry out music signal analysis and processing. The network transmission layer transmits the completed music signals to the music signal database in the application layer of the system. The music feature analysis module of the application layer uses a dynamic time regularization algorithm to obtain the maximum similarity between the test template and the reference. The music feature analysis module of the application layer uses the dynamic time regularization algorithm to obtain the maximum similarity between the test template and the reference template to realize the feature recognition of the music signal and determine the music pattern and music emotion corresponding to the music feature content according to the recognition result. The experimental results show that the system operates stably, can capture high-quality music signals, and can correctly identify music style features and emotion features. The results of this study can meet the needs of composers’ assisted creation and music researchers’ analysis of a large amount of music data, and the results can be further transferred to deep music learning research, human-computer interaction music creation, application-based music creation, and other fields for expansion.
... For the area ''Analysis of speech with nature-inspired intelligence,'' there are two works. The first work shows a system for the multilingual audio information management [4], which is based on a scalable architecture of automatic machine learning methodologies for semantic knowledge analysis in complex environments. The accuracies were between 81 and 68% according to the language. ...
Conference Paper
Full-text available
Se desarrolla un estimador de relación señal a ruido (SNR) con el fin de determinar la calidad de registros sonoros de habla que ingresen a sistemas de reconocimiento automático de voz (ASR) para su transcripción en un contexto médico. Debido a que se necesita obtener un valor de SNR a partir de una única señal, sin poseer una referencia, se crean modelos de detección de actividad de voz haciendo uso de una red neuronal de arquitectura híbrida (CNN y BILSTM), y de dos sets de entrenamiento, uno con datos similares a los de su uso final y otro con ejemplos más generales. A través de un algoritmo que utiliza la información mencionada se logra estimar la relación señal a ruido con un error máximo de 3 dB para señales con un SNR de 0 dB o superior. Para definir un valor a partir del cual las señales se consideran aptas para ingresar al sistema ASR se calcula el Word Error Rate (WER) de 1000 transcripciones automáticas de estudios de laboratorio y se correlacionan los resultados con sus respectivos valores de SNR estimados. El umbral de rechazo se define en 15 dB para un WER esperado del 10 % o menos.
Article
The rural agricultural ecosystem has an important influence on the development of our country’s economy, society and ecological environment at any time. In recent years, our country has paid more and more attention to rural agriculture, and scientific and reasonable management of the agricultural ecosystem is a problem that everyone is more concerned about. In order to solve the management problem of rural agro-ecosystem, this article uses big data as the research background and constructs an ecological management system index system for rural agro-ecosystem based on complex system theory. The experiment uses fertilizer consumption, water pollution degree, pest degree, carbon and nitrogen absorption and agricultural economic benefits in a rural agricultural ecosystem in a certain area as the system indicators of the ecological management system. Use data mining technology in big data to collect and process relevant data in the network, analyze and understand the agricultural ecosystem through complex systems, and finally calculate and analyze the data of various indicators. The final result showed that the consumption of fertilizer in the rural area with the introduction of the agricultural ecological management system was 70 Kg/m2 less than that in another rural area, and the water pollution and insect pests were two degrees lower than the other rural area. From 2019 to 2020, due to the introduction of the agricultural ecological management system, the carbon absorption of the agricultural ecosystem increased to 166 mt and 241 mt, and the nitrogen absorption increased to 1134 mt and 1430 mt. And under the influence of the agricultural ecological management system in this area, the total agricultural economic income of rural D and rural E that introduced the system was 12.51 million, which was 1.98 million more than the other three rural areas. It can be seen that the agricultural ecological management system has a positive effect on the rural agricultural ecological system.