Character error rate (CER) comparison of GridLSTM model and baseline. The bold data in table indicates the minimum CER and highest sensitivity, specificity, and F1 in the experiment.

Character error rate (CER) comparison of GridLSTM model and baseline. The bold data in table indicates the minimum CER and highest sensitivity, specificity, and F1 in the experiment.

Source publication
Article
Full-text available
The Recurrent Neural Network (RNN) utilizes dynamically changing time information through time cycles, so it is very suitable for tasks with time sequence characteristics. However, with the increase of the number of layers, the vanishing gradient occurs in the RNN. The Grid Long Short-Term Memory (GridLSTM) recurrent neural network can alleviate th...

Context in source publication

Context 1
... this experiment, each loop layer was set to two layers, and the output layer was a fully connected layer. The speech recognition evaluation indicators of different models are shown in Table 1. GridLSTM has the lowest CER on the LibriSpeech corpus and the highest comprehensive index F1. ...

Similar publications

Article
Full-text available
Long Short-Term Memory (LSTM) is a modification of the Recurrent Neural Network (RNN) designed to deal with the issues of exploding and vanishing gradients and makes it possible to manage long-term information. To tackle these problems, modifications were made to the RNN by providing memory cells that can store information for long periods. In this...
Conference Paper
Full-text available
Data-driven techniques, especially artificial intelligence (AI) based deep learning (DL) techniques, have attracted more and more attention in the manufacturing sector because of the rapid growth of the industrial Internet of Things (IoT) and Big Data. Tremendous researches of DL techniques have been applied in machine health monitoring, but still...
Preprint
Full-text available
Recurrent neural networks are key tools for sequential data processing. Existing architectures support only a limited class of operations that these networks can apply to their memory state. In this paper, we address this issue and introduce a recurrent neural module called Deep Memory Update (DMU). This module is an alternative to well-established...
Article
Full-text available
Demand prediction for on-demand food delivery (ODFD) is of great importance to the operation and transportation resource utilization of ODFD platforms. This paper addresses short-term ODFD demand prediction using an end-to-end deep learning architecture. The problem is formulated as a spatial–temporal prediction. The proposed model is composed of c...
Article
Full-text available
Forecasting sales is a prerequisite for successful management, this avoids overproduction and therefore costly storage, the goal of this work is to provide a reliable sales forecasting model to help organizations make strategic and operational decisions. Indeed, this work accounts for the profitability of the use of techniques based on artificial i...

Citations

... BLSTM shown in algorithm 5 is an extension of traditional LSTM that improves model performance on various problems. BLSTM processes input in two dimensions[21][22] MLSTM appear in algorithm 6 is a hybrid network that combines the multiplicative RNN (MRNN) and LSTM's gating framework. By using connections from the intermediate state in MRNN to each gate in LSTM, it increases flexibility[23][24] ...
Article
Full-text available
Localization in wireless sensor networks (WSNs) plays a crucial role in various applications that rely on spatial information. This paper introduces the Multi-Strategy Fusion for Localization model, which integrates optimization techniques (ABO, DSA, EHO, and KNN) and neurocomputing techniques (BP, MTLSTM, BILSTM, and Autoencoder) to enhance localization accuracy in WSNs. The work is divided into three phases: data collection, model building, and implementation. The first and the last are carried out in the field, while the second is made in the laboratory. The three phases involve a few general steps. The (1) Data Collection Phase includes four steps: (a) Deploy three anchors at known locations, forming an equilateral triangle. (b) Each anchor starts broadcasting its location. (c) Using an ordinary sensor, the RSSI of each anchor is measured at every possible location where the signal of the three anchors can reach it. (d) Data is logged to a CSV file containing the measuring location and the RSSI of the three anchors and their locations. The Model Building Phase includes (a) preprocessing of the collected data, and (b) building a model based on optimization and optimization techniques. The Implementation phase includes five steps: (a) Convoy sensors to the target field. (b) Manually deploy anchors according to the distribution plan. (c) Randomly deploy ordinary sensors. (d) Each ordinary sensor starts in initialization mode. When receiving a signal from three anchors, a sensor computes its location and stores it for future use. When a sensor has its location, it turns into operational mode. (e) A sensor in the operational mode attaches the location with sensed data each time it sends it to the sink or neighbors according to routing protocols (routing is not considered in this study). ABO and DSA optimization techniques show similar performance, with lower Mean Squared Error (MSE) values compared to EHO and KNN. ABO and DSA also have similar Mean Absolute Error (MAE) values, indicating lesser average absolute errors. BP emerges as the top performer among the neurocomputing techniques, demonstrating better accuracy with lower MSE and MAE values compared to MTLSTM, BILSTM, and Autoencoder. Finally; The Multi-Strategy Fusion for Localization model offers an effective approach to enhance localization accuracy in wireless sensor networks. The paper focuses on addressing the correlation between wireless device positions and signal intensities to improve the localization process. The obtained results and provided justification emphasize the significance and value of the model in the field of localization in WSNs. The model represents a valuable contribution to the development of localization techniques and improving their accuracy to meet the needs of various applications. The model opens up opportunities for its utilization in diverse domains such as environmental monitoring, healthcare, smart cities, and disaster management, enhancing its practical applications and practical significance.
... However, as the message is back propagated to lower layers, gradient often gets smaller, and eventually, the weights would remain unchanged at lower layers. This is the issue commonly suffered by RNN called gradient vanishing (Fei and Tan, 2018). RNN does not work effectively for learning long-term patterns and constructing deep network due to the gradient vanishing and exploding problems. ...
... In addition, a well-trained discriminator network may confidently reject the generator-generated samplings due to this problem. The challenge of optimizing the generator is compounded by the fact that the discriminator does not share any information, which can damage the overall learning capacity model [81]. • Mode collapse: Mode collapse poses a critical challenge in GAN training, leading to the generator consistently producing identical outputs. ...
Article
Full-text available
Advancements in technology have improved human well-being but also enabled new avenues for criminal activities, including digital exploits like deep fakes, online fraud, and cyberbullying. Detecting and preventing such activities, especially for law enforcement agencies needing photo profiles for covert operations, is imperative. Yet, conventional methods relying on authentic images are hindered by data protection laws. To address this, alternatives like generative adversarial networks, stable diffusion, and pixel recurrent neural networks can generate synthetic images. However, evaluating synthetic image quality is complex due to the varied techniques. Metrics are crucial, offering objective measures to compare techniques and identify areas for enhancement. This article underscores metrics’ significance in evaluating synthetic images produced by generative adversarial networks. By analyzing metrics and datasets used, researchers can comprehend the strengths, weaknesses, and areas for further research on generative adversarial networks. The article ultimately enhances image generation precision and control by detailing dataset preprocessing and quality metrics for synthetic images.
... And it is structured with three gates (oblivion, input and output gates) instead of neurons, and the context of the data. Increases long-term dependencies, equivalently, the vanishing gradient problem of DNN is alleviated by the network structure [5]. Specifically, data control is performed by three gates centered on the memory cell idea shown in Fig. 7. ...
Article
In recent years, the occurrence of local torrential rain has increased, and an accurate prediction model is required. Atmospheric water vapor measurement based on the zenith total delay (ZTD) produced by the precise point positioning processing employed in the Global Navigation Satellite System (GNSS) is effective for forecasting. Recently, there has been a lot of research into applying deep learning to forecasting, however, it could not be practical. In this paper, we introduce the Long Short-Term Memory (LSTM) built by a neural network algorithm to effectively model the discrete time series of rain rate, the ZTD and the meteorological sensing data work as explanatory variables. A key message from this analysis is that a deep learning model has the capability to follow the climate variation as long as a short-term event even though it exists spatial locality.
... However, the LSTM-based model is only taking advantage of adjacent contexts (Yu et al., 2019). We assumed that if no correlation between the adjacent covariates existed, it might face the vanishing gradient problem during backward propagation and further affect the predictive performance (Fei and Tan, 2018). To mitigate this issue, the second reorganized method (referred to as LGD-LSTM) was proposed. ...
Article
Full-text available
The accurate and cost-effective mapping of soil texture is essential for agricultural development and environmental activities. Soil texture exhibits high spatial heterogeneity which poses challenges for recent Digital Soil Mapping (DSM) methods in achieving accurate predictions. Feature engineering methods, extensively used to capture complex soil-forming relationships and enhance prediction accuracy, often involve labor-intensive processes. Additionally, the engineered "discrete" feature cannot reflect interactions between environmental covariates or dependencies. To address the challenges, this study proposes a novel Local-Global Dependency Long Short-Term Memory model (LGD-LSTM) to enhance soil texture predictions at various soil depths. Firstly, a covariate reorganization method has been devised to generate multiple sets of input. Subsequently, several Long Short-Term Memory models (LSTM) have been employed to extract the interdependencies among the covariates. Finally, predictions are generated using a fully-connected layer. Cross-validation was conducted within this experiment to analyze prediction accuracy: the average explained variation (R 2) ranged from 0.66 to 0.73, and the root mean square error (RMSE) ranged from 6.52% to 10.89%. The results indicated that the LGD-LSTM model offers distinct advantages over other digital soil mapping methods, including Random Forests (RF), Convolutional Neural Network (CNN), and the standard Long Short-Term Memory model (LSTM). In summary, this LGD-LSTM method demonstrates superior performance with relatively high accuracy, ensuring its applicability in effectively representing spatial variations in soil texture. Furthermore, it presents a novel option for DSM applications, enhancing the field's methodology and potential impact.
... However, they suffer from several limitations due to the sequential processing of input data and the challenges associated with BPTT, especially while processing datasets with long dependencies. The training process of LSTM and GRU models also suffers from vanishing and exploding gradient problems [28,69]. While processing long sequences, the gradient descent algorithm (using BPTT) may not update the model parameters as the gradient information is lost (either approaches zero or infinity). ...
Article
Full-text available
Transformer architectures have widespread applications, particularly in Natural Language Processing and Computer Vision. Recently, Transformers have been employed in various aspects of time-series analysis. This tutorial provides an overview of the Transformer architecture, its applications, and a collection of examples from recent research in time-series analysis. We delve into an explanation of the core components of the Transformer, including the self-attention mechanism, positional encoding, multi-head, and encoder/decoder. Several enhancements to the initial Transformer architecture are highlighted to tackle time-series tasks. The tutorial also provides best practices and techniques to overcome the challenge of effectively training Transformers for time-series analysis.
... Every part maps each input through the hidden state to evaluate yield sequentially, utilizing the concept of the Back Propagation algorithm by computing errors from different layers to the yield layer. To overcome the problems arising from RNN, a long short-term memory (LSTM) was introduced, which supports longduration memory capacity to store elements and support long-term dependencies [45,46]. ...
Article
Sugarcane (Saccharum officinarum L.) is one of the principal origins of sugar and is also known as the main cash crop of India. About 19.07% of the total production of the world’s sugar requirement is fulfilled by India. Traditionally, Statistical approaches have been utilized for Crop yield prediction, which is tedious and timeconsuming. In this direction, the present work proposed a novel hybrid CNN-Bi-LSTM_CYP deep learningbased approach that includes convolutional layers to extract the relevant spatial information in a sequence to Bi-LSTM layers that recognize the Phenological long-term and short-term bidirectional dependencies in the dataset to predict the Sugarcane crop yield. The experimentation was performed and validated on the historical dataset from 1950 to 2019 years of the major Sugarcane-producing states of India. The preliminary results shown that the CNN-Bi-LSTM_CYP method performed well (RMSE:4.05, MSE:16.40) in comparison to traditional Stacked-LSTM (RMSE:8.8, MSE:77.79), ARIMA (RMSE:5.9, MSE:34.80), GPR (RMSE:10.1, MSE:103.3), and Holtwinter Time-series (RMSE:9.9, MSE:99.7) techniques. The study concluded that the predicted sugar yield has a minimal relative error concerning the ground truth data for the CNN-Bi-LSTM_CYP approach proving the proposed model’s efficiency.
... Due to the vanishing gradient problem, the RNN performs less effectively in terms of prediction than the DNN, whereas the ability of the RNN to memorize previous inputs results in a better prediction than CNN models. Fei et al [42] reported increase in error in RNN is owing to the vanishing gradient, consistent with this study. The mechanical strength of the dissimilar explosive clads predicted by RNN is 4% less accurate than the DNN model (R 2 -0.9146). ...
Article
Full-text available
In this study, the tensile and shear strengths of aluminum 6061-differently grooved stainless steel 304 explosive clads are predicted using deep learning algorithms, namely the conventional neural network (CNN), deep neural network (DNN), and recurrent neural network (RNN). The explosive cladding process parameters, such as the loading ratio (mass of the explosive/mass of the flyer plate, R: 0.6–1.0), standoff distance, D (5–9 mm), preset angle, A (0°–10°), and groove in the base plate, G (V/Dovetail), were varied in 60 explosive cladding trials. The deep learning algorithms were trained in a Python environment using the tensile and shear strengths acquired from 80% of the experiments, using trial and previous results. The remaining experimental findings are used to evaluate the developed models. The DNN model successfully predicts the tensile and shear strengths with an accuracy of 95% and less than 5% deviation from the experimental result.
... The RNN enhances the hidden layer with a feedback connection. As a result, the RNN can access information from all past instances through the recurrent feedback connection [71]. However, the RNN can only transmit information to its nearby successors. ...
... They discovered that the gradient explosion and vanishing gradient problems are the main causes of the difficulty in RNN training. In order to address this issue, [72] presented a long short-term memory recurrent neural network, which Alex Graves recently enhanced and advanced [71]. Long-term dependency is a common issue with traditional RNNs when attempting to connect past information to new information [73]. ...
... Where: is input vector at any time t; ℎ −1 is previously hidden state; ℎ is cell output. Through a particular gating unit, the LSTM keeps the error at a more Through an iterative process where errors are propagated back and weights are changed by gradient descent, the memory unit learns when to permit data to enter, exit, or be erased [71]. Figure 2.19 illustrates LSTM architecture [76]. ...
Thesis
Full-text available
In the recent years, there is a great interest in the field of Anomalous Event Detection System (AEDS). Providing security for a person is a key issue in every community nowadays, existing classical closed-circuit television considered is insufficient since it needs a person to stay awake and constantly monitor the cameras. For these reasons, the development of an automated security system that can identify suspicious activities in real-time and quickly aid victims is required. This work, attempt to design and implement an AEDS, that would improve detection accuracy and reduce the processing time. In this work, four different models were proposed for fast and accurate detection of anomalies. These systems are: the first model (AEDS-SUM15-DL) based on combining the features of each 15 consecutive frames to generate new vectors that represents the new anomalous features. The second model (AEDS-H265-SUM15-DL) based on the combination of H265 with the suggested framework (AEDS-SUM15-DL). The third model (AEDS-ML-PCA-DL) based on the combination of Machine Learning (ML)-Principal Component Analysis (PCA) and Deep Learning (DL). The fourth model (AEDS-ML-GA-DL) based on the combination of Machine Learning (ML)- Genetic Algorithm (GA) and Deep Learning (DL). The models were trained and tested based on the challenge UCF-Crime dataset. The results obtained for the four models were AUC equal to 93.61%, 90.16%, 94.21% and 94.58%, respectively, while the classification accuracy equal to 86.06%, 79.81%, 88.46% and 89.90%, respectively. By comparing the results obtained from our suggested models with the previous work, it was found that the model that adopts the genetic algorithm gives the better results, while the results obtained from the model that adopts the H265 were not relatively good.
... The residual dense block is composed of three groups of convolution units, global residuals and tight connections. Through the residual dense block, the problems of gradient disappearance and explosion can be alleviated [21], which has a great effect on enhancing the sorting of negative pressure wave signals and noise and improving the noise reduction performance of the model. Its structure is shown in Figure 4. ...
... Taking the 12km pipeline as an example, the first station receives a negative pressure wave at 10:45:40, and the last station receives it at 10:45:45. According to the localization formula (21), it can be obtained as follows: ...
Article
Full-text available
In the pigging operation of an oil pipeline, the traditional pigging localization method can only achieve fixed-point detection and cannot track continuously. To ensure the normal operation of the pigging operation and track and locate the pig in a timely manner, a tracking and localization method of an oil pipeline pig based on a noise reduction autoencoder is proposed. First, using the advantages of U-Net (U-shaped Convolutional Neural Network) multi-scale feature extraction, combined with coding and residual-dense blocks, U-Net combined with Residual Dense Block U-shaped Network (RDBU) was proposed for noise reduction, and then real-time localization was calculated out based on the negative pressure wave localization formula of pigs. The experimental results show that, compared with the traditional method, the SNR of the RDBU is increased by 0.9dB and the root mean square error is reduced by 14.92%. The denoising algorithm in this paper can effectively eliminate the noise of the negative pressure wave signal of pigs.