A Bidirectional Long Short-Term Memory Unit [47]

A Bidirectional Long Short-Term Memory Unit [47]

Source publication
Article
Full-text available
With the advancement of cloud computing technologies, there is an ever-increasing demand for the maximum utilization of cloud resources. It increases the computing power consumption of the cloud’s systems. Consolidation of cloud’s Virtual Machines (VMs) provides a pragmatic approach to reduce the energy consumption of cloud Data Centers (DC). Effec...

Context in source publication

Context 1
... processes inputs in the forward direction, and the other unit processes inputs in the backward direction. It is an extension of traditional LSTM, which preserves both past and future information. This results in a substantial amount of increment of the information accessible to the network. Thus, the network can better understand the context. Fig. 6 shows the block diagram of Bi-LSTM network. Each LSTM block The forward layer LSTM in Bi-LSTM at time step t takes into account the input data sequence X t and past hidden state value − → h t−1 . Then, it calculates the present hidden state value − → h t . The hidden state value is updated and processed following the internal equation ...

Similar publications

Preprint
Full-text available
Virtual Machine (VM) instance price prediction in cloud computing is an emerging and important research area. VM instance’s price prediction is used for different purposes such as reducing energy consumption, maintaining Service Level Agreement (SLA), and balancing workload at cloud data centers. In this paper, we propose a Seasonal Auto-Regressive...

Citations

... A proficient MapReduce scheduler was presented in [18] for further developing the asset usage by relegating the right blend of the undertakings to each VM. A half and half Repetitive Brain Organization (RNN) was created in [19] to foresee virtual machine responsibility for task planning. Nonetheless, an energy-productive asset portion calculation was not planned. ...
Article
Full-text available
Cloud computing has arisen as a shrewd and well known worldview for people and associations to work with the entrance and use of registering assets through the web.With the rapid growth of cloud computing technology, efficiently running big data applications within minimal time has become a significant challenge. In this dynamic and scalable environment, effective resource allocation and task scheduling of big data applications play pivotal roles in optimizing performance, enhancing efficiency, and ensuring cost-effectiveness. In environments involving remote computing, task scheduling is a crucial consideration. In order to effectively accomplish resource-optimal task scheduling and minimize overall task execution time, a novel technique called Multicriteria Generalized Regressive Neural Federated Learning (MGRNFL) is developed to address the particular issues in cloud systems. Tasks from several users arrive at the cloud server at the start of the procedure. The cloud server's job scheduler then uses Multicriteria Federated Learning to carry out resource-optimal task scheduling. A decentralized machine learning technique called federated learning (FL) enables model training across several tasks that are gathered from cloud computing customers. This decentralized approach primarily focuses on learning from datasets to obtain a global model by aggregating the results of local models. The proposed techniques involve two different steps: local training models and global aggregation models. In the local training model, the task scheduler determines the resource-optimal virtual machine in the cloud server using a Generalized Regression Neural Network (GRNN) based on multicriteria functions of the virtual machine, such as energy, memory, CPU, and bandwidth. Based on these objective functions, resource-efficient virtual machines are determined to schedule multiple user tasks. The locally updated models are then combined and fed into the global aggregation model. Calculated within the global aggregation model is the weighted total of locally updated findings. The algorithm iterates through this process till the maximum number of times. In order to schedule incoming tasks, the resource-optimal virtual machine is found. Various quantitative criteria are used for the experimental evaluation, including makespan, throughput in relation to the number of tasks, and task scheduling efficiency.
... In 2021 Md. Ebtidaul Karim et al. [42] proposed an integrated prediction model named BHyPreC using the architecture of Recurrent Neural Network. This framework includes Bidirectional LSTM (Bi-LSTM) on top of the stacked LSTM and Gated Recurrent Unit (GRU). ...
Article
Full-text available
Cloud Computing (CC) generally exhibits varying workload patterns. This autoscaling feature of CC has been extensively managed through predictive cloud resource management approaches. For this reason, a solitary forecasting model is not sufficient to forecast various workload patterns of CC web applications. With the intention of developing a proficient CC model, this paper implements a novel cloud workload forecasting model with the efficiency of DL algorithms like Deep Belief Network (DBN) and Lion Algorithm (LA). Here, the previous workload models are assessed and trained via DBN, and the prediction efficacy is enhanced by optimizing the DBN's hidden neurons through LA. The proposed workload forecast model for the cloud is accomplished by tuning the significant parameters via the Adaptive Lion Algorithm Assisted-DBN (ALAA-DBN)model. Besides, a well-organized resource provisioning pattern is followed to manage the under/over provisioning issues. The proposed CC model also considers providing better Quality of Service (QoS) to the users without violating the Service Level Agreement (SLA). Eventually, the significance of the proposed ALAA-DBN is verified through a comparative analysis with state-of-the-art models.
... Some state-of-the-methods were selected to further evaluate the performance of the DSTNW, including ARIMA [3], LSTM [18], and TCN [26]. To achieve a fair comparison, all corresponding parameters used are the authors' recommended ones for each method. ...
... One can find from Table 2 that the proposed method exhibits the best performance among the selected methods ARIMA [3], LSTM [18], and TCN [26]. Compared with these comparative models, the algorithm we propose has better prediction performance and higher prediction accuracy. ...
... ARIMA [3] is a linear model in essence, while the cloud load prediction sequence has nonlinear characteristics. LSTM [18] performs poorly in extracting shallow information and is prone to encountering the gradient disappearance problem. This makes it difficult to predict accurately. ...
Article
Full-text available
When resource demand increases and decreases rapidly, container clusters in the cloud environment need to respond to the number of containers in a timely manner to ensure service quality. Resource load prediction is a prominent challenge issue with the widespread adoption of cloud computing. A novel cloud computing load prediction method has been proposed, the Double-channel residual Self-attention Temporal convolutional Network with Weight adaptive updating (DSTNW), in order to make the response of the container cluster more rapid and accurate. A Double-channel Temporal Convolution Network model (DTN) has been developed to capture long-term sequence dependencies and enhance feature extraction capabilities when the model handles long load sequences. Double-channel dilated causal convolution has been adopted to replace the single-channel dilated causal convolution in the DTN. A residual temporal self-attention mechanism (SM) has been proposed to improve the performance of the network and focus on features with significant contributions from the DTN. DTN and SM jointly constitute a dual-channel residual self-attention temporal convolutional network (DSTN). In addition, by evaluating the accuracy aspects of single and stacked DSTNs, an adaptive weight strategy has been proposed to assign corresponding weights for the single and stacked DSTNs, respectively. The experimental results highlight that the developed method has outstanding prediction performance for cloud computing in comparison with some state-of-the-art methods. The proposed method achieved an average improvement of 24.16% and 30.48% on the Container dataset and Google dataset, respectively.
... Karim et al. [21] presented a hybrid deep learning model that forecasts the multi-step CPU utilization of virtual machines by combining the nonlinear pattern learning capabilities of LSTM, GRU, and BLSTM with patterns extracted by One Dimensional CNN. As a consequence of its ability to discover temporal patterns in both past and future data, B-LSTM outperforms ARIMA, LSTM, GRU and BLSTM. ...
Article
Full-text available
Host load prediction is essential in computing to improve resource utilization and for achieving service level agreements. However, due to variations in load and the inefficiency of feature extraction, prediction of load on hosts in fog computing an immense challenge. A predictive model with variable load patterns can better estimate future resource needs, which is crucial for capacity planning, service-level goals and energy efficiency. To improve workload prediction accuracy, proposed framework proposes a time-series-based multivariate ensemble model using anomaly detection techniques. In the proposed work, virtual machines are deployed on different platforms and numerous parameters such as CPU utilization, number of cores, RAM, allocated memory, available memory, disk I/O and network I/O are extracted. There may exist inconsistencies in load prediction due to the enormous volume of data. To reduce redundancy in data, various anomaly detection techniques are utilized. The performance of the proposed ensemble model is compared with various time series models using Mean Absolute Error, Mean Squared Error, Root Mean Squared Error and Mean Absolute Percentage Error (MAPE) and Accuracy. Moreover, the effectiveness of the proposed ensemble model is demonstrated on the generated dataset and performance is measured based on the performance evaluation metrics. The ensemble model exhibits higher accuracy in workload prediction as compared to the current state-of-the-art models, it achieves the lowest MAPE and provides an accuracy of about 88%.
... Benchmarking on the UNSW-NB15 dataset showcased the FT-GCN's commendable efficiency against leading-edge models. Table 1 presents a detailed comparative analysis of the existing research on intrusion detection that we have reviewed, highlighting the unique advantages and identifying the specific challenges encountered in each referenced study [39]. Bias towards normal data, leading to false alarms. ...
Article
Full-text available
In the fast-evolving landscape of digital networks, the incidence of network intrusions has escalated alarmingly. Simultaneously, the crucial role of time series data in intrusion detection remains largely underappreciated, with most systems failing to capture the time-bound nuances of network traffic. This leads to compromised detection accuracy and overlooked temporal patterns. Addressing this gap, we introduce a novel SSAE-TCN-BiLSTM (STL) model that integrates time series analysis, significantly enhancing detection capabilities. Our approach reduces feature dimensionality with a Stacked Sparse Autoencoder (SSAE) and extracts temporally relevant features through a Temporal Convolutional Network (TCN) and Bidirectional Long Short-term Memory Network (Bi-LSTM). By meticulously adjusting time steps, we underscore the significance of temporal data in bolstering detection accuracy. On the UNSW-NB15 dataset, our model achieved an F1-score of 99.49%, Accuracy of 99.43%, Precision of 99.38%, Recall of 99.60%, and an inference time of 4.24 s. For the CICDS2017 dataset, we recorded an F1-score of 99.53%, Accuracy of 99.62%, Precision of 99.27%, Recall of 99.79%, and an inference time of 5.72 s. These findings not only confirm the STL model’s superior performance but also its operational efficiency, underpinning its significance in real-world cybersecurity scenarios where rapid response is paramount. Our contribution represents a significant advance in cybersecurity, proposing a model that excels in accuracy and adaptability to the dynamic nature of network traffic, setting a new benchmark for intrusion detection systems.
... Bi-directional Long Short-Term Memory (BiLSTM) is to replace the ordinary RNN units in the traditional Bidirectional Recurrent Neural Network (BiRNN) with LSTM units, which is a combination of forward LSTM and backward LSTM 18 . The bi-directional network has the capability to learn sequential patterns by considering the inputs from both the start and end implicit layers of the sequence. ...
Article
Full-text available
Temperature as an important indicator of climate change, accurate temperature prediction has important guidance and application value for agricultural production, energy management and disaster warning. Based on the advantages of CEEMDAN model in effectively extracting the time–frequency characteristics of nonlinear and nonsmooth signals, BO algorithm in optimizing the objective function within a limited number of iterations, and BiLSTM model in revealing the connection between the current data, the previous data and the future data, a monthly average temperature prediction model based on CEEMDAN–BO–BiLSTM is established. A CEEMDAN–BO–BiLSTM-based monthly average temperature prediction model is developed and applied to the prediction of monthly average temperature in Jinan City, Shandong Province. The results show that the constructed monthly mean temperature prediction model based on CEEMDAN–BO–BiLSTM is feasible; the constructed CEEMDAN–BO–BiLSTM model has an average absolute error of 1.17, a root mean square error of 1.43, an average absolute percentage error of 0.31%, which is better than CEEMDAN–BiLSTM, EMD–BiLSTM, and BiLSTM models in terms of prediction accuracy and shows better adaptability; the constructed CEEMDAN–BO–BiLSTM model illustrates that the model is not over-modeled and adds complexity using Friedman’s test and performance comparisons between model run speeds. The model provides insights for effective forecasting of monthly mean temperatures.
... Within the realm of regression models, we select three models: linear regression (LR) [39], eXtreme gradient boosting (XGB) [40] and support vector regression (SVR) [35]. Simultaneously, from the domain of neural network models, we leverage the capabilities of four distinct architectures: gated recurrent unit (GRU), long short-term memory (LSTM) [38], recurrent neural network (RNN) and independently RNN (IndRNN) [23,43,44]. ...
Article
Full-text available
Optimizing energy consumption in heterogeneous GPU clusters is of paramount importance to enhance overall system efficiency and reduce operational costs. However, the diversity of GPU types in heterogeneous GPU clusters poses challenges for energy optimization. In this paper, we propose an utilization-prediction-aware energy optimization approach for heterogeneous GPU clusters. We utilize a feature correlation-based method to select feature vectors and improve the prediction accuracy of our GPU utilization prediction model. By constructing a model for energy consumption and combining it with the predicted results, our approach can avoid wasting too much energy on idle GPUs and reduce the energy consumption of the heterogeneous GPU cluster. To validate our approach, experiments are conducted on real Alibaba data. The results show that our model achieved a good performance in predicting the utilization for four types of GPUs, with RMSE, MAE and R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R^{2}$$\end{document} average values of 7.08, 3.81 and 0.93, respectively. Moreover, we calculate the energy consumption of the GPU cluster before and after the adjustment, and estimate the energy savings, which amount to an average of 35.55%.
... According to Fig. 6, the neurons in a dense layer are connected to all the neurons in the layer above it. This layer is the backbone of most modern neural networks (Karim et al.,2021). Dense-layer neurons in a model do matrix-vector multiplication and receive input from all of the neurons in the preceding layer. ...
Article
Full-text available
Since the advent of malware, it has reached a toll in this world that exchanges billions of data daily. Millions of people are victims of it, and the numbers are not decreasing as the year goes by. Malware is of various types in which obfuscation is a special kind. Obfuscated malware detection is necessary as it is not usually detectable and is prevalent in the real world. Although numerous works have already been done in this field so far, most of these works still need to catch up at some points, considering the scope of exploration through recent extensions. In addition to that, the application of a hybrid classification model is yet to be popularized in this field. Thus, in this paper, a novel hybrid classification model named, MalHyStack, has been proposed for detecting such obfuscated malware within the network. This proposed working model is built incorporating a stacked ensemble learning scheme, where conventional machine learning algorithms namely, Extremely Randomized Trees Classifier (ExtraTrees), Extreme Gradient Boosting (XgBoost) Classifier, and Random Forest are used in the first layer which is then followed by a deep learning layer in the second stage. Before utilizing the classification model for malware detection, an optimum subset of features has been selected using Pearson correlation analysis which improved the accuracy of the model by more than 2 % for multiclass classification. It also reduces time complexity by approximately two and three times for binary and multiclass classification, respectively. For evaluating the performance of the proposed model, a recently published balanced dataset named CIC-MalMem-2022 has been used. Utilizing this dataset, the overall experimental results of the proposed model represent a superior performance when compared to the existing classification models.
... It is fixed along with the gating appliance and it is an innovative version of RNN. It was obtained to diminish the fading declivity issue that happened in standard RNN (Karim et al. 2021). GRU is compared with the LSTM method and processed with few parameters. ...
Article
Full-text available
Nowadays the eye disease that widely affects the visual impairment of humans is Diabetes Retinopathy (DR). The advanced stage of the disorder leads to cause complete vision loss and creates complex situations for treatment. So it is significant to treat prolonged diabetes at an initial stage. Therefore, the main reason for DR is the uncontrolled growth of blood glucose levels in the eye. If it reaches the severity level the bleeding is caused in the eye. However, the lesions generated due to DR are medicated based on fundus images. The significant purpose of affecting DR is the presence of high sugar in the blood and this damages the retina. Therefore proper screening of DR is essential to prevent it from affecting the blood vessels all over the body. Also, it unblocks blood vessels paves the way to function the new blood vessels grown in the eye. Therefore a novel hybrid oppositional fire-fly modified 1D bidirectional recurrent (HOF-M1DBR) method is proposed to detect the DR through fundus images accurately. The datasets Messidor-1 and APTOS-2019 are applied for performing the initial process of fundus images such as denoising, smoothing, cropping, and resizing. The M1DBR is used to validate the accuracy and optimize the weight function using the Opposition-Based Learning-FireFly (OBL-FF) algorithm. Thus prediction and classification of f DR from the fundus images are detected accurately and distinguished the four levels from the extracted features. The validation of the proposed method is performed based on accuracy, precision, recall, and F1-score. The experimental results revealed that the proposed HOF-M1DBR attained an accuracy of 98.9% for the Messidor-1 and APTOS-2019 dataset, thus improving the performance. However, the existing DCNN-PCA-FF, DNN-MSO, SI-GWO, and DCNN-EMF methods diminished the performance by 79.9%, 87%, 83%, and 89.1% respectively.
... Whether it is a virtual machine or a container, there will be repetitive patterns in its workload [16], [17]. In comparison to virtual machines, the granularity of container resource allocation is finer, and the startup and destruction costs are smaller [18], resulting in container load fluctuating more randomly and the amount of training data collected being smaller. ...
... The applicability of our method in virtual machine load prediction is as follows. First, both virtual machine load prediction and container load prediction are time series prediction, and virtual machine load has periodic characteristics like container load [16], [17]. Second, the amount of training data for virtual machine load prediction is relatively large. ...
Article
Full-text available
The emergence of containers dramatically simplifies and facilitates the development and deployment of applications. More and more enterprises deploy their applications on the container cloud platform. For cloud service providers, an effective container workload prediction method is a must to achieve efficient utilization of cloud resources. However, the existing methods are either rarely based on container load characteristics or cannot make accurate real-time predictions. In this paper, we propose a Docker container workload proactive prediction method using a hybrid model combining triple exponential smoothing and long short-term memory (LSTM), which not only can capture both short-term and long-term dependencies in container resource time series but also smooth the container resource utilization data. In order to improve the prediction accuracy of the hybrid model, those two single models are combined using the mean absolute percentage error (MAPE) method. Besides, we design a real-time Docker workload prediction system for the hybrid model. Our experiments show that the mean absolute percentage error of the hybrid model is decreased by an average of 3.24%, 12.18%, 13.42%, 43.45%, and 50.69% compared with the LSTM, the triple exponential smoothing, ES-ARIMA, Bayesian Ridge Regression and BiLSTM with an acceptable time and computational cost overhead.