Fig 2 - uploaded by Katarina Grolinger
Content may be subject to copyright.
Random forest structure [16].

Random forest structure [16].

Source publication
Article
Full-text available
During building operation, a significant amount of energy is wasted due to equipment and human-related faults. To reduce waste, today's smart buildings monitor energy usage with the aim of identifying abnormal consumption behaviour and notifying the building manager to implement appropriate energy-saving procedures. To this end, this research propo...

Similar publications

Article
Full-text available
Insider trading is one kind of criminal behaviors in security markets. It has existed since the birth of the security market. Until 2018, the history of the Chinese security market is less than 30 years, nonetheless, insider trading behavior frequently occurred. In this study, we mainly explore the features of insider trading behavior by studying r...

Citations

... The instance is represented as a coloring matrix where different intensities of colors are assigned based on the significance of deviation. Araya et al. proposed the use of ensemble learning techniques to identify the anomalies in energy systems (Araya et al. 2017). They proposed a collective contextual anomaly detection using sliding window (CCAD-SW). ...
Article
Full-text available
The cooling systems contribute to 40% of overall building energy consumption. Out of which, 40% is wasted because of faulty parts that cause anomalies in the cooling systems. We propose a three-stage, non-invasive part-level anomaly detection technique to identify anomalies in both cooling systems, a ducted-centralized and a ductless-split. We use COTS sensors to monitor temperature and energy without invading the cooling system. After identifying the anomalies, we find the cause of the anomaly. Based on the anomaly, the solution recommends a fix. If there is a technical fault, our proposed technique informs the technician regarding the faulty part, reducing the cost and time needed to repair it. In the first stage, we propose a domain-inspired time-series statistical technique to identify anomalies in cooling systems. We observe an AUC-ROC score of more than 0.93 in simulation and experimentation. In the second stage, we propose using a rule-based technique to identify the cause of the anomaly. We classify causes of anomalies into three classes. We observe an AUC-ROC score of 1. Based on the anomaly classification, we identify the faulty part of the cooling system in the third stage. We use the Nearest-Neighbour Density-Based Spatial Clustering of Applications with Noise (NN-DBSCAN) algorithm with transfer learning capabilities to train the model only once, where it learns the domain knowledge using the simulated data. The trained model is used in different environmental scenarios with both types of cooling systems. The proposed algorithm shows an accuracy score of 0.82 in simulation deployment and 0.88 in experimentation. In the simulation we used both ducted-centralized and ductless-split cooling systems and in the experimentation we evaluated the solution with ductless-split cooling systems. The overall accuracy of the three-stage technique is 0.82 and 0.86 in simulation and experimentation, respectively. We observe energy savings of up to 68% in simulation and 42% during experimentation, with a reduction of ten days in the cooling system’s downtime and up to 75% in repair cost.
... Recent research efforts have extensively focused on various methodologies for energy consumption forecasting to facilitate more effective energy management. Notable studies that explore diverse approaches include [3,4,5], highlighting the breadth of investigative work in this area. ...
Chapter
Full-text available
Nowadays, efficient management of energy consumption is crucial for the sustainability of our cities, and overall of our planet. Approaches investigated so far, mostly adopt complex approaches, often based on deep learning, which have an important footprint. This study focuses on the importance of using simpler methods to predict energy consumption in smart buildings, emphasizing a methodological approach that prioritizes simplicity, transparency, and computational efficiency, especially when data is scarce. It emphasizes that even the prediction of energy consumption at the scale of a building, which is sometimes ignored due to computational complexity, is feasible and can make a big difference. By using simple analytical models combined with outlier detection, this research contributes to the field by showing how we can still gain valuable insights with limited data. Therefore, this study provides a practical and scalable way to improve energy efficiency and sustainability in buildings, which has a significant contribution to energy management practices.
... The process of detecting such behavior is known as anomaly detection [12]. Anomalies can be categorized into three types: point anomalies, contextual anomalies, and collective anomalies [13]. Data points that deviate significantly from the rest of the dataset are known as point anomalies. ...
... Many different feature extraction techniques have been used by researchers in the domain of smart houses. For example, these features were extracted by [30] and [31], which are the mean of sensor data values in each window (x), the standard deviation of sensor data values in each window (s), the difference between the last and first elements of a sliding window, the first quartile (Q1), the second quartile (Q2), the third quartile (Q3), and the Interquartile Range (IQR). They also extracted some contextual features, including the day of the year, the season, the month, the day of the week, and the hour. ...
Article
Full-text available
In recent years, the incorporation of sensing technology into residential buildings has given rise to the concept of ”smart buildings”, aimed at enhancing resident comfort. These buildings are typically part of interconnected neighborhoods sharing common energy sources, which makes the energy consumption a critical consideration in decision-making processes. Consequently, optimizing energy usage in smart buildings has posed significant challenges for both enterprises and governments, prompting numerous studies to address this issue. One such challenge is organizing energy usage within neighborhood networks while ensuring the user comfort and without exceeding the total energy capacity. In this paper, we present a novel mechanism that predicts the future behavior of each house based on its historical consumption data, generating a weekly schedule annotated with hourly energy usage levels (high, normal, or low) tailored to individual user needs. Additionally, we introduce an incentive-based program that rewards users with bill discounts for adhering to high energy consumption periods. The scheduling process involves extracting features from data and utilizing a genetic algorithm for construction, coupled with dynamic programming to enhance efficiency by storing house features and schedules. This enables rapid provision of suitable schedules for similar houses. Evaluation results demonstrate that the proposed technique achieves an accuracy of \(92\%\) and improves the execution time of the optimization algorithm by \(26\%\).
... The authors showed a promising result to resolve the current issues of supervised learning-based outlier detection. Araya et al. [28] proposed a new framework based on collective contextual anomaly detection using a sliding window (CCAD-SW) to identify unusual energy use in smart buildings, which was further enhanced by the Ensemble Anomaly Detection (EAD) framework combining various classifiers. Using real-world data, the EAD was shown to increase the sensitivity of CCAD-SW by 3.6% and decrease false alarms by 2.7%, proving effective in energy monitoring. ...
Article
Full-text available
Outlier detection plays a critical role in building operation optimization and data quality maintenance. However, existing methods often struggle with the complexity and variability of building energy data, leading to poorly generalized and explainable results. To address the gap, this study introduces a novel Vision-based Outlier Detection (VOD) approach, leveraging computer vision models to spot outliers in the building energy records. The models are trained to identify outliers by analyzing the load shapes in 2D time series plots derived from the energy data. The VOD approach is tested on four years of workday time-series electricity consumption data from 290 commercial buildings in the United States. Two distinct models are developed for different usage purposes, namely a classification model for broad-level outlier detection and an object detection model for the demands of precise pinpointing of outliers. The classification model is also interpreted via Grad-CAM to enhance its usage reliability. The classification model achieves an F1 score of 0.88, and the object detection model achieves an Average Precision (AP) of 0.84. VOD is a very efficient path to identifying energy consumption outliers in building operations, paving the way for the enhancement of building energy data quality, operation efficiency, and energy savings.
... Ensemble learning framework [116], Decision tree and SVM-based data analytics [117] Operation behaviour, usage anomaly, theft detection, load anomaly [111] we can postpone the executions of delay insensitive applications until the local renewable energy supply becomes available. Lajevardi et al. [120] suggest the Power Density Efficiency (PDE) metric to provide further insights into energy-efficiency and the effectiveness of thermal management. ...
Article
Full-text available
The energy consumption of Cloud–Edge systems is becoming a critical concern economically, environmentally, and societally; some studies suggest data centers and networks will collectively consume 18% of global electrical power by 2030. New methods are needed to mitigate this consumption, e.g. energy-aware workload scheduling, improved usage of renewable energy sources, etc. These schemes need to understand the interaction between energy considerations and Cloud–Edge components. Model-based approaches are an effective way to do this; however, current theoretical Cloud–Edge models are limited, and few consider energy factors. This paper analyses all relevant models proposed between 2016 and 2023, discovers key omissions, and identifies the major energy considerations that need to be addressed for Green Cloud–Edge systems (including interaction with energy providers). We investigate how these can be integrated into existing and aggregated models, and conclude with the high-level architecture of our proposed solution to integrate energy and Cloud–Edge models together.
... Araya et al. [18] focus on the use of Ensemble Learning for anomaly detection in smart building energy consumption. This research is focused on improving electricity usage efficiency in commercial and industrial environments to reduce the environmental impact and is not specifically aimed at detection of malicious activity, but makes use of Electronics 2024, 13, 1391 4 of 17 the same strategies for anomaly detection as IDS. ...
Article
Full-text available
The swift embrace of Industry 4.0 paradigms has led to the growing convergence of Information Technology (IT) networks and Operational Technology (OT) networks. Traditionally isolated on air-gapped and fully trusted networks, OT networks are now becoming more interconnected with IT networks due to the advancement and applications of IoT. This expanded attack surface has led to vulnerabilities in Cyber–Physical Systems (CPSs), resulting in increasingly frequent compromises with substantial economic and life safety repercussions. The existing methods for the anomaly detection of security threats typically use simple threshold-based strategies or apply Machine Learning (ML) algorithms to historical data for the prediction of future anomalies. However, due to the high levels of heterogeneity across different CPS environments, minimizing the opportunities for transfer learning, and the scarcity of real-world data for training, the existing ML-based anomaly detection techniques suffer from a poor predictive performance. This paper introduces a hybrid anomaly detection approach designed to identify threats to CPSs by combining the signature-based anomaly detection typically utilized in IT networks, the threshold-based anomaly detection typically utilized in OT networks, and behavioural-based anomaly detection using Ensemble Learning (EL), which leverages the strengths of multiple ML algorithms against the same dataset to increase the accuracy. Multiple public research datasets were used to validate the proposed approach, with the hybrid methodology employing a divide-and-conquer strategy to offload the detection of certain cyber threats to computationally inexpensive signature-based and threshold-based methods using domain knowledge to minimize the size of the behavioural-based data needed for ML model training, thus achieving a higher accuracy over a reduced timeframe. The experimental results showed accuracy improvements of 4–7% over those of the conventional ML classifiers in performing anomaly detection across multiple datasets, which is particularly important to the operators of CPS environments due to the high financial and life safety costs associated with interruptions to system availability.
... An autoencoder consists of an encoder that transforms the input data into a hidden representation, while the decoder attempts to reconstruct the input data from the same hidden representation (Fan, et al., 2018;Li, et al., 2017) with a minimum amount of distortion and noise (Baldi, 2012). Due to this characteristic, autoencoders have been used for dimensionality reduction applications , signal reconstruction applications, and anomaly detection applications (Fan, et al., 2018;Araya, et al., 2017). ...
Thesis
Full-text available
This research addresses the challenges in Thermal-Energy-Storage-Air-Conditioning (TES-AC) systems by developing a machine learning model for predicting the necessary water volume for chilling. TES-AC technology, utilizing thermal energy storage tanks, offers substantial benefits such as reduced chiller operation, cost savings, and lower carbon emissions. However, determining the optimal chilled water volume poses challenges. The primary objective is to design a machine learning model leveraging Multilayer Perceptron (MLP) for predicting water load, incorporating input variables like weather forecasts, day of the week, and occupancy data. The study validates the impact of weather data on chilled water volume, demonstrating its efficacy in prediction. The MLP-based model is fine-tuned through hyperparameter optimization, achieving a remarkable accuracy of 93.4%. The model provides specific water volume ranges, minimizing errors and aiding facility managers in informed decision-making to minimize disruptions. Technical significance lies in the model's flexibility, allowing retraining for diverse TES-AC plants without requiring specific sensor inputs. This adaptability promotes widespread TES AC adoption, contributing to environmentally friendly practices in building construction. The integration of the model into facility management software enhances predictive capabilities while offering common features, ensuring seamless incorporation into existing systems. The research aligns with Sustainable Development Goals, particularly Goals 11, 12, and 13, emphasizing sustainable cities, responsible consumption, and climate action. By focusing on technical problem-solving, addressing challenges, and emphasizing the social significance through Sustainable Development Goals, this research provides a comprehensive solution to enhance TES-AC efficiency, thereby contributing to greener and more sustainable urban environments.
... In the realm of machine learning, the integration of ensemble learning and deep learning has emerged as a transformative paradigm, surpassing the capabilities of traditional algorithms [1][2][3][4][5][6][7] Ensemble learning, characterized by the amalgamation of diverse base models into a unified framework, stands out for its ability to yield a more potent model that outperforms its individual components. This paradigmatic shift in machine learning is supported by ensemble methods that leverage multiple learning algorithms to achieve predictive performance superior to that of individual algorithms alone [10,15].This research introduces a novel contribution to the field-an advanced stacked ensemble framework tailored to empower researchers, machine learning enthusiasts, and students engaged in classification tasks which may be highly time-consuming and relies heavily on the expertise and experience of the individuals involved, thus affecting its accuracy [5]. ...
Article
Full-text available
In this work, we introduce EnsembleForge, a versatile framework designed to streamline machine learning experimentation and simplify classification tasks. Leveraging the stacking ensemble method, EnsembleForge offers an intuitive platform built upon the Scikit-learn library. This framework facilitates seamless model implementation and evaluation, supporting both Randomized and Grid Search for hyperparameter optimization. Our experiments with publicly available datasets demonstrate the ease of use and effectiveness of EnsembleForge in experimenting with various algorithms. With its adaptability and innovation, EnsembleForge showcases promising potential to serve as an asset for researchers and practitioners seeking to achieve optimal model performance in their machine learning endeavors.
... The training mode of base detectors can be divided into two categories: parallel training and sequence training. Araya et al. 25 proposed a parallel anomaly framework for the detection of anomalous energy consumption behaviors generated during the operation of a building, which implements the ensemble of three common anomaly detection algorithms. The common hyperspectral anomaly detection algorithms usually score anomalies by finding a single approximation kernel, a process that is susceptible to anomalous samples. ...
Article
Full-text available
Anomaly detection is a highly important task in the field of data analysis. Traditional anomaly detection approaches often strongly depend on data size, structure and features, while introducing the idea of ensemble into anomaly detection can greatly improve the generalization ability. Ensemble-based anomaly detection methods still face some challenges, however, such as data imbalance, time and space demand and the selection of base detectors. To this end, we propose a selective ensemble method for anomaly detection based on parallel learning (SEAD-PL). First, a differentiated stratified sampling method is designed to alleviate the problem of data imbalance. Then, a distributed parallel training frame is built to address the problem of excessive time and space consumption for base detector training. Finally, a clustering-based ensemble selection strategy is introduced to balance the accuracy and diversity of base detectors. Experiments are performed on six datasets, which demonstrate that the proposed method has obvious advantages over four selected methods.