Conference PaperPDF Available

Occupancy Forecasting using two ARIMA Strategies

Authors:

Abstract and Figures

We present an occupancy forecast method in a smart home context based on the exploitation of environmental measures such as CO2, sound or relative humidity. This article presents our machine learning algorithm and prediction strategy. It is based on two levels of data exploitation. The first level is "supervised learning" to obtain past occupancy from sensor measurements. It is achieved with a multiple logistic regression algorithm. The second level consists in two main steps. During the first step ARIMA learns and trains the model, using the past occupancy data from level 1. During the second step ARIMA predicts the future occupancy. The innovative part of our paper is that we compare two different ARIMA's (de-seasonalised). The first is the "day-sequence-time-series" (a serial ARIMA). The second is the "daily-slice-time-series" (a parallel ARIMA). We conclude by analyzing the performance of our occupancy prediction paradigm. 1 Introduction The context of our study is energy efficiency. Energy efficiency has been achieved in recent years by working on the insulation of the building envelope. This strategy has achieved optimal levels of energy performance. Additional gains are now to be sought in optimal thermal regulation. The strategy is to permanently adapt the comfort situation to the living situation. To do this, it is necessary to automatically characterise the activity of the occupants in the building. In today's innovative technological design for smart buildings, the key problem we are faced with is understanding the consumer's behaviour. Our occupancy forecast strategy will in future allow for energy savings in a smart building context. The control/command strategy of the heater will be presented in an upcoming paper. In this article, we address the principle of our method of occupancy prediction. The method of occupancy forecasting exposed in this paper contains one remarkable contribution: we compute two original ARIMA strategies for the forecast of occupancy. The first is a "Day Sequence" Time Series, which is a common process and the second is a "Daily Time Slice" Time Series which is an unusual process. The second ARIMA consists in forecasting the probability of occupancy of just one time slice (30 minutes). Then, with a loop, we reconstitute a full day by assembling all the time-slices results. We present a comparative analysis of our two ARIMA models against several criteria (error, reliability, temporal consistency, etc.). Finally, we propose conclusions and perspectives for using our prediction algorithms in an intelligent regulation paradigm in the context of energy saving.
Content may be subject to copyright.
Occupancy Forecasting using two ARIMA
Strategies
Tiên Dung CAO, Laurent DELAHOCHE, Bruno MARHIC, Jean-Baptiste MASSON.
Keywords: Multiple Logistic Regression, Occupancy Forecast, Time Series Analysis,
Nonlinear regression model, ARIMA.
Abstract
We present an occupancy forecast method in a smart home context based on the exploitation of
environmental measures such as CO2, sound or relative humidity. This article presents our
machine learning algorithm and prediction strategy. It is based on two levels of data exploitation.
The first level is “supervised learning” to obtain past occupancy from sensor measurements. It is
achieved with a multiple logistic regression algorithm. The second level consists in two main
steps. During the first step ARIMA learns and trains the model, using the past occupancy data
from level 1. During the second step ARIMA predicts the future occupancy. The innovative part
of our paper is that we compare two different ARIMA’s (de-seasonalised). The first is the “day-
sequence-time-series” (a serial ARIMA). The second is the “daily-slice-time-series” (a parallel
ARIMA). We conclude by analyzing the performance of our occupancy prediction paradigm.
1 Introduction
The context of our study is energy efficiency. Energy efficiency has been achieved
in recent years by working on the insulation of the building envelope. This strategy has
achieved optimal levels of energy performance. Additional gains are now to be sought
in optimal thermal regulation. The strategy is to permanently adapt the comfort
situation to the living situation. To do this, it is necessary to automatically characterise
the activity of the occupants in the building. In today’s innovative technological design
for smart buildings, the key problem we are faced with is understanding the consumer’s
behaviour. Our occupancy forecast strategy will in future allow for energy savings in
a smart building context. The control/command strategy of the heater will be presented
in an upcoming paper. In this article, we address the principle of our method of
occupancy prediction.
The method of occupancy forecasting exposed in this paper contains one remarkable
contribution: we compute two original ARIMA strategies for the forecast of occupancy.
The first is a Day Sequence Time Series, which is a common process and the second
is a Daily Time Slice Time Series which is an unusual process. The second ARIMA
consists in forecasting the probability of occupancy of just one time slice (30 minutes).
Then, with a loop, we reconstitute a full day by assembling all the time-slices results.
We present a comparative analysis of our two ARIMA models against several criteria
(error, reliability, temporal consistency, etc.). Finally, we propose conclusions and
perspectives for using our prediction algorithms in an intelligent regulation paradigm
in the context of energy saving.
2 Related works
Characterisation of human activity and the ability to predict, it is a major issue in
many disciplinary fields. Many proposals for methods have already been suggested in
the medical field (such as personal assistance), in the energy efficiency field and in
many others. In [1], a complete monitoring architecture is presented, including home
sensors and cloud-based back-end services. In this article, supervised techniques for
behavioural data analysis are proposed using regression methods and ARIMA. By
means of inductive and deductive reasoning, the authors of the article [2] introduce a
framework to detect occupant activity and potentially wasted energy consumption. This
framework consists of three sub-algorithms for action detection, activity recognition
and waste estimation. Unsupervised clustering models are used to detect the occurred
actions. In paper [3] a new approach to modeling human behaviour patterns is
suggested. The authors use Markov chains to determine an unsupervised model of
human behaviour and to detect the deviation over time. Deviating behaviour is revealed
through data clustering and analysis of associations between clusters and data vectors
representing adjacent time intervals. The activity recognition is also used in [4], which
proposes learning customized structural models for common user activities in order to
predict the trend of energy consumption. The recognition algorithm is based on
recursive structures of user activities obtained from raw sensor readings. Artificial
neural networks (ANN) are used in [5] [6] [7] to manage resident activity recognition
in Smart Homes. The authors in [5] tackle three ANN algorithms for human activity
recognition, namely: Quick Propagation (QP), Levenberg Marquardt (LM) and Batch
Back Propagation (BBP). In the same way, an unsupervised learning strategy is used in
[8] to improve activity recognition in smart environments. In [9] and [10] the Support
Vector Machines (SVM) are used to address the same problem.
3 Theoretical framework
3.1 The Multiple Variables Logistic Regression (MLR):
Logistic Regression is a statistical learning algorithm developed by David Cox in
1958. Its purpose is to reconstruct a qualitative variable Y as a function of one (simple
regression) or several (multiple regression) explanatory variables . A
discussion on logistic regression (and variants) can be found in detail in the book by
Hastie et al. [11]. The main idea is to express certain log-odds as linear functions of the
Xi, using equations similar to classical linear regression.
When Y is binary, it suffices to define󰇛󰇜󰇛 󰇜 and to assume
that its log-odds is a linear function of the explanatory variables:
 󰇛󰇜
󰇛󰇜 (Eq. 1)
where the coefficients are parameters to be estimated.
The MLR finds estimates for the parameters by maximizing the log-
likelihood function 󰇛󰇜with the NewtonRaphson iterative method (the solution
has no closed form): at each step, the estimates are updated by
  󰇡
󰇢 
 (Eq. 2)
Once we have estimated
, we obtain an estimated probability function:
󰇛󰇜



(Eq. 3)
This 󰇛󰇜is calculated at each instant of measurement, giving the posterior
probability of occupation as a function of time, as mentioned at the end of § 3.1.
Its interpretation is as follows: when it is close to 1, the measurements indicate
that the occupant is present; when it is close to 0, the measurements indicate that
the occupant is absent; when it is close to 0.5, the measurements could be
associated with either presence or absence.
3.2 Pre-processing with STL
The STL method decomposes a time series into the sum of three components:
seasonal, trend, and residual (or remainder) using Loess (non-linear regression
technique) [15]. An STL decomposition of our data is shown in Figure 1 below. Here,
the seasons correspond to days. We call de-seasonalised data the residual component.
It will be handed to several ARIMA strategies 4.1). Finally, we will add the trend
and the seasonal components back to the ARIMA results to obtain occupancy
probability forecasts: we will call this operation re-seasonalising the data.
FIGUR E 1: STL DECOMPOSITION
3.3 ARIMA
An ARIMA (Auto Regressive Integrated Moving Average) [12] model is a
statistical model for analyzing and forecasting time series data. Adopting an ARIMA
model for a time series assumes that the underlying process that generated the
observations is an ARIMA process; i.e. Stationarity [13]. The data will follow the same
general trends and patterns as in the past [14]. This may seem obvious but helps to
motivate the need to confirm the assumptions of the model in the raw observations and
in the residual errors of forecasts from the model.
FIGUR E 2: ARIMA MODEL
4 Proposed Method
We propose an occupancy prediction method in a smart home context based on the
exploitation of the measurements of the sensors disseminated in the building. Our
paradigm is based on two consecutive steps integrating the learning process:
- to determine occupancy probability (MLR) based on sensor data
- to forecast occupancy in the near future (STL-ARIMA)
4.1 Forecasting step using ARIMA class model by de-seasonalising data with 2
strategies
Strategy 1 (Serial): “Day Sequence” Time Series Processes Model
This model handles the time series in a classical way: the probabilities of occupancy
form a single sequence treated by ARIMA. Here, we assume that whole days will
follow the same general trends and patterns as in the past. However, we separate the
weekdays from the weekends, and work independently on the two resulting samples (in
one, Fridays are followed by Mondays, and in the other one, Sundays are followed by
Saturdays). We implement our new “weekday” database and the two types of seasonal
variables in the Day Sequence Time Series process (database=[2016,1] (weekday and
30 minute step phases)) and we forecast one day ahead (48 steps of 30 minutes each)
with the STL-ARIMA function. The STL ARIMA can be written as:




(Eq. 4)
 (days sub-seasons slice per 30 minutes).
: Errors
j: numbers step-ahead forecasts
: our benchmark time
: Level
: Trend
: Seasonal
Strategy 2 (Parallel): “Daily Time Slice” Time Series Processes Model.
This model handles the time series in an innovative way: we define 48 time slices
per day (each 30 minutes long) and then form a sample for each time slice. We designed
this model to take advantage of the regularity per time slices on multiple days (the
occupant's “habits”). Hence, 48 instances of ARIMA are performed on shorter
sequences than in Strategy 1. For instance, one ARIMA handles only the probabilities
of presence for the time-slice 8:00 to 8:30 am, each data point coming from a different
day. Therefore, we use the same database as in Strategy 1, converted into a probability
matrix [42x48] that corresponds to 42 days and 48 slices of time per day. This strategy
can be seen as 48 “parallel” ARIMAs, whereas Strategy 1 consists of 48 “serial”
ARIMAs. We also forecast one day ahead with the STL-ARIMA function, but here it
just corresponds to one step in time (one day) for each of the 48 slices (30 minutes).
This can be written as:



(Eq. 5)
(weekdays season)
: Errors
i: numbers slice of time
: our benchmark time.
: Level
: Trend
: Seasonal
4.2 Implementation algorithm
To determine occupancy (the variable of interest), we use data from Netatmo(c) and
infrared sensors disseminated in the environment: we have relative humidity (Hr%),
CO2 (ppm), and infrared measurements (PIR 0/1) to determine whether or not the
occupant is present. We obtain the probability of occupancy by supervised learning,
fitting PIR as a function of the others with multiple logistic regression (MLR). Then,
we aim to compare and/or combine two forecast algorithms based on ARIMA models,
differing by the strategy for reassembling the time samples: the “Day Sequence” time
series” (48 serial ARIMAs) and the “Daily Time Slice” time series” (48 parallel
ARIMAs).
The prediction data are reorganised (split) in order to set data to both ARIMA
(§4.1). At the “end” of the process the parallel ARIMA forecast are merge together in
order to obtain an entire day. The serial ARIMA provides a day forecast directly. Figure
3 illustrates the sequence of computations involve in this method.
FIGUR E 3: PROPOSED ALGORITHM
5 Results and Discussion
5.1 The raw data in the Learning phase
Our perception system is composed of four sources. For each source the sampling
rate of the raw data is 5 minutes. The input of the data is almost synchronous. The
sensors’ data include room temperature (°C), CO2 levels (ppm), relative hygrometry
(% Hr) and passive infrared (PIR, 0/1), as shown in the Figure 4. This dataset covers
the period stretching from 1 January to 28 February 2017. We esteem that this time
range is sufficiently long to evaluate the occupancy behaviour.
FIGUR E 4: TH E RA W DAT A FO R 1 DAY
The raw dataset is used to train and test a classification in order to determine
occupancy probability. The PIR data is only used for the MLR (Multiple Variable
Logistic Regression) classification, to supervise (training) and to control the estimation
(testing). The purpose of this classification is to replace all raw data by a new dataset
that represents the occupancy probability as a function of time.
5.2 The time series data (Occupancy probability)
In the Figure 5 the reader will find our times series data that was rendered by the
Learning phase (MLR).
FIGUR E 5: TH E TIM E SERI ES DATA MLR RESULTS (OCCUPANCY PROBA BILITY)
Because we are using past data to predict future data, we should assume that the
data will follow the same general trends and patterns as in the past. This general
statement holds for most training data and modelling. The rolling mean and standard
deviation look like they change over time. There may be some de-trending and
removing seasonality involved. Applying log transformation, and first-order
differencing makes the data more stationary over time. This makes the data suitable to
be used in our ARIMA models.
5.3 Forecasting Probabilities
In Figure 6 below, the dataset covers the 01/01/17-28/02/17 period, and we forecast
the next day’s hourly results (the 48 steps period) with the 2 strategies described in §
4.1 (grey and orange curves). To assess the accuracy of the forecast, we use as reference
the output of the MLR classification of a known day (01/03/2017, blue curve). All
values are re-seasonalized.
DateTime Probab Occup
01/01/2017 00:01 0,472024
01/01/2017 00:06 0,424827
01/01/2017 00:11 0,409648
01/01/2017 00:16 0,427371
01/01/2017 00:21 0,440148
01/01/2017 00:26 0,407419
01/01/2017 00:31 0,43276
01/01/2017 00:36 0,407419
01/01/2017 00:41 0,427659
01/01/2017 00:46 0,409933
01/01/2017 00:51 0,392438
01/01/2017 00:56 0,379828
01/01/2017 01:01 0,374939
01/01/2017 01:06 0,336805
01/01/2017 01:11 0,334485
01/01/2017 01:16 0,365237
01/01/2017 01:21 0,343813
01/01/2017 01:26 0,362828
01/01/2017 01:31 0,379828
01/01/2017 01:36 0,375215
01/01/2017 01:41 0,353532
01/01/2017 01:46 0,387488
01/01/2017 01:51 0,37035
01/01/2017 01:56 0,379828
FIGUR E 6: OCCUPANCY FORECA STING PR OBABILI TIES BOTH ARIMAS 1 DAY
The STL-ARIMA forecasting with the “Day Sequence” process (Serial) tends to
overestimate the occupancy probability and is smoother. The STL-ARIMA forecasting
with the “Daily Time Slice” process (Parallel) is jagged: sometimes it overestimates,
sometimes it underestimates. Both manage to anticipate the rise of the occupancy
probability, but a little too soon (34 units instead of 39 time units, so about 2h30 in real
time).
The difference in smoothness is not very surprising, since the Serial strategy
corresponds to a single autoregressive model whereas the Parallel strategy corresponds
to 48 intertwined models: in the Serial, the auto-regression equation uses actually
successive data (previous half hours); in the Parallel, each forecast point is obtained as
a function of more distant points in time (previous days).
In Figure 7 below, we report only the Serial strategy forecast (previous grey curve),
with the associated confidence interval. In Figure 8 below, we report only the Parallel
strategy forecast (previous orange curve), with the associated confidence interval.
FIGUR E 7: STL-ARIMA (SERIAL) FOREC ASTI NG WI TH 95% CO NFIDENC E
INTERVAL
FIGUR E 8: STL-ARIMA (PARALLEL) FOR ECAST ING WITH 95% CONFIDENCE
INTERVAL
Globally, all confidence intervals are quite narrow, which means that our horizon
of forecast (1 day) is suitable. Perhaps more surprisingly, the sizes of these intervals are
very similar with both strategies. The Serial strategy learns with more data (2 long
subsamples) but its forecast horizon is farther (48 time units); the Parallel strategy
learns with fewer data (48 short subsamples) but its forecast horizon in nearer (1 time
unit). It seems that the influence of these two factors (sample size, time horizon)
counterbalance each other.
In Figure 9 below, we report several statistical indicators that aim to assess the
performance of our two strategies on the test day (01/03/2017). We also report the
performance of a more naïve ARIMA without the preliminary STL step (Auto-
ARIMA).
FIGUR E 9: RE SULTS OF STATIST ICAL ERRORS
Most indicators are similar among the three methods. Some notable exceptions are
the MAE (mean absolute error), the MPE (mean percentage error) and the MAPE (mean
absolute percentage error). Both our strategies perform better than the naive one
according to the MAE, and our Parallel method performs better than the others
according to the MPE and the MAPE. It seems that our innovative strategy has real
qualities and deserves interest.
6 Conclusion and Perspective
In this article, we have proposed to deal with occupancy forecasting in a Smart
Building context. Occupancy forecasting allows a smart control of HVAC devices in
order to save energy and optimise comfort. We presented a forecasting strategy of
occupancy mainly based on four steps. First, from direct data measurements (CO2, PIR,
Hr, Temp), we define an occupancy probability based on MLR classification. Then, we
remove the seasonal component of the time series of occupancy through the STL-
method. The third step predicts the temporal signal (occupancy) with two ARIMA
strategies: one is the “Day Sequence” (Serial) and the other is the “Daily Time Slice”
(Parallel). Finally, we add the seasonal component back.
The cautious reader will have noticed that in Figure 7 , the forecasts are negative
between 8 and 11 units of time. This cannot represent a suitable probability, and is due
to the fact that ARIMA works with unconstrained real values. We plan to solve this
problem by using the ARIMA strategies on the log-odds instead of the probabilities.
Globally, both ARIMA strategies give suitable results with low uncertainties. The
“Daily Time Slice” forecast is more dynamic than the “Day Sequence” one, but has
similar uncertainties, at least for a 1-day horizon. It is well known that as the forecast
horizon increases, the confidence intervals’ size tends to rise. Our two strategies might
exhibit a difference in the speed of this size increase. This question will be addressed
in future works.
Bibliographie
[1]
Mora, Niccolò , Guido Matrella and Paolo Ciampolini, «Cloud-Based
Behavioral Monitoring in Smart Homes,» Sensors (Basel, Switzerland), Vols.
%1 sur %218,6 1951, n° %1doi:10.3390/s18061951, 15 Jun. 2018.
[2]
Simin Ahmadi-Karvigha, Ali Ghahramania, Burcin Becerik-Gerberb,
Lucio Soi-belmanc, «Real-time activity recognition for energy efficiency in
buildings,» Ap-plied Energy 211, p. 146160., 2018.
[3]
Jens Lundström, Eric Järpe, Antanas Verikas, «Detecting and exploring
deviating behaviour of smart home residents,» Expert Systems With
Applications 55, p. 429440, 2016.
[4]
Pietro Cottone, Salvatore Gaglio, Giuseppe Lo Re, Marco Ortolani, «User
activity recognition for energy saving in smart homes,» Pervasive and Mobile
Computing 16, p. 156170, 2015.
[5]
Homay Danaei Mehr, Huseyin Polat and Aydin Cetin, «Resident Activity
Recognition in Smart Homes by Using Artificial Neural Networks,» 2016 4th
International Istanbul Smart Grid Congress and Fair (ICSG), Istanbul,
Turkey, %1DOI: 10.1109/SGCF.2016.7492428, 20-21 April 2016.
[6]
Niall Twomey, Tom Diethe, Ian Craddock, Peter Flach, «Unsupervised
learning of sensor topologies for improving activity recognition in smart
environments,» Neurocomputing, vol. Volume 234, %1ISSN 0925-2312,
https://doi.org/10.1016/j.neucom.2016.12.049, pp. 93-106, 2017.
[7]
Anthony Fleury, Michel Vacher, Norbert Noury, «SVM-Based Multimodal
Classification of Activities of Daily Living in Health Smart Homes: Sensors,
Algorithms, and First Experimental Results,» IEEE Transactions on
Information Technology in Biomedicine, Vols. %1 sur %2Volume: 14, Issue:
2, March 2010.
[8]
Muhammad Fahim, Iram Fatima, Sungyoung Lee, Young-Koo Lee,
«Activity Recognition Based on SVM Kernel Fusion in Smart Home,» Part of
the Lecture Notes in Electrical Engineering book series (LNEE, volume 203),
Computer Science and its Applications, vol. 203, pp. 283-290, Octobre 2012.
[9]
Vanus, J., Belesova, J., Martinek, R. et al., «Monitoring of the daily living
activities in smart home care,» Human-centric Computing and Information
Sciences (2017), vol. 7, n° %1https://doi.org/10.1186/s13673-017-0113-6, 30
December 2017.
[10]
Jiho Park, Kiyoung Jang, Sung-Bong Yang, «Deep neural networks for
activity recognition with multi-sensor data in a smart home,» Internet of
Things (WF-IoT) 2018 IEEE 4th World Forum on, pp. 155-160, 2018.
[11]
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning, 2e éd., Newyork: Springer, 2009, pp. 119-135.
[12]
Brownlee, Jason, «A Gentle Introduction to SARIMA for Time Series
Forecasting in Python,» 17 08 2018. [En ligne]. Available:
https://machinelearningmastery.com/sarima-for-time-series-forecasting-in-
python/. [Accès le 06 2019].
[13]
Nau, Robert, «Stationarity and differencing,» Forecasting home page, 09
2014. [En ligne]. Available: https://people.duke.edu/~rnau/411diff.htm.
[Accès le 06 2019].
[14]
Srivastava, Tavish, «A Complete Tutorial on Time Series Modeling in R,»
Analytics Vidhya, 16 12 2015. [En ligne]. Available:
https://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-
series-modeling/. [Accès le 06 2019].
[15]
Robert B. Cleveland, William S. Cleveland, McRae, J. E., & Terpenning, I.
J., «STL: A seasonal-trend decomposition procedure based on Loess.,»
Journal of Official Statistics, vol. 6, n° %11, pp. 3-33, 1990.
... The study's parameters included floor space, building type, and measurement unit, and the Deep Neural Network (DNN) models were trained using training data to ensure efficient performance. To provide a control scheme for the ventilation systems [22] and HVAC systems in smart buildings, the transfer learning model was integrated with a convolutional neural network (CNN) and recurrent neural network (RNN). ...
... To achieve this, it is important to automatically characterize the activities of the building's residents. The significant challenge in today's new technical design for smart buildings is understanding customer behaviors [9]. In the future, our occupancy prediction approach will guarantee energy savings in a smart building environment. ...
Article
Full-text available
In this work, we provide a smart home occupancy prediction technique based on environmental variables such as CO2, noise, and relative temperature via our machine learning method and forecasting strategy. The proposed algorithms enhance the energy management system through the optimal use of the electric heating system. The Long Short-Term Memory (LSTM) neural network is a special deep learning strategy for processing time series prediction that has shown promising prediction results in recent years. To improve the performance of the LSTM algorithm, particularly for autocorrelation prediction, we will focus on optimizing weight updates using various approaches such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The performances of the proposed methods are evaluated using real available datasets. Test results reveal that the GA and the PSO can forecast the parameters with higher prediction fidelity compared to the LSTM networks. Indeed, all experimental predictions reached a range in their correlation coefficients between 99.16% and 99.97%, which proves the efficiency of the proposed approaches.
... From an academic point of view, these methods can be divided into three categories: statistical analysis, machine learning and deep learning. The statistical models include various methods such as Markov chain (MC) [7], exponential smoothing [8] and autoregressive integrated moving average (ARIMA) [9]. The machine learning models consist of three methodologies: Decision Tree (DT), Support Vector Machine (SVM) and Artificial Neural Network (ANN) [10]. ...
Article
Full-text available
With the steep rise in the development of smart grids and the current advancement in developing measuring infrastructure, short term power consumption forecasting has recently gained increasing attention. In fact, the prediction of future power loads turns out to be a key issue to avoid energy wastage and to build effective power management strategies. Furthermore, energy consumption information can be considered historical time series data that are required to extract all meaningful knowledge and then forecast the future consumption. In this work, we aim to model and to compare three different machine learning algorithms in making a time series power forecast. The proposed models are the Long Short-Term Memory (LSTM), the Gated Recurrent Unit (GRU) and the Drop-GRU. We are going to use the power consumption data as our time series dataset and make predictions accordingly. The LSTM neural network has been favored in this work to predict the future load consumption and prevent consumption peaks. To provide a comprehensive evaluation of this method, we have performed several experiments using real data power consumption in some French cities. Experimental results on various time horizons show that the LSTM model produces a better result than the GRU and the Drop-GRU forecasting methods. There are fewer prediction errors and its precision is finer. Therefore, these predictions based on the LSTM method will allow us to make decisions in advance and trigger load shedding in cases where consumption exceeds the authorized threshold. This will have a significant impact on planning the power quality and the maintenance of power equipment.
Chapter
The accurate estimation of heat energy performance in buildings is critical for optimizing energy demand and supply. Non-residential properties have predictable operating patterns in principle, incorporating these patterns into simulations of energy consumption can help estimate building energy use. In this work we develop Long-Short Term Memory (LSTM) Sequence to Sequence and Gated Recurrent Unit (GRU) architectures, which are composed of Dropout, Repeat Vector, Time-distributed and Graph Convolution layers. We have conducted a rigor comparative study on the structures and hyper parameters using the national grid data, then use the learnt models for the energy demand site management undertaken in a laboratory environment.KeywordsLSTMGRUenergy consumption predictiontime series data analysis
Article
Full-text available
Environmental sensors are exploited in smart homes for many purposes. Sensor data inherently carries behavioral information, possibly useful to infer wellness and health-related insights in an indirect fashion. In order to exploit such features, however, powerful analytics are needed to convert raw sensor output into meaningful and accessible knowledge. In this paper, a complete monitoring architecture is presented, including home sensors and cloud-based back-end services. Unsupervised techniques for behavioral data analysis are presented, including: (i) regression and outlier detection models (also used as feature extractors for more complex models); (ii) statistical hypothesis testing frameworks for detecting changes in sensor-detected activities; and (iii) a clustering process, leveraging deep learning techniques, for extracting complex, multivariate patterns from daily sensor data. Such methods are discussed and evaluated on real-life data, collected within several EU-funded projects. Overall, the presented methods may prove very useful to build effective monitoring services, suitable for practical exploitation in caregiving activities, complementing conventional telemedicine techniques.
Article
Full-text available
One of the key requirements for technological systems that are used to secure independent housing for seniors in their home environment is monitoring of daily living activities (ADL), their classification, and recognition of routine daily patterns and habits of seniors in Smart Home Care (SHC). To monitor daily living activities, the use of a temperature, CO2, humidity sensors, and microphones are described in experiments in this study. The first part of the paper describes the use of CO2 concentration measurement for detecting and monitoring room´s occupancy in SHC. In second part focuses this paper on the proposal of an implementation of Artificial Neural Network based on the Levenberg–Marquardt algorithm (LMA) for the detection of human presence in a room of SHC with the use of predictive calculation of CO2 concentrations from obtained measurements of temperature (indoor, outdoor) Ti, To and relative air humidity rH. Based on the long-term monitoring (1 month) of operational and technical functions (unregulated, uncontrolled) in an experimental Smart Home (SH), LMA was trained through the data picked up by the sensors of CO2, T and rH with the aim to indirectly predict CO2 leading to the elimination of CO2 sensor from the measurement process. Within the realized experiment, input parameters of the neuronal network and the number of neurons for LMA were optimized on the basis of calculated values of Root Mean Squared Error, the correlative coefficient (R) and the length of the measured training time ANN. With the use of the trained network ANN, we realized a strictly controlled short-term (11 h) experiment without the use of CO2 sensor. Experimental results verified high method accuracy (>95%) within the short-term and long-term experiments for learned ANN (1.6.2015–30.6.2015). For learned ANN (1.2.2014–27.2.2014) was verified worse method accuracy (>60%). The original contribution is a verification of a low-cost method for the detection of human presence in the real operating environment of SHC. In the third part of the paper is described the practical implementation of voice control of operating technical functions by the KNX technology in SHC by means of the in-house developed application HESTIA, intended for both the desktop system version and the mobile version of the Windows 10 operating system for mobile phones. The resultant application can be configured for any building equipped with the KNX bus system. Voice control implementation is an in-house solution, no third-party software is used here. Utilization of the voice communication application in SHC was proven on the experimental basis with the combination of measurement CO2 for ADL monitoring in SHC.
Article
Full-text available
There has been significant recent interest in sensing systems and ‘smart environments’, with a number of longitudinal studies in this area. Typically the goal of these studies is to develop methods to predict, at any one moment of time, the activity or activities that the resident(s) of the home are engaged in, which may in turn be used for determining normal or abnormal patterns of behaviour (e.g. in a health-care setting). Classification algorithms, such as Conditional Random Field (CRFs), typically consider sensor activations as features but these are often treated as if they were independent, which in general they are not. Our hypothesis is that learning patterns based on combinations of sensors will be more powerful than single sensors alone. The exhaustive approach – to take all possible combinations of sensors and learn classifier weights for each combination – is clearly computationally prohibitive. We show that through the application of signal processing and information-theoretic techniques we can learn about the sensor topology in the home (i.e. learn an adjacency matrix) which enables us to determine the combinations of sensors that will be useful for classification ahead of time. As a result we can achieve classification performance better than that of the exhaustive approach, whilst only incurring a small cost in terms of computational resources. We demonstrate our results on several datasets, showing that our method is robust in terms of variations in the layout and the number of residents in the house. Furthermore, we have incorporated the adjacency matrix into the CRF learning framework and have shown that it can improve performance over multiple baselines.
Conference Paper
Full-text available
Recognition and detection of human activity is one of the challenges in smart home technologies. In this paper, three algorithms of artificial neural networks, namely Quick Propagation (QP), Levenberg Marquardt (LM) and Batch Back Propagation (BBP), have been used for human activity recognition and compared according to performance on Massachusetts Institute of Technology (MIT) smart home dataset. The achieved results demonstrated that Levenberg Marquardt algorithm has better human activity recognition performance (by 92.81% accuracy) than Quick Propagation and Batch Back Propagation algorithms.
Article
Full-text available
By 2050, about one third of the French population will be over 65. Our laboratory's current research focuses on the monitoring of elderly people at home, to detect a loss of autonomy as early as possible. Our aim is to quantify criteria such as the international activities of daily living (ADL) or the French Autonomie Gerontologie Groupes Iso-Ressources (AGGIR) scales, by automatically classifying the different ADL performed by the subject during the day. A Health Smart Home is used for this. Our Health Smart Home includes, in a real flat, infrared presence sensors (location), door contacts (to control the use of some facilities), temperature and hygrometry sensor in the bathroom, and microphones (sound classification and speech recognition). A wearable kinematic sensor also informs postural transitions (using pattern recognition) and walk periods (frequency analysis). This data collected from the various sensors are then used to classify each temporal frame into one of the ADL that was previously acquired (seven activities: hygiene, toilet use, eating, resting, sleeping, communication, and dressing/undressing). This is done using support vector machines. We performed a 1-h experimentation with 13 young and healthy subjects to determine the models of the different activities, and then we tested the classification algorithm (cross validation) with real data.
Article
More than half of the electricity in residential and commercial buildings is consumed by lighting systems and appliances. Consumption by these service systems is directly associated with occupant activities. By recognizing activities and identifying the associated possible energy savings, more effective strategies can be developed to design better buildings and automation systems. In line with this motivation, using inductive and deductive reasoning, we introduce a framework to detect occupant activities and potential wasted energy consumption and peak-hour usage that could be shifted to non-peak hours in real-time. Our framework consists of three sub-algorithms for action detection, activity recognition and waste estimation. As the real-time input, the action detection algorithm receives the data from the sensing system, consisting of plug meters and sensors, to detect the occurred actions (e.g., turning on an appliance) via our unsupervised clustering models. Detected actions are then used by the activity recognition algorithm to recognize the activities (e.g., preparing food) through semantic reasoning on our constructed ontology. Based on the recognized activities, the waste estimation algorithm identifies the potential waste and estimates the potential savings. To evaluate the performance of our framework, an experimental study was carried out in an office with five occupants and in two single-occupancy apartments for two weeks. Following the experiment, the performance of the action detection and activity recognition algorithms was evaluated using the ground truth labels for actions and activities. Average accuracy was 97.6% for action detection using Gaussian Mixture Model with Principal Components Analysis and 96.7% for activity recognition. In addition, 35.5% of the consumption of an appliance or lighting system in average was identified as potential savings.
Article
STL is a filtering procedure for decomposing a time series into trend, seasonal, and remainder components. STL has a simple design that consists of a sequence of applications of the loess smoother; the simplicity allows analysis of the properties of the procedure and allows fast computation, even for very long time series and large amounts of trend and seasonal smoothing. Other features of STL are specification of amounts of seasonal and trend smoothing that range, in a nearly continuous way, from a very small amount of smoothing to a very large amount; robust estimates of the trend and seasonal components that are not distorted by aberrant behavior in the data; specification of the period of the seasonal component to any integer multiple of the time sampling interval greater than one; and the ability to decompose time series with missing values.
Antanas Verikas, «Detecting and exploring deviating behaviour of smart home residents,» Expert Systems With Applications 55
  • Jens Lundström
  • Eric Järpe
Jens Lundström, Eric Järpe, Antanas Verikas, «Detecting and exploring deviating behaviour of smart home residents,» Expert Systems With Applications 55, p. 429-440, 2016.
User activity recognition for energy saving in smart homes
  • Pietro Cottone
  • Salvatore Gaglio
  • Giuseppe Lo Re
  • Marco Ortolani
Pietro Cottone, Salvatore Gaglio, Giuseppe Lo Re, Marco Ortolani, «User activity recognition for energy saving in smart homes,» Pervasive and Mobile Computing 16, p. 156-170, 2015.
  • Muhammad Fahim
  • Iram Fatima
  • Sungyoung Lee
  • Young-Koo Lee
Muhammad Fahim, Iram Fatima, Sungyoung Lee, Young-Koo Lee, «Activity Recognition Based on SVM Kernel Fusion in Smart Home,» Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 203), Computer Science and its Applications, vol. 203, pp. 283-290, Octobre 2012.