Conference PaperPDF Available

Privacy Implications of Room Climate Data

September 2017

September 2017

DOI:10.1007/978-3-319-66399-9_18

Conference: ESORICS 2017

Authors:

Philipp Morgner

Friedrich-Alexander-University of Erlangen-Nürnberg

Bjoern M Eskofier

Friedrich-Alexander-University of Erlangen-Nürnberg

Show all 7 authorsHide

Smart heating applications promise to increase energy efficiency and comfort by collecting and processing room climate data. While it has been suspected that the sensed data may leak crucial personal information about the occupants , this belief has up until now not been supported by evidence. In this work, we investigate privacy risks arising from the collection of room climate measurements. We assume that an attacker has access to the most basic measurements only: temperature and relative humidity. We train machine learning classifiers to predict the presence and actions of room occupants. On data that was collected at three different locations, we show that occupancy can be detected with up to 93.5% accuracy. Moreover, the four actions reading, working on a PC, standing, and walking, can be discriminated with up to 56.8% accuracy, which is also far better than guessing (25%). Constraining the set of actions allows to achieve even higher prediction rates. For example, we discriminate standing and walking occupants with 95.1% accuracy. Our results provide evidence that even the leakage of such 'inconspicuous' data as temperature and relative humidity can seriously violate privacy.

Average accuracy over all sensors from each location for occupancy detection depending on the window size

…

Figures - uploaded by Philipp Morgner

Content may be subject to copyright.

Content uploaded by Philipp Morgner

Content may be subject to copyright.

Privacy Implications of Room Climate Data*

Philipp Morgner1, Christian M¨

uller2, Matthias Ring1, Bj¨

orn Eskoﬁer1,

Christian Riess1, Frederik Armknecht2, and Zinaida Benenson1

1Friedrich-Alexander-Universit¨

at Erlangen-N¨

urnberg, Germany

{philipp.morgner, matthias.ring, bjoern.eskofier,

christian.riess, zinaida.benenson}@fau.de

2University of Mannheim, Germany

{christian.mueller, armknecht}@uni-mannheim.de

Abstract. Smart heating applications promise to increase energy eﬃciency and

comfort by collecting and processing room climate data. While it has been sus-

pected that the sensed data may leak crucial personal information about the oc-

cupants, this belief has up until now not been supported by evidence.

In this work, we investigate privacy risks arising from the collection of room

climate measurements. We assume that an attacker has access to the most basic

measurements only: temperature and relative humidity. We train machine learn-

ing classiﬁers to predict the presence and actions of room occupants. On data that

was collected at three diﬀerent locations, we show that occupancy can be detected

with up to 93.5% accuracy. Moreover, the four actions reading, working on a PC,

standing, and walking, can be discriminated with up to 56.8% accuracy, which

is also far better than guessing (25%). Constraining the set of actions allows to

achieve even higher prediction rates. For example, we discriminate standing and

walking occupants with 95.1% accuracy. Our results provide evidence that even

the leakage of such ‘inconspicuous’ data as temperature and relative humidity

can seriously violate privacy.

1 Introduction

The vision of the Internet of Things (IoT) is to enhance work processes, energy eﬃ-

ciency, and living comfort by interconnecting actuators, mobile devices and sensors.

These networks of embedded technologies enable applications such as smart heating,

home automation, and smart metering, among many others. Sensors are of crucial im-

portance in these applications. Data gathered by sensors is used to represent the current

state of the environment, for instance in smart heating, sensors measure the room cli-

mate. Using these information and a user-deﬁned conﬁguration of the targeted state of

room climate, the application regulates heating, ventilation, and air conditioning.

While the collection of room climate data is obviously essential to enable smart

heating, it may at the same time impose the risk of privacy violations. Consequently, it is

commonly believed among security experts that leaking room climate data may result in

privacy violations and hence that the data needs to be cryptographically protected [39,

*Authors’ version of the paper published in the Proceedings of the 22nd European Symposium

on Research in Computer Security (ESORICS 2017). DOI: 10.1007/978-3-319-66399-9 18

12, 4]. However, these claims have not been supported by scientiﬁc evidence so far.

Thus, one could question whether in practice additional eﬀort for protecting the data

would be justiﬁed.

The current situation with room climate data is comparable to the area of smart

metering [18, 44, 24, 27]. In 1989, Hart [18] was the ﬁrst to draw attention to the fact

that smart metering appliances can be exploited as surveillance devices. Since then,

research has shown far-reaching privacy violations through ﬁne-granular power con-

sumption monitoring, ranging from occupancy and everyday activities detection [31]

up to recognizing which program a TV was displaying [14].

Various techniques have been proposed over the years to mitigate privacy risks of

smart metering [3, 37, 25, 44, 36]. This issue has become such a grave concern that the

German Federal Oﬃce for Information Security published a protection proﬁle for smart

meters in 2014 [2]. By considering privacy implications of smart heating, we hope to

initiate consumer protection research and policy debate in this area, analogous to the

developments in smart metering described above.

Research Questions. In this work, we are the ﬁrst to investigate room climate data

from the perspective of possible privacy violations. More precisely, we address the fol-

lowing research questions:

–Occupancy detection: Can an attacker determine the presence of a person in a room

using only room climate data, i.e., temperature and relative humidity?

–Activity recognition: Can an attacker recognize activities of the occupant in the

room using only the temperature and relative humidity data?

Our threat scenario targets buildings with multiple rooms that are similar in size,

layout, furnishing, and positions of the sensors. These properties are typical for oﬃce

buildings, dormitories, cruise ships, and hotels, among others. Assuming that an at-

tacker is able to train a classiﬁer that recognizes pre-deﬁned activities, possible privacy

violations are, e.g., tracking presence and working practices of employees in oﬃces, or

the disclosure of lifestyle and intimate activities in private spaces. All these situations

present intrusions in the privacy of the occupants. In contrast to surveillance cameras

and motion sensors, the occupant does not expect to be monitored. Also, legal restric-

tions regarding privacy might apply to surveillance cameras and motion sensors but not

to room climate sensors.

Experiments. To evaluate these threats, we present experiments that consider occu-

pancy detection and activity recognition based on the analysis of room climate data

from a privacy perspective. We measured room climate data in three oﬃce-like rooms

and distinguished between the activities reading, standing, walking, and working on

a laptop. Although we assume that in smart heating applications, only one sensor per

room is most likely to be installed, each room was equipped with several sensors in

order to evaluate diﬀerent positions of sensors in the room. These sensors measured

temperature and relative humidity at a regular time interval of a few seconds. In our

procedure, an occupant performed a pre-deﬁned sequence of tasks in the experimental

space. In sum, we collected almost 90 hours of room climate sensor data from a total

of 36 participants. The collected room climate data was analyzed using an oﬀ-the-shelf

machine learning classiﬁcation algorithm. To reﬂect realistic settings, we only evaluated

data of a single sensor and did not apply sensor fusion.

Results. Evaluating our collected room climate data, the attacker detects presence of

a person with detection rates up to 93.5% depending on location and the sensor posi-

tion, which is signiﬁcantly higher than guessing (50%). The attacker can distinguish

between four activities (reading, standing, walking, and working on a laptop) with de-

tection rates up to 56.8%, which is also signiﬁcantly better than guessing (25%). We

can also distinguish between three activities (sitting, standing and walking) with detec-

tion rates up to 81.0%, as opposed to 33.3% if guessing. Furthermore, we distinguish

between standing and walking with detection rates up to 95.1%. Thus, we show that

the fears of privacy violation by leaking room climate data are well justiﬁed. Further-

more, we analyze the inﬂuence of the room size, positions of the sensor, and amount

of the measured sensor data on the accuracy. In summary, we provide the ﬁrst steps in

verifying the common belief that room climate data leaks privacy-sensitive information.

Outline. The remainder of this paper is organized as follows. In Section 2, we give an

overview of related work. Section 3 presents the threat model considered in this work.

In Section 4, we introduce the experimental design and methods. The results of our

experiments are presented and discussed in Sections 5 and 6, respectively. We draw

conclusions in Section 7. Additional information regarding the experimental procedure

can be found in Appendix A.

2 Related Work

Over the last decade, several experiments have been conducted to detect occupancy in

sensor-equipped spaces and to recognize people’s activities as summarized in Table 1.

Activity recognition has been considered for basic activities, such as leaving or arriv-

ing at home, or sleeping [29], as well as for more detailed views, including toileting,

showering and eating [41].

Most of the previous research uses types of sensors that are diﬀerent from tempera-

ture and relative humidity. For example, CO2represents a useful source for occupancy

detection and estimation [43]. Additionally, sensors detecting motion based on passive

infrared (PIR) [1, 6, 15, 28, 17, 46], sound [11, 15], barometric pressure [30], and door

switches [8, 9, 45] are utilized for occupancy estimation. For evaluation, diﬀerent ma-

chine learning techniques are used, e. g., HMM [43], ARHMM [17], ANN [11], and

decision trees [15, 45].

In contrast to previous work, our results rely exclusively on temperature and relative

humidity. Previously published experimental results involved other or additional types

of sensors, such as CO2, acoustics, motion, or lighting (the latter three are referred to as

AML in Table 1), door switches or states of appliances (also gathered with the help of

switches), such as water taps or WC ﬂushes. For this reason, our detection results are

also not directly comparable to these works.

Table 1: Overview of

previous experiments

on occupancy detec-

tion (D), occupancy

estimation (E), which

aims at determining

the number of people

in a room, and activity

recognition (A) with

a focus on selected

sensors; AML denotes

acoustic, motion, and

lighting sensors.

Work

Target

Rel. Humidity

Temperature

CO2

Ventilation

AML

Switches

van Kasteren et al., 2008 [41] A      

Lam et al., 2009 [28] E      

Dong et al., 2010 [6] E      

Lu et al., 2010 [29] A      

Hailemariam et al., ’2011 [15] D      

Han et al., 2012 [17] E      

Zhang et al., 2012 [46] E      

Ekwevugbe et al., 2013 [11] E      

Ebadat et al., 2013 [8] E      

Ai et al., 2014 [1] E      

W¨

orner et al., 2014 [43] D      

Yang et al., 2014 [45] D/E     

Masood et al., 2015 [30] E      

Ebadat et al., 2015 [9] E      

This work D/A     

3 Threat Model

The overall goal of our work is to understand the potential privacy implications if room

climate data is accessed by an attacker. The goal of the attacker is to gain informa-

tion about the state of occupancy as well as the activity of the occupants without their

consent.

Obviously, the more information an attacker can gather, the more likely she can

deduce privacy-harming information from the measurements. Therefore, we base our

analysis on the attacker model that considers a room climate system where only one

sensor node is used to derive information. This is a realistic scenario since usually one

sensor node per room is suﬃcient to monitor the room climate. Moreover, we assume

that this sensor node takes only the two most basic measurements, temperature and

relative humidity. These data are the fundamental properties to describe room climate.

Note that our restricted data is in contrast to existing work (cf. Table 1 and Section 2)

that based their experiments on more types of measurements or used data that is less

common to characterize room climate.

We consider a sensor system that measures the climate of a room, denoted as target

location. At the target location, a temperature and relative humidity sensor is installed

that reports the measured values in regular intervals to a central database. We consider

an attacker model where the attacker has access to this database and aims to derive in-

formation about the occupants at the target location. Furthermore, we assume that the

attacker has access to either the target location itself, or a room similar in size, layout,

sensor positions, and furniture. Such situations are given, for example, at oﬃce build-

ings, hotels, cruise ships, and student dormitories. This location, denoted as training

location, is used to train the classiﬁer, which is a machine learning algorithm learning

the input data labeled with the groundtruth. As the attacker has full control over the

training location, she can freely choose what actions are taking place during the mea-

surements. For example, she could do measurements while no persons are present at the

training location, or one person is present and executes a predeﬁned activity.

There are various scenarios, in which an attacker has incentives to collect and an-

alyze room climate data. For example, the management of a company aims at observ-

ing the presence and working practices of employees in the oﬃces. In another case,

a provider of private spaces (hotels, dormitories, etc.) wants to disclose lifestyle and

intimate activities in these spaces. This information may be utilized for targeted adver-

tising or sold to insurance companies. In any case, the evaluation of room climate data

provides the attacker with the possibility to undermine the privacy of the occupants.

The procedure of these attacks is as follows: First, the attacker collects training data

at a training location, which might be the target location or another room similar in

size, layout, sensor positions, and furniture. The attacker also records the groundtruth

for all events that shall be distinguished. Examples of events are occupancy and non-

occupancy, or diﬀerent activities such as working, walking, and sleeping. The training

data is recorded with a sample rate of a few seconds and split into windows (i.e., a

temperature curve and a relative humidity curve) of same time lengths, usually one to

three minutes. Using the collected training data, the attacker trains a machine learning

classiﬁer. After the classiﬁer is trained, it can be used to classify windows of climate

data from the target location to determine the events. The classiﬁer works on previously

collected data, thus reconstructing past events, and also on live-recorded data, thus de-

termining current events “on-the-ﬂy” at the target location.

4 Experimental Design and Methods

We conducted a study to investigate the feasibility of detecting occupancy and inferring

activities in an oﬃce environment from temperature and relative humidity: From March

to April 2016, we performed experiments at two locations simultaneously, Location A

and Location B, with a distance of approximately 200 km between them. In addition,

from January to February 2017, we conducted further experiments at a third location,

denoted as Location C, which is located in the same building as Location B.

4.1 Experimental Setup and Tasks

The experimental spaces at the three locations are diﬀerent in size, layout, and positions

of the sensors. Thus, each target location is also the training location in our study. At

Location A, the room has a ﬂoor area of 16.5 m2and was equipped with room climate

sensors at four positions as shown in Figure 1ii. At Location B, the room has a ﬂoor

area of 30.8 m2, i. e., roughly twice as much as at Location A, and had room climate

sensors installed at three positions as illustrated in Figure 1i. Location C has a ﬂoor

area of 13.9 m2and was equipped with room climate sensors at ﬁve positions as shown

in Figure 1iii. In all locations, the room climate sensors measured temperature and

relative humidity. The number of deployed sensors varied due to limitations of hardware

availability.

Cup-

board DeskDesk

Table

Desk

DeskDesk

B1 B3

h: 1.2m

h: 2.8m h: 2.2m

Windows

Door

(i) Location B (30.8 m2)

Desk

Shelf

Pillar

Desk

h: 0.91m

A1 h: 2.88m

h: 2.25mA3

h: 1.18m

Windows

Door

(ii) Location A (16.5 m2)

Shelf

h: 1.14m

h: 2.87m

h: 1.83m

Windows

Door C1

Shelf

h: 1.65m

Desk

h: 1.83m

(iii) Location C (13.9 m2)

Fig. 1: Floor plans of the experiment spaces including sensor node locations, h indicates

the node’s height.

Our goal was to determine to which extent the presence and activities of an occu-

pant inﬂuences the room climate data. Therefore, we measured temperature and relative

humidity during phases of absence as well as phases of occupants’ presence. If an occu-

pant was present, this person had to perform one task or a sequence of tasks. We deﬁned

the following experimental tasks (see also Figure 2):

Read Sit on an oﬃce chair next to a desk and read.

Stand Stand in the middle of the room, try to avoid movements.

Walk Walk slowly and randomly through the room.

Work Sit on an oﬃce chair next to a desk and use a laptop, which is located on

the desk.

(i) Read (ii) Stand (iii) Walk (iv) Work

Fig. 2: The deﬁned tasks performed by participants at Location A.

To eliminate confounding factors, we deﬁned location default settings applying to

all locations. Essentially, all windows were required to remain closed and no person was

allowed in the room when not in use for the experiment. The rooms have radiators for

heating, which were adjusted to a constant level. At Location A and B, we used shutters

ﬁxed in such positions that enough light was provided for reading and working.

4.2 Sensor Data Collection

We used a homogeneous hardware and software setup at all locations for data collection,

which is described in the following.

Hardware. At each location, we set up a sensor network consisting of several Moteiv

Tmote Sky sensor nodes with an integrated IEEE 802.15.4-compliant radio [32] as well

as an integrated temperature and relative humidity sensor. The nodes have the Contiki

operating system [7] version 2.7 installed. In addition, we deployed a webcam that took

pictures in a 3-second interval at Location A. These were used for veriﬁcation during

the data collection phase only, and were not given to the classiﬁcation algorithms.

Software. For sensor data collection, we customized the Collect-View application in-

cluded in Contiki 2.7, which provides a graphical user interface to manage the sen-

sor network. For our purposes, we implemented an additional control panel oﬀering a

customized logging system. The measurement settings of the Collect-View application

were set to a report interval of 4seconds with a variance of 1second, i. e., each sen-

sor node reported its current values in a time interval of 4±1seconds. The variance

is a feature provided by Collect-View to decrease the risk of packet collisions during

over-the-air transmissions.

Collected Data. We structured data collection in units and aimed for a good balance

between presence and absence as well as the diﬀerent tasks among all units, as this is

needed for the later analysis using machine learning. Each unit has a ﬁxed time duration,

t, where exactly one person was present (t∈{10,30,60}, in minutes) who executed

predeﬁned activities. If the presence time was tminutes, then the absence time before

and after it, respectively, was determined as t

2+5minutes, where 5minutes served

as buﬀer. This accounts for both, the equal distribution of presence time and absence

time, respectively, and the fact that temperature and humidity settle within a 15-minute

period after the 60-minute presence of one person. For a detailed description of the

experimental procedure, we refer to Appendix A.1.

Overall, we collected almost 90 hours of sensor data, 40 hours of which with a

person being present. A more extensive overview of the amount of measured sensor data

is shown in Table 2. To encourage replication and further investigations, all collected

sensor data is available as open data sets on GitHub.3

Table 2: Measured sen-

sor data of all locations

(in hours)

Variable Value Recorded Time [h]

Location A Location B Location C

Occupancy no 20:38:26 15:21:00 13:21:42

yes 14:41:56 11:33:06 13:44:29

Task

Read 4:46:13 2:56:44 3:19:47

Stand 2:45:27 2:34:20 3:28:27

Walk 2:43:53 2:37:12 3:20:05

Work 4:03:33 3:00:20 3:20:52

4.3 Participants and Ethical Principles

For participating in the experiment, 14 subjects volunteered at Location A, 12 subjects

at Location B, and 10 subjects at Location C as shown in Table 3. Demographic data of

participants was collected in order to facilitate replication and future experiments. All

subjects provided written informed consent after the study protocol was approved by

the data protection oﬃce.4We assigned each participant to a random ID. All collected

sensor data as well as the demographic data is only linked to this ID.

4.4 Classiﬁer Design

We used classiﬁcation to predict occupancy and activities in the rooms. We adopt an

approach that has successfully been used in several applications of biosignal process-

ing, namely extraction of a number of statistical descriptors with subsequent feature

selection [26, 21].

The features use measurements from short time windows. We experimented with

windows of diﬀerent lengths, namely 60 s, 90 s, 120 s, 150 s, and 180 s. The oﬀset be-

tween two consecutive windows was set to 30 s. We excluded all windows where only

a part of the measurements belongs to the same activity.

The feature set was composed from a number of statistical descriptors that were

computed on temperature and humidity measurements within these windows. These

3https://github.com/IoTsec/Room-Climate-Datasets

4Ethical review boards at both locations only consider medical experiments.

Characteristic Location

ABC

Gender f: 3 2 5

m: 11 10 5

Weight [kg] µ: 74.9 81.7 63.1

σ: 8.0 12.1 10.0

Height [cm] µ: 175.9 178.4 170.7

σ: 9.2 5.3 9.3

Age µ: 33.7 30.3 25.6

σ: 8.2 4.8 2.8

Table 3: Demographic data of partici-

pants, µdenotes the average, σdenotes

the standard deviation.

are mean value, variance, skewness, kurtosis, number of nonzero values, entropy, dif-

ference between maximum and minimum value of the window (i.e., value range), cor-

relation between temperature and humidity, and mean and slope of the regression line

for the measurement window before the current window. Additionally, we subtracted

from the measurements their least-square linear regression line, and computed all of

the listed statistics on the subtraction residuals. Feature selection was performed using

a sequential forward search [42, Ch. 7.1 & 11.8], with an inner leave-one-subject-out

cross-validation [19, Ch. 7] to determine the performance of each feature set. For classi-

ﬁcation, we used the Na¨

ıve Bayes classiﬁer. To avoid a bias in the results, we randomly

selected identical numbers of windows per class for training, validation and testing. For

implementation, we used the ECST software [38], which wraps the WEKA library [16].

As performance measures, we use accuracy (i. e., the number of correctly classiﬁed

windows divided by the number of all windows), and per-class sensitivity (i. e., the

number of correctly classiﬁed windows for a speciﬁc class divided by the number of all

windows of this class). Classiﬁcation accuracy was deemed statistically signiﬁcant if it

was signiﬁcantly higher than random guessing which is the best choice if the classiﬁer

could not learn any useful information during training. For each experiment, a binomial

test with signiﬁcance level p < 0.01 was carried out using the R software [34].

Note that neither the features nor the rather simple Na¨

ıve Bayes classiﬁer are par-

ticularly tailored to predicting privacy leaks. However, we show that also such an un-

optimized system is able to correctly predict occupancy and action types and hence

produce privacy leaks. Higher detection rates results can be expected if more advanced

classiﬁers are applied to this task.

5 Results

In this section, we present the experimental results. First, a visual inspection of the col-

lected data is presented, followed by the machine learning-aided occupancy detection

and activity recognition.

0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000 6600 7200 7800 8400

time (s)

20.6

20.7

20.8

20.9

temperature (°C)

Temperature (Sensor A1)

Stand Read Walk Read Walk

0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000 6600 7200 7800 8400

time (s)

humidity (%)

Relative Humidity (Sensor A1)

Stand Read Walk Read Walk

(i) Occupant is present for 60 minutes at Location A.

0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000 6600 7200 7800 8400

time (s)

23.2

23.4

temperature (°C)

Temperature (Sensor B2)

Stand Work Walk

0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000 6600 7200 7800 8400

time (s)

humidity (%)

Relative Humidity (Sensor B2)

Stand Work Walk

(ii) Occupant is present for 60 minutes at Location B.

Fig. 3: Visualization of two examples of room climate measurements. The grey back-

ground indicates the presence of the occupant in the experimental space.

5.1 Visual Inspection

We started our evaluation by analyzing the raw sensor data. Hence, we implemented

a visualization script in MATLAB, which plots this data. The visualizations of two

measurements are exemplarily depicted in Figure 3.

The visualizations show an immediate rise of the temperature and humidity as soon

as an occupant enters the room. Furthermore, variations in temperature and humidity

increase rapidly and can be clearly seen. Thus, one can visually distinguish between

phases of occupancy and non-occupancy. One can also notice diﬀerent patterns during

the performance of the tasks. As Figure 3i shows, an occupant walking in the experi-

mental space causes a constant increase of temperature and humidity with only small

variations. In contrast, an occupant standing in the room causes the largest variations of

humidity compared to the other deﬁned tasks (cf. Figure 3ii). The eﬀects of the tasks

reading and working on temperature and humidity in the depicted ﬁgures are very sim-

ilar: both variables tend to increase showing medium variations. For further analysis of

the data, we used machine learning as outlined in Section 4.4.

5.2 Occupancy Detection

Occupancy detection describes the binary detection of occupants in the experimental

space based on features from windows with length of 180 seconds (cf. Section 4.4).

This is a two-class task, namely to distinguish whether an occupant is present (true)

or not (false). We only considered training and testing data within the same room (but

separated training and testing both by the days and participants of the acquisition). We

randomly selected the same number of positive and negative cases from the data. Thus,

simply guessing the state has a success probability of 50%. However, our classiﬁcation

results are considerably higher than that. Table 4 shows that the highest accuracies

per location were 93.5% (Location A), 88.5% (Location B), and 91.0% (Location C).

Considering all sensors of all three locations, detection accuracy ranges between 66.8%

(Sensor B3) and 93.5% (Sensor A1) as shown in Figure 4i. All classiﬁcation accuracies

were statistically signiﬁcantly diﬀerent from random guessing. This indicates that an

attacker can reveal the presence of occupants in a target location with a high probability.

Scenario Sensor Sensitivity [%] Guess Acc.

Occup. No Occup. [%] [%]

Occupancy

A1 94.1 93.0 50.0 93.5

A2 94.5 85.0 50.0 89.7

A3 92.0 76.4 50.0 84.2

A4 77.8 79.1 50.0 78.4

B1 91.9 85.1 50.0 88.5

B2 85.3 77.2 50.0 81.3

B3 69.7 63.9 50.0 66.8

C1 92.9 89.2 50.0 91.0

C2 89.9 87.4 50.0 88.6

C3 90.0 82.0 50.0 86.0

C4 89.8 87.6 50.0 88.7

C5 92.5 88.8 50.0 90.7

Table 4: Classiﬁcation ac-

curacy for occupancy de-

tection. Notations: ‘Oc-

cup.’, sensitivity for class

occupancy. ‘No Occup.’,

sensitivity for class no oc-

cupancy. ‘Guess’, proba-

bility of correct guessing.

‘Acc.’, classiﬁcation accu-

racy.

5.3 Activity Recognition

Activity recognition reports the current activity of an occupant in the experimental

space. The four activity tasks are described in Section 4.1. The recognition results for

these tasks are shown in Figure 4.

Activity4 classiﬁes between the activities Read,Stand,Walk,Work. As shown in

Figure 4ii, the accuracy of recognizing activities achieved by the machine learning

pipeline ranged from 23.9% (Sensor C1) to 56.8% (Sensor A1). Overall, the accu-

racy of Activity4 was statistically signiﬁcantly better than the probability of guessing

the correct task (25%) for 8 out of 12 sensors. Thus, the distinction between multiple

activities is possible, but depends on the target location and the position of the sensor.

A B C

Location

100

Accuracy [%]

C3C4

(i) Occupancy

A B C

Location

Accuracy [%]

(ii) Activity4

(read, stand, walk, work)

A B C

Location

100

Accuracy [%]

(iii) Activity3

(sit, stand, walk)

A B C

Location

Accuracy [%]

(iv) Activity2

(sit, upright)

A B C

Location

Accuracy [%]

(v) Activity2a

(read, work)

A B C

Location

100

Accuracy [%]

(vi) Activity2b

(stand, walk)

Fig. 4: Classiﬁcation accuracy for occupancy detection and activity recognition. In each

diagram, the guessing probability is plotted as a line. Each symbol represents the accu-

racy that we achieved with a single sensor. A blue dot marks a statistically signiﬁcant

result, while a red ‘x’ represents a statistically insigniﬁcant result.

In the next step, we investigated whether an attacker can increase the recognition

accuracies by distinguishing between a smaller set of activities. To this end, we com-

bined two tasks to a meta task, e.g., the tasks Read and Work became Sit. The model

Activity3 classiﬁes between the tasks Sit,Stand, and Walk. The probability of correct

guessing is thus 33.3%. This model is typical to represent activities of an occupant

in a private space or an oﬃce room. For Activity3, the achieved accuracy ranged from

31.8% (Sensor C1) to 81.0% (Sensor A1). Our results were statistically signiﬁcant for

10 out of the 12 sensors deployed in the three locations. Assuming a known layout of

the target location, the attacker might be able to determine the position of the occupant

in the space and infer activities such as watching TV, exercising, cooking or eating.

The model Activity2 classiﬁes between the tasks Sit and Upright, whereby Sit is as

previously Read or Work, and Upright combines Stand and Walk. In this classiﬁca-

tion, the attacker distinguishes whether an occupant is at a certain posture. The model

Activity2a classiﬁes between the tasks Read and Work, and the model Activity2b clas-

siﬁes between the tasks Stand and Walk.Activity2a indicates that an attacker can even

distinguish between the sedentary activities, such as reading a book or working on the

laptop. In contrast, Activity2b shows that an attacker can diﬀerentiate between standing

and moving activities. Thus, an attacker can detect movements at the target location.

For Activity2,Activity2a, and Activity2b, the probability to guess the correct class is

50%. Using these models, the attacker can infer various work and life habits.

For Activity2, our accuracy varies between 54.6% (Sensor C2) and 82.1% (Sensor

A1), and all accuracies are statistically signiﬁcant. For Activity2a, the lowest and high-

est accuracies were 54.2% (Sensor B3) and 76.6% (Sensor C2), respectively, which

resulted in statistically signiﬁcant results for 11 out of 12 sensors. For Activity2b, the

achieved accuracy ranged from 53.3% (Sensor C4) to 95.1% (Sensor A1) and the re-

sults for 10 out of 12 sensors were statistically signiﬁcant.

5.4 Further Observations

Length of Measurement Windows. The length of the measurement windows inﬂu-

ences the accuracy of detection. We evaluated window sizes in the range between 60

and 180 seconds. Exemplarily, we analyzed the average accuracy of occupancy detec-

tion depending on the window size for all three locations. As shown in Figure 5, the

accuracy increases with a longer window size. We achieved the best results with the

longest window sizes of 180 seconds.

60 90 120 150 180

Window Size [s]

Average Accuracy [%]

Location A

Location B

Location C

Fig. 5: Average accuracy over

all sensors from each loca-

tion for occupancy detection de-

pending on the window size

This indicates that the highest accuracies are possible if longer time periods are

considered. From a practical perspective, it is not advisable to extend the window size to

a much larger duration than a few minutes since we assume that the performed activity

is consistent for the whole duration of the window.

Selected Features. To assess the feasibility of an attacker that has only access to either

temperature data or relative humidity data, we evaluated whether it might be enough

to solely collect one type of room climate data. In the classiﬁcation process, an at-

tacker derives a set of features from temperature and relative humidity data and selects

the best-performing features for each sensor and classiﬁcation goal automatically (cf.

Section 4.4). Analysis shows that features computed from temperature and relative hu-

midity are of similar importance. In our evaluation, 57.9% of the selected features are

derived from temperature measurements, and 52.3% from relative humidity measure-

ments.5

We also compared the features in terms of diﬀerences between the three locations

as well as diﬀerences between occupancy detection and activity recognition. In all these

cases, there are no signiﬁcant diﬀerences between the importance of temperature and

relative humidity. An attacker restricted to either temperature or relative humidity data

will perform worse than with both data.

Size and Layout of Rooms. All our locations are oﬃce-like rooms, which have a sim-

ilar layout (rectangular) but diﬀer in size and furnishing. In our evaluation, the accuracy

correlates with the size of the target location. As shown in Figure 5, we had the highest

average accuracy in occupancy detection with Location C, which has also the smallest

ground area of 13.9m2. Location A has a ground area of 16.5m2, and has a slightly

lower average accuracy. Location B is almost twice as large (30.8m2) and shows the

worst average accuracy compared to the other locations. Thus, our experiment indi-

cates that an increasing room size leads to decreasing accuracy on average. An attacker

achieves higher accuracies by monitoring target locations of a small size compared to

target locations of larger sizes.

Position of Sensors. According to our threat model in Section 3, the attacker controls

layout of the target location. Thus, we assume an attacker that can decide at which

position in the target location a room climate sensor is installed. We consider how the

position of a room climate sensor inﬂuences the accuracy of derived information. For

occupancy detection, we had the best accuracy with a sensor node that is located in the

center point at the ceiling of the target location (Sensors A1, B1, C1). In this position,

the sensor has the largest gathering area to measure the climate of the room. Sensors

mounted to the walls or on shelves perform diﬀerently in our experiments. For activity

recognition, the central sensor nodes performed best at Location A and B, but not at

Location C.

From the attacker perspective, the best position to deploy a room climate sensor is

at the ceiling in the center of the target location. In large rooms, multiple sensors at the

ceiling could be installed, each covering a subsection of the room.

5Note that some features are based on both, temperature and relative humidity, which is why

the sum of both numbers exceeds 100%.

6 Discussion

As our experiments reveal, knowing the temperature and relative humidity of a room

allows to detect the presence of people and to recognize certain activities with a sig-

niﬁcantly higher probability than guessing. By evaluating temperature and relative hu-

midity curves of the length of 180 seconds, we were able to detect the presence of an

occupant in one of our experimental spaces with an accuracy of 93.5% using a single

sensor. In terms of activity recognition, we distinguished between four activities with an

accuracy up to 56.8%, between three activities up to 81.0%, and between two activities

up to 95.1%. Thus, an attacker focusing on the detection of a speciﬁc activity is more

successful than an attacker that aims to classify a broader variety of activities. In the

following, we discuss implications and limitations of our results.

Privacy Implications We show that an attacker might be able to infer life and work

habits of the occupants from the room climate data. Thus, the attacker is able to dis-

tinguish between sitting, standing, and moving, which already might reveal the posi-

tion and activities of the occupant in the room. Moreover, the attacker can distinguish

between upright and sedentary activities, between moving and standing, and between

working on the laptop or reading a book.

Given the limited amount of recorded sensor data, the achieved accuracies in occu-

pancy detection and activity recognition give a clear indication that occupants are sub-

ject to privacy violations according to the threat model described in Section 3. However,

activity recognition is not straightforward since the achieved accuracies diﬀer between

the diﬀerent sensor positions and locations.

Further experiments are required for a better assessment of the privacy risks induced

by the room climate data. Our work provides promising directions for these assess-

ments. For example, we demonstrated the existence of the information leak with the

Na¨

ıve Bayes classiﬁer. Na¨

ıve Bayes is arguably one of the simplest machine learning

classiﬁers. In future work, it would be interesting to explore upper boundaries for the

detection of presence/absence and diﬀerent activities by using more advanced classiﬁers

such as the recently popular deep learning algorithms.

Location-Independent Classiﬁcation An important question is whether it is possible

to perform location-independent classiﬁcation, i.e., to train the classiﬁer with sensor

data of one location and then use it to classify sensor data at the target location that

is not similar to the training location in size, layout, and sensor positions. If this was

possible, the service providers of smart heating applications would be able to detect

occupancy and to recognize activities without having access to the target locations.

According to their privacy statements, popular smart thermostats from Nest [33],

Ecobee [10], and Honeywell [20] send measured climate data to the service providers’

databases. To evaluate these privacy threats, we used the room climate data of the best-

performing sensor of a location as training data set for other locations. For example, to

classify events of an arbitrary sensor of Location A, we trained the classiﬁer with room

climate data collected by Sensor B1 or Sensor C1. We gained statistically signiﬁcant

results for a few combinations in occupancy detection but the majority of our occupancy

detection results was not signiﬁcant. For activity recognition, we were not able to gain

statistically signiﬁcant results.

However, the possibility of location-independent attackers cannot be excluded. Ab-

sence of signiﬁcant results in our experiments may be merely due to the limited amount

of data. Future studies should be conducted to gather data from various rooms up to a

point where the combined results hold for arbitrary locations. Having more data from a

multitude of rooms available would help the machine learning classiﬁers to recognize

and ignore data characteristics that are speciﬁc to either of the experimental rooms.

Consequently, the algorithms could better identify the distinct data characteristics of

the diﬀerent classes in occupancy detection and activity recognition. This would enable

location-independent classiﬁcation of room climate data, in which the training location

is not similar to the target location regarding size, layout, furnishing, and positions of

the sensors.

In a representative smart home survey of German consumers from 2015,34% of the

participants stated that they are interested in technologies for intelligent heating or are

planning to acquire such a system [5]. Another survey with 1,000 US and 600 Canadian

consumers found that for 72% of them, the most desired smart home device would be

a self-adjusting thermostat, and 37% reported that they were likely to purchase one in

the next 12 months [22]. Sharing smart home data with providers and third parties is a

popular idea and a controversial issue for consumers. Thus, in a recent representative

survey with 461 American adults by Pew Research [35], the participants were presented

with a scenario of installing a smart thermostat “in return for sharing data about some

of the basic activities that take place in your house like when people are there and when

they move from room to room”. Of all respondents, 55% said that this scenario was not

acceptable for them, 27% said that it was acceptable, with remaining 17% answering

“it depends”. Furthermore, in a worldwide survey with 9,000 respondents from nine

countries (Australia, Brazil, Canada, France, Germany, India, Mexico, the UK, and the

US), 54% of respondents said that “they might be willing to share their personal data

collected from their smart home with companies in exchange for money” [23].6

We think that the idea of sharing the smart home data for various beneﬁts will

continue to be intensively discussed in the future, and therefore, consumers and policy

makers should be made aware of the level of detail inferable from smart home data.

Which rewards are actually beneﬁcial for consumers? Moreover, which kind of data

sharing is ethically permissible? Only by answering these questions it would be possible

to design fair policies and establish beneﬁcial personal data markets [40]. In this work,

we take the ﬁrst step towards informing the policy for the smart heating scenario.

7 Conclusions

We investigated the common belief that the data collected by room climate sensors di-

vulge private information about the occupants. To this end, we conducted experiments

that reﬂect realistic conditions, i.e., considering an attacker who has access to typical

room climate data (temperature and relative humidity) only. Our experiments revealed

6Methodological details, such as representativeness, breakdown by country and the exact for-

mulation of the questions, are not known about this survey.

that knowing a sequence of temperature and relative humidity measurements already

allows to detect the presence of people and to recognize certain activities with high

accuracy. Our results conﬁrm that the assumptions that room climate data needs pro-

tection are justiﬁed: the leakage of such ‘inconspicuous’ sensor data as temperature

and relative humidity can seriously violate privacy in smart spaces. Future work is re-

quired determine the level of privacy invasion in more depth and develop appropriate

countermeasures.

Acknowledgement

The work is supported by the German Research Foundation (DFG) under Grant AR

671/3-1: WSNSec – Developing and Applying a Comprehensive Security Framework

for Sensor Networks.

References

1. B. Ai, Z. Fan, and R. X. Gao. Occupancy estimation for smart buildings by an auto-regressive

hidden Markov model. In American Control Conference, ACC 2014, Portland, OR, USA,

June 4-6, 2014, pages 2234–2239. IEEE, 2014.

2. BSI. Protection Proﬁle for the Gateway of a Smart Metering System (Smart Meter Gateway

PP). https://www.commoncriteriaportal.org/ﬁles/ppﬁles/pp0073b pdf.pdf, Mar. 2014.

3. A. Cavoukian, J. Polonetsky, and C. Wolf. SmartPrivacy for the Smart Grid: embedding pri-

vacy into the design of electricity conservation. Identity in the Information Society, 3(2):275–

294, 2010.

4. Chaos Computer Club: Guidelines for Smart Home Solutions.

https://www.ccc.de/en/updates/2016/smarthome. Feb. 2016 (in German).

5. Deloitte. Ready for Takeoﬀ? Consumer Survey, July 2015.

6. B. Dong, B. Andrews, K. P. Lam, M. H¨

oynck, R. Zhang, Y.-S. Chiou, and D. Benitez.

An information technology enabled sustainability test-bed (ITEST) for occupancy detection

through an environmental sensing network. Energy and Buildings, 42(7):1038 – 1046, 2010.

7. A. Dunkels, B. Gr ¨

onvall, and T. Voigt. Contiki – a lightweight and ﬂexible operating sys-

tem for tiny networked sensors. In 29th Annual IEEE International Conference on Local

Computer Networks, 2004, pages 455–462. IEEE, 2004.

8. A. Ebadat, G. Bottegal, D. Varagnolo, B. Wahlberg, and K. H. Johansson. Estimation of

building occupancy levels through environmental signals deconvolution. In BuildSys 2013,

Proceedings of the 5th ACM Workshop On Embedded Systems For Energy-Eﬃcient Build-

ings, Roma, Italy, November 13-14, 2013, pages 8:1–8:8, 2013.

9. A. Ebadat, G. Bottegal, D. Varagnolo, B. Wahlberg, and K. H. Johansson. Regularized

deconvolution-based approaches for estimating room occupancies. IEEE Trans. Automation

Science and Engineering, 12(4):1157–1168, 2015.

10. Ecobee. Privacy policy & terms of use, April 2015.

11. T. Ekwevugbe, N. Brown, V. Pakka, and D. Fan. Real-time building occupancy sensing using

neural-network based sensor network. In 7th IEEE International Conference on Digital

Ecosystems and Technologies (DEST), 2013, pages 114–119, July 2013.

12. European Union Agency For Network And Information Security. Security and

Resilience of Smart Home Environments – Good practices and recommendations.

https://www.enisa.europa.eu. Dec. 2015.

13. S. Fischer-H ¨

ubner and N. Hopper, editors. Privacy Enhancing Technologies - 11th Inter-

national Symposium, PETS 2011, Waterloo, ON, Canada, July 27-29, 2011. Proceedings,

volume 6794 of Lecture Notes in Computer Science. Springer, 2011.

14. U. Greveler, P. Gl ¨

osek¨

otterz, B. Justusy, and D. Loehr. Multimedia content identiﬁcation

through smart meter power usage proﬁles. In Proceedings of the International Conference

on Information and Knowledge Engineering (IKE), 2012.

15. E. Hailemariam, R. Goldstein, R. Attar, and A. Khan. Real-time occupancy detection us-

ing decision trees with multiple sensor types. In 2011 Spring Simulation Multi-conference,

SpringSim ’11, Boston, MA, USA, April 03-07, 2011., pages 141–148, 2011.

16. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA

Data Mining Software: An Update. SIGKDD Explor. Newsl., 11(1):10–18, Nov. 2009.

17. Z. Han, R. X. Gao, and Z. Fan. Occupancy and indoor environment quality sensing for

smart buildings. In 2012 IEEE International Instrumentation and Measurement Technology

Conference (I2MTC), pages 882–887, May 2012.

18. G. W. Hart. Residential energy monitoring and computerized surveillance via utility power

ﬂows. Technology and Society Magazine, IEEE, 8(2):12–16, 1989.

19. T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer,

New York, NY, USA, 2nd edition, 2009.

20. Honeywell. Honeywell connected home privacy statement, December 2015.

21. V. Huppert, J. Paulus, U. Paulsen, M. Burkart, B. Wullich, and B. Eskoﬁer. Quantiﬁcation of

Nighttime Micturition With an Ambulatory Sensor-Based System. IEEE Journal of Biomed-

ical and Health Informatics, 20(3):865–872, May 2016.

22. icontrol Networks: 2015 State of the Smart Home Report.

https://www.icontrol.com/blog/2015-state-of-the-smart-home-report.

23. Intel Security: Intel Security’s International Internet of Things Smart Home Survey Shows

Many Respondents Sharing Personal Data for Money. https://newsroom.intel.com/news-

releases/intel-securitys-international-internet-of-things-smart-home-survey. Mar. 2016.

24. M. Jawurek, M. Johns, and F. Kerschbaum. Plug-in privacy for smart metering billing. In

Fischer-H¨

ubner and Hopper [13], pages 192–210.

25. M. Jawurek, F. Kerschbaum, and G. Danezis. SoK: Privacy technologies for smart grids – a

survey of options. Microsoft Res., Cambridge, UK, 2012.

26. U. Jensen, P. Blank, P. Kugler, and B. Eskoﬁer. Unobtrusive and Energy-Eﬃcient Swimming

Exercise Tracking Using On-Node Processing. IEEE Sensors Journal, 16(10):3972–3980,

May 2016.

27. K. Kursawe, G. Danezis, and M. Kohlweiss. Privacy-friendly aggregation for the smart-grid.

In Fischer-H¨

ubner and Hopper [13], pages 175–191.

28. K. P. Lam, M. H¨

oynck, B. Dong, B. Andrews, Y. shang Chiou, D. Benitez, and J. Choi.

Occupancy detection through an extensive environmental sensor network in an open-plan

oﬃce building. In Proc. of Building Simulation 09, an IBPSA Conference, 2009.

29. J. Lu, T. Sookoor, V. Srinivasan, G. Gao, B. Holben, J. Stankovic, E. Field, and K. White-

house. The smart thermostat: using occupancy sensors to save energy in homes. In Proceed-

ings of the 8th ACM Conference on Embedded Networked Sensor Systems, pages 211–224.

ACM, 2010.

30. M. K. Masood, Y. C. Soh, and V. W. Chang. Real-time occupancy estimation using envi-

ronmental parameters. In 2015 International Joint Conference on Neural Networks, IJCNN

2015, Killarney, Ireland, July 12-17, 2015, pages 1–8. IEEE, 2015.

31. A. Molina-Markham, P. Shenoy, K. Fu, E. Cecchet, and D. Irwin. Private memoirs of a smart

meter. In Proceedings of the 2Nd ACM Workshop on Embedded Sensing Systems for Energy-

Eﬃciency in Building, BuildSys ’10, pages 61–66, New York, NY, USA, 2010. ACM.

32. Moteiv Corporation. Tmote Sky Datasheet, 2006.

33. Nest. Privacy statement for nest products and services, March 2016.

34. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for

Statistical Computing, Vienna, Austria, 2014.

35. L. Rainie and M. Duggan. Pew Research: Privacy and Information Sharing.

http://www.pewinternet.org/2016/01/14/privacy-and-information-sharing. Jan. 2016.

36. A. Reinhardt, F. Englert, and D. Christin. Averting the privacy risks of smart metering by

local data preprocessing. Pervasive and Mobile Computing, 16:171–183, 2015.

37. A. Rial and G. Danezis. Privacy-preserving smart metering. In Proceedings of the 10th

Annual ACM Workshop on Privacy in the Electronic Society, WPES ’11, pages 49–60, New

York, NY, USA, 2011. ACM.

38. M. Ring, U. Jensen, P. Kugler, and B. Eskoﬁer. Software-based performance and complexity

analysis for the design of embedded classiﬁcation systems. In Proceedings of the 21st Inter-

national Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 11-15,

2012, pages 2266–2269. IEEE Computer Society, 2012.

39. M. Selinger. Test: Smart Home Kits Leave the Door Wide Open – for Ev-

eryone. https://www.av-test.org/en/news/news-single-view/test-smart-home-kits-leave-the-

door-wide-open-for-everyone/. Apr. 2014.

40. S. Spiekermann, A. Acquisti, R. B ¨

ohme, and K.-L. Hui. The challenges of personal data

markets and privacy. Electronic Markets, 25(2):161–167, 2015.

41. T. van Kasteren, A. Noulas, G. Englebienne, and B. Kr¨

ose. Accurate activity recognition in

a home setting. In Proceedings of the 10th International Conference on Ubiquitous Comput-

ing. ACM, 2008.

42. I. H. Witten, E. Frank, and M. A. Hall. Data Mining: Practical Machine Learning Tools and

Techniques. Morgan Kaufmann, Burlington, MA, USA, 3rd edition, 2011.

43. D. W ¨

orner, T. von Bomhard, M. Roeschlin, and F. Wortmann. Look twice: Uncover hidden

information in room climate sensor data. In 4th International Conference on the Internet of

Things, IoT 2014, Cambridge, MA, USA, October 6-8, 2014, pages 25–30. IEEE, 2014.

44. W. Yang, N. Li, Y. Qi, W. Qardaji, S. McLaughlin, and P. McDaniel. Minimizing private data

disclosures in the smart grid. In Proceedings of the 2012 ACM Conference on Computer and

Communications Security, pages 415–427. ACM, 2012.

45. Z. Yang, N. Li, B. Becerik-Gerber, and M. D. Orosz. A systematic approach to occupancy

modeling in ambient sensor-rich buildings. Simulation, 90(8):960–977, 2014.

46. R. Zhang, K. P. Lam, Y.-S. Chiou, and B. Dong. Information-theoretic environment features

selection for occupancy detection in open oﬃce spaces. Building Simulation, 5(2):179–188,

2012.

A Additional Material

A.1 Experimental Procedure

The participants were assigned to at least one experimental unit with ﬁxed presence

times and tasks, and provided with a script for their actions (that is, for how long and

in which order the tasks should be performed). Every participant performed each unit

twice, with the same tasks, but possibly on diﬀerent days and in a permuted chronolog-

ical order. Tasks were performed in blocks of 10,20, or 30 minutes. Thus, 10-minute

units contained only one task of 10 minutes; 30-minute units consisted of either three

tasks of 10 or one task of 10 plus one of 20 minutes; 60-minute units were composed

of either two tasks of 20 plus two of 10, or one task of 10,20, and 30 minutes each.

At the beginning of the presence time for each unit, i.e., the time period where a

person had to be present, the experimental supervisor unlocked the room door to let the

participant in. The participant started with the ﬁrst task and was instructed by phone

(at Locations A and C) or through the glass pane (at Location B) when it was time to

change activities or to leave the room.

Overall, we deﬁned 22 units per location, consisting of six 60-minute plus eight

30-minute and eight 10-minute units. Furthermore, the distribution of units and tasks

was identical for all locations. Both, Read and Work account for 180 minutes each,

whereas Stand and Walk provide 160 minutes each. A comprehensive overview of the

distribution of tasks, number of tasks (per unit and block), and aggregated values is

provided by Table 5.

Table 5: Overview of the number and dis-

tribution of tasks and units at one location.

nxtdenotes the number of nrecorded t-

units (i. e., the time of presence in minutes,

t∈{10,30,60}), ttask denotes the deﬁned

task block lengths per unit. For instance, in

a total of six 60-minute units, Read and

Work account for two 30-minute blocks,

whereas in a total of eight 10-minute units,

all tasks account for two blocks of 10 min-

utes each.

Units Tasks

nxt ttask Read Stand Walk Work

30 2 – – 2

6x60 20 2 –4 2

10 –8– –

8x30 20 2 –2 2

10 2 6 2 2

8x10 10 2 2 2 2

Total time [min] 180 160 160 180

Distributed embodied evolution over networks

Article

Full-text available

Mar 2021
APPL SOFT COMPUT

In several network problems the optimal behavior of the agents (i.e., the nodes of the network) is not known before deployment. Furthermore, the agents might be required to adapt, i.e. change their behavior based on the environment conditions. In these scenarios, offline optimization is usually costly and inefficient, while online methods might be more suitable. In this work, we use a distributed Embodied Evolution approach to optimize spatially distributed, locally interacting agents by allowing them to exchange their behavior parameters and learn from each other to adapt to a certain task within a given environment. Our results on several test scenarios show that the local exchange of information, performed by means of crossover of behavior parameters with neighbors, allows the network to conduct the optimization process more efficiently than the cases where local interactions are not allowed, even when there are large differences on the optimal behavior parameters within each agent’s neighborhood.

Indoor Climate Prediction Using Attention-Based Sequence-to-Sequence Neural Network

Article

Full-text available

May 2023

The Solar Dryer Dome (SDD), a solar-powered agronomic facility for drying, retaining, and processing comestible commodities, needs smart systems for optimizing its energy consumption. Therefore, indoor condition variables such as temperature and relative humidity need to be forecasted so that actuators can be scheduled, as the largest energy usage originates from actuator activities such as heaters for increasing indoor temperature and dehumidifiers for maintaining optimal indoor humidity. To build such forecasting systems, prediction models based on deep learning for sequence-to-sequence cases were developed in this research, which may bring future benefits for assisting the SDDs and greenhouses in reducing energy consumption. This research experimented with the complex publicly available indoor climate dataset, the Room Climate dataset, which can be represented as environmental conditions inside an SDD. The main contribution of this research was the implementation of the Luong attention mechanism, which is commonly applied in Natural Language Processing (NLP) research, in time series prediction research by proposing two models with the Luong attention-based sequence-to-sequence (seq2seq) architecture with GRU and LSTM as encoder and decoder layers. The proposed models outperformed the adapted LSTM and GRU baseline models. The implementation of Luong attention had been proven capable of increasing the accuracy of the seq2seq LSTM model by reducing its test MAE by 0.00847 and RMSE by 0.00962 on average for predicting indoor temperature, as well as decreasing 0.068046 MAE and 0.095535 RMSE for predicting indoor humidity. The application of Luong's attention also improved the accuracy of the seq2seq GRU model by reducing the error by 0.01163 in MAE and 0.021996 in RMSE for indoor humidity. However, the implementation of Luong attention in seq2seq GRU for predicting indoor temperature showed inconsistent results by reducing approximately 0.003193 MAE and increasing roughly 0.01049 RMSE. Doi: 10.28991/CEJ-2023-09-05-06 Full Text: PDF

Deep Learning with Greedy Layer-Wise Compound Scaling for Temperature and Humidity Prediction in Solar Dryer Dome

Article

Full-text available

Jan 2022

Wicked Implications for Human Interaction with IoT Sensor Data

Preprint

Jan 2022

Human data interaction with sensor data from smart homes can cause some implications when it comes to human sensemaking of this data. With our data-driven method Guess the Data for individual and collective data work we revealed in previous work a number of potential pitfalls when interacting with this type of data. We introduce some of the identified, often wicked implications for further discussion.

Mitigating Privacy Leakage in Anomalous Building Data Streams

Conference Paper

Nov 2023

Data-I – Interactive Experience of IoT Data: A Practical Tool for IoT Sensor Data in Interdisciplinary Human-Computer Interaction Education

Conference Paper

Apr 2023

Systematic Literature Review On Machine Learning Predictive Models For Indoor Climate In Smart Solar Dryer Dome

Conference Paper

Oct 2022

Precise predictions of indoor climate conditions are required in the implementation of Smart Solar Dryer Dome (SDD). Trend development of prediction models is discussed in this review from 15 selected research papers (2018-2022) on indoor climate prediction which was obtained from research paper databases The output shows that the most used model for predicting indoor climate is Artificial Neural Network (ANN), especially Recurrent Neural Network (RNN) such as LSTM and GRU. However, there are some potential methods such as Transformer, Combined Support Vector Machines (SVM)-Deep Learning, and sequence-to-sequence which could outperform other commonly used models. Based on findings various opportunities exist to improve the precision of indoor climate prediction, which can bring power consumption efficiency and others benefit to Smart SDD users. Such studies may further be explored to produce more accurate machine learning models.

Sequence to Sequence Deep Learning Architecture for Forecasting Temperature and Humidity inside Closed Space

Conference Paper

Sep 2022

Solar Dryer Dome (SDD), an agricultural facility for drying and preserving agricultural products, needs a smart ability to predict the future indoor climate accurately, including indoor temperature and indoor humidity, in order to optimize electricity usage. To overcome these challenges, deep learning has been a widely adopted method. This research aims to forecast the future indoor climate using time series data by implementing a sequence-to-sequence (seq2seq) architecture, which is mostly used in Natural Language Processing (NLP) tasks. The two proposed seq2seq models, Long Short-Term Memory (LSTM) seq2seq and Gated Recurrent Unit (GRU) seq2seq, have proven to be superior to the adapted LSTM and GRU. The results show that the seq2seq GRU model outperforms the adapted GRU baseline model by an average difference of 0.03013 in MAE and the seq2seq LSTM model outperforms the adapted LSTM baseline model by an average difference of 0.00941 in MAE. To the best of our knowledge, this is the first implementation of seq2seq models for indoor climate forecasting on the Room Climate dataset.

User Privacy Concerns in Commercial Smart Buildings1

Article

Jun 2022

Smart buildings are socio-technical systems that bring together building systems, IoT technology and occupants. A multitude of embedded sensors continually collect and share building data on a large scale which is used to understand and streamline daily operations. Much of this data is highly influenced by the presence of building occupants and could be used to monitor and track their location and activities. The combination of open accessibility to smart building data and the rapid development and enforcement of data protection legislation such as the GDPR and CCPA make the privacy of smart building occupants a concern. Until now, little if any research exists on occupant privacy in work-based or commercial smart buildings. This paper addresses this gap by conducting two user studies ( N = 81 and N = 40) on privacy concerns and preferences about smart buildings. The first study explores the perception of the occupants of a state-of-the-art commercial smart building, and the latter reflects on the concerns and preferences of a more general user group who do not use this building. Our results show that the majority of the participants are not familiar with the types of data being collected, that it is subtly related to them (only 19.75% of smart building residents (occupants) and 7.5% non-residents), nor the privacy risks associated with it. After being informed more about smart buildings and the data they collect, over half of our participants said that they would be concerned with how occupancy data is used. These findings show that despite the more public environment, there are similar levels of privacy concerns for some sensors to those living in smart homes. The participants called for more transparency in the data collection process and beyond, which means that better policies and regulations should be in place for smart building data.

Automatic Ontology-Based Model Evolution for Learning Changes in Dynamic Environments

Article

Full-text available

Nov 2021

Knowledge engineering relies on ontologies, since they provide formal descriptions of real-world knowledge. However, ontology development is still a nontrivial task. From the view of knowledge engineering, ontology learning is helpful in generating ontologies semi-automatically or automatically from scratch. It not only improves the efficiency of the ontology development process but also has been recognized as an interesting approach for extending preexisting ontologies with new knowledge discovered from heterogenous forms of input data. Driven by the great potential of ontology learning, we present an automatic ontology-based model evolution approach to account for highly dynamic environments at runtime. This approach can extend initial models expressed as ontologies to cope with rapid changes encountered in surrounding dynamic environments at runtime. The main contribution of our presented approach is that it analyzes heterogeneous semi-structured input data for learning an ontology, and it makes use of the learned ontology to extend an initial ontology-based model. Within this approach, we aim to automatically evolve an initial ontology-based model through the ontology learning approach. Therefore, this approach is illustrated using a proof-of-concept implementation that demonstrates the ontology-based model evolution at runtime. Finally, a threefold evaluation process of this approach is carried out to assess the quality of the evolved ontology-based models. First, we consider a feature-based evaluation for evaluating the structure and schema of the evolved models. Second, we adopt a criteria-based evaluation to assess the content of the evolved models. Finally, we perform an expert-based evaluation to assess an initial and evolved models’ coverage from an expert’s point of view. The experimental results reveal that the quality of the evolved models is relevant in considering the changes observed in the surrounding dynamic environments at runtime.

Regularized Deconvolution-Based Approaches for Estimating Room Occupancies

Article

Full-text available

Oct 2015

We address the problem of estimating the number of people in a room using information available in standard HVAC systems. We propose an estimation scheme based on two phases. In the first phase, we assume the availability of pilot data and identify a model for the dynamic relations occurring between occupancy levels, ${rm CO}_{2}$ concentration and room temperature. In the second phase, we make use of the identified model to formulate the occupancy estimation task as a deconvolution problem. In particular, we aim at obtaining an estimated occupancy pattern by trading off between adherence to the current measurements and regularity of the pattern. To achieve this goal, we employ a special instance of the so-called fused lasso estimator, which promotes piecewise constant estimates by including an $ell_{1}$ norm-dependent term in the associated cost function. We extend the proposed estimator to include different sources of information, such as actuation of the ventilation system and door opening/closing events. We also provide conditions under which the occupancy estimator provides correct estimates within a guaranteed probability. We test the estimator running experiments on a real testbed, in order to compare it with other occupancy estimation techniques and assess the value of having additional information sources.

Look twice: Uncover hidden information in room climate sensor data

Article

Full-text available

Feb 2015

Connected sensors are on the march to become pervasive. While they are often deployed for a single purpose it is worth to take a second look. In this study, we show that the widespread Netatmo weather station which is intended to monitor and improve indoor climate can be used to estimate binary occupancy of individual rooms. We collected data from 11 rooms in 3 apartments including binary occupancy for several days. We show that CO2 measurements and derivatives thereof qualify as observables to be used in Hidden Markov Models and achieve accuracies well above 75% in most cases. However, we see that the accuracy metric is often misleading for such timeseries data and consider additional performance metrics as well which show varying results depending on the respective occupancy patterns of a room.

The challenges of personal data markets and privacy

Article

Full-text available

Jun 2015

Personal data is increasingly conceived as a tradable asset. Markets for personal information are emerging and new ways of valuating individuals’ data are being proposed. At the same time, legal obligations over protection of personal data and individuals’ concerns over its privacy persist. This article outlines some of the economic, technical, social, and ethical issues associated with personal data markets, focusing on the privacy challenges they raise. © 2015, Institute of Information Management, University of St. Gallen.

Occupancy detection through an extensive environmental sensor network in an open-plan office building.

Conference Paper

Jul 2009

Privacy Enhancing Technologies: 11th International Symposium, PETS 2011, Waterloo, ON, Canada, July 27-29, 2011. Proceedings

Book

Jan 2011
Lect Notes Comput Sci

This book constitutes the refereed proceedings of the 10th International Symposium, PETS 2011, held in Waterloo, Canada, in July 2011. The 15 revised full papers were carefully reviewed and selected from 61 submissions. The papers address design and realization of privacy services for the Internet, other data systems and communication networks. Presenting novel research on all theoretical and practical aspects of privacy technologies, as well as experimental studies of fielded systems the volume also features novel technical contributions from other communities such as law, business, and data protection authorities, that present their perspectives on technological issues.

R: A Language and Environment for Statistical Computing

Book

Jan 2015

Core R Team

Data Mining: Practical Machine Learning Tools and Techniques

Book

Nov 2016

Data Mining: Practical Machine Learning Tools and Techniques, Fourth Edition, offers a thorough grounding in machine learning concepts, along with practical advice on applying these tools and techniques in real-world data mining situations. This highly anticipated fourth edition of the most acclaimed work on data mining and machine learning teaches readers everything they need to know to get going, from preparing inputs, interpreting outputs, evaluating results, to the algorithmic methods at the heart of successful data mining approaches. Extensive updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including substantial new chapters on probabilistic methods and on deep learning. Accompanying the book is a new version of the popular WEKA machine learning software from the University of Waikato. Authors Witten, Frank, Hall, and Pal include today's techniques coupled with the methods at the leading edge of contemporary research. Please visit the book companion website at http://www.cs.waikato.ac.nz/ml/weka/book.html It contains Powerpoint slides for Chapters 1-12. This is a very comprehensive teaching resource, with many PPT slides covering each chapter of the book Online Appendix on the Weka workbench; again a very comprehensive learning aid for the open source software that goes with the book Table of contents, highlighting the many new sections in the 4th edition, along with reviews of the 1st edition, errata, etc. Provides a thorough grounding in machine learning concepts, as well as practical advice on applying the tools and techniques to data mining projects Presents concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods Includes a downloadable WEKA software toolkit, a comprehensive collection of machine learning algorithms for data mining tasks-in an easy-to-use interactive interface Includes open-access online courses that introduce practical applications of the material in the book.

Real-time occupancy estimation using environmental parameters

Conference Paper

Jul 2015

Unobtrusive and Energy-Efficient Swimming Exercise Tracking Using On-Node Processing

Article

May 2016

Body-worn sensors for movement analysis in swimming have to be unobtrusive and energy-efficient. We present a swimming exercise tracker for the unobtrusive positioning at the back of the head and an energy-efficient analysis using an on-node implementation. To develop the system, we collected head kinematics from 11 subjects in two 200-m medley races comprising breaks, turns, and four swimming styles. Each subject was equipped with a 6-D inertial measurement unit and completed one session in rested and fatigued state. Data were analyzed with a classification system, whereby different classifiers, window sizes, and feature sets were evaluated. Algorithm selection for on-node processing was performed on the basis of classifier accuracy and computational cost. The algorithm with the best tradeoff in accuracy and computational cost was selected and had a classification rate of 85.4%. Energy consumption of both on-node processing and Bluetooth streaming was evaluated on the Shimmer sensor platform. The results revealed energy savings of over 60% when data were processed on the sensor node. The presented analysis approach can be easily applied to other data analysis tasks, and the presented toolchain can support the rapid development of wearable systems in sports and healthcare.

Quantification of Nighttime Micturition With an Ambulatory Sensor-Based System

Article

Apr 2015

Among elderly males, benign prostate syndrome (BPS) is the most common urinary disorder. Nocturia is one of the major symptoms of BPS and has a considerable influence on quality of life (QoL). For assessment of BPS (including nocturia), the International Prostate Symptom Score (IPSS) is widely used, but questionnaires are prone to bias. To date, there is no objective measurement system available for nocturia. In this study, we present an unobtrusive and non-stigmatizing device for objective measurement of nighttime micturition. In a preliminary study of 6 males diagnosed with BPS and nighttime micturition ≥ 2 times, we showed that the device is accurate, with an average misdetection rate of 0.32 events and a mean absolute deviation of 3.8% when comparing the average number of nighttime micturition occurrences. In this extended study, an additional 9 males were recorded and data from an occupancy sensor were also included. The results of the preliminary study were confirmed with an average misdetection rate of 0.33 events and a mean absolute deviation of 9.1%. The system can therefore be used to objectively measure nighttime micturition, and thereby provide the basis for treatment, e.g., medication efficacy assessment.

Privacy Implications of Room Climate Data

Abstract and Figures

Recommended publications

Privacy implications of room climate data

Fusion of Non-Intrusive Environmental Sensors for Occupancy Detection in Smart Homes

Towards smart individual-room heating for residential buildings

Enhancing user privacy by preprocessing distributed smart meter data