Conference PaperPDF Available

Analyze building performance data for energy-efficient building operation

January 2009

January 2009

Conference: Proceedings of 26th W78 Conference on Information Technology in Construction. International Council for Building Research Studies and Documentation, CIB W

Authors:

Ammar Ahmed

Joern Ploennigs

University of Rostock

Yang Gao

University College Cork

Karsten Menzel

Technische Universität Dresden

…

. The predictors.

…

Histograms of various measures from 4 rooms. (  - mean value;  - standard deviation; c 95 – 95% confidence interval)

…

. Comfort classes based on the PMV.

…

Figures - uploaded by Joern Ploennigs

Content may be subject to copyright.

Content uploaded by Joern Ploennigs

Content may be subject to copyright.

1 INTRODUCTION

There is a great interest to improve energy manage-

ment in buildings considering the increasing price of

fuel, and the global goal of reducing CO² output.

Building Energy Management (BEM) aims at the ef-

fective and efficient usage of energy to maintain

high building performance operation (Capehar et al.

2008, p. 1). One of the current challenges in this

domain is to optimise energy consumption, while

considering occupant comfort (Metz 2007, p. 394).

Building performance analysis emphasizes the

measurement and assessment of various perform-

ance indicators covering the interests of owners, op-

erators, and occupants in aspects like energy, light-

ing, thermal comfort, and maintenance (Augenbroe

& Park 2005).

The continuous development of wired building

automation systems and the current emerging of

easy-to-integrate wireless solutions have increased

the amount of available building performance data

(Menzel et al. 2008) to evaluate these indicators.

Traditional database management systems (DBMS)

are nowadays used to store the building monitoring

data. These DBMS lack the ability to create data ag-

gregations and do not support the analysis of build-

ing performance data to deliver reports and action-

able information (Lane 2007, p. 29).

Modern approaches from computer science may

simplify the building performance analysis. Data

Warehouses (DW) adds data aggregation capabilities

to databases to prepare and deliver reports for large

data sets (Stackowiak et al. 2007). They also facili-

tate the use of modern analysis approaches such as

Knowledge Discovery in Databases and Data Min-

ing (KDD) (Han & Kamber 2006, p. 35) to discover

previously unknown characteristics, relationships,

dependencies, or trends in data (Rob et al. 2008, p.

744).

The paper introduces a system that incorporates

these two technologies to simplify the building per-

formance analysis. Data Warehouse technologies are

used to aggregate building performance data and

provide to users a fast and easy way to manually

analyse it. This approach is demonstrated in Section

2 for the energy consumption of a real building.

Data mining approaches can be used to analyse

patterns in building performance data, but also to

train models (Section 3). This is presented during the

evaluation of thermal comfort to identify rooms with

low comfort in Section 4 using only room tempera-

ture sensors. The data mining process is introduced

from building data sources, to data preparation and

transformation, model building, testing, and scoring.

The paper uses real data from the Environmental

Research Institute (ERI 2002). The ERI is an en-

ergy-efficient building with many sustainable energy

features such as solar panels, geothermal heat pumps

and heat recovery systems. The ERI building is used

by multiple research groups from biology, chemis-

try, as well as engineering. It also facilitates as a

―Living Laboratory‖ to demonstrate smart building

concepts. The mixed usage with office and labora-

tory spaces and the modern sustainable energy fea-

tures define a wide set of requirements for the build-

ing operator to optimize energy usage while

maintaining steady occupant comfort.

Analyze building performance data for energy-efficient building

operation

A. Ahmed, J. Ploennigs, Y. Gao & K. Menzel

IRUSE, University College Cork, Ireland

ABSTRACT: Modern buildings contain several sensors and meters to monitor the building performance. This

data allows analyzing the building performance to increase the energy-efficiency along with user comfort.

This paper presents two approaches to analyse building performance data. One solution uses data warehouse

techniques to create sophisticated energy consumption aggregations. A second approach implements data

mining techniques to estimate the thermal comfort of occupants with a reduced number of sensors. This paper

interprets the knowledge gained using, as an example, University College Cork‘s Environmental Research In-

stitute building to demonstrate the feasibility of this approach.

2 DATA WAREHOUSE FOR ENERGY-

EFFICIENT BUILDING OPERATION

Data Warehouses (DW) structure data in pre-

specified materialised views that are defined by di-

mensions and stored in cubes to support data aggre-

gation.

For example, an operator wants to analyse the en-

ergy consumption of a building and needs to know

when the most energy is used (time), where it is used

(location), and by which tenant (organization). This

use case specifies the dimensions of the data ware-

house respectively Time, Location, and Organiza-

tion. These dimensions are used to structure and ac-

cess the data in queries, for example: Give me the

aggregated energy consumption of ―last year‖ (time)

for the tenant ―IRUSE‖ (organization) in the ―ERI‖

(location). Such aggregation queries are predefined

in cubes that are spanned by dimensions and the re-

sults are pre-computed in the data warehouse, thus

allowing very fast access to such results. The multi-

dimensional data analysis concept and DW tech-

niques for building performance are further detailed

in Ahmed et al. (2009).

Figure 1. GUI for the building operator

Figure 1 shows the GUI implemented for the ERI

DW. The three energy consumption data categories

that affect the operational costs are electricity (main

power board meter), natural gas (boiler and labora-

tory meters) and water (mains water meter). They

are selectable at the bottom of the GUI. This will be

extended to support the ERI´s sustainable energy

systems to allow a comparison of the energy intake.

The operator uses the dimension categories to

specify the data shown in the graph on the top right.

The operator can select the energy consumption for

a whole building, a specific zone (rooms), a tenant

organization, or equipment. The calendar allows

specifying the time dimension from years, to month,

to single days. These dimensions enable the operator

to easily analyse the building‘s energy consumption

from top level (several years per building), down to

the most detailed level (hourly per room). Due to the

pre-computed queries defined by cubes, the data

warehouse quickly responds with results if the op-

erator modifies a relevant query.

3 DATA MINING CONCEPTS AND APPLICA-

TIONS

Knowledge Discovery in Databases (KDD) and

Data Mining (DM) involves processes to extract or

mine knowledge from large amounts of data (Han &

Kamber 2006, p. 5), providing implicit useful

knowledge (Wang & Huang 2006) to address spe-

cific business problems.

Data Mining approaches can usually be catego-

rised into descriptive and predictive algorithms. De-

scriptive algorithms on the one hand are used for

exploratory data analysis to discover individual pat-

terns, such as associations, or clusters. Predictive al-

gorithms on the other hand focus on the creation of

models that allow predicting observations from input

data like classifications, regression models or neural

networks.

Data mining has been used extensively in the

medical field to solve many problems, such as the

association of genes to genetically inherited diseases

(Perez-Iratxeta et al. 2002). In direct marketing, data

mining is able to identify likely buyers of products,

advertise and promote products (Ling & Li 1998),

and for products placement in shopping centres, to

identify items that are likely to be purchased to-

gether. Data mining has proved successful in reduc-

ing the cost of doing business, improving profits,

and increasing service quality (Apte et al. 2002). In

addition, data mining supports the construction of

customers‘ personal profile from customer transac-

tional data (Adornavicius & Tuzhilin 2002) by the

means of knowledge discovery in databases.

In buildings and energy fields, data mining ap-

proaches, like neural networks, are used in modern

building automation to identify usage scenarios

(Lang et al. 2007), or to estimate the energy con-

sumption in residential buildings (Mihalakakou et al.

2002), and tropical regions (Dong et al. 2005). Char-

acterisation of electric energy consumers was ac-

quired using data mining (Figueiredo et al. 2005). It

was also used to analyse data collected from simula-

tions (Morbitzer et al. 2004), or wireless sensor net-

works (Wu & Clements-Croom 2007).

Most of these studies focus on the energy con-

sumption of buildings, but few evaluate occupant re-

lated aspects of building performance like the ther-

mal comfort of occupants. One reason may be, that

the thermal comfort is a complex measurement it-

self, depending, in the case of the Predicted Mean

Vote (PMV), on the temperature, humidity, air ve-

locity, occupants clothing, etc. This requires com-

plex sensor equipment for data gathering, which is

not reasonable in all rooms. Data Mining can help to

solve such limitations with its predictive algorithms

as this paper demonstrates.

The objective is to analyse building performance

data and room thermal comfort to evaluate heating

and cooling systems efficiency. The data used in this

research is the historical sensed data of the ERI. The

ERI has air temperature sensors in each of its 70

rooms, but possesses additional radiant temperature,

humidity, and CO2 sensors in only four rooms. To

evaluate the thermal comfort for all rooms the pre-

dictive models of data mining should be used as dis-

cussed in the next sections.

4 MINING THE BUILDING PERFORMANCE

DATA

Figure 2 shows the mining process of the sensor data

in the ERI building. This includes data acquisition

(gathering) and preparation (data access, data sam-

pling, and data transformation), model building and

evaluation (create model, test model, evaluate and

interpret model), and Knowledge deployment

(model apply) (Haberstroh 2008, pp. 9-12). All logi-

cal definitions and their physical implementation

presented in this paper comply with Oracle Corpora-

tion Specifications for Oracle Data Miner (ODM)

11g version 1 (Oracle 2008).

Problem

definition

Data gathering

and preparations

Model building

and evaluation

Knowledge

deployment

•Data access

•Data sampling

•Data transformation

•Create model

•Test model

•Evaluate and

interpret model

•Model apply

Figure 2. The process of mining the ERI sensed data stream.

4.1 Problem definition in terms of Data Mining and

Energy Management

This section defines the problem from the energy

management perspective, then converts this knowl-

edge into a data mining problem definition and

shows the preliminary plan designed to solve it.

As mentioned in Section 2, energy management

is required to provide steady user comfort while re-

ducing energy consumption. Relevant stakeholders

need to evaluate HVAC system efficiency and user

comfort in order to accomplish this task, while keep-

ing the cost of this evaluation as low as possible.

We approach this problem by classifying rooms

based on their thermal comfort into hot, warm,

slightly warm, neutral, slightly cool, cool, and cold.

The classification is based on the Predicted Mean

Vote (PMV) as standardized in the ISO 7730 (2005).

A classification model is created based on 4 rooms

that have the necessary sensors available as detailed

in Section 4.2.3 This model is then applied to all 70

rooms using only air temperature sensors to predict

the comfort class.

4.2 Data acquisition and preparations

4.2.1 Data sources and volumes

Data processing includes cleansing, integration, and

transformation of the sensed data to assure high

quality (Atzmüller 2007, p. 174).

The data source for this research is a collection of

storages of the ERI building performance data, as

mentioned in Section 2. The ERI building is a 4500

m² ―Living Laboratory‖ located on the campus of

University Cork College, Ireland. It is equipped with

multiple types of solar panels, geothermal heat

pumps and an under floor heating system. Building

Performance Data is provided by 180 wired sensors

of the Building Management System. Additionally, a

test bed for wireless sensors and actuators has been

installed since April 2008 in three phases. Demon-

strator 0 has been operational since June 2008. Table

1 shows the expected sensors data stream volume for

the ERI building per year.

Table 1. Expected data volumes in the ERI.

Sensors

Sampling Period

Total records

180 Wired

15 minutes

6,307,200

80 Wireless

1 minutes

42,048,000

Total Volume

48,355,200

Currently, there are 190 sensors installed and

working in the ERI building, with 13 different types

of measurements, including indoor environment and

outdoor weather conditions. These sensors are in-

stalled in 109 points in 94 rooms and spaces such as

stairs way, and corridors.

4.2.2 Data collection

Data extracted and retrieved from the building‘s

monitoring data sources is stored in a table with the

attributes as listed in Table 2. These attributes are

the predictors or the influences that are used to de-

tect the room comfort class.

The data for building and testing the model used

in Section 4.3 was collected for the period of

08/02/2007 to 24/04/2009 and contains 933,235 re-

cords for four rooms in the ERI building.

The data for scoring the model in Section 4.4

represents the period of 13/10/2008 to 01/02/2009

and contains 890,921 records for the air temperature

and outdoor conditions for all rooms in the building.

Table 2. The predictors.

Attribute Name

Description

MEASURE_ID

A unique id to identify a

sensor measure

ROOM_ID

A unique id to identify a

room in the ERI

ROOM_NAME

A name to identify a

room

ROOM_SIZE

The volume of a room

ROOM_FLOOR

Storey in which a room is

located

TIME_ID

The time stamp of sensor

reading

COMFORT_CLASS

The predicted comfort

class of room

ROOM_TEMPERATURE

temperature measure in

room

OUT_TEMPERATURE

Outside temperature

OUT_HUMIDITY

Outside humidity

OUT_LIGHT

Outside light

OUT_TOTAL_RADIATION

Outside total solar radia-

tion

OUT_DIFFUSE_ RADIA-

TION

Outside total diffuse solar

radiation

OUT_WIND_DIRECTION

Wind direction

OUT_WIND_SPEED

Wind speed

ROOM_RAD_TEMP*

Radiant Temperature

ROOM_HUMIDITY*

Relative Humidity

ROOM_CO2*

CO2 Concentration

*Available for 4 rooms and used only for computing the com-

fort class

4.2.3 Data preparations and transformation

This section shows the activity of modifying the

values of some attributes and adding other values as

required to present the appropriate data set for min-

ing. There is no methodology agreed upon to prepare

data for the purpose of mining, but it usually tries to

identify and remove outliers, fill null-values and re-

move noise in the data to improve model quality.

First outliers are detected and removed. It has

been found that the air temperature sensor in one

room in the scoring data is broken and delivers read-

ings between -300°C and -200°C. Second, when the

Building Management System is reset it sets all

measurements to zero by default. Both outliers‘

sources were removed from the data leaving 933,235

records for model building and 890,921 records for

scoring.

However, the biggest issue concerns approxi-

mately 90% of the records per measurement (lines 8-

18 in Table 2) that are NULL in the database. The

reason for this is that the timestamps of the sensors

are not synchronized and each sensor fills only its

own column. Thus, when the air temperature sensor

adds a value in the ROOM_TEMPERATURE col-

umn the other measurement columns (lines 9-18) are

left empty. For data mining they need to be filled to

allow the analysis of correlations.

This is done by linearly interpolating each col-

umn over the timestamp for each room. Let us as-

sume for example the air temperature sensor in room

G01 reads 20.0°C at 4:00pm and 15 minutes later

21.5°C. The relative humidity sensor adds its value

at 4:05pm to the database. For this timestamp the

temperature in G01 can be linearly interpolated to

20.5°C. This linear interpolation is implemented in

JAVA for all continuous measurements in Table 2

for building and scoring the model.

As a last preparation step, the thermal comfort

class needs to be computed for the data used for

model building. The classification is based on the

PMV, which is defined in the ISO 7730 and was im-

plemented in JAVA. The PMV value is not an un-

disputable thermal comfort measurement (Nicol &

Parsons 2002, Pfafferott et al. 2007) and other ap-

proaches try to create more general models (Yao et

al. 2009). Nevertheless, the PMV was selected for

this example as it shows the complexity of thermal

comfort evaluation and is established. Other thermal

comfort measures can be analyzed in the same way.

The PMV depends on the air temperature, radiant

temperature, relative humidity, air velocity, as well

as occupant‘s clothing and activity level. Readings

for the air temperature, radiant temperature, and

relative humidity are available for four rooms in the

database. To compute the PMV, we assume constant

air velocity of 0.1m/s, which is a representative

mean value for naturally ventilated offices (Mou-

jalled 2008). At the activity level we assume office

works with 1.2met. The clothing value is interpo-

lated depending on the outside temperature between

1.0m2K/W (indoor winter clothing at 0°C) and

0.5m2K/W (summer clothing at 30°C).

Table 3. Comfort classes based on the PMV.

Comfort Class

Classification

No. in

Data

Percentage

in Data

Hot

3.5 > PMV ≥ 2.5

0.0%

Warm

2.5 > PMV ≥ 1.5

0.0%

Slightly Warm

1.5 > PMV ≥ 0.5

8,948

1.0%

Neutral

0.5 > PMV ≥ -0.5

772,072

82.7%

Slightly Cool

-0.5 > PMV ≥ -1.5

150,227

16.1%

Cool

-1.5 > PMV ≥ -2.5

1,984

0.2%

Cold

-2.5 > PMV ≥ -3.5

0.0%

OutOfRange

otherwise

0.0%

The comfort class is assigned from the PMV

value according to the classification in Table 3. The

table lists also the resulting numbers of entries in

each class. The distribution of the PMV and the

room measurements are displayed in Figure 3 for

comparison. The distributions of the PMV values are

about the same for all four rooms.

Figure 4 shows the results of the attribute impor-

tance analysis of the Oracle Data Miner run on the

computed comfort classes for the model building

data. Attribute Importance identifies the subset of at-

tributes relevant for classification using a Minimum

Description Length Algorithm (Oracle 2008). It is

obvious that the PMV and the related Percentage of

Persons Dissatisfied (PPD) have the biggest influ-

ence on the comfort class. The air and radiant tem-

peratures are next in rank of importance. Other val-

ues are less important for the comfort classification.

a) Room air temperature

b) Room radiant temperature

c) Room relative humidity.

d) Room PMV.

Figure 3. Histograms of various measures from 4 rooms.

( - mean value;  - standard deviation; c95 – 95% confidence

interval)

Figure 4. Influences of the indoor measures in room comfort.

This is relevant for the model building in the next

step, as the PMV, PPD, radiance temperature, rela-

tive humidity, and CO2 are removed, as they are not

available in the other rooms on which the model

should be applied to. We assume that this is feasible,

as the room radiance temperature is strongly corre-

lated to the room air temperature (compare Figure 3a

and 3b, the room humidity is correlated to the out-

side humidity, and the clothing level was related to

the outside temperature during the PMV computa-

tion. Several tests in the next section will show if

this assumption is correct.

4.3 Building and evaluating the comfort model

Building a data mining model is the process of find-

ing the best algorithm or technique, by which the

building sensed data is analysed and represented as

patterns and rules (Harinath & Quinn 2006, p. 485).

The following shows how to classify room com-

fort. This is an overview of building, testing, and

scoring a classification model.

Classification is a model or a classifier that is

constructed to predict the categorical label of a room

in a building (Han & Kamber 2006, p. 286). These

classes are defined in Section 4.2.3. Classification

mining function uses different algorithms such as

decision trees, Naïve Bayes, and support vector ma-

chines.

As the attributes in Table 2 are unconditional this

makes Naïve Bayes the optimal algorithm (Fielding

2007, p. 99) to detect room comfort in buildings in

this case. Naïve Bayes is a probabilistic classifier

that uses Bayesian theory. It simplifies the learning

by assuming that the attributes in Table 2 are inde-

pendent (Abellan et al. 2007) given the room com-

fort class as the variable to classify. Decision trees

and support vector machines resulted in poor mod-

els.

In the setting phase to build the model, the cool

label has been used as the preferred target value.

Data split into two subsets of 60% and 40% for

training and testing the models. The 40% is called a

holdout sample or a test dataset. The sampling proc-

ess was disabled, as the model building time was ac-

ceptable for our data size. The model was tuned to-

wards a maximum average accuracy that creates a

model that is good in predicting all labels (Huang et

al. 2008).

During the building process the model learns

from the sensed data how to distinguish between

comfort classes in order to predict the same classes

when the model is applied to other rooms. The test

metrics of ODM, which are detailed in the following

sections, allow evaluation of the model‘s quality

(Maimon & Rokach 2005, p. 1241).

4.3.1 Predictive confidence

Predictive confidence is a visual indication of the ef-

fectiveness of this model compared to a random

guess of the rooms‘ comfort class. It is a validation

of the ability of the model to generalize what it

learned in a different data set (Fernández 2003, p.

152). If the needle in Figure 5 points to the lowest

point on left of the dial, then the model is no better

than a random guess (Haberstroh 2008, p. 85). The

comfort detection model developed in this study

shows 85.28% predictive improvement over a ran-

dom guess in predicting rooms‘ comfort class. In

comparison, a classification model taking also the

rooms‘ humidity, CO2 and radiant temperature into

account reaches a predictive confidence of 89.44%.

If only the rooms‘ air temperature is used for classi-

fication, the predictive confidence reduces to

74.75%. This shows on the one hand the high impor-

tance of the rooms‘ air temperature for the comfort

class. On the other hand, this demonstrates also that

the other values considered in this study, like outside

measurements and room size, improve the model

significantly.

Figure 5. The predictive confidence of the model.

4.3.2 Model accuracy

Model accuracy shows the several interpretations

of the fault detecting model ability in predicting the

class when applied to the test data.

Figure 6. Model accuracy and the confusion matrix.

Figure 6 shows the model accuracy for the com-

fort classification. The table on the top shows the

percentage of values correctly predicted per class.

For example, there are 308,623 cases with a comfort

class ‗neutral‘ and the model predicts 71.8% of them

correct. The cost is an indication of damage done by

incorrect prediction (Berry & Linoff 2004, p. 79),

and it is a valuable metric for model comparisons.

The displayed model was the best model we could

develop, with the lowest cost of predicting rooms‘

comfort classes.

The type of errors expected from this model is

shown on the confusion matrix in the lower table in

Figure 6. Actual (correct) values of the classes are

represented by rows and compared against the pre-

dictions made by the model in columns. The num-

bers tell how many classes were correctly predicted

or misinterpreted as another class. For example, the

first row in Figure 6 indicates that, of the samples

with the actual comfort class ‗cool‘, 709 cases were

correctly predicted and 55 cases were predicted in-

correctly as ‗slightly cool‘.

To interpret the confusion matrix, incorrect pre-

diction variations are usually placed next to the cor-

rect classes, i.e. the ‗neutral‘ class is either predicted

incorrectly as ‗slightly cool‘ or ‗slightly warm‘. The

rare classes ‗warm‘ and ‗cool‘ have a high percent-

age of correct prediction. The reason is probably that

they are characterized by extreme air temperatures.

However, the low number of samples do not allow

generalisation in so far as the classes will also be de-

tected correctly in other data. As the building data

contained no ‗hot‘ and ‗cold‘ cases the model will

not be able to classify these classes.

4.4 Knowledge deployment

The created model can be applied to any building

performance data that has the same structure and

format, to predict the comfort class. The applying

activity is sometimes referred to as scoring the

model (Giovinazzo 2002, p. 168) that uses the model

in a different data set to predict the classification.

This is done for all 70 rooms excluding the room

with the broken temperature sensor, which was

cleaned out as explained in Section 4.2.3. The new

model allows predicting the thermal comfort class

based only on the rooms‘ air temperature and the

buildings outside conditions.

a) Room air temperature.

b) Room PMV.

Figure 7. Histograms of measures from all rooms.

The distributions for the air temperature of these

rooms and the predicted PMV values are shown in

Figure 7. The mean air temperature () is 19.9°C

and slightly lower than the 21.7°C for the four

rooms‘ data used for model (see Figure 3a). The

standard deviation increases from 1.6°C to 2.2°C as

the added rooms increase the variance. This results

in a broader PMV distribution in Figure 7b in com-

parison to Figure 3c, with significantly more

‗slightly cool‘ and ‗cool‘ values.

Figure 8. Sample of output table for applying the model.

A sample of the output table of applying the

model is displayed in Figure 8. The sample table

shows each row with the identifier, prediction of the

most likely class, the probability that this is the right

guess; the cost of incorrect prediction, and the rank

to categorize predictions. The room name was added

to ease readability.

The model estimates to make correct predictions

with mean probabilities of 78%. The ‗neutral‘ label

is usually predicted with 97% mean probability,

‗slightly cool‘ with 76%, ‗slightly warm‘ with 21%,

and ‗cool‘ and ‗warm‘ with 15% mean probability.

The reason for this distribution is that the data used

for building the model contained mostly cases for

‗neutral‘, which increases the model quality for this

case, but the lack of data for the other cases reduces

their model quality.

4.5 Knowledge gained and interpretation

As a last step, the PMV distribution for a room was

analysed to identify the rooms with an emphasis on

not ‗neutral‘ comfort level. See Figure 9 for a loca-

tion of the rooms. 40 rooms out of 70 were identified

as having mainly ‗slightly cool‘ comfort level, and 5

rooms had a ‗cool‘ comfort level for more than 30%

of the cases. Four of these five rooms are located at

the south facade on ground level and three have ex-

terior doors. The ground floor has the highest num-

ber of rooms with ‗neutral‘ comfort. One room in

the middle of the floor shows abnormal behaviour

that should be investigated, as the room has more

than 30% ‗cool‘ comfort level in contrast to its

neighbour rooms.

Figure 9. Comfort levels of the rooms.

In general, the thermal comfort for the scored

winter period was ‗slightly cool‘. The set point tem-

perature for all rooms was 20°C, which represents

the mean temperature value as shown in Figure 7a.

However, to provide a better thermal comfort, the

set point should be higher.

Office hours were not considered during the

analysis. The mean air temperature varies in the

scoring data by 1.5°C reaching the minimum of

19.0°C at 2am and the maximum of 20.5°C at 4pm.

5 CONCLUSION

Two approaches were introduced to analyse building

performance data for energy-efficient buildings.

The data warehouse solution provides a single re-

pository for building performance data, creates so-

phisticated energy aggregations, and provides

friendly user interfaces.

The data mining model automates and eases

evaluating building thermal comfort, while reducing

the cost of monitoring equipment. The process from

data acquisition, preparation, model building, to

knowledge deployment was examined using real

data from the ERI. The results show that the ap-

proach is feasible, but more data is needed to train

the model for less frequent classes like ‗hot‘ and

‗cold‘.

Implementing data mining techniques to building

sensed data will help in stabilising rooms‘ prefer-

ences while optimising energy usage. Therefore, the

correlations between the building energy usage and

thermal comfort will be further examined with a

special focus on the sustainable energy sources of

the ERI. Another future research topic will be the

development of mining models for fault detection

and diagnosis as well as mining models that consider

human comfort feedback along with other influences

in room states, such as the structural properties of

the building and its geometrical specifications. The

extensions of the ERI with a further 80 wireless sen-

sors will increase the data set for analysis and will

also provide more validation data for this model.

These solutions are used by the ITOBO (2007) pro-

ject to increase the value of energy-efficient smart

buildings.

6 ACKNOWLEDGEMENT

Work in the Strategic Research Cluster ‗ITOBO‘ is

funded by Science Foundation Ireland and additional

contributions from 5 industry partners. Joern Ploen-

nigs is as Feodor Lynen Fellow in Cork and wants to

thank the Humboldt-Foundation and the German

BMBF for their support.

The authors thank Paul Stack, Luke Allan, Brian

Cahill, Civil Engineering UCC; Anika Schumann,

Cork Constraint Computation Centre; and Haithum

Elhadi, U.S. Telecom and Illinois Institute of Tech-

nology for their contribution to this research.

7 REFERENCES

Abellan, J., Cano, A., Masegosa, A. R., & Moral, S. 2007. A

Semi-Naive Bayes Classifier with Grouping of Cases. In K.

Mellouli (Ed.), 9th European Conference, ECSQARU (pp.

477-488). Hammamet, Tunisia: Springer.

Adornavicius, G., & Tuzhilin, A. 2002. Using data mining

methods to build customer profiles. IEEE Computer 34 (2):

74-82.

Ahmed, A., Menzel, K., Ploennigs, J., & Cahill, B. 2009. As-

pects of Multi-dimensional Data Analysis of Building Per-

formance Data Management. 16th European Group for In-

telligent Computing in Engineering International

Workshop. Berlin, Germany, accepted.

Apte, C., Liu, B., Pednault, E. P., & Smyth, P. 2002. Business

Application of Data Mining. Communications of the ACM

45 (8): 49-53.

Atzmüller, M. 2007. Knowledge-intensive Subgroup Mining:

Techniques for Automatic and Interactive Discovery. IOS

Press.

Augenbroe, G., Park, C. S. 2005. Quantification methods of

technical building performance. Building Research and In-

formation 33 (2): 159-72.

Berry, M. J., & Linoff, G. 2004. Data mining techniques: for

marketing, sales, and customer relationship management.

John Wiley and Sons.

Capehar, B. L., Turner, W. C., & Kennedy, W. J. 2008. Guide

to Energy Management. The Fairmont Press.

Crawley, D. B., Hand, J. W., Kummert, M., & Griffith, B. T.

2008. Contrasting the capabilities of building energy per-

formance simulation programs. Building and Environment

43 (4): 661-673 .

Dong, B., Cao, C., & Lee, S. E. 2005. Applying support vector

machines to predict building energy consumption in tropi-

cal region. Energy and Buildings 37 (5): 545-553.

ERI 2002. Environmental Research Institute. Cork, Ireland:

University College Cork, http://eri.ucc.ie.

Fernández, G. 2003. Data mining using SAS applications. CRC

Press.

Fielding, A. 2007. Cluster and classification techniques for the

biosciences. Cambridge University Press.

Figueiredo, V., Rodrigues, F., Vale, Z., & Gouveia, J. B. 2005.

IEEE Transaction on Power Systems 20 (2): 596-602.

Giovinazzo, W. A. 2002. Internet-enabled business intelli-

gence. Prentice Hall PTR.

Haberstroh, R. 2008. Oracle Data Mining Tutorial for Oracle

Data Mining 11g Release 1. Oracle.

Han, J., & Kamber, M. 2006. Data mining: concepts and tech-

niques (2 ed.). Morgan Kaufmann.

Harinath, S., & Quinn, S. R. 2006. Professional SQL server

analysis services 2005 with MDX. John Wiley and Sons.

Huang, B., Cai, Z., Gu, Q., & Chen, C. 2008. Using Support

Vector Regression for Classification. 4th International Con-

ference on Advanced Data Mining and Applications (pp.

581-588). Chengdu, China: Springer.

ISO 7730:2005. Ergonomics of the thermal environment - Ana-

lytical determination and interpretation of thermal comfort

using calculation of the PMV and PPD indices and local

thermal comfort criteria.

ITOBO 2007. Information & Communication Technology for

Sustainable and Optimised Building Operation. Cork, Ire-

land: http://zuse.ucc.ie/itobo/.

Lane, P. 2007. Data Warehousing Guide, 11g Release 1 (11.1),

Oracle Data Base, Oracle.

Lang, R., Bruckner, D., Pratl, G., Velik, R., & Deutsch, T.

2007. Scenario recognition in modern building automation.

7th IFAC International Conference on Fieldbuses & Net-

works in Industrial & Embedded Systems, (pp. 305-312).

Ling, C. X., & Li, C. 1998. Data Mining for Direct Marketing:

Problems and Solutions. 4th International Conference on

Knowledge Discovery and Data Mining, (pp. 73-79).

Maimon, O. Z., & Rokach, L. 2005. Data mining and knowl-

edge discovery handbook. Springer Science & Business.

McCue, C. 2006. Data mining and predictive analysis: intelli-

gence gathering and crime analysis. Butterworth-

Heinemann.

Menzel, K., Pesch, D., O‘Flynn, B., Keane, M., & O‘Mathuna,

C. 2008. Towards a Wireless Sensor Platform for Energy

Efficient Building Operation. 12th International conference

on Computing in Civil and Building Engineering (pp. 381-

386). Beijing, China : Elsevier B.V.

Metz, B. 2007. IPCC Fourth Assessment Report on the mitiga-

tion of climate change for researchers, students, and poli-

cymakers. University Press.

Mihalakakou, G., Santamouris, M., & Tsangrassoulis, A. 2002.

On the energy consumption in residential buildings. Energy

and Buildings 34 (7): 727-736.

Morbitzer, C. and Strachan, P. and Simpson, C. 2004. Data

mining analysis of building simulation performance data.

Building Services Engineering Research and Technology

35 (3): 253–267.

Moujalled, B., Cantin, R., & Guarracin, G. 2008. Comparison

of thermal comfort algorithms in naturally ventilated ofﬁce

buildings. Energy and Buildings 40 (12): 2215–2223.

Nicol, F., Parsons, K. 2002, Special issue on thermal comfort

standards, Energy and Buildings 34 (6): 529-685.

Oracle. 2008. Oracle Data Mining Concepts. Oracle.

Perez-Iratxeta, C., Bork, P., & Andrade, M. A. 2002. Associa-

tion of genes to genetically inherited diseases using data

mining. Nature Genetics 31: 316-319.

Pfafferott, J. U., Herkel, S., Kalz, D. E., Zeuschner, A. 2007

Comparison of low-energy office buildings in summer us-

ing different thermal comfort criteria, Energy and Buildings

39 (7): 750-757.

Rob, P., Coronely, C., & Crockett, K. 2008. Data Bases Sys-

tems: Design, Implementation and Management. Cengage

Learning EMEA.

Stackowiak, R., Rayman, J., & Greenwald, R. 2007. Oracle

data warehousing and business intelligence solutions. John

Wiley and Sons.

Wang, X., & Huang, J. Z. 2006. A Cased-Based Data Mining

Platform. In G. J. Williams, & S. J. Simoff, A State of the

Art Survey, Data mining: theory, methodology, techniques,

and applications (pp. 28-39). Springer Science & Business.

Witten, I. H., & Frank, E. 2005. Data mining: practical ma-

chine learning tools and techniques (2 ed.). Morgan Kauf-

mann.

Wu, S., Clements-Croom, D. 2007 Understanding the indoor

environment through mining sensory data—A case study.

Energy and Buildings 39 (11): 1183–1191.

Yao, R., Li, B., Liu, J. 2009. A theoretical adaptive model of

thermal comfort - Adaptive Predicted Mean Vote (aPMV),

Building and Environment 44 (10): 2089-2096.

Computer Vision for Construction Progress Monitoring: A Real-Time Object Detection Approach

Chapter

Full-text available

Sep 2023

Construction progress monitoring (CPM) is essential for effective project management, ensuring on-time and on-budget delivery. Traditional CPM methods often rely on manual inspection and reporting, which are time-consuming and prone to errors. This paper proposes a novel approach for automated CPM using state-of-the-art object detection algorithms. The proposed method leverages e.g. YOLOv8's real-time capabilities and high accuracy to identify and track construction elements within site images and videos. A dataset was created, consisting of various building elements and annotated with relevant objects for training and validation. The performance of the proposed approach was evaluated using standard metrics, such as precision, recall, and F1-score, demonstrating significant improvement over existing methods. The integration of Computer Vision into CPM provides stakeholders with reliable, efficient, and cost-effective means to monitor project progress, facilitating timely decision-making and ultimately contributing to the successful completion of construction projects.

Digital Twins as Enabler for Long Term Data Management Using Building Logbooks

Chapter

Full-text available

Sep 2023

This paper describes how digital twin technology can be used to develop holistic, standardized building documentation, paving the way from the traditional, national ‘Bauwerksbuch’ to the proposed, international EU-building logbook. The research explores the concept of digital twins and their potential to harmonize long-term data management for facilities, enabling the set-up of international collaborative networks among various stakeholders. The study highlights the benefits of digital twins to support collaboration between property owners, public authorities, building inspectors, and different engineering disciplines for maintenance, energy efficiency, and sustainability. It specifies the steps required for improving data quality, consistency, and accuracy of the ‘digital sibling’ to match reality to the greatest extent possible, ensuring seamless collaboration across networks. The authors highlight the potential of building logbooks to transform the way we manage facilities, improve energy efficiency and sustainability, and deliver decision support for the long-term maintenance of our built environment, through a collaborative network approach.

From static to dynamic information containers

Conference Paper

Full-text available

Jul 2023

Computer Vision for Construction Progress Monitoring: A Real-Time Object Detection Approach

Preprint

Full-text available

Jun 2023

Experimental Research of the Heat Transfer into the Ground at Relatively High and Low Water Table Levels

Article

Full-text available

May 2023

During the cold period, the heat transferred through the building’s external boundaries to the environment changes the naturally established heat balance between atmospheric air and soil layers. The process of the heat transfer into the ground was investigated experimentally in the cases of the relatively high and low levels of the water table. The first part of each experiment was the research of the heat transfer into the soil from the heating surface. The second part was monitoring the heat dissipation in the ground until the return to the initial natural thermodynamic equilibrium after the heating is intercepted. The heating device was installed into the clay at a one-meter depth, and its surface temperature was kept constant at 20 degrees Celsius. The ground was warmed up in contact with the heating surface. The heat spread to other soil layers and transformed the temperature distribution. A new thermodynamic equilibrium was reached six days after the heating started at an initial temperature of 4.4 degrees Celsius. The intensity of the heat flux density approached a stable value equal to 117.4 W/m2, which is required to maintain this thermodynamic equilibrium, as the heat was dissipating in the large volume of the surrounding soil. The heating was turned off, and the natural initial heat balance was reached after two weeks.

Challenges and drivers for data mining in the AEC sector

Article

Full-text available

Oct 2018

Purpose The purpose of this paper is to explore the current challenges and drivers for data mining in the AEC sector. Design/methodology/approach Following a comprehensive literature review, the data mining concept was investigated through a workshop with industry experts and academics. Findings The results showed that the key drivers for using data mining within the AEC sector is associated with the sustainability, process improvement, market intelligence, cost certainty and cost reduction, performance certainty and decision support systems agendas in the sector. As for the processes with the greatest potential for data mining application, design, construction, procurement, forensic analysis, sustainability and energy consumption and reuse of digital components were perceived as the main process areas. While the key challenges were perceived as being, data issues due to the fragmented nature of the construction process, the need for a cultural change, IT systems used in silos, skills requirements and having clearly defined business goals. Originality/value With the increasing abundance of data, business intelligence and analytics and its related concepts, data mining and Big Data have captured the attention of practitioners and academics for the last 20 years. On the other hand, and despite the growing amount of data in its business context, the AEC sector still lags behind in utilising those concepts in its end products and daily operations with limited research conducted to explore those issues at the sector level. This paper investigates the main opportunities and barriers for data mining in the AEC sector with a practical focus.

Analysis of energy data of existing buildings in a University Campus

Article

Full-text available

Apr 2018

The constant increase in the requirement for electric energy on the part of the service sector has driven the development of tools for the analysis of energy data regarding the management of buildings. Particularly university campuses are made up of buildings for different purposes and uses. Therefore, this work proposes to develop a method capable of providing clear and easy-to-understand information to track electricity end-use by means of two-dimensional graphs. The aim of the method is to analyse the data of the electricity consumption of one or several buildings and compare them with regard to their surface areas and uses, as well as to obtain a correlation with the outdoor temperature. The visual analysis of the building electricity consumption in the campus 'Sciences et Technologies' of the University of Bordeaux in France is based on graphs which include electricity use intensity, time series of daily electricity consumption, scatter diagrams of consumption versus heating degree-days, and boxplots for daily consumption profiles. This analysis permits characterising the electricity consumption of the different types of buildings and determining a trends of energy end-use in the short as well as in the long term.

Data science for building energy management: A review

Article

Full-text available

Apr 2017
RENEW SUST ENERG REV

The energy consumption of residential and commercial buildings has risen steadily in recent years, an increase largely due to their HVAC systems. Expected energy loads, transportation, and storage as well as user behavior influence the quantity and quality of the energy consumed daily in buildings. However, technology is now available that can accurately monitor, collect, and store the huge amount of data involved in this process. Furthermore, this technology is capable of analyzing and exploiting such data in meaningful ways. Not surprisingly, the use of data science techniques to increase energy efficiency is currently attracting a great deal of attention and interest. This paper reviews how Data Science has been applied to address the most difficult problems faced by practitioners in the field of Energy Management, especially in the building sector. The work also discusses the challenges and opportunities that will arise with the advent of fully connected devices and new computational technologies.

AI-Supported Commissioning of Buildings: Verification of Collaboration Achievements as Pre-requisite for Intelligent Building Operations

Chapter

Sep 2023

The handover and commissioning of buildings are highly collaborative tasks since multiple sub-systems, designed by numerous experts from different disciplines will become operational. This means, the commissioning process is the final proof of concept for architects and engineers’ design efforts. Additionally, construction companies must verify that all materials used are certified according to national standards. Finally, certified inspectors must approve the operational capability of critical systems and document each approval step using so-called operational certificates. This paper describes a novel, AI-guided approach, which will assist main contractors, investors and building owners in compiling, storing and managing all related product models, certificates and data sources for real-time monitoring (e.g. sensors) in a federated ‘distributed Digital Twin Information Model’ (dDT-IM). Furthermore, the paper explains why the availability of a complete, consistent, and comprehensive dDT-IM is an essential pre-requisite for the successful execution of collaborative, intelligent building operation.

Application of Data Science for Controlling Energy Crises: A Case Study of Pakistan

Conference Paper

Feb 2019

Today Pakistan is facing numerous challenges for the interconnection of local energy resources and for balanced energy policies. Data Science, Big Data, Artificial Intelligence (AI), IoT and Cloud computing draws our focus towards controlling energy crises in terms of smart energy generation, consumption and to overcome causes of energy crises. To make a conclusion valuable we have to extract significant value from a large amount of data that's why data management plays a significant role. This Paper presents a review of energy sectors, energy resources, energy crises in Pakistan. It also presents the possible solution of energy crises with the help of data science application and the involvement of Big Data, Cloud computing, IoT and AI.

Quantification methods of technical building performance

Article

Full-text available

Mar 2005

A building performance assessment toolkit was developed for use by large corporate owners and building portfolio managers in the US. A variety of technical performance aspects are addressed such as energy, lighting, thermal comfort, maintenance and indoor air quality. Every assessment is based on a normative and objective Performance Indicator (PI). For easy data capture and calculation of PIs, the toolkit was implemented in a web hosted form, enabling facility managers and staff to collect the data during a walk-through enabled by PDA-based data entry. The current set of performance indicators is discussed and the results of the first benchmarks, most notably the energy benchmarks, are reported.

Data mining analysis of building simulation performance data

Article

Aug 2004

Detailed simulation studies of building performance can result in large data sets, particularly where statistical information on annual energy or environmental performance is required. Key performance indicators such as the number of hours above a certain temperature can easily be extracted. However, it is difficult for users to explore such datasets and understand the underlying reasons why a building performs in a certain way. This is especially true in climate responsive buildings which involve complex interactions of ventilation, solar gains, internal gains and thermal mass, for example. Data mining techniques have traditionally been employed in the financial and marketing sectors to elicit patterns within the data. This paper describes how the different data mining techniques may be employed in helping to analyse building performance data. Clustering is identified as a particular useful analysis technique and its potential is illustrated through a number of case studies.

Data mining using SAS applications

Article

Jan 2002

G. C. J. Fernandez

Data Mining for Direct Marketing: Specific Problems and Solutions

Article

Jan 1998

Special issue on thermal comfort standards

Article

Jul 2002
ENERG BUILDINGS

Guide to Energy Management

Book

Dec 1994

This book presents an overview on energy management. Chapters include energy management systems; energy audits; economic evaluation; energy bills; lighting; HVAC systems; industrial wastes; steam generation; control systems; maintenance; insulation; process energy management; and renewable resources and water management. Individual chapters were processed separately for the databases.

The Data Mining and Knowledge Discovery Handbook

Book

Jan 2005

The Data Mining process encompasses many different specific techniques and algorithms that can be used to analyze the data and derive the discovered knowledge. An important problem regarding the results of the Data Mining process is the development of efficient indicators of assessing the quality of the results of the analysis. This, the quality assessment problem, is a cornerstone issue of the whole process because: i) The analyzed data may hide interesting patterns that the Data Mining methods are called to reveal. Due to the size of the data, the requirement for automatically evaluating the validity of the extracted patterns is stronger than ever. ii)A number of algorithms and techniques have been proposed which under different assumptions can lead to different results. iii)The number of patterns generated during the Data Mining process is very large but only a few of these patterns are likely to be of any interest to the domain expert who is analyzing the data. In this chapter we will introduce the main concepts and quality criteria in Data Mining. Also we will present an overview of approaches that have been proposed in the literature for evaluating the Data Mining results.

Data Mining Techniques: For Marketing, Sales, and Customer Support

Book

Jan 1997

Data Warehousing and OLAP

Article

Dec 1999

Data warehousing and on-line analytical processing (OLAP) is becoming an important tool for decision making in corporations and other organizations. It is one of the main focuses of the database industry. However, the functions and properties of decision support system are rather different from the traditional database application. For example, user of decision support system may be interested in the trend of certain data instead of the actual data itself. Another feature of data warehouse system is that the amount of data inside is tremendous, which means that the traditional query process on these data will be very time consuming. In this survey paper, we will mainly discuss several techniques used in data warehouse to accelerate the OLAP process speed. The rest of paper is organized as follows: Chapter 1 is the introduction, in which we will give an overview of current technology used in the area of data warehousing and OLAP. In chapter 2, we will talk about a new aggregation operator which is called Data Cube operator. The Data Cube operator can perform N-dimensional aggregation. From chapter 3, we will begin to discuss one of the most importation is issue in data warehousing and OLAP. That is view materialization and view maintenance. In chapter 3, a general introduction to the problems and techniques of materialized views maintenance will be given. In chapter 4, some techniques developed base on the space constrain of data warehouse will be discussed. In chapter 5, we will use a dynamic view management system to discuss the techniques of dynamic view selection and view maintenance.

A Semi-naive Bayes Classifier with Grouping of Cases

Conference Paper

In this work, we present a semi-naive Bayes classifier that searches for dependent attributes using different filter approaches. In order to avoid that the number of cases of the compound attributes be too high, a grouping procedure is applied each time after two variables are merged. This method tries to group two or more cases of the new variable into an unique value. In an emperical study, we show as this approach outperforms the naive Bayes classifier in a very robust way and reaches the performance of the Pazzani’s semi-naive Bayes [1] without the high cost of a wrapper search.

Analyze building performance data for energy-efficient building operation

Figures

Recommended publications

Quantification of wind‐driven rain — An experimental approach

Evaluation of Building Sustainable Performance Based on Set Pair Analysis

Trends in building energy modelling and simulation

Thermal Comfort for Users According to the Brazilian Housing Buildings Performance Standards