Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 177 (2020) 24–31
1877-0509 © 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the Conference Program Chairs.
10.1016/j.procs.2020.10.007
10.1016/j.procs.2020.10.007 1877-0509
© 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the Conference Program Chairs.
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2020) 000–000
www.elsevier.com/locate/procedia
1877-0509 © 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
The 11th International Conference on Emerging Ubiquitous Systems and Pervasive Networks
(EUSPN 2020)
November 2-5, 2020, Madeira, Portugal
Using Smartphone Accelerometer for Human Physical Activity and
Context Recognition in-the-Wild
Muhammad Ehatisham-ul-Haqa, Muhammad Awais Azamb, Yusra Asima, Yasar Amina,
Usman Naeemc,
*
, and Asra Khalidd
aFaculty of Telecommunication and Information Engineering, University of Engineering and Technology, Taxila, Punjab, 47050, Pakistan
bWhitecliffe Technology, Wellington, New Zealand
cSchool of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom
dSchool of Engineering & Computer Science, Victoria University of Wellington, New Zealand
Abstract
Adaptation of smart devices is frequently rising, where a new generation of smartphones is growing into an emerging platform
for personal computing, monitoring, and private data processing. Smartphone sensing allows collecting data from immediate
environments and surroundings to recognize human daily living activities and behavioral contexts. Although smartphone-based
activity recognition is universal; however, there is a need for coinciding recognition of in-the-wild human physical activities and
the associated contexts. This research work proposes a two-level scheme for in-the-wild recognition of human physical activities
and the corresponding contexts based on the smartphone accelerometer data. Different classifiers are used for experimentation
purposes, and the achieved results validate the efficiency of the proposed scheme.
© 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
Keywords: activity recognition; context-aware sensing; context recognition; smart sensing; ubiquitous computing;
* Usman Naeem Tel.: +44-7882-6171
E-mail address: u.naeem@qmul.ac.uk
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2020) 000–000
www.elsevier.com/locate/procedia
1877-0509 © 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
The 11th International Conference on Emerging Ubiquitous Systems and Pervasive Networks
(EUSPN 2020)
November 2-5, 2020, Madeira, Portugal
Using Smartphone Accelerometer for Human Physical Activity and
Context Recognition in-the-Wild
Muhammad Ehatisham-ul-Haqa, Muhammad Awais Azamb, Yusra Asima, Yasar Amina,
Usman Naeemc,*, and Asra Khalidd
aFaculty of Telecommunication and Information Engineering, University of Engineering and Technology, Taxila, Punjab, 47050, Pakistan
bWhitecliffe Technology, Wellington, New Zealand
cSchool of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom
dSchool of Engineering & Computer Science, Victoria University of Wellington, New Zealand
Abstract
Adaptation of smart devices is frequently rising, where a new generation of smartphones is growing into an emerging platform
for personal computing, monitoring, and private data processing. Smartphone sensing allows collecting data from immediate
environments and surroundings to recognize human daily living activities and behavioral contexts. Although smartphone-based
activity recognition is universal; however, there is a need for coinciding recognition of in-the-wild human physical activities and
the associated contexts. This research work proposes a two-level scheme for in-the-wild recognition of human physical activities
and the corresponding contexts based on the smartphone accelerometer data. Different classifiers are used for experimentation
purposes, and the achieved results validate the efficiency of the proposed scheme.
© 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
Keywords: activity recognition; context-aware sensing; context recognition; smart sensing; ubiquitous computing;
* Usman Naeem Tel.: +44-7882-6171
E-mail address: u.naeem@qmul.ac.uk
2 Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2020) 000–000
1. Introduction
Smartphones are ubiquitous and context-aware devices that have emergent computing and sensing capabilities.
In recent years, with the advancements in smartphone manufacturing technologies, the worth of these devices has
become a key attraction towards the researchers [1]. These devices provide a gateway for human-centric computing
and allow pervasive monitoring of the peoples' physical activities and their daily routines [2]. As a result, sensor-
based physical activity recognition (PAR) has become indispensable in many application areas such as context-
awareness, human-computer interactions, smart homes, healthcare, and security control. A lot of successful research
studies have been conducted in the area of PAR. However, in-the-wild PAR poses a significantly tricky challenge
for the researchers to address as people's behavior is chaotic, which changes with different environments and
settings. Moreover, human postures and activity patterns vary with respect to different environments. For example, a
person may sit and behave in a formal way in a meeting, while his/her behavior and sitting posture may be more
relaxed at home. Likewise, activity patterns related to sitting, walking, eating, and talking may vary in different
contexts, e.g., at home or in a workplace, etc. Therefore, there is a need to model and recognize in-the-wild human
physical activities with respect to their contexts. The coinciding recognition of human activities and their contexts
may help in avoiding the erroneous realization of the real-world. Further, it can provide crucial improvements in
context-aware decision-making applications and recommender systems.
In existing research works, many authors put their efforts to recognize physical activities of daily living such as
sitting, standing, walking, and talking, or behavioral context like indoor and outdoor [3–5]. Yonatan et al. [6] used
multi-modal sensing for recognizing human behavioral contexts (including user's location, body posture, and
activity) based on a single-label-per-classifier approach. The authors in [7] recognized actions and objects from the
videos and combined the Hidden Markov Models (HMMs) with the body context to categorize hand actions.
However, there is no such way to recognize human physical activities along with their behavioral contexts to
produce fine-grained information. The existing research studies face many limitations regarding in-the-wild
recognition of human activities and their behavioral contexts as targeting a wide range of human context for
recognition purposes is very challenging. Moreover, in-the-wild human behavior entails significant unpredictability
in varying settings and environmental contexts. For example, an activity of lying down may have varying contexts
like sleeping, surfing the internet, or watching TV, which may affect its pattern chaotically. For real-time PAR
applications, it is necessary to address this variability of activity patterns in different contexts. Moreover, it is
necessary to infer the user's context along with PAR for distinguishing better the human activities being conducted
in various contexts. The existing research work addresses a few aspects of context recognition; however, there is a
need to recognize the detailed human context along with their physical activity.
In this study, we propose a two-level PAR model to recognize human physical activities along with the
associated behavioral contexts. The first level of the proposed model is capable of classifying between four primary
physical activities, including lying down, sitting, standing, and walking. The second level entails human context
recognition based on the activity recognized at the first level. By combining the output of both levels, the proposed
PAR scheme provides the coinciding recognition of human physical activities and their contexts. The goal of the
proposed model is to provide a cost-effective solution for effectively recognizing and classifying the physical
activities and their associated contexts using only the accelerometer data from the smartphone.
This research work offers the following significant contributions:
• Recognition of four physical activities (i.e., lying down, sitting, standing, and walking) in-the-wild based on
smartphone accelerometer data
• Recognition of activity-related human contexts based on physical activity pattern recognition
• Investigation and exploration of different machine learning classifiers for physical activity and context
recognition in-the-wild
The remaining part of the paper is structured as follows. Section 2 explains the related work for the proposed
scheme. Section 3 elaborates the steps involved in the proposed two-level model for recognizing human physical
activities and contexts. Section 4 provides the experimental results and discussions regarding the performance of
Muhammad Ehatisham-ul-Haq et al. / Procedia Computer Science 177 (2020) 24–31 25
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2020) 000–000
www.elsevier.com/locate/procedia
1877-0509 © 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
The 11th International Conference on Emerging Ubiquitous Systems and Pervasive Networks
(EUSPN 2020)
November 2-5, 2020, Madeira, Portugal
Using Smartphone Accelerometer for Human Physical Activity and
Context Recognition in-the-Wild
Muhammad Ehatisham-ul-Haqa, Muhammad Awais Azamb, Yusra Asima, Yasar Amina,
Usman Naeemc,*, and Asra Khalidd
aFaculty of Telecommunication and Information Engineering, University of Engineering and Technology, Taxila, Punjab, 47050, Pakistan
bWhitecliffe Technology, Wellington, New Zealand
cSchool of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom
dSchool of Engineering & Computer Science, Victoria University of Wellington, New Zealand
Abstract
Adaptation of smart devices is frequently rising, where a new generation of smartphones is growing into an emerging platform
for personal computing, monitoring, and private data processing. Smartphone sensing allows collecting data from immediate
environments and surroundings to recognize human daily living activities and behavioral contexts. Although smartphone-based
activity recognition is universal; however, there is a need for coinciding recognition of in-the-wild human physical activities and
the associated contexts. This research work proposes a two-level scheme for in-the-wild recognition of human physical activities
and the corresponding contexts based on the smartphone accelerometer data. Different classifiers are used for experimentation
purposes, and the achieved results validate the efficiency of the proposed scheme.
© 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
Keywords: activity recognition; context-aware sensing; context recognition; smart sensing; ubiquitous computing;
* Usman Naeem Tel.: +44-7882-6171
E-mail address: u.naeem@qmul.ac.uk
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2020) 000–000
www.elsevier.com/locate/procedia
1877-0509 © 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
The 11th International Conference on Emerging Ubiquitous Systems and Pervasive Networks
(EUSPN 2020)
November 2-5, 2020, Madeira, Portugal
Using Smartphone Accelerometer for Human Physical Activity and
Context Recognition in-the-Wild
Muhammad Ehatisham-ul-Haqa, Muhammad Awais Azamb, Yusra Asima, Yasar Amina,
Usman Naeemc,*, and Asra Khalidd
aFaculty of Telecommunication and Information Engineering, University of Engineering and Technology, Taxila, Punjab, 47050, Pakistan
bWhitecliffe Technology, Wellington, New Zealand
cSchool of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom
dSchool of Engineering & Computer Science, Victoria University of Wellington, New Zealand
Abstract
Adaptation of smart devices is frequently rising, where a new generation of smartphones is growing into an emerging platform
for personal computing, monitoring, and private data processing. Smartphone sensing allows collecting data from immediate
environments and surroundings to recognize human daily living activities and behavioral contexts. Although smartphone-based
activity recognition is universal; however, there is a need for coinciding recognition of in-the-wild human physical activities and
the associated contexts. This research work proposes a two-level scheme for in-the-wild recognition of human physical activities
and the corresponding contexts based on the smartphone accelerometer data. Different classifiers are used for experimentation
purposes, and the achieved results validate the efficiency of the proposed scheme.
© 2020 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the Conference Program Chairs.
Keywords: activity recognition; context-aware sensing; context recognition; smart sensing; ubiquitous computing;
* Usman Naeem Tel.: +44-7882-6171
E-mail address: u.naeem@qmul.ac.uk
2 Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2020) 000–000
1. Introduction
Smartphones are ubiquitous and context-aware devices that have emergent computing and sensing capabilities.
In recent years, with the advancements in smartphone manufacturing technologies, the worth of these devices has
become a key attraction towards the researchers [1]. These devices provide a gateway for human-centric computing
and allow pervasive monitoring of the peoples' physical activities and their daily routines [2]. As a result, sensor-
based physical activity recognition (PAR) has become indispensable in many application areas such as context-
awareness, human-computer interactions, smart homes, healthcare, and security control. A lot of successful research
studies have been conducted in the area of PAR. However, in-the-wild PAR poses a significantly tricky challenge
for the researchers to address as people's behavior is chaotic, which changes with different environments and
settings. Moreover, human postures and activity patterns vary with respect to different environments. For example, a
person may sit and behave in a formal way in a meeting, while his/her behavior and sitting posture may be more
relaxed at home. Likewise, activity patterns related to sitting, walking, eating, and talking may vary in different
contexts, e.g., at home or in a workplace, etc. Therefore, there is a need to model and recognize in-the-wild human
physical activities with respect to their contexts. The coinciding recognition of human activities and their contexts
may help in avoiding the erroneous realization of the real-world. Further, it can provide crucial improvements in
context-aware decision-making applications and recommender systems.
In existing research works, many authors put their efforts to recognize physical activities of daily living such as
sitting, standing, walking, and talking, or behavioral context like indoor and outdoor [3–5]. Yonatan et al. [6] used
multi-modal sensing for recognizing human behavioral contexts (including user's location, body posture, and
activity) based on a single-label-per-classifier approach. The authors in [7] recognized actions and objects from the
videos and combined the Hidden Markov Models (HMMs) with the body context to categorize hand actions.
However, there is no such way to recognize human physical activities along with their behavioral contexts to
produce fine-grained information. The existing research studies face many limitations regarding in-the-wild
recognition of human activities and their behavioral contexts as targeting a wide range of human context for
recognition purposes is very challenging. Moreover, in-the-wild human behavior entails significant unpredictability
in varying settings and environmental contexts. For example, an activity of lying down may have varying contexts
like sleeping, surfing the internet, or watching TV, which may affect its pattern chaotically. For real-time PAR
applications, it is necessary to address this variability of activity patterns in different contexts. Moreover, it is
necessary to infer the user's context along with PAR for distinguishing better the human activities being conducted
in various contexts. The existing research work addresses a few aspects of context recognition; however, there is a
need to recognize the detailed human context along with their physical activity.
In this study, we propose a two-level PAR model to recognize human physical activities along with the
associated behavioral contexts. The first level of the proposed model is capable of classifying between four primary
physical activities, including lying down, sitting, standing, and walking. The second level entails human context
recognition based on the activity recognized at the first level. By combining the output of both levels, the proposed
PAR scheme provides the coinciding recognition of human physical activities and their contexts. The goal of the
proposed model is to provide a cost-effective solution for effectively recognizing and classifying the physical
activities and their associated contexts using only the accelerometer data from the smartphone.
This research work offers the following significant contributions:
• Recognition of four physical activities (i.e., lying down, sitting, standing, and walking) in-the-wild based on
smartphone accelerometer data
• Recognition of activity-related human contexts based on physical activity pattern recognition
• Investigation and exploration of different machine learning classifiers for physical activity and context
recognition in-the-wild
The remaining part of the paper is structured as follows. Section 2 explains the related work for the proposed
scheme. Section 3 elaborates the steps involved in the proposed two-level model for recognizing human physical
activities and contexts. Section 4 provides the experimental results and discussions regarding the performance of
26 Muhammad Ehatisham-ul-Haq et al. / Procedia Computer Science 177 (2020) 24–31
Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2018) 000–000 3
different machine learning classifiers for the proposed scheme. Finally, Section 5 summarizes this research work
and provides some highlights for future work.
2. Related Works
Smartphone sensing and computing powers have attracted various researchers, who have made use of these
capabilities for a range of applications. In previous works, smartphones have been most commonly utilized for PAR
[8,9], and many researchers utilized these mobile devices for crowdsourcing [10], user authentication [11], and
context-awareness [12,13]. Xing et al. [8] proposed a comprehensive survey report on PAR using smartphone
sensors. L. Xu et al. [14] presented a PAR method based on wearable sensors, where they used Random Forest
classifier for training and testing of the system to achieve efficient recognition results. People daily living activity
recognition is also used for several purposes, such as realizing human behavior and monitoring health [4]. M.
Mehedi et al. [15] proposed a deep learning-based method for recognizing twelve (12) human activities using
smartphone sensors. Likewise, Sourav et al. [16] utilized the deep learning model, i.e., Restricted Boltzmann
Machines (RBM), for robust PAR.
With the improvement in ubiquitous sensing technologies, smartphones have also been used for context-
awareness in an extensive range of applications. Context-awareness has developed progressively as awareness about
the person's environment/background, which is very useful for a broad range of ubiquitous applications. A few
researchers have studied different aspects of human behavioral contexts, for example, outside or indoor, in a
meeting or in a car, etc. [4,17]. Chaoyan et al. [18] proposed discriminative context models and extracted human
contextual information for recognition of mutual activities based on inter-class and intra-class labels. Their
experimental results demonstrate that their proposed scheme is beneficial for robust PAR. Smartphone sensors
collectively enable observing the surroundings from different perspectives, which serves to increase the validity of
activity and context inference tasks. Moreover, PAR with context modeling helps to infer valuable information
about the individuals' lifestyle. As people are habitual to use smartphones, hence they also prefer to store their
confidential information on these hand-held devices. The risk of unauthorized access to private and confidential data
is getting higher with the increasing use of smartphones. In this aspect, for preserving mobile device security, the
smartphone-embedded sensors are also employed for phone authentication [11,19,20]. The main objective of these
studies is to recognize the phone user's context (such as activity being performed) for identifying and validating a
person.
Although the fusion of various sensing modalities can be useful for robust activity and context recognition, still
in-the-wild PAR, along with human context recognition, is a challenging task. Moreover, the use of multi-modal
heterogeneous sensors increases the computation cost of a system, which may become inappropriate for real-time
applications. The increasing development of smartphones and their quick assumption have made tremendous
attention towards context-aware mobile applications. For instance, the accelerometer sensor in a smartphone is used
for sensing the device alignment and consequently aligning the screen to shift between portrait and landscape styles.
Similarly, location tracking using smartphone sensors has become very common for a wide range of mobile
applications. Hence, there is also a need to recognize human physical activity and their contexts for effective
decision making using real-time applications that can respond to change in the user's activity and context. Therefore,
the proposed scheme aims to provide a viable solution for recognizing in-the-wild human physical activities and
contexts based on the smartphone accelerometer data.
3. Proposed Method
The proposed scheme for recognizing human physical activities and the related contexts is based on a two-level
approach. At the first level, four primary physical activities are recognized, whereas, at the second level, the
corresponding contexts are recognized for each primary activity. The proposed method is based on the
accelerometer data and consists of four steps, i.e., data acquisition and pre-processing, feature extraction, activity
recognition, and context recognition, as presented in Fig. 1.
4 Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2020) 000–000
Fig. 1. Steps involved in the proposed PAR model
3.1. Data Acquisition
To conduct the trials for recognizing physical activities and their corresponding contexts using the proposed
scheme, a publicly available "ExtraSensory" dataset is used [21]. This dataset is collected in-the-wild from 60
participants and includes data corresponding to six (06) primary activities of daily living, e.g., lying down, sitting,
walking, standing, running, and bicycling. Heterogeneous sensors are used for data collection experiments.
However, in our proposed study, we only used the accelerometer data (at 40 Hz sampling rate) of smartphone for
recognition purposes. In addition to primary activity labels, the dataset also contains various secondary context
labels (such as indoor, outdoor, at the workplace, in a meeting, talking, and shopping, etc.) for these primary
activities. In our proposed scheme, we only focused on the data related to four physical activities (i.e., lying down,
sitting, standing, and walking). The activities of bicycling and running are ignored as these activities do not
incorporate any secondary context labels, which can be used for the proposed scheme. The raw data from the
accelerometer may incorporate different types of noise. For getting rid of the unwanted noise from raw data, motion
signals are typically pre-processed before classification. In this regard, an average smoothing filter with order three
is used to remove noise from each axis of the signal data. A non-overlapping window (with a size of 20 seconds
duration in time) is used for data segmentation.
3.2. Feature Extraction
Feature extraction is typically performed on the data segments based on a sliding window. In this study, twenty
(20) time-domain features are extracted from the pre-processed data segments to recognize the physical activities
and the corresponding contexts. These features include the dominant tendency measures (such as signal entropy and
energy, arithmetic mean, maximum/minimum signal latency, maximum/minimum signal amplitude, peak-to-peak
amplitude/time/slope, and latency-amplitude ratio), dispersion measures (i.e., standard deviation, kurtosis, variance,
and skewness), signal percentiles (i.e., 25th, 50th and 75th), and signal differences (first and second difference). The
existing studies regarding feature extraction for sensor-based PAR demonstrate that these features improve the
recognition performance. Hence, these features are extracted from each axis of the 3D accelerometer data, thus
producing a final feature vector with size
1 × 60
.
3.3. Activity Recognition
PAR is typically preserved as a multiclass machine learning problem. In existing research, different classifiers
have been used for accurately determining the specific set of physical activities. In line with state-of-the-art studies,
we used a series of different classifiers for evaluating the proposed scheme for PAR, which includes Random Forest
(RF), K-Nearest Neighbors (KNN), Bagging (BAG), and Decision Tree (J48). The first step of the proposed two-
level model performs PAR based on the selected classifiers. For this purpose, these classifiers are trained
individually to learn and recognize four selected physical activities using a set of twenty extracted features. The
activities recognized include lying down, sitting, standing, and walking.
Muhammad Ehatisham-ul-Haq et al. / Procedia Computer Science 177 (2020) 24–31 27
Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2018) 000–000 3
different machine learning classifiers for the proposed scheme. Finally, Section 5 summarizes this research work
and provides some highlights for future work.
2. Related Works
Smartphone sensing and computing powers have attracted various researchers, who have made use of these
capabilities for a range of applications. In previous works, smartphones have been most commonly utilized for PAR
[8,9], and many researchers utilized these mobile devices for crowdsourcing [10], user authentication [11], and
context-awareness [12,13]. Xing et al. [8] proposed a comprehensive survey report on PAR using smartphone
sensors. L. Xu et al. [14] presented a PAR method based on wearable sensors, where they used Random Forest
classifier for training and testing of the system to achieve efficient recognition results. People daily living activity
recognition is also used for several purposes, such as realizing human behavior and monitoring health [4]. M.
Mehedi et al. [15] proposed a deep learning-based method for recognizing twelve (12) human activities using
smartphone sensors. Likewise, Sourav et al. [16] utilized the deep learning model, i.e., Restricted Boltzmann
Machines (RBM), for robust PAR.
With the improvement in ubiquitous sensing technologies, smartphones have also been used for context-
awareness in an extensive range of applications. Context-awareness has developed progressively as awareness about
the person's environment/background, which is very useful for a broad range of ubiquitous applications. A few
researchers have studied different aspects of human behavioral contexts, for example, outside or indoor, in a
meeting or in a car, etc. [4,17]. Chaoyan et al. [18] proposed discriminative context models and extracted human
contextual information for recognition of mutual activities based on inter-class and intra-class labels. Their
experimental results demonstrate that their proposed scheme is beneficial for robust PAR. Smartphone sensors
collectively enable observing the surroundings from different perspectives, which serves to increase the validity of
activity and context inference tasks. Moreover, PAR with context modeling helps to infer valuable information
about the individuals' lifestyle. As people are habitual to use smartphones, hence they also prefer to store their
confidential information on these hand-held devices. The risk of unauthorized access to private and confidential data
is getting higher with the increasing use of smartphones. In this aspect, for preserving mobile device security, the
smartphone-embedded sensors are also employed for phone authentication [11,19,20]. The main objective of these
studies is to recognize the phone user's context (such as activity being performed) for identifying and validating a
person.
Although the fusion of various sensing modalities can be useful for robust activity and context recognition, still
in-the-wild PAR, along with human context recognition, is a challenging task. Moreover, the use of multi-modal
heterogeneous sensors increases the computation cost of a system, which may become inappropriate for real-time
applications. The increasing development of smartphones and their quick assumption have made tremendous
attention towards context-aware mobile applications. For instance, the accelerometer sensor in a smartphone is used
for sensing the device alignment and consequently aligning the screen to shift between portrait and landscape styles.
Similarly, location tracking using smartphone sensors has become very common for a wide range of mobile
applications. Hence, there is also a need to recognize human physical activity and their contexts for effective
decision making using real-time applications that can respond to change in the user's activity and context. Therefore,
the proposed scheme aims to provide a viable solution for recognizing in-the-wild human physical activities and
contexts based on the smartphone accelerometer data.
3. Proposed Method
The proposed scheme for recognizing human physical activities and the related contexts is based on a two-level
approach. At the first level, four primary physical activities are recognized, whereas, at the second level, the
corresponding contexts are recognized for each primary activity. The proposed method is based on the
accelerometer data and consists of four steps, i.e., data acquisition and pre-processing, feature extraction, activity
recognition, and context recognition, as presented in Fig. 1.
4 Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2020) 000–000
Fig. 1. Steps involved in the proposed PAR model
3.1. Data Acquisition
To conduct the trials for recognizing physical activities and their corresponding contexts using the proposed
scheme, a publicly available "ExtraSensory" dataset is used [21]. This dataset is collected in-the-wild from 60
participants and includes data corresponding to six (06) primary activities of daily living, e.g., lying down, sitting,
walking, standing, running, and bicycling. Heterogeneous sensors are used for data collection experiments.
However, in our proposed study, we only used the accelerometer data (at 40 Hz sampling rate) of smartphone for
recognition purposes. In addition to primary activity labels, the dataset also contains various secondary context
labels (such as indoor, outdoor, at the workplace, in a meeting, talking, and shopping, etc.) for these primary
activities. In our proposed scheme, we only focused on the data related to four physical activities (i.e., lying down,
sitting, standing, and walking). The activities of bicycling and running are ignored as these activities do not
incorporate any secondary context labels, which can be used for the proposed scheme. The raw data from the
accelerometer may incorporate different types of noise. For getting rid of the unwanted noise from raw data, motion
signals are typically pre-processed before classification. In this regard, an average smoothing filter with order three
is used to remove noise from each axis of the signal data. A non-overlapping window (with a size of 20 seconds
duration in time) is used for data segmentation.
3.2. Feature Extraction
Feature extraction is typically performed on the data segments based on a sliding window. In this study, twenty
(20) time-domain features are extracted from the pre-processed data segments to recognize the physical activities
and the corresponding contexts. These features include the dominant tendency measures (such as signal entropy and
energy, arithmetic mean, maximum/minimum signal latency, maximum/minimum signal amplitude, peak-to-peak
amplitude/time/slope, and latency-amplitude ratio), dispersion measures (i.e., standard deviation, kurtosis, variance,
and skewness), signal percentiles (i.e., 25th, 50th and 75th), and signal differences (first and second difference). The
existing studies regarding feature extraction for sensor-based PAR demonstrate that these features improve the
recognition performance. Hence, these features are extracted from each axis of the 3D accelerometer data, thus
producing a final feature vector with size 1 × 60.
3.3. Activity Recognition
PAR is typically preserved as a multiclass machine learning problem. In existing research, different classifiers
have been used for accurately determining the specific set of physical activities. In line with state-of-the-art studies,
we used a series of different classifiers for evaluating the proposed scheme for PAR, which includes Random Forest
(RF), K-Nearest Neighbors (KNN), Bagging (BAG), and Decision Tree (J48). The first step of the proposed two-
level model performs PAR based on the selected classifiers. For this purpose, these classifiers are trained
individually to learn and recognize four selected physical activities using a set of twenty extracted features. The
activities recognized include lying down, sitting, standing, and walking.
28 Muhammad Ehatisham-ul-Haq et al. / Procedia Computer Science 177 (2020) 24–31
Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2018) 000–000 5
3.4. Context Recognition
After PAR, the last step of the proposed model is to recognize the activity context. Context recognition in-the-
wild is generally more challenging as compared to PAR because of the diversity in the sensor data even in the same
settings. The "ExtraSensory" dataset various context labels as supplementary data for each primary activity selected
in this study for recognition. The second level of the proposed model recognizes the activity context based on the set
of extracted features and activity label inferred in the first stage. For example, indoor or outdoor context is
recognized in the second level once the activity of walking is at the first level. The list of primary activities and their
associated contexts that are recognized in this study is presented in Table 1. The primary purpose behind the
recognition of these specific contexts with the selected physical activities is their higher frequencies of co-
occurrence. For recognizing context, all the selected classifiers are trained and tested separately on each primary
physical activity data.
Table 1: List of primary activities and corresponding behavioral contexts used for recognition purpose
Primary Activity Contexts
Sitting
In a meeting, In a car, Watching TV, Surfing the internet
Standing
Indoor, Outdoor
Lying Down
Surfing the internet, Sleeping, Watching TV
Walking Indoor, Outdoor, Shopping, Talking
4. Experimental Results
The principal objective of this research work is to recognize the activity context along with the physical activities
to better analyze the changing physical activity patterns in different contexts. In this aspect, we trained our proposed
model using a 10-fold cross-validation scheme. The metrics used for estimating the performance of our proposed
model are based on accuracy, precision, recall, and f-measure. The results of the first stage are based on the
recognition of four physical activities, i.e., sitting, standing, lying down, and walking, which are presented in Table
2. Based on the average values of performance measures obtained in Table 2, it can be said that RF classifier
performs better than the other selected classifiers. The maximum average accuracy rate of 87.75% is achieved for
PAR using RF classifier, which demonstrates its superiority over other classifiers.
Table 2: Average performance measure values of the selected classifiers for physical activity recognition (PAR)
Classifier
Accuracy
Precision
Recall
F-Measure
RF 0.877 0.877 0.878 0.876
BAG
0.838
0.836
0.839
0.836
J48 0.808 0.806 0.808 0.807
KNN
0.610
0.603
0.610
0.607
The results for second-level context recognition are presented in Table 3, where a comparison among four
classifiers is made based on the average values of the selected performance metrics. Fig. 2 provides a graphical
comparison of the accuracy rate achieved for context recognition based on four physical activities using RF, BAG,
J48, and KNN classifiers. It can be observed from the results reported in Table 3 and Fig. 2 that RF classifier
provides the best performance for context recognition based on each activity. The best average context recognition
accuracy achieved for the proposed scheme based on lying down, sitting, standing, and walking activity is 97.76%,
86.90%, 95.28%, and 71.4%, respectively, using RF classifier. Based on the obtained results in Table 3, it can be
concluded that the recognition of activity contexts related to the walking and sitting activities is much harder as
compared to lying down and standing activities. It is because some behavioral contexts adversely affect physical
activity patterns. For example, walking patterns are more affected by varying contexts such as indoor, outdoor,
shopping, and talking. Hence, the context recognition accuracy, in this case, is less than other activities and their
corresponding context. In contrast, standing activity pattern/posture is not much influenced by the change in the
6 Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2020) 000–000
specified contexts as it is a static activity. Moreover, RF classifier provides the best results for the proposed model,
whereas KNN classifier achieves the worst results.
Table 3: Performance measures of the selected classifiers for context recognition based on four selected physical activities
Contexts
Classifiers
Accuracy
Precision
Recall
F-Measure
Lying Down
Contexts
RF
0.977
0.977
0.978
0.975
BAG
0.971
0.970
0.972
0.968
J48
0.965
0.964
0.966
0.965
KNN
0.930
0.926
0.931
0.928
Sitting
Contexts
RF
0.869
0.870
0.869
0.869
BAG
0.836
0.837
0.836
0.836
J48
0.797
0.798
0.798
0.798
KNN
0.530
0.533
0.532
0.531
Standing
Contexts
RF
0.952
0.951
0.953
0.942
BAG
0.945
0.940
0.946
0.931
J48
0.935
0.926
0.935
0.930
KNN
0.906
0.903
0.907
0.905
Walking
Contexts
RF
0.714
0.715
0.714
0.705
BAG
0.663
0.659
0.663
0.653
J48
0.604
0.603
0.604
0.604
KNN
0.518
0.516
0.519
0.517
Fig. 2: Comparison of average accuracy rate achieved for context recognition based on four different physical activities using RF, BAG, J48, and
KNN classifier.
Fig. 3 presents the confusion matrices for individual behavioral context recognition results pertaining to four
selected physical activities. These confusion matrices are obtained using RF classifier. It can be observed from the
confusion matrices in Fig. 3(a)-(d) that some contexts are misclassified with each other, although they relate to the
same activity class. However, for each context, the value of correctly classified instances is still very high. Hence,
the proposed scheme can recognize the physical activities and their associated behavioral contexts effectively.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Lying Down Sitting Standing Walking
Accuracy Rate
Activities
RF
BAG
J48
KNN
Muhammad Ehatisham-ul-Haq et al. / Procedia Computer Science 177 (2020) 24–31 29
Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2018) 000–000 5
3.4. Context Recognition
After PAR, the last step of the proposed model is to recognize the activity context. Context recognition in-the-
wild is generally more challenging as compared to PAR because of the diversity in the sensor data even in the same
settings. The "ExtraSensory" dataset various context labels as supplementary data for each primary activity selected
in this study for recognition. The second level of the proposed model recognizes the activity context based on the set
of extracted features and activity label inferred in the first stage. For example, indoor or outdoor context is
recognized in the second level once the activity of walking is at the first level. The list of primary activities and their
associated contexts that are recognized in this study is presented in Table 1. The primary purpose behind the
recognition of these specific contexts with the selected physical activities is their higher frequencies of co-
occurrence. For recognizing context, all the selected classifiers are trained and tested separately on each primary
physical activity data.
Table 1: List of primary activities and corresponding behavioral contexts used for recognition purpose
Primary Activity
Contexts
Sitting
In a meeting, In a car, Watching TV, Surfing the internet
Standing
Indoor, Outdoor
Lying Down
Surfing the internet, Sleeping, Watching TV
Walking
Indoor, Outdoor, Shopping, Talking
4. Experimental Results
The principal objective of this research work is to recognize the activity context along with the physical activities
to better analyze the changing physical activity patterns in different contexts. In this aspect, we trained our proposed
model using a 10-fold cross-validation scheme. The metrics used for estimating the performance of our proposed
model are based on accuracy, precision, recall, and f-measure. The results of the first stage are based on the
recognition of four physical activities, i.e., sitting, standing, lying down, and walking, which are presented in Table
2. Based on the average values of performance measures obtained in Table 2, it can be said that RF classifier
performs better than the other selected classifiers. The maximum average accuracy rate of 87.75% is achieved for
PAR using RF classifier, which demonstrates its superiority over other classifiers.
Table 2: Average performance measure values of the selected classifiers for physical activity recognition (PAR)
Classifier
Accuracy
Precision
Recall
F-Measure
RF
0.877
0.877
0.878
0.876
BAG
0.838
0.836
0.839
0.836
J48
0.808
0.806
0.808
0.807
KNN
0.610
0.603
0.610
0.607
The results for second-level context recognition are presented in Table 3, where a comparison among four
classifiers is made based on the average values of the selected performance metrics. Fig. 2 provides a graphical
comparison of the accuracy rate achieved for context recognition based on four physical activities using RF, BAG,
J48, and KNN classifiers. It can be observed from the results reported in Table 3 and Fig. 2 that RF classifier
provides the best performance for context recognition based on each activity. The best average context recognition
accuracy achieved for the proposed scheme based on lying down, sitting, standing, and walking activity is 97.76%,
86.90%, 95.28%, and 71.4%, respectively, using RF classifier. Based on the obtained results in Table 3, it can be
concluded that the recognition of activity contexts related to the walking and sitting activities is much harder as
compared to lying down and standing activities. It is because some behavioral contexts adversely affect physical
activity patterns. For example, walking patterns are more affected by varying contexts such as indoor, outdoor,
shopping, and talking. Hence, the context recognition accuracy, in this case, is less than other activities and their
corresponding context. In contrast, standing activity pattern/posture is not much influenced by the change in the
6 Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2020) 000–000
specified contexts as it is a static activity. Moreover, RF classifier provides the best results for the proposed model,
whereas KNN classifier achieves the worst results.
Table 3: Performance measures of the selected classifiers for context recognition based on four selected physical activities
Contexts Classifiers Accuracy Precision Recall F-Measure
Lying Down
Contexts
RF 0.977 0.977 0.978 0.975
BAG 0.971 0.970 0.972 0.968
J48 0.965 0.964 0.966 0.965
KNN 0.930 0.926 0.931 0.928
Sitting
Contexts
RF 0.869 0.870 0.869 0.869
BAG 0.836 0.837 0.836 0.836
J48 0.797 0.798 0.798 0.798
KNN 0.530 0.533 0.532 0.531
Standing
Contexts
RF 0.952 0.951 0.953 0.942
BAG 0.945 0.940 0.946 0.931
J48 0.935 0.926 0.935 0.930
KNN 0.906 0.903 0.907 0.905
Walking
Contexts
RF 0.714 0.715 0.714 0.705
BAG 0.663 0.659 0.663 0.653
J48 0.604 0.603 0.604 0.604
KNN 0.518 0.516 0.519 0.517
Fig. 2: Comparison of average accuracy rate achieved for context recognition based on four different physical activities using RF, BAG, J48, and
KNN classifier.
Fig. 3 presents the confusion matrices for individual behavioral context recognition results pertaining to four
selected physical activities. These confusion matrices are obtained using RF classifier. It can be observed from the
confusion matrices in Fig. 3(a)-(d) that some contexts are misclassified with each other, although they relate to the
same activity class. However, for each context, the value of correctly classified instances is still very high. Hence,
the proposed scheme can recognize the physical activities and their associated behavioral contexts effectively.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Lying Down Sitting Standing Walking
Accuracy Rate
Activities
RF
BAG
J48
KNN
30 Muhammad Ehatisham-ul-Haq et al. / Procedia Computer Science 177 (2020) 24–31
Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2018) 000–000 7
(a)
(b)
(c)
(d)
Fig. 3. Confusion matrices for context recognition results (obtained using Random Forest classifier) based on (a) lying down, where the labels
C1-C3 represent the contexts surfing the internet, sleeping, and watching TV, respectively; (b) sitting, where the labels C1-C4 represent the
contexts in a meeting, in a car, watching TV, and surfing the internet, respectively; (c) standing, where the labels C1 and C2 represent the
contexts indoor and outdoor respectively; (d) walking, where the labels C1-C4 represent the contexts indoor, outdoor, shopping, and talking,
respectively.
5. Conclusions
In this study, we concentrate on recognizing the physical activities along with their corresponding behavioral
context in-the-wild using the accelerometer sensor of the smartphone. Twenty time-domain features are extracted
from the raw accelerometer data. Four classifiers (including Random Forest, Decision Tree, Bagging, and K-Nearest
Neighbors) are trained on the extracted features to recognize four physical activities and overall nine (09) associated
contexts. The best performance is achieved for the proposed scheme using Random Forest classifier. Moreover,
based on the achieved recognition results, it is concluded that the activity of walking and the associated contexts are
hard to recognize in-the-wild. In contrast, the recognition of standing activity is more comfortable. This study can be
extended to incorporate more behavioral contexts and physical activities in-the-wild. For this purpose, multiple
sensing modalities can be used to improve the recognition accuracy of the system. Moreover, people behavior can
also be investigated in varying contexts, which can lead to abnormal behavior detection and recognition, as well.
References
1. Stisen A, Blunck H, Bhattacharya S, Prentow TS, Kjærgaard MB, Dey A, et al. Smart Devices are Different: Assessing and
MitigatingMobile Sensing Heterogeneities for Activity Recognition with RealWorld HAR Dataset. Proc 13th ACM Conf Embed
Networked Sens Syst - SenSys ’15 [Internet]. 2015;127–40. Available from: http://dl.acm.org/citation.cfm?doid=2809695.2809718
2. Morales J, Akopian D. Physical activity recognition by smartphones, a survey. Biocybern Biomed Eng. 2017;37(3):388–400.
Muhammad Ehatisham-ul-Haq et al. / Procedia Computer Science 177 (2020) 24–31 31
Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2018) 000–000 7
(a)
(b)
(c)
(d)
Fig. 3. Confusion matrices for context recognition results (obtained using Random Forest classifier) based on (a) lying down, where the labels
C1-C3 represent the contexts surfing the internet, sleeping, and watching TV, respectively; (b) sitting, where the labels C1-C4 represent the
contexts in a meeting, in a car, watching TV, and surfing the internet, respectively; (c) standing, where the labels C1 and C2 represent the
contexts indoor and outdoor respectively; (d) walking, where the labels C1-C4 represent the contexts indoor, outdoor, shopping, and talking,
respectively.
5. Conclusions
In this study, we concentrate on recognizing the physical activities along with their corresponding behavioral
context in-the-wild using the accelerometer sensor of the smartphone. Twenty time-domain features are extracted
from the raw accelerometer data. Four classifiers (including Random Forest, Decision Tree, Bagging, and K-Nearest
Neighbors) are trained on the extracted features to recognize four physical activities and overall nine (09) associated
contexts. The best performance is achieved for the proposed scheme using Random Forest classifier. Moreover,
based on the achieved recognition results, it is concluded that the activity of walking and the associated contexts are
hard to recognize in-the-wild. In contrast, the recognition of standing activity is more comfortable. This study can be
extended to incorporate more behavioral contexts and physical activities in-the-wild. For this purpose, multiple
sensing modalities can be used to improve the recognition accuracy of the system. Moreover, people behavior can
also be investigated in varying contexts, which can lead to abnormal behavior detection and recognition, as well.
References
1. Stisen A, Blunck H, Bhattacharya S, Prentow TS, Kjærgaard MB, Dey A, et al. Smart Devices are Different: Assessing and
MitigatingMobile Sensing Heterogeneities for Activity Recognition with RealWorld HAR Dataset. Proc 13th ACM Conf Embed
Networked Sens Syst - SenSys ’15 [Internet]. 2015;127–40. Available from: http://dl.acm.org/citation.cfm?doid=2809695.2809718
2. Morales J, Akopian D. Physical activity recognition by smartphones, a survey. Biocybern Biomed Eng. 2017;37(3):388–400.
8 Muhammad Ehatisham-ul-Haq et. al / Procedia Computer Science 00 (2020) 000–000
3. Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL. Human activity recognition on smartphones using a multiclass hardware-friendly
support vector machine. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics). 2012;7657
LNCS:216–23.
4. Wannenburg J, Malekian R. Physical Activity Recognition from Smartphone Accelerometer Data for User Context Awareness Sensing.
IEEE Trans Syst Man, Cybern Syst. 2017;47(12):3143–9.
5. Vaizman Y, Ellis K, Lanckriet G. Recognizing detailed human context in the wild from smartphones and smartwatches. IEEE Pervasive
Comput. 2017;16(4):62–74.
6. Vaizman Y, Weibel N. Context Recognition In-the-Wild: Unified Model for Multi-Modal Sensors and Multi-Label Classification. Proc
ACM Interact Mob Wearable Ubiquitous Technol 1, 4, Artic [Internet]. 2017;168:22. Available from: https://doi.org/10.1145/3161192
7. Moore DJ, Essa IA, Hayes MH. Exploiting human actions and object context for recognition tasks. In 2008. p. 80–6 vol.1.
8. X. S, H. T, P. J. Activity recognition with smartphone sensors. Tsinghua Sci Technol. 2014;19(3):235–49.
9. Shoaib M, Bosch S, Incel O, Scholten H, Havinga P. A Survey of Online Activity Recognition Using Mobile Phones. Sensors [Internet].
2015;15(1):2059–85. Available from: http://www.mdpi.com/1424-8220/15/1/2059/
10. Consolvo S, McDonald DW, Toscos T, Chen MY, Froehlich J, Harrison B, et al. Activity Sensing in the Wild: A Field Trial of UbiFit
Garden. In: Chi 2008: 26th Annual Chi Conference on Human Factors in Computing Systems Vols 1 and 2, Conference Proceedings. 2008.
p. 1797–806.
11. Ehatisham-ul-haq M, Awais M, Naeem U, Amin Y, Loo J. Continuous authentication of smartphone users based on activity pattern
recognition using passive mobile sensing. J Netw Comput Appl [Internet]. 2018;109(March):24–35. Available from:
https://doi.org/10.1016/j.jnca.2018.02.020
12. Wang H, Sen S, Elgohary A. No need to war-drive: Unsupervised indoor localization. ACM MobiSys. 2012;197–210.
13. Matellan V. Context Awareness in shared human-robot Environments: Benefits of Environment Acoustic Recognition for User Activity
Classification. IET Conf Proc. :24 (6 .)-24 (6 .)(1) 2017.
14. Xu L, Yang W, Cao Y, Li Q. Human activity recognition based on random forests. ICNC-FSKD 2017 - 13th Int Conf Nat Comput Fuzzy
Syst Knowl Discov. 2018;548–53.
15. Hassan MM, Uddin MZ, Mohamed A, Almogren A. A robust human activity recognition system using smartphone sensors and deep
learning. Futur Gener Comput Syst. 2018;81:307–13.
16. Bhattacharya S, Lane ND. From smart to deep: Robust activity recognition on smartwatches using deep learning. 2016 IEEE Int Conf
Pervasive Comput Commun Work PerCom Work 2016. 2016;
17. Saeedi S, Moussa A, El-Sheimy N. Context-aware personal navigation using embedded sensor fusion in smartphones. Sensors
(Switzerland). 2014;14(4):5742–67.
18. Zhao C, Fu W, Wang J, Bai X, Liu Q, Lu H. Discriminative context models for collective activity recognition. In: Proceedings -
International Conference on Pattern Recognition. 2014. p. 648–53.
19. Ehatisham-ul-Haq M, Azam MA, Naeem U, Rehman SU, Khalid A. Identifying Smartphone Users based on their Activity Patterns via
Mobile Sensing. In: Procedia Computer Science. 2017. p. 202–9.
20. Alqarni MA, Chauhdary SH, Malik MN, Ehatisham-ul-Haq M, Azam MA. Identifying smartphone users based on how they interact with
their phones. Human-centric Comput Inf Sci. 2020;10(1).
21. Vaizman Y, Ellis K, Lanckriet G, Weibel N. ExtraSensory App: Data Collection In-the-Wild with Rich User Interface to Self-Report
Behavior. Proc CHI. 2018;1–12.