ChapterPDF Available

Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

June 2017

June 2017

In book: Handbook of Large-Scale Distributed Computing in Smart Healthcare (pp.11-25)
Publisher: Springer

Authors:

Harishchandra Dubey

Microsoft

Nicholas P Constant

EchoWear

Mohammadreza Abtahi

University of Rhode Island

Show all 9 authorsHide

Abstract. In the era when the market segment of ”Internet of Things (IoT)” tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearable and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily life activities impacting one’s health and wellbeing. However, IoT-driven healthcare will have to overcome many barriers: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex. 2) The data, when communicated, are vulnerable to security and privacy issues. 3) The communication of the continuously collected data is not only costly but also energy hungry. 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defines Fog Computing (FC) in the context of medical IoT. Conceptually, FC is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of the FC is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an FC system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as Phonocardiogram (PCG) signal for heart rate estimation and motion sensing data from Smart Gloves. The book chapter ends with experiments and results showing how FC could lessen the obstacles of existing cloud-driven medical IoT solutions and enhance the overall performance of the system in terms of computing intelligence, transmission, storage, configurability, and security. The case studies show that the proposed Fog architecture could be used for enhancement, processing and analysis of various types of bio-signals. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices.

Fog computing as an intermediate computing layer between edge devices (wearables) and cloud (backend). The Fog computer enhances the overall efficiency by providing computing near the edge devices. Such frameworks are useful for wearables (employed for healthcare, fitness and wellness tracking), smart-grid, smart-cities and ambient-assisted living etc..

…

The conceptual overview of the proposed Fog architecture that assisted Medical Internet of Things framework in tele-treatment scenarios.

…

. Latency measurements of Fog for computing the clinical speech features namely zero crossing rate (ZCR), special centroid (SC), and short-time energy (STE).

…

Overall architecture of the proposed Fog architecture in the context of frontend and backend services. It shows the information flow from patients to clinicians through the modular architecture.

…

+20

. Various resources required to develop Fog architecture.

…

Figures - uploaded by Harishchandra Dubey

Content may be subject to copyright.

Content uploaded by Harishchandra Dubey

Content may be subject to copyright.

Content uploaded by Nicholas P Constant

Content may be subject to copyright.

Fog Computing in Medical Internet-of-Things: Architecture, Implementation,

and Applications

Harishchandra Dubey1,3,5?, Admir Monteiro1,3, Nicholas Constant1,3, Mohammadreza Abtahi1,3 , Debanjan

Borthakur1,3, Leslie Mahler2, Yan Sun1, Qing Yang1, Umer Akbar4, and Kunal Mankodiya1,3

1Department of Electrical, Computer, and Biomedical Engineering, University of Rhode Island, RI-02881, USA

2Department of Communicative Disorders, University of Rhode Island, RI-02881, USA

3Wearable Biosensing Lab, University of Rhode Island, RI-02881, USA

4Movement Disorders Program, Rhode Island Hospital, RI-02903, USA

5Center for Robust Speech Systems, University of Texas at Dallas, TX-75080, USA

kunalm@uri.edu

Abstract. In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it

is apparently envisioned that the ﬁeld of medicine expects to gain a large beneﬁt from the explosion of wearables and

internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication,

food intake, and daily-life activities impacting one’s health and wellness. However, IoT-driven healthcare would have

to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the

analysis of the medical big data becomes increasingly complex; 2) The data, when communicated, are vulnerable to

security and privacy issues; 3) The communication of the continuously collected data is not only costly but also energy

hungry; 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks.

This book chapter deﬁned Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-

oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating con-

nectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent,

wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from

wearables or other medical sensors and offers efﬁcient means to serve telehealth interventions. We implemented and

tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage

and communication of the various medical data such as pathological speech data of individuals with speech disorders,

Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection. The

book chapter ends with experiments and results showing how fog computing could lessen the obstacles of existing

cloud-driven medical IoT solutions and enhance the overall performance of the system in terms of computing intelli-

gence, transmission, storage, conﬁgurable, and security. The case studies on various types of physiological data shows

that the proposed Fog architecture could be used for signal enhancement, processing and analysis of various types of

bio-signals.

Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyber-

physical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices.

1 Introduction

The recent advances in Internet of Things (IoT) and growing use of wearables for the collection of physiological

data and bio-signals led to an emergence of new distributed computing paradigms that combined wearable

devices with the medical internet of things for scalable remote tele-treatment and telecare [15,38,18]. Such

systems are useful for wellness and ﬁtness monitoring, preliminary diagnosis and long-term tracking of patients

with acute disorders. Use of Fog computing reduces the logistics requirements and cut-down the associated

medicine and treatment costs (See Figure 1). Fog computing have found emerging applications into other

domains such as geo-spatial data associated with various healthcare issues [8].

This book chapter highlights the recent advancements and associated challenges in employing wearable

internet of things (wIoT) and body sensor networks (BSNs) for healthcare applications. We present the re-

search conducted in Wearable Biosensing Lab and other research groups at the University of Rhode Island.

We developed prototypes using Raspberry Pi and Intel Edison embedded boards and conducted case studies on

?This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are

retained by the authors or by the respective copyright holders. The original citation of this book chapter is: H. Dubey , N. Constant,

M. Abtahi, A. Monteiro, D. Borthakur, L. Mahler, Y. Sun, Q. Yang, U. Akbar, K. Mankodiya, ”Fog Computing in Medical Internet-

of-Things: Architecture, Implementation, and Applications”, Chapter in Handbook of Large-Scale Distributed Computing in Smart

Healthcare (2017), Springer International Publishing AG, S.U. Khan et al. (eds.), Handbook of Large-Scale Distributed Computing

in Smart Healthcare, Scalable Computing and Communications, DOI 10.1007/978-3-319-58280-1 11.

2 Authors Suppressed Due to Excessive Length

Fig. 1. Fog computing as an intermediate computing layer between edge devices (wearables) and cloud (backend). The Fog computer

enhances the overall efﬁciency by providing computing near the edge devices. Such frameworks are useful for wearables (employed

for healthcare, ﬁtness and wellness tracking), smart-grid, smart-cities and ambient-assisted living etc..

three healthcare scenarios: (1) Speech Tele-treatment of patients with Parkinson’s disease; (2) Electrocardio-

gram (ECG) monitoring; (3) Phonocardiography (PCG) for heart rate estimation. This book chapter extends

the methods and systems published in our earlier conferences papers by adding novel system changes and

algorithms for robust estimation of clinical features.

This chapter made the following contributions to the area of Fog Computing for Medical Internet-of Things:

– Fog Hardware: Intel Edison and Raspberry Pi were leveraged to formulate two prototype architectures.

Both of the architectures can be used for each of the three case-studies mentioned above.

– Edge Computing of Clinical Features: The Fog devices executed a variety of algorithms to extract clinical

features and performed primary diagnosis using data collected from wearable sensors;

– Interoperability: We designed frontend apps for body sensor network such as android app for smart-

watch [26], PPG wrist-band, and backend cloud infrastructure for long-term storage. In addition, transfer,

communication, authentication, storage and execution procedures of data were implemented in the Fog

computer.

– Security: In order to ensure security and data privacy, we built an encrypted server that handles user

authentication and associated privileges. The rule-based authentication scheme is also a novel contribution

of this chapter where only the individuals with privileges (such as clinicians) could access the associated

data from the patients.

– Case Study on Fog Computing for Medical IoT-based Tele-treatment and Monitoring: We conducted

three case studies: (1) Speech Tele-treatment of patients with Parkinson’s disease; (2) Electrocardiogram

(ECG) monitoring; (3) Phonocardiography (PCG) for heart rate estimation. Even if we conducted validation

experiments on only three types of healthcare data, the proposed Fog architecture could be used for analysis

of other bio-signals and healthcare data as well.

– Android API for Wearable (Smartwatch-based) Audio Data Collection The EchoWear app that was

introduced in [16] is used in proposed architecture for collecting the audio data from wearables. We have

released the library to public at: https://github.com/harishdubey123/wbl-echowear

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 3

Fig. 2. The conceptual overview of the proposed Fog architecture that assisted Medical Internet of Things framework in tele-treatment

scenarios.

2 Related & Background Works

In this section, we present recent emergence of wearables and fog computing for enhancement processing of

physiological data for healthcare applications.

2.1 Wearable Big Data in Healthcare

The medical data is collected by the intelligent edge devices such as wearables, wrist-bands, smartwatches,

smart textiles etc.. The intelligence refers to knowledge of analytics, devices, clinical application and the con-

sumer behavior. Such smart data is structured, homogeneous and meaningful with negligible amount of noise

and meta-data [1]. The big data and quiet recently smart data trend had revolutionized the biomedical and

healthcare domain. With increasing use of wireless and wearable body sensor networks (BSNs), the amount of

data aggregated by edge devices and synced to the cloud is growing at enormous rate [32]. The pharmaceutical

companies are leveraging deep learning and data analytics on their huge medical databases. These databases

are results of digitization of patient’s medical records. The data obtained from patient’s health records, clini-

cal trials and insurance programs provided an opportunity for data mining. Such databases are heterogeneous,

unstructured, scalable and contain signiﬁcant amount of noise and meta-data. The noise and meta-data have

low or no useful information. Cleaning and structuring the real-world data is another challenge in processing

medical big data. In recent years, the big data trend had transformed the healthcare, wellness and ﬁtness in-

dustry. Adding value and innovation in data processing chain could help patients and healthcare stakeholders

accomplish the treatment goals in lower cost with reduced logistic requirements [32]. Authors in [46] presented

the smart data as a result of using semantic web and data analytics on structured collection of big data. Smart

data attempts to provide a superior avenue for better decision and inexpensive processing for person-centered

medicine and healthcare. The medical data such as diagnostic images, genetic test results and biometric in-

formation are getting generated at large scale. Such data has not just the high volume but also a wide variety

and different velocity. It necessitates the novel ways for storing, managing, retrieving and processing such

data. The smart medical data demand development of novel scalable big data architecture and applicable algo-

rithms for intelligent data analytics. Authors also underlined the challenges in semantic-rich data processing

for intelligent inference on practical use cases [46].

2.2 Speech Treatments of Patients with Parkinson’s Disease

The patients with Parkinson’s Disease (PD) have their own unique set of speech deﬁcits. We developed

EchoWear [16] as a technology front-end for monitoring the speech from PD patients using smartwatch. The

speech-language pathologists (SLPs) had access to such as system for remote monitoring of their patients. The

rising cost of healthcare, the increase in elderly population, and the prevalence of chronic diseases around the

4 Authors Suppressed Due to Excessive Length

Table 1. A comparison between Fog computing and cloud computing [adopted from [11].

Criterion

Fog nodes close to

IoT devices

Fog Aggregation

Nodes

Cloud

Computing

Response

time

Milliseconds to

sub-second

Seconds to

minutes

Minutes,

days, weeks

Application

examples

Telemedicine

and training

Visualization

simple analytics

Big data

analytics

Graphical

dashboards

How long IoT

data is stored Transient

Short duration:

perhaps hours, days

or weeks Months or

years

Geographic

coverage

Very local:

for example, one

city block Regional Global

Fig. 3. The ﬂow of information and control between three main components of the medical IoT system for smartwatch-based speech

treatment [16]. The smartwatch is triggered by the patients with Parkinson’s disease. At ﬁxed timings set by patients, caregivers or

their speech-language pathologist (SLPs), the tablet triggers the recording of speech data. The smartwatch interacts with the tablet via

Bluetooth. Once tablet gets the data from smartwatch, it send to the Fog devices that process the clinical speech. Finally, the features

were sent to the cloud from where those could be queried by clinicians for long-term comparative study. SLPs use the ﬁnal features for

designing customized speech exercises and treatment regime in accordance with patient’s communications deﬁcits.

Fig. 4. The proposed Fog architecture that acquired the data from body sensor networks (BSNs) through smartphone/tablet gateways.

It has two choices for fog computers: Intel Edison and Raspberry Pi. The extraction of clinical features was done locally on fog device

that was kept in patient’s home (or near the patient in care-homes). Finally, the extracted information from bio-signals was uploaded

to the secured cloud backend from where it could be accessed by clinicians. The proposed Fog architecture consists of four modules,

namely BSNs (e.g. smartwatch), gateways (e.g. smartphone/tablets), fog devices (Intel Edison/Raspberry Pi) and cloud backend.

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 5

world urgently demand the transformation of healthcare from a hospital-centered system to a person-centered

environment, with a focus on patient’s disease management as well as their wellbeing.

Speech Disorders affected approximately 7.5 million people in US [2]. Dysarthria (caused by Parkinson’s

disease or other speech disorders) refers to motor speech disorder resulting from impairments in human speech

production system. The speech production system consists of the lips, tongue, vocal folds, and/or diaphragm.

Depending on the part of nervous system that is affected, there are various types of dysarthria. The patients

with dysarthria posses speciﬁc speech characteristics such as difﬁcult to understand speech, limited movement

in lips, tongue and jaw, abnormal pitch and rhythm. It also includes poor voice quality, for instance, hoarse,

breathy or nasal voice. Dysarthria results from neural dysfunction. It might happen at birth (cerebral palsy) or

developed later in person’s life. It can be due to variety of ailments in the nervous system, such as Motor neu-

ron diseases, Alzheimer’s disease, Cerebral Palsy (CP), Huntington’s disease, Multiple Sclerosis, Parkinson’s

disease (PD), Traumatic brain injury (TBI), Mental health issues, Stroke, Progressive neurological conditions,

Cancer of the head, neck and throat (including laryngectomy). The patients with dysarthria are subjectively

evaluated by the speech-language pathologist (SLP) who identiﬁes the speech difﬁculties and decide the type

and severity of the communication deﬁcit [20].

Authors in [48] compared the perceived loudness of speech and self-perception of speech in patients with

idiopathic Parkinson’s disease (PD) with healthy controls. Thirty patients with PD and fourteen healthy controls

participated in the research survey. Various speech tasks were performed and nine speech and voice character-

istics were used for evaluation. Results showed that the patients with PD had signiﬁcant reduction in loudness

as compared to healthy controls during various speech tasks. These results furnished additional information on

speech characteristics of patients with PD that might be useful for effective speech treatment of such popula-

tion [48]. Authors in [28] studied the acoustic characteristics of voice in patients with PD. Thirty patients with

early stage PD and thirty patients with later stage PD were compared with thirty healthy controls for acoustic

characteristics of the voice. The speech task included sustained /a/ and one minute monologue. The voice of

patients with early as well as later stage PD were found to have reduced loudness, limited loudness and pitch

variability, breathiness, and harshness. In general, the voice of patients with PD had lower mean intensity levels

and reduced maximum phonational frequency range as compared to healthy controls [28].

Authors in [50] studied and evaluated the voice and speech quality in patients with and without deep brain

stimulation of the subthalamic nucleus (STN-DBS) before and after LSVT LOUD therapy. The goal of the

study was to do a comparative study of improvement in surgical patients as compared to the non-surgical

ones. Results showed that the LSVT LOUD is recommended for voice and speech treatment of patients with

PD following STN-DBS surgery. Authors in [22] performed acoustic analysis of voice from 41 patients with

PD and healthy controls. The speech exercises included in the study were the sustained /a/ for two seconds

and reading sentences. The acoustic measures for quantifying the speech quality were fundamental frequency,

perturbation in fundamental frequency, shimmer, and harmonic to noise ratio of the sustained /a/, phonation

range, dynamic range, and maximum phonation time. Authors concluded that the patients with PD had higher

jitter, lower harmonics to noise ratio, lower frequency and intensity variability, lower phonation range, the

presence of low voice intensity, mono pitch, voice arrests, and struggle irrespective of the severity of the PD

symptoms.

People suffering from Parkinson’s disease experience speech production difﬁculty associated with Dysarthria.

Dysarthria is characterized by monotony of pitch, reduced loudness, an irregular rate of speech and, imprecise

consonants and changes in voice quality [36]. Speech-language pathologists do the evaluation, diagnosis and

treat communication disorders. Literature suggests that Lee Silverman Voice Treatment (LSVT) has been most

efﬁcient behavioral treatment for voice and speech disorders in Parkinson’s disease. Telehealth monitoring is

very effective for the speech-language pathology, and smart devices like EchoWear [16] can be of much use in

such situations. Several cues indicate the relationship of dysarthria and acoustic features. Some of them are ,

1. Shallower F2 trajectories in male speakers with dysarthria is observed in [33].

2. Vowel space area was found to be reduced relative to healthy controls for male speakers with amyotrophic

lateral sclerosis [33].

3. Shimmers as described in [17] as a measure of variation in amplitude of the speech and it is an important

speech quality metric for people with speech disorders.

4. Like shimmers, Jitters (pitch variations) and loudness and sharpness of the speech signal can be used as a

cue for speech disorders [17].

6 Authors Suppressed Due to Excessive Length

5. In ataxic dysarthria ,patients can produce distorted vowels and excess variation in loudness, so speech

prosody and acoustic analysis are of much use.

6. Multi dimensional voice analysis as stated in [33] plays an important role in motor speech disorder diag-

nosis and analysis. Parameters that can effectively used are relative perturbation (RAP), pitch perturbation

quotient (PPQ),fundamental frequency variation (vF0), shimmer in dB(ShdB), shimmer percent (Shim),

peak amplitude variation (vam) and amplitude tremor intensity index(ATRI).

7. Shrinking of the F0 range as well as vowel space are observed in dysarthria speech. Moreover, from the

comparison of F0 range and vowel formant frequencies, it is suggested that speech effort to produce wider

F0 range can inﬂuence vowel quality as well.

EchoWear [16] is a smartwatch technology for voice and speech treatments of patients with Parkinson’s disease.

Considering the difﬁculties associated with the patients in following prescribed exercise regimes outside the

clinic, this device remotely monitors speech and voice exercise as prescribed by speech-language pathologists.

The speech quality metrics used in EchoWear presently as stated in [16] were average loudness level and

average fundamental frequency (F0). Features were derived from the short-term speech spectrum of a speech

signal. To ﬁnd the fundamental frequency, EchoWear uses SWIPE pitch estimator, whereas other methods

such as cepstral analysis and autocorrelation methods are also extensively used for estimation of the pitch. The

software Praat is designed for visualizing the spectrum of a speech signal for analysis. Fundamental frequency

(F0) variability is associated with the PD speech. There is a decrease variation in pitch , i.e. Fundamental

frequency associated with PD speech.

Fig. 5. Overall architecture of the proposed Fog architecture in the context of frontend and backend services. It shows the information

ﬂow from patients to clinicians through the modular architecture.

3 Proposed Fog Architecture

In this section, we describe the implementation of the proposed Fog architecture. Figure 5 shows the overall

architecture of proposed system in the context of frontend and backend services. It shows the information ﬂow

from the patients to SLPs through the communication and processing interfaces. Instead of layers, we describe

the implementation using three modules namely, 1) Fog device; 2) Backend Cloud Database; and 3) Frontend

App Services. These three modules gave a convenient representation for describing the multi-user model of the

proposed Fog architecture.

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 7

Table 2. List of speech exercises performed by the patients with Parkinson’s disease.

Task Exercise Name Description

t1Vowel Prolongation Sustain the vowel /a/ for as long as possible for three repetitions.

t2High Pitch Start saying /a/ at talking pitch and then go up and hold for 5 seconds (three repetitions).

t3Low Pitch Start saying /a/ at talking pitch and then go down and hold for 5 seconds (three repetitions).

t4Read Sentence Read ’The boot on top is packed to keep’

t5Read Passage Read the ’farm’ passage.

t6Functional Speech Task Read a set of customized sentences.

t7Monologue Explain happiest day of your life.

Fig. 6. The interface view of the IoT PD Android app for frontend users such as clinicians, caregivers and patients. Different categories

of users have different privileges. For example, a patient can register with the app only upon receiving the clinician’s approval.

3.1 Fog Computing Device

To transfer the audio ﬁle/other data ﬁle from a patient, we used socket streaming using TCP wrapped in Secure

Socket Layer/ Transmission Layer Security (SSL/TLS) sockets to ensure the secure transmission. Sockets

provide communication framework for devices using different protocols such as TCP and UDP that could then

be wrapped in secured sockets. Next, we describe these protocols and their usage in the proposed architecture.

– Transmission Control Protocol (TCP) is a networking protocol that allows guaranteed and reliable de-

livery of ﬁles. It is a connection-oriented and bi-directional protocol. In other words, both devices could

send and receive ﬁles using this protocol. Each point of the connection involved Internet Protocol (IP)

address and a port number so the connection could be made with a speciﬁc device. Furthermore, we

wrapped the TCP sockets in SSL Sockets for ensuring the security and privacy of the data collected from

the users/patients.

– Secure Sockets Layer (SSL) is a network communication protocol that allows encrypted authentication

for network sockets from the server and client sides. To implement it in the proposed Fog architecture, we

used two python modules, namely SSL and socket. To create the certiﬁcations for the server and client,

we also used the command line program called OpenSSL [42]. OpenSSL is an open-source project that

provides a robust, commercial-grade, and full-featured toolkit for the Transport Layer Security (TLS) and

Secure Sockets Layer (SSL) protocols.

Once all the SSL certiﬁcation keys were built for client (the Android gateway devices/wearables) and server

(Fog computer e.g., Intel Edison or Raspberry Pi), we ran the secure sockets on the server and continuously

listened for a connection for ﬁle transfer. We renamed the ﬁle with date and time stamps before it could be used

for further processing. As soon as the audio ﬁle was completely transferred, the connection was closed and

the processing began. We used the python based Praat and Christian’s Library described for processing and

analysis of audio data. For other healthcare data such as Phonocardiography (PCG) data and Electrocardiogram

(ECG) data etc., we implemented the associated methods using Python, C and GNU Octave.

8 Authors Suppressed Due to Excessive Length

3.2 Frontend App Services

For the frontend users, including patients and clinicians (SLPs), we designed Android applications and web

applications that could be used to log-into the system and access clinical features. Also, front-end apps were

running on wearable devices are facilitating the data collection. Our app, IoT PD, took advantage of the REST

protocol. We used REST protocol for simplicity of implementation. For every REST request of data infor-

mation gathering, we returned a JSON (JavaScript Object Notation), a format of data-interchange between

programs [29]. The IoT PD app is based on software engine Hermes. We open-source the URI library for audio

data collection from wearable devices.

The app allowed access to two categories of users as shown in Figure 6. Both the patients and healthcare

providers were allowed to login and view their proﬁle; however their proﬁles were different, only the clinicians

could give permission to their patients for app registration. Further, the physician could setup personalized

notiﬁcations for their patients. For example, the physician could schedule a personalized exercise regime for

a given patient so that their speech functions could be enhanced. On the other hand, patients could only view

their information and visual data.

3.3 Backend Cloud Database

To support the centralized storage of clinical features and analytics, we implemented a backend cloud database

using PHP and MySQL. Firstly, we set up a Linux, Apache, MySQL, PHP (LAMP) server, an open-source web

platform for development on Linux systems using Apache for web servicing, MySQL as database system for

management and storage, and PHP as the language for server interaction with applications [49]. The main

component of the backend was the relational database development. We designed a database revolving around

the users and Fog computers that could easily engage with the database. It created three tables that were used for

the users (patients and healthcare providers such as clinicians). The fourth table was created for the information

extracted from the patient’s data. The extracted features obtained from the Fog computer were entered in the

data table.

3.4 Pathological Speech Data Collection

Earlier, we described our implementation of EchoWear that was used in an in-clinic validation study on six

patients with Parkinson’s disease (PD). We received an approval (no: 682871-2) of the University of Rhode Is-

land’s Institutional Review Board to conduct human studies involving the presented technologies including IoT

PD and proposed Fog architecture. First, the six patients were given an intensive voice training in the clinic by

Leslie Mahler, a speech-language pathologist, who also prescribed home speech tasks for each patient. Patients

were given a home kit consisting of a smartwatch, a companion tablet and charging accessories. Patients were

recommended to wear the smartwatch during the day. Patients chose their preferred timings for speech exer-

cise. A tactile vibration of the smartwatch was used as a notiﬁcation method to remind the patients to perform

speech exercises. The IoT PD app took the timings to set the notiﬁcations accordingly. Home exercise regime

had six speech tasks. The six speech tasks assigned to patients with PD are given in Table 2. Speech-language

pathologists (SLPs) use extensive number of speech parameters in their diagnosis. We skip the clinical details

of prescription as it is out-of-the-scope of this book chapter.

3.5 Dynamic Time Warping

Dynamic time warping (DTW) is an algorithm for ﬁnding similar patterns in a time-series data. DTW has been

used for time-series mining for a variety of applications such as business, ﬁnance, single word recognition,

walking pattern detection, and analysis of ECG signals. Usually, we use Euclidean distance to measure the

distance between two points. For example, consider two vectors , x= [x1,x2,..., xn]and y= [y1,y2, ..., yn]

d(x,y) = q(x1−y1)2+ (x2−y2)2+ (·xn−yn)2(1)

Euclidean distance works well in many areas. But for some special case where two similar and out-of-phase se-

ries are to be compared, Euclidean distance fails to detect similarity. For example, consider two time series A =

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 9

Fig. 7. (a)Spectrogram of acquired speech signal. The frequency sampling rate is 8000 Hz. Time-windows of 25 ms with 10 ms skip-rate

were used. (b) Spectrogram of enhanced speech signal.

Fig. 8. Bar chart depicting the data reduction achieved by using dynamic time warping (DTW), clinical speech processing (CLIP) and

GNU zip compression on ten sample speech ﬁles collected from in-home trails of patients with Parkinson’s disease (PD).

[1,1,1,2,8,1] and B [1,1,1,8,2,1], the Euclidean distance between them is √72. Thus, DTW is an effective algo-

rithm that can detect the similarity between two series regardless of different length, and/or phase difference.

The example vectors are similar but the similarity could not be inferred by Euclidean distance metric while

DTW can detect the similarity easily. DTW is based on the idea of dynamic programming (DP). It builds an

adjacency matrix then ﬁnds the shortest path across it. DTW is more effective than Euclidean distance for many

applications [14] such as gesture recognition [23], ﬁngerprint veriﬁcation [34], and speech processing [41].

4 Case Studies using Proposed Fog Architecture

4.1 Case Study I: Speech Tele-treatment of Patients with Parkinson’s Disease

A variety of acoustic and spectral features were derived from the speech content of audio ﬁle acquired by

wearables. In proposed Fog architecture, noise reduction, automated trimming, and feature extraction were

done on the Fog device. In our earlier studies [16,19,13,40], trimming was done manually by human annotator

and feature extraction was done in the cloud. In addition, there were no noise reduction done in previous stud-

ies [16,19]. The Fog computer syncs the extracted features and preliminary diagnosis back in the secured cloud

10 Authors Suppressed Due to Excessive Length

backend. Fog was employed for in-home speech treatment of patients with Parkinson’s disease. The patholog-

ical features were later extracted from the audio signal. Figure 14 shows the block diagram of pathological

speech processing module. In our earlier studies, we computed features from the controlled clinical environ-

ment and performed Fog device trails in lab scenarios [19]. This paper explored the in-home ﬁeld application.

In-clinic speech data was obtained in quiet scenarios with negligible background noise. On the other hand, data

from in-home trials had huge amounts of time-varying non-stationary noise. It necessitated the use of robust

algorithms for noise reduction before extracting the pathological features. In addition to previously studied

features such as loudness and fundamental frequency, we developed more features for accurate quantiﬁcation

of abnormalities in patient’s vocalization. The new features are jitter, frequency modulation, speech rate and

sensory pleasantness. In our previous studies, we use just three speech exercises (tasks t1,t2and t3) for analysis

of algorithms. In this paper, we incorporated all six speech exercises. The execution was done in real-time in

patient’s home unlike pilot data used in our previous studies [19,40]. Thus, Fog speech processing module is

an advancement over earlier studies in [16,19,40].

The audio data was acquired and stored in wav format. Using perceptual audio coding such as mp3 would

have saved transmission power, storage and execution time as the size of mp3 coded speech data is lower than

corresponding wav format. The reason for not using mp3 or other advanced audio codecs is to avoid loss of

information. Perceptual audio codes such as mp3 are lossy compression scheme that removes frequency bands

that are not perceptually important. Such codecs have worked well for music and audio streaming. However,

in pathological speech analysis, patients have very acute vocalizations such as nasal voice, hypernasal voice,

mildly slurred speech, monotone voice etc.. Clinicians do not recommend lossy coding for speech data as it

can cause confusion in diagnosis, monitoring, and evaluation of pathological voice. Since we use the unicast

transmission from BSNs to fog computers, we employed Transmission Control Protocol (TCP). The data have

to be received in the same order as sent by BSNs. We did not use User Datagram Protocol (UDP) that is more

popular for audio/video streaming as UDP does not guarantee receipt of packets. For videos/audios that are

perceptually encoded and decoded, small losses lead to temporary degradation in received audio/video. We do

not have that luxury in pathological speech or PCG data that have to be guaranteed delivery even if delayed

and/or have to be re-transmitted. The pathological data was saved as mono-channel audio sampled at 44.1 kHz

with 16-bit precision in .wav format.

Background Noise Reduction The audio signals from in-home speech exercises are highly contaminated with

time-varying background noise. Authors developed a method for reducing non-stationary noise in speech [12].

The audio signal is enhanced using noise estimates obtained from minima controlled recursive averaging. We

performed a subjective evaluation for validating the suitability of this algorithm for our data. The enhanced

speech was later used for extracting perceptual speech features such as loudness, fundamental frequency, jitter,

frequency modulation, speech rate and sensory pleasantness (sharpness). We used the method developed in [12]

for reducing non-stationary background noise in speech. It optimized the log-spectral amplitude of the speech

using noise estimates from minima-controlled recursive averaging. Authors used two functions for accounting

the probability of speech presence in various sub-bands. One of these functions was based on the time frequency

distribution of apriori signal-to-noise ratio (SNR) and was used for estimation of the speech spectrum. The

second function was decided by the ratio of the energy of noisy speech segment and its minimum value within

that time window. Objective, as well as subjective evaluation, illustrated that this algorithm could preserve

the weak speech segments contaminated with a high amount of noise [12]. Figure 7 shows the spectrogram

of acquired speech signal from in-home trials and the spectrogram of corresponding enhanced speech signal.

Speech enhancement is clearly visible in the darker regions (corresponding to speech) and noise reduction in

lighter regions (corresponding to silences/pauses).

Automated Trimming of the Speech Signal We used the method developed in [52] for automated trimming

of audio ﬁles by removing the non-speech segments. This method was validated to be accurate even at low

SNRs that is typical for in-home audio data. The low computational complexity of this algorithm qualiﬁes

it for implementation on Fog device with limited resources. After applying the noise reduction method on

acquired speech signal, we used voice activity detection (VAD) algorithm for removing the silences. Authors

in [52] proposed a simple technique for VAD based on an effective selection of speech frames. The short time-

windows of a speech signal are stationary (for 25-40 ms windows). However, for an extended time duration

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 11

Fig. 9. Top sub-ﬁgure shows time-domain enhanced speech signal. The middle sub-ﬁgure depicts corresponding fundamental frequency

contour. The bottom sub-ﬁgure shows the speech activity labels where ’1’ stands for speech and ’0’ for silence/pauses. We used speech

activity detection proposed in [52]. This is effective and has low computational expense.

(more than 40 ms), the statistics of speech signal changes signiﬁcantly rendering unequal relevance of speech

frames. It necessitates the selection of effective frames on the basis of posteriori signal-to-noise ratio (SNR).

The authors used energy distance as a substitute to the standard cepstral distance for measuring the relevance of

speech frames. It resulted in reduced computational complexity of this algorithm. Figure 9 illustrates automated

trimming of a speech signal for removing the pauses present in the audio ﬁles. We used time-windows of size

25 ms with 10 ms skip-rate between successive windows.

Fundamental Frequency Estimation We used the method proposed in [27] for estimation of the fundamental

frequency. It was found to be effective even at very low SNRs. It is a frequency-domain method referred as

Pitch Estimation Filter with Amplitude Compression (PEFAC). We used 25 ms time-windows with 10 ms

skip-rate for estimation of the fundamental frequency. In the ﬁrst step, noise components were suppressed

by compressing the speech amplitude. In the second step, the speech was ﬁltered such that the energy of

harmonics was summed. It involved ﬁltering of power spectral density (PSD) followed by picking the peaks

for estimation of the fundamental frequency (in Hz). Figure 9 shows the time-domain speech signal along with

automatic trimming decision and pitch estimates for each overlapping windows.

Another method we implement for fundamental frequency estimation is based on harmonic models [6].

Voiced speech is not just periodic but also rich in harmonic, so voiced segments are modeled by adopting

harmonic models.

Perceptual Loudness Speech-language pathologists (SLPs) use loudness as an important speech feature for

quantifying the perceptual quality of clinical speech. It is a mathematical quantity computed using various

models of the human auditory system. There are different models available for loudness computation valid for

speciﬁc sound types. We used Zwicker model for loudness computation valid for time-varying signals [56].

12 Authors Suppressed Due to Excessive Length

Fig. 10. The time-domain speech signal and corresponding instantaneous loudness curve. Loudness was computed over short windows

of 25ms with 10 ms skip-rate.

The loudness is perceived intensity of a sound. The human ears are more sensitive to some frequencies than the

other. This frequency selectivity is quantiﬁed by the Bark-scale. The Bark scale deﬁnes the critical bands that

play an important role in intensity sensation by the human’s ears. The speciﬁc loudness of a frequency band is

denoted as L0and measured in units of Phon/Bark. The loudness, L, (in unit Phon) is computed by integrating

the speciﬁc loudness, L0, over all the critical-band rates (on bark scale). Mathematically, we have

24Bark

∑

L0·dz (2)

Typically, the step-size, dz, is ﬁxed at 0.1 [56]. We used Phon (in dB) as the unit of loudness level. Figure 10

shows a time-domain speech signal and corresponding instantaneous loudness in dB Phon. It depicts the de-

pendence of loudness on speech amplitude.

Jitter Jitter (J1) quantiﬁes changes in the vocal period from one cycle to another. Instantaneous Fundamental

frequency was used for computing the jitter [53]. J1was deﬁned as the average absolute difference between

consecutive time-periods. Mathematically, it is given as:

J1=1

M−1

∑

j=1|Fj−Fj+1|(3)

where Fjwas the j-th extracted vocal period and Mis the number of extracted vocal periods.

Figure 11 shows the comparison of jitter of six patients with PD from home-trials. Three patients used the

Fog for ﬁrst week and third week of the trial-month. Another three patients used Fog for second and fourth

week. This swapping was done to see the effect of Fog architecture. In absence of Fog device, data was stored

in android tablet (gateway) device and later was processed in ofﬂine mode. In presence of Fog, the data was

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 13

Fig. 11. The average jitter, J1(ms) computed using speech samples from six patients with Parkinson’s disease who participated in

ﬁeld-trial that lasted four weeks. Three patients used Fog for ﬁrst and third week while other three patients used it for second and

fourth week. We are comparing weeks where Fog was used.

processed online. Since same program produced these results, we can compare them. Figure 11 shows the Jitter

(in ms) for all cases. We can see that the change in jitter from ﬁrst/second to third/fourth week is complicated. In

some cases it increases while in other it decreases. Only specialized clinicians can interpret such variations. The

Fog architecture facilitate the computation of jitter and sync it to cloud backend. Speech-language pathologists

(SLPs) can later access these charts and correlated it with corresponding patient’s treatment regime.

Frequency Modulation It quantiﬁes the presence of sub-harmonics in speech signal. Usually, speech signals

with many sub-harmonics lead to a more complicated interplay between various harmonic components making

it relevant for perceptual analysis. Mathematically, it is given as [51]:

Fmod =max (Fj)M

j=1−min(Fj)M

j=1

max(Fj)M

j=1+min(Fj)M

j=1

(4)

where Fmod is frequency modulation, and Fjis the fundamental frequency of j-th speech frame.

Frequency Range The range of frequencies is an important feature of speech signal that quantiﬁes its qual-

ity [7]. We computed the frequency range as the difference between 5−th and 95 −th percentiles. Mathemat-

ically, it becomes:

Frange =F95% −F5% (5)

Taking 5 −th and 95 −th percentiles helps in eliminating the inﬂuence of outliers in estimates of fundamental

frequency that could be caused by impulsive noise and other interfering sounds.

Harmonics to Noise Ratio Harmonics to Noise Ratio (HNR) quantiﬁes the noise present in the speech signal

that results from incomplete closure of the vocal folds during speech production process [53]. We used method

proposed in [9] for HNR estimation. The average and standard deviation of the segmental HNR values are use-

ful for perceptual analysis by speech-language pathologist. Lets assume that Rxx is normalized autocorrelation

and lmax is the lag (in samples) at which it is maximum, except the zero lag. Then, HNR is mathematically

given by [9]:

HNRd B =10 log 10 Rxx (lmax)

1−Rxx(lmax )(6)

Spectral Centroid It is the center of mass of spectrum. It measure the brightness of an audio signal. Spectral

centroid of a spectrum-segment is given by average values of frequency weighted by amplitudes, divided by

14 Authors Suppressed Due to Excessive Length

Fig. 12. The weights that were used for computing sharpness based on [56]. Sharpness quantiﬁes perceptual pleasantness of the speech

signal. We can see that the higher critical band rates use lower weights for computing the sharpness.

the sum of amplitudes [44]. Mathematically, we have

SC =∑N

n=1kF[k]

∑N

n=1F[k](7)

where SC is the spectral centroid, and F[k]is amplitude of k−th frequency bin of discrete Fourier transform

of speech signal.

Spectral Flux It quantiﬁes the rate of change in power spectrum of speech signal. It is calculated by comparing

the normalized power spectrum of a speech-frame with that of other frames. It determines the timbre of speech

signal [55].

Spectral Entropy We adopted it for speech-language pathology in this chapter. It is given by:

SE =−∑Pjlog(Pj)

log(M)(8)

where SE is the spectral entropy, Pjis the power of j-th frequency-bin and M is the number of frequency-bins.

Here, ∑P

k=1 as the spectrum is normalized before computing the spectral entropy.

Spectral Flatness It measures the ﬂatness of speech power spectrum. It quantiﬁes how similar the spectrum is

to that of a noise-like signal or a tonal signal. Spectral Flatness (SF) of white noise is 1 as it has constant power

spectral density (PSD). A pure sinusoidal tone has SF close to zero showing the high concentration of power at

a ﬁxed frequency. Mathematically, SF is ratio of geometric mean of power spectrum to its average value [30].

Sharpness Sharpness is a mathematical function that quantiﬁes the sensory pleasantness of the speech signal.

High sharpness implies low pleasantness. It value depends on the spectral envelope of the signal, amplitude

level and its bandwidth. The unit of sharpness is acum (Latin expression). The reference sound producing 1

acum is a narrowband noise, one critical band wide with 1 kHz center frequency at 60 dB intensity level [21].

Sharpness, Sis mathematically deﬁned as

S=0.11∑24Bark

0L0·g(z)·z·dz

∑24Bark

0L0·dz acum (9)

However, its numerator is weighted average of speciﬁc loudness (L0) over the critical band rates. The weighting

function, g(z), depends on critical band rates. The g(z)could be interpreted as the mathematical model for the

sensation of sharpness shown in Figure 16.

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 15

Fig. 13. The left sub-ﬁgure (a) shows the articulation rate (nsyll/phonation time) for the patients with PD and healthy controls. It

shows that the healthy controls exhibit signiﬁcantly higher articulation rate as compared to the patients with PD that is in accordance

with the ﬁndings in [39]. The right sub-ﬁgure depicts speech rate for the same case. The y-axis represents the speech rate (number of

syllables/duration) for the healthy controls and the patients with PD. The ﬁndings were that, Healthy control showed a higher speech

rate as compared to the patients with Parkinson’s disease. Speech rate for the healthy control was, 3.74 and for the PD subject 2.86.

The analysis is done using Praat [3] and the bar graph plots were generated using R statistical analysis software.

Fig. 14. Block diagram of pathological speech processing module in proposed Fog architecture. The speech signal is ﬁrst enhanced to

reduce the non-stationary background noise. Next, speech activity is detected to identify the speech regions and discard pauses/silences.

Speech activity detection reduces the computation by ignoring non-speech frames. Finally, the speech is used for computing clinically

relevant features using mathematical models of auditory perception.

Speech Rate and Articulation Rate Praat scripting is extensively used in speech analysis. Some analysis

were done using Praat scripting language. Slurred speech, breathy and hoarse speech, difﬁculty in fast-paced

conversations are some of the symptoms of Parkinson’s disease. The progressive decrease in vocal sonority and

intensity at the end of the phonation is also observed in patients with PD [39]. Literature suggests that speech

and articulation rates decrease in PD, and there is a causal link between duration and severity of PD with this

decrease in articulation rate [39]. Articulation rate is a prosodic feature and is deﬁned as a measure of rate of

speaking excluding the pauses. Speech rate is usually deﬁned as the number of sounds a person can produce

16 Authors Suppressed Due to Excessive Length

Fig. 15. Depicting the variations in frequency in sustained /a/ (task t1), HIGHS (task t2), and LOWS (task t3) for several speech samples.

Fig. 16. The comparison of average sharpness of the speech signal obtained from in-home trails of a patient. The six days of two weeks

are compared with respect to average sharpness (in acum). These two weeks are separated by one week. Low sharpness shows high

sensory pleasantness in a speech signal. We can see that the evolution of sharpness on different days is very complicated even during

the same week. It is because the speech disorders are unique for each patient with PD.

in a unit of time [39]. As illustrated in [3], Speech rate is calculated by detecting syllable nuclei. We used

Wempe’s algorithm for estimating the speech rate [3]. For analysis of speech rate and articulation rate, Praat

scripts were used. Two sound samples were chosen for comparative analysis. Samples comprised of healthy

control and the patients with PD. Figure 13 shows the bar-chart for articulation rate and speech rate.

4.2 Case Study II: Phonocardiography (PCG)-based Heart Rate Monitoring

Phonocardiography refers to acquisition of heart sounds that contains signatures of abnormalities in cardiac

cycle. There are two major sound, S1 and S2 associated with cycle of cardiac rhythm. Traditionally, specialized

clinicians listen heart sound using devices such as stethoscopes for cardiac diagnosis. Such examination need

specialized training [24]. Authors developed a computationally inexpensive method for preliminary diagnosis

of heart sound [47]. Segmentation of PCG signals and estimation of heart rate from it has been done primarily

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 17

Fig. 17. Proposed method for estimation of heart rate from PCG signal. We ﬁrst do low pass ﬁltering for reducing the high frequency

noise. It is followed by downsampling for reducing the computational complexity. Next, Hilbert envelope is extracted and envelope

is processed with Teager energy operator (TEO). The output of TEO is smoothed by Savitzky-Golay ﬁltering. We performed moving

averaging for further enhancement of peaks corresponding to heart sound S1. The time-period of heart sound S1 (in seconds) is

multiplied with 60 get the heart rate in Beats per minute (BPM). The normal heart rate lies in range 70-200 BPM. Signiﬁcant deviation

from this range shows abnormality in cardiac cycle. This method was implemented in Python and executed in the Fog computer.

Fig. 18. Time-domain PCG signal for four conditions namely normal, asd, pda, diastolic. The variations in these signals reﬂect the

corresponding cardiac functions.

using two approaches. Segmentation of PCG signals and estimation of heart rate from it has been done primarily

using two approaches. The ﬁrst approach uses ECG as a reference for synchronization of cardiac cycles. Second

approach relies solely on PCG signal and is appropriated for wearable devices that relies on smaller number of

sensors.

In this paper, we integrated the analysis method into Fog framework for providing local computing on

Fog device. With the growing use of wearables [10] for acquiring PCG data, there is need of processing such

data for preliminary diagnosis. Such preliminary diagnosis refers to segmentation of PCG signal into heart

sounds S1 and S2 and extraction of heart rate. Figure 17 shows the proposed scheme for analysis of PCG data

for extracting the heart rate. We detect the time-points for heart sounds S1 and S2. Later, these were used

for extracting the heart rate. The development and execution of a robust algorithm on Fog device is novel

contribution of this chapter.

PCG Data Acquisition PCG signals were acquired using a wearable microphones kept closer to the chest.

Such wearable devices could send data to a nearby placed fog device through a smartphone/tablet (gateway).

Fog saves the PCG data in .wav format sampled at 800 Hz with 16 bit resolution. The microsoft wav format

18 Authors Suppressed Due to Excessive Length

Fig. 19. Envelope of the PCG signal using procedure shown in block diagram (see Figure 17) for four conditions namely normal, asd,

pda, diastolic. The envelope shows clear transitions in PCG signal that can be further processed for localizing the fundamental sound

S1 and hence estimation of heart rate in Beats per minute (BPM).

is lossless format and is widely used for healthcare sound data. We are not discussing the hardware details

as our primary goal is computing signal features on Fog device. The segmentation step (see block diagram

in Figure 17) separated the heart sounds S1 and S2 from the denoised PCG signal. The heart sounds S1 and

S2 captures the acoustic cues from cardiac cycle. The peak-to-peak time-distance between two successive S1

sounds make one cardiac cycle. Thus, time-distance between two S1 sound determines the heart rate.

We used the data from four scenarios of cardiac cycles namely, normal, asd, pda, and diastolic. The ’nor-

mal’ refers to normal heartbeat from an healthy person. The ’asd’ refers to PCG data induced by an atrial septal

defect (a hole in the wall separating the atria). The ’pda’ refers to PCG signal induced by patent ductus arterio-

sus (a condition wherein a duct between the aorta and pulmonary artery fails to close after birth). The last one,

’diastolic’ refers to PCG signal corresponding to a diastolic murmur (leakage in the atrioventricular or semilu-

nar valves). Figure 18 shows the time domain PCG signals corresponding to these scenarios. We can see that

PCG signal contain signatures of cardiac functioning and clear distinction is portrayed by these time-domain

signals. Figure 19 shows the enveloped of these signals (see Figure 17). We can see that envelope shows the

better track of time-domain variations.

Noise Reduction in PCG data The PCG signal was acquired at 800 Hz for capturing high ﬁdelity data.

Some noise is inherently present in data collected using wearable PCG sensors. We do low pass ﬁltering

using a sixth-order Butterworth ﬁlter with a cutoff frequency of 100 Hz. It reduces the noise leaving behind

spectral components of cardiac cycle. We downsampled the low-pass ﬁltered signal to reduce the computational

complexity.

Teager Energy Operator for Envelope Extraction Teager Energy Operator (TEO) is a nonlinear energy

function [31]. TEO captures the signal energy based on physical and mechanical aspects of signal production.

It has been successfully in various applications [37,35]. For a discrete signal x[n], it is given by

Ψ(x[n]) = x[n]∗x[n]−x[n+1]∗x[n−1](10)

where Ψ(x[n]) is the TEO corresponding to the sample x[n]. We applied TEO on the downsampled signal

(see Figure 17) to extract the envelope. The TEO output is further smoothed using Savitzky-Golay ﬁltering.

Savitzky-Golay ﬁlters are polynomial ﬁlters that achieve least-squares smoothing. These ﬁlters performed bet-

ter than standard ﬁnite impulse response (FIR) smoothing ﬁlters. [43]. We used ﬁfth-order Savitzky-Golay

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 19

Fig. 20. Detecting heart sound S1 and using it for heart rate estimation in units of beats per minute (BPM). The top-ﬁgure shows the

low pass and downsampled PCG signal. The middle ﬁgure shows the envelope by procedure shown in block diagram of Figure 17. The

bottom sub-ﬁgure shows the ﬁnal post-processed envelope. It is clear that by choosing a suitable threshold, we can detect the S1 sound

from PCG signal. Since the cardiac cycle time-length (in seconds) is same as time-difference between two S1 sound (in seconds), we

can estimate heart rate by multiply it with 60.

smoothing ﬁlters with a frame-length of 11 windows. Next, we perform moving-average ﬁltering on smoothed

TEO envelope. The window-length of 11 was used for moving averaging. In next step, the output of moving-

average ﬁlter is mean and variance normalized to suppress the channel variations.

Heart Rate Estimation by Segmentation of Heart Sounds S1 The heart sound S1 marks the start of the

systole. It is generated by closure of mitral and tricuspid valves that cause blood ﬂow from atria to ventricle.

It happens when blood has returned from the body and lungs. The heart sound S2 marks the the end of systole

and the beginning of diastole. It is generated upon closure of aortic and pulmonary valves following which the

blood moves from heart to the body and lungs. Under still conditions, the average heart-sound duration are S1

(70-150ms) and S2 (60-120ms). The cardiac cycle lasts for 800 ms where systolic period is around 300 ms and

diastolic period being 500 ms [54].

The mean and variance normalized envelope is used for detecting the fundamental heart sound (S1). Since

S1 marks the span of cardiac cycle, we compute time-distance between two S1 locations. It gives the length

of cardiac cycle (in seconds). This is multiplied by 60 (see Figure 17) to get the heart rate in Beats per minute

(BPM). Under normal cardiac functioning heart rate lies in range 70-200 BPM. In case, where estimated heart

rate is signiﬁcantly large than this range over a long duration of time, it shows some abnormality in health. It is

worth to note that intense exercises such as running on treadmill, cycling etc. can also cause increase in heart

rate. The Fog computer receives the PCG signals from wearable sensors and extract heart rate in BPM for each

frame. We choose a time-windows of size two seconds with 70% overlap between successive windows.

20 Authors Suppressed Due to Excessive Length

Fig. 21. An example typical time-domain ECG waveform showing phases P, QRS complex and T.

4.3 Case Study III: Electrocardiogram (ECG) Monitoring

Heart diseases are one of the major chronic illness with a dramatic impact on productivity of affected individu-

als and related healthcare expenses. An ECG sub-system is considerably for more out-of-hospital applications,

manufacturers face continued pressure to reduce system cost and development time while maintaining or in-

creasing performance levels. The electrocardiogram (ECG) is a diagnostic tool to assess the electrical and

muscular functions of the heart. The ECG signal consists of components such as P wave, PR interval, RR inter-

val, QRS complex, pulse train, ST segment, T wave, QT interval and infrequent presence of U wave. Presence

of arrhythmias changes QRS complex, RR interval and pulse train. For instance a narrow QRS complex (¡120

milliseconds) indicates rapid activation of the ventricles that in turn suggests that the arrhythmia originates

above or within the his bundle (supraventricular tachycardia) and a wide QRS (greater than 120 milliseconds)

occurs when ventricular activation is abnormally slow. The most common reason for a wide QRS complex is

arrhythmia of the ventricular myocardium (e.g., ventricular tachycardia) [5]. Figure 21 shows ECG time series

with P wave, T wave and QRS complex. These three patterns are search using DTW for a large number of ECG

data sets. The last section of this case study will discuss the data reduction using DTW and GNU zip compres-

sion on ECG data. The goal of our experiment is to detect arrhythmic ECG beats or QRS changes using QRS

complex and the RR interval measurements. The ECG data is fed to the Fog computer from Internet-based

database. The Fog computer extracts QRS complex from ECG signals using real-time signal processing im-

plemented in Python on Intel Edison. The Pan Tompkins algorithm is used for detection of QRS complex [45].

Pan-Tompkins algorithm consists of ﬁve steps:

Band Pass Filtering The energy contained in QRS complex is approximated in 5-15 Hz range [5]. We apply

a band pass ﬁlter for extracting 5- 15 Hz content of ECG signals. The band pass ﬁlter reduces muscle noise, 60

Hz power-line interference, baseline wandering and T wave interference. This ﬁlter achieve a 3dB pass-band

from about 5-12 Hz. The high-pass ﬁlter is designed by subtracting the output of ﬁrst-order low-pass ﬁlter from

an all-pass ﬁlter with delay of 16 samples (80ms) [45].

Derivation The output of band-pass ﬁlter is differentiated to get the slope. It uses a ﬁve-point derivative. After

differentiation, the output signal is squared to get only positive values. It performs non-linear ampliﬁcation of

the output suppressing the values lower than 1. A moving-window integration is applied on output of last step.

It smoothens the output resulting in multiple peaks within duration of QRS complex. It adapts to changes in the

ECG signal by estimating the signal and noise peaks for ﬁnding the R-peaks (Figure 22). The Pan-Tompkins

based QRS detection is implemented on ECG signals obtained from MIT-BIH Arrhythmia Database [4]. Fig-

ure 22 illustrates the QRS detection using Pan-Tomkins algorithm on Intel Edison using MIT-BIH Arrhythmia

data. The ECG signal containing 2160 samples take 1 second of processing time on Intel Edison Fog computer.

It shows that proposed Fog architecture is well suited for real-time ECG monitoring. We used DTW based

pattern mining for P wave, T wave and QRS complex in ECG data. The DTW indices showing the location

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 21

Fig. 22. Illustration of QRS detection using Pan-Tompkins algorithm; (a) Raw ECG data; (b) ECG signal after band-pass ﬁltering and

derivation; (c) Squaring the data; (d) Integration and thresholding to detect QRS; (e) Pulse train of ECG signal.

of these pattern in ECG time-series is sent to the cloud. Similarly, we use GNU zip program to compress the

original ECG time series. The compressed ECG data ﬁles are then send to the cloud. Figure 23 shows the data

reduction resulting from DTW based pattern mining with compression. Similar to speech data, DTW reduces

ECG data by more than 98% in most of the cases while compression reduces around 91%. Figure 24 shows the

execution time (in seconds) for Pan-Tompkins based QRS detection implemented in Python on Intel Edison

Fog computer. The data sets from MIT-BIH Arrhythmia Database are used. The size of the data sets range from

16.24 kB to 36.45 kB. The execution time increases with increase in ﬁle size. The time taken is always less

than 15 seconds. This validates the efﬁcacy of Fog Data architecture for real-time ECG monitoring.

22 Authors Suppressed Due to Excessive Length

Fig. 23. Comparison of data reduction resulting from DTW based pattern mining and GNU zip based compression for ECG data

obtained from MIT-BIH Arrhythmia Database [4].

Fig. 24. Execution time (in seconds) for Pan-Tompkins based QRS detection on Inter Edison Fog computer for ECG data from MIT-

BIH Arrhythmia Database [4].

Fig. 25. Comparing loudness computed from speech signal recorded by smartwatch at sampling rate of 44.1 kHz and half of it. We can

see the variations are low. The mean change with respect to 44.1 kHz is 2.86% with a standard deviation of 1.26%.

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 23

Fig. 26. Comparing fundamental frequency (in Hz) computed from speech signal recorded by smartwatch at sampling rate of 44.1 kHz

and half of it. We can see the variations are low. The mean change with respect to 44.1 kHz is 0.0818 % with a standard deviation of

0.1786 %.

Fig. 27. (a) The data collected from one of the patients for 8 days. The speech data collected was well structured with date and time

stamps.(b) The pie chart shows the user acceptability of the proposed system during in-home trials. By non-positive, we mean neither a

positive nor a negative inclination towards proposed system. For one participant with severe motor disorders, using smartwatch needed

some effort and hence had neither a positive or negative inclination.

Table 3. Latency measurements of Fog for computing the clinical speech features namely zero crossing rate (ZCR), special centroid

(SC), and short-time energy (STE).

Speech Tasks Processing Time(s) File Duration(s) Size(kB)

Task1 2.34 6.24 551

Task2 2.33 6.18 545

Task3 2.12 5.62 496

Task4 2.28 6.08 537

Task5 1.86 4.96 438

Total 10.94 29.08 2567

5 Experiments & Results

5.1 Intel Edison Description

The Intel Edison platform used in this application was designed with a core system consisting of dual-core,

dual-threaded Intel Atom CPU at 500MHz and a 32-bit Intel Quark microcontroller at 100MHz, along with

connectivity interfaces capable of Bluetooth 4.0 and dual-band IEEE 802.11a/b/g/n via an on-board chip an-

tenna. This platform came with a Linux environment called Yocto, which is not an embedded distribution of

Linux itself, its true purpose is to provide an environment to develop a custom Linux distribution. We did not

create a Linux distribution, instead we deployed a prebuilt distribution of Debian/Jessie for 32-bit systems.

This decision was made such that we could deploy the same environment on both the Intel Edison and the

Raspberry Pi.

24 Authors Suppressed Due to Excessive Length

Fig. 28. The average loudness for two days on task ”Highs” (task t2) and ”Lows” (task t3) for six patient doing speech exercises at

home. The data was processed with Fog in real-time. It illustrates the Fog functionality to compute these features.

Fig. 29. Showing average loudness and pitch for each day for in-home trials for six patients. The patients used Fog for alternate weeks.

We can see that each patient has a different trend for change in loudness and pitch. Interpretation of these variations is done by trained

clinicians such as speech-language pathologists (SLPs). Fog compute these features and sync it to the secured cloud backend from

where it can be accessed by SLPs, caregivers.

5.2 Raspberry Pi Description

The Raspberry Pi Model B platform used in this application was designed with a core system consisting of

a 900MHz 32-bit quad-core ARM Cortex-A7 CPU, and 1GB RAM. Since the Raspberry Pi does did not

have WIFI connectivity built-in a WIFI dongle based on the Realtek RTL8188CUS chipset was installed. This

platform came with a custom Linux distribution called Raspbian. Since Raspbian would provide a slightly

different environment it was replaced with the Debian/Jessie distribution used on the Intel Edison.

5.3 Fog Computing: Feature Extraction on Fog devices

The fog devices, the Intel Edison and Raspberry Pi, were both conﬁgured to run the same Debian/Jessie i386

distribution. Once the distribution was setup, both devices installed the same version of Octave 3.8.2-4, along

with the additional packages required to perform the processing required by our algorithms. We also ensured

that both gateway devices tracking system performance using the same tools. The tools we used included the

Linux program top and the Octave function Proﬁler. The top program provided real-time insights into CPU

Load, Memory Usage, and run-times for processes or threads being managed by the Linux kernel. This was

used later to provide use with benchmarking for the system overall. The Octave function Proﬁler provided

insights into the run-times for each of section of the algorithm. This was used later determine which parts of

the algorithm required more time to complete.

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 25

5.4 Benchmarking and Program Setup

The gateway devices where remotely logged into via the SSH protocol. From here we ran the same bench-

marking scripts for both devices. The scripts would start Octave and load it with the data and use-case based

algorithm, while top was started in parallel. The script searched top for the process ID (PID) for this new

instance of Octave. Once determined it would extract all the information top provided about the systems per-

formance and the load imposed on the system by this instance of Octave. The extracted information was logged

into a csv ﬁle and saved for analysis after the algorithm ran its course. Once the instance of Octave was ready

to run the algorithm it started the Proﬁler function in the background. At the conclusion of the algorithm the

Proﬁlers set of data was stored into a .mat ﬁle for later analysis.

5.5 Bandwidth & Data Reduction

We conducted an experiment to measure the percentage by which Fog could reduce the data by processing

the audio ﬁles using proposed Fog architecture. In our previous studies [16], we developed a clinical speech

processing chain (CLIP), a series of ﬁltering operations applied on the speech data for computing the clinical

features such as loudness and fundamental frequency. We incorporated several new features in present chapter

in addition to loudness and fundamental frequency used in [16]. We took 20 audio ﬁles and processed them

with two methods;

1. Conventional method of compressing the ﬁles using GNU zip [25] and sending them to the cloud server

for further processing ;

2. Extracting the clinical features on the fog computer (proposed Fog architecture).

Table 3 lists the performance of Fog computer with respect to computation of clinical features. Figure 8 shows

the percentage reduction in data size achieved by clinical speech processing and GNU zip compression. We

can see that there is huge gains by processing data on Fog computer and sending only the features to cloud as

compared to sending the original ﬁles to the cloud.

5.6 Engineering Perspectives

Charging the wearables such as smartwatches etc. and gateways such as (smartphones/tablets) was necessary

at least once in a day. In case patients want to do exercise while being away from home, they need to carry

the tablet along with them. Patients were asked to do exercise in a quiet place where the noise is very low

or negligible. The patients could wear the smartwatch all the time. The tablet and the smartwatch need to be

within a range of 50 meters. The speech recordings were saved with date and time stamp that helped in sorting

and query-ing them in cloud database. The participants have the choice to switch-on the recording system using

smartwatch when they want to perform their vocal and other exercises. Similar procedures for other wearables.

5.7 Medical Data Analytics and Visualization

The part (a) of Figure 27 shows the size of speech data collected from one of the patients for eight days. We

can see that the least amount of data (24 MB) was collected on the ﬁrst day. On later days, the data size had

been increasing. The part (b) of Figure 27 shows the patients feedback on using the IoT PD technology for

facilitating the remote monitoring of their vocal and speech exercises. Five out of six participants of in-home

trials express a pleasant experience in using it. One participants had problems in using it for the ﬁrst week. This

patient had severe movement disorders in addition to speech disorder that made it difﬁcult to switch ON/OFF

the smartwatch. One week later, we made a software update allowing easier mechanism for switching ON/OFF.

After using the updated IoT PD, the patient reported that it was easy to use it. Accounting one feedback out of

six as neutral, we depict the user experience as the pie chart shown in Figure 27 (b).

6 Practical Insights

6.1 Data Vs. Fog Data for Cloud Storage

Table 4 shows approximation on the cloud storage requirement when we compare the conventional model

of raw data transfer with the presented Fog Data. It is clear that for the long-term continuous data including

26 Authors Suppressed Due to Excessive Length

Fig. 30. Showing effect of downsampling on loudness. We can see that by capturing pathological speech at lower sampling rate, we

are still approximately at same loudness level. The lower sampling rate would lead to lower power consumption in battery-operated

wearable devices.

Table 4. Cloud storage requirement for 100 patients undergoing speech tele-therapy at home.

Time Raw Data Fog Data

1 Day 12 GB 0.0012 GB

1 Week 84 GB 0.84 GB

1 Month 360 GB 3.6 GB

1 Year 4079 GB 43.8 GB

speech and ECG, Fog Data architecture reduces the storage requirements tremendously and ultimately cuts the

storage and maintenance cost as well as power demand on the cloud. Moreover, the reduced storage reduces

the complexity of Big Data Analytics.

Figure 30 shows the loudness computed by capturing pathological speech data at 44.1 kHz and downsam-

pling it half the rate. It is clear that downsampling degrades the perceptual quality of pathological speech at

the advantage of lower power consumption. This graph shows that if needed lower sampling rate can still be a

useful in situations where power consumption on wearable devices is an issue.

System Complexity: Our experiences with Fog provide evidence that establishing an intelligent computing

resource in remote settings where the patients were located was not only challenging in terms of hardware

development and programming, but also required the interdependence of many tools and libraries to build

automated exchange of information among various elements of telemedicine. For example, Table 5 shows the

various tools needed to bring autonomy, conﬁgurability, security, and smart computing on the Intel Edison. We

spent months to pursue a systematic survey of what was available and what was useful. Surveying the useful

tools was time consuming yet rewarding.

Table 5. Various resources required to develop Fog architecture.

Languages Tools Usage

MySQL SequelPro, DataGrip Run Queries and check database tables for testing purposes

PHP PhpStorm, Postman

Code the server to return

the data and information to the mobile application

Python PyCharm

Code the data Processing, storage,

transmission, and interfacing with database

Android IntelliJ, Android Studio

Coded the Client transfer

code and the complete mobile application

All languages Atom, Sublime Text2 Editors which can code all languages

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 27

6.2 Compatibility Issues

There were countless instances when we had to ﬁnd unconventional ways to establish intelligence in Fog. For

example, installing Praat python library on the Intel Edison was extremely difﬁcult.

6.3 Security & Privacy

In this work, we presented how the fog computer could be conﬁgured for computation and database access. We

also touched upon Fog security from the authenticated access point-of-view. However, we believe that security

needs can be addressed more rigorously since Fog allows us to conﬁgure the fog computing node remotely and

inject algorithms that could make the communication and storage more secure.

6.4 Challenges in using Fog computing for Telemedicine

No system is perfect and fog computing is no exception. There are difﬁculties in deploying Fog architecture for

telemedicine applications. Although the fog computing provides the data computation on the edge, reducing

the data signiﬁcantly, the data becomes non-reversible when only analytics are communicated to the cloud. The

fog has a limited storage space such that it can only store data for days or weeks, depending on the type of data.

In our case, the data were audio ﬁles that could easily exceed the storage limit on the fog within a few days.

An alternative is to create a query mechanism to access the data on fog when the clinicians want to listen to the

audio ﬁles. Furthermore, since the raw data was not communicated to the cloud, there was no way to perform

additional analysis in the cloud. In other words, it is necessary to ensure the reliability of the computational

models used for analysis of the data before they are injected into the fog computing resources.

7 Conclusions

We presented a multi-layer telemedicine architecture of the fog-assisted Medical Internet-of-things that was

implemented on Intel Edison with layers for hardware, middleware (communication and software), and appli-

cation (with security services). The Fog framework achieves intelligent gateway functions by processing audio

ﬁles using signal processing algorithms such as psychoacoustic analysis to extract the clinical features; storage

of raw data and features that are on-demand queryable by the cloud as well as the Fog interface. We also im-

plemented Android apps for stakeholders such as patients, healthcare providers and administrators who require

access to the backend database. This enabled speech-language pathologists (SLPs) to query the data showing

daily progress of their patients. Our case study demonstrated that managing computations on Intel Edison (fog

computer) reduces the data by 99%; though less data reduction would occur if more features were analyzed.

Our study also showed that it is possible to perform high-ﬁdelity signal processing on the fog device to extract

pathological speech features and communicate them to the cloud database.

Moreover, the paper not only provides a high level understanding of the fog-based IoT system, but also

provides details of how each layer was implemented including the tools and libraries used in the develop-

ment. If implemented appropriately, Fog has a great potential to provide more autonomy and reliability in

telemedicine applications driven by IoT. In future, we plan to deploy Fog in patient’s homes. This will help us

face operational challenges when the fog computer is located remotely in a different network.

Acknowledgments

Authors would like to thank the patients with Parkinson’s disease for their co-operation during validation stud-

ies reported in this chapter. This work was supported by a grant (No: 20144261) from Rhode Island Foundation

Medical Research and NSF grants CCF-1421823, CCF-1439011 and NSF CAREER CPS 1652538. Any opin-

ions, ﬁndings, and conclusions or recommendations expressed in this material are those of the author(s) and

do not necessarily reﬂect the views of the National Science Foundation or Rhode Island Foundation Medical

Research. Authors would like to thank Alyssa Zisk for proofreading the manuscript. Authors would like to

thank Manob Saikia, and Dr. Amir Mohammad Amiri for helpful discussions and suggestions for preparation

of this chapter.

28 Authors Suppressed Due to Excessive Length

References

1. http://www.siemens.com/innovation/en/home/pictures-of-the-future/digitalization-and-software/from-big-data-to-smart-data-

infographic.html. accessed: 2015-10-21

2. National institute of deafness and other communication disorders, https://www.nidcd.nih.gov/health/statistics/statistics-voice-

speech-and-language (2015)

3. Python script for PRAAT, https://github.com/JoshData/praat-py (2015)

4. http://www.physionet.org/physiobank/database/mitdb. online (2016), accessed

5. Alzand, B.S., Crijns, H.J.: Diagnostic criteria of broad QRS complex tachycardia: decades of evolution. Europace 13(4), 465–472

(2011)

6. Asgari, M., Shafran, I., Bayestehtashk, A.: Robust detection of voiced segments in samples of everyday conversations using

unsupervised hmms. In: IEEE Spoken Language Technology Workshop (2012)

7. Banse, R., Scherer, K.R.: Acoustic proﬁles in vocal emotion expression. Journal of personality and social psychology 70(3), 614

(1996)

8. Barik, R.K., Dubey, H., Samaddar, A.B., Gupta, R.D., Ray, P.K.: FogGIS: Fog Computing for Geospatial Big Data Analytics. In:

3rd IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics Engineering, India (2016)

9. Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In:

Proceedings of the institute of phonetic sciences. vol. 17, pp. 97–110. Amsterdam (1993)

10. Brusco, M., Nazeran, H.: Development of an intelligent pda-based wearable digital phonocardiograph. In: Proceedings of the 27th

IEEE Annual Conference on Engineering in Medicine and Biology. vol. 4, pp. 3506–3509 (2005)

11. Cisco: White paper published by cisco. fog computing and the internet of things: Extend the cloud to where the things are. (2015)

12. Cohen, I.: Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. IEEE Transac-

tions on audio, speech and language processing 11(5), 466–475 (2003)

13. Constant, N., Borthakur, D., Abtahi, M., Dubey, H., Mankodiya, K.: Fog-Assisted wIoT: A Smart Fog Gateway for End-to-End

Analytics in Wearable Internet of Things. In: The 23rd IEEE Symposium on High Performance Computer Architecture HPCA,

Austin, Texas, USA (2017)

14. Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental compari-

son of representations and distance measures. Proceedings of the VLDB Endowment 1(2), 1542–1552 (2008)

15. Dubey, H., Mehl, M.R., Mankodiya, K.: BigEAR: Inferring the Ambient and Emotional Correlates from Smartphone-Based

Acoustic Big Data. In: IEEE First International Conference on Connected Health: Applications, Systems and Engineering Tech-

nologies (CHASE), Washington DC, USA (June 2016)

16. Dubey, H., Golberg, C., Abtahi, M., Mahler, L., Makodiya, K.: EchoWear: Smartwatch Technology for Voice and Speech Treat-

ments of Patients with Parkinson’s Disease. In: Proceedings of the Wireless Health 2015, National Institutes of Health, Baltimore,

MD, USA. ACM (2015)

17. Dubey, H., Goldberg, J.C., Makodiya, K., Mahler, L.: A multi-smartwatch system for assessing speech characteristics of people

with dysarthria in group settings. In: Proceedings IEEE 17th International Conference on e-Health Networking, Applications and

Services (Healthcom), Boston, USA (2015)

18. Dubey, H., Kumaresan, R., Mankodiya, K.: Harmonic sum-based method for heart rate estimation using PPG signals affected

with motion artifacts. ”Journal of Ambient Intelligence and Humanized Computing” pp. 1–14 (2016), http://dx.doi.org/10.

1007/s12652-016-0422-z

19. Dubey, H., Yang, J., Constant, N., Amiri, A., Yang, Q., Makodiya, K.: Fog Data: Enhancing Telehealth Big Data Through Fog

Computing. In: Proceedings of The Fifth ASE International Conference on BigData, Kaohsiung, Taiwan. ACM (2015)

20. Dysarthria: http://www.asha.org/public/speech/disorders/dysarthria/. accessed: 2015-10-21

21. Fastl, H., Zwicker, E.: Psychoacoustics: Facts and models, vol. 22. Springer Science & Business Media (2007)

22. Gamboa, J., Jim´

enez-Jim´

enez, F.J., Nieto, A., Montojo, J., Ort´

ı-Pareja, M., Molina, J.A., Garc´

ıa-Albea, E., Cobeta, I.: Acoustic

voice analysis in patients with parkinson’s disease treated with dopaminergic drugs. Journal of Voice 11(3), 314–320 (1997)

23. Gavrila, D., Davis, L., et al.: Towards 3-d model-based tracking and recognition of human movement: a multi-view approach. In:

International workshop on automatic face-and gesture-recognition. pp. 272–277 (1995)

24. Geddes, L.: Birth of the stethoscope. IEEE Engineering in Medicine and Biology Magazine 24(1), 84–86 (2005)

25. GNU compression and decompression methods: https://www.gnu.org/software/gzip/gzip.html, year=2015,

26. Goldberg, J.C., Dubey, H., Mankodiya, K.: https://github.com/harishdubey123/wbl-echowear. online (2016), API for Hermes

27. Gonzalez, S., Brookes, M.: Pefac-a pitch estimation algorithm robust to high levels of noise. IEEE Transactions on Audio, Speech,

and Language Processing 22(2), 518–530 (2014)

28. J Holmes, R., M Oates, J., J Phyland, D., J Hughes, A.: Voice characteristics in the progression of parkinson’s disease. International

Journal of Language & Communication Disorders 35(3), 407–418 (2000)

29. JavaScript Object Notation: http://www.json.org/ (2015)

30. Johnston, J.D.: Transform coding of audio signals using perceptual noise criteria. IEEE Journal on Selected Areas in Communi-

cations 6(2), 314–323 (1988)

31. Kaiser, J.F.: Some useful properties of teager’s energy operators. In: IEEE International Conference on Acoustics, Speech, and

Signal Processing (ICASSP) (1993)

32. Kayyali, B., Knott, D., Van Kuiken, S.: The big-data revolution in us health care: Accelerating value and innovation. Mc Kinsey

& Company (2013)

33. Kent, R.D., Weismer, G., Kent, J.F., Vorperian, H.K., Duffy, J.R.: Acoustic studies of dysarthric speech: Methods, progress, and

potential. Journal of communication disorders 32(3), 141–186 (1999)

34. Kovacs-Vajna, Z.M.: A ﬁngerprint veriﬁcation system based on triangular matching and dynamic time warping. IEEE Transactions

on Pattern Analysis and Machine Intelligence 22(11), 1266–1276 (2000)

Chapter in Springer Handbook of Large-Scale Distributed Computing in Smart Healthcare 29

35. Kvedalen, E.: Signal processing using the teager energy operator and other nonlinear operators. Master, University of Oslo De-

partment of Informatics 21 (2003)

36. Lansford, K.L., Liss, J.M.: Vowel acoustics in dysarthria: Speech disorder diagnosis and classiﬁcation. Journal of Speech, Lan-

guage, and Hearing Research 57(1), 57–67 (2014)

37. Li, F., Gao, Y., Cao, Y., Iravani, R.: Improved teager energy operator and improved chirp-z transform for parameter estimation of

voltage ﬂicker. IEEE Transactions on Power Delivery 31(1), 245–253 (2016)

38. Mahler, L., Dubey, H., Goldberg, C., Mankodiya, K.: Use of smartwatch technology for people with dysarthria. In: Motor Speech

Conference at. Madonna Rehabilitation Hospital, Newport Beach, CA, USA. (2016)

39. Mart´

ınez-S´

anchez, F., Meil´

an, J., Carro, J., G´

omez, ´

I.C., Millian-Morell, L., Pujante, V.I., L ´

opez-Alburquerque, T., L´

opez, D.:

Speech rate in parkinson’s disease: A controlled study. Neurologia (Barcelona, Spain) (2015)

40. Monteiro, A., Dubey, H., Mahler, L., Yang, Q., Mankodiya, K.: FIT: A Fog Computing Device for Speech TeleTreatments. 2nd

IEEE International Conference on Smart Computing (SMARTCOMP), Missouri, USA (2016)

41. Myers, C., Rabiner, L., Rosenberg, A.: Performance tradeoffs in dynamic time warping algorithms for isolated word recognition.

IEEE Transactions on Acoustics, Speech, and Signal Processing 28(6), 623–635 (1980)

42. OpenSSL: https://www.openssl.org/ (2015)

43. Orfanidis, S.J.: Introduction to signal processing. Prentice-Hall, Inc. (1995)

44. Paliwal, K.K.: Spectral subband centroid features for speech recognition. In: Proceedings of IEEE International Conference on

Acoustics, Speech and Signal Processing (1998)

45. Pan, J., Tompkins, W.J.: A real-time QRS detection algorithm. IEEE transactions on biomedical engineering (3), 230–236 (1985)

46. Panahiazar, M., Taslimitehrani, V., Jadhav, A., Pathak, J.: Empowering personalized medicine with big data and semantic web

technology: Promises, challenges, and use cases. In: IEEE International Conference on Big Data. pp. 790–795 (2014)

47. Reed, T.R., Reed, N.E., Fritzson, P.: Heart sound analysis for symptom detection and computer-aided diagnosis. Simulation Mod-

elling Practice and Theory 12(2), 129–146 (2004)

48. Sapir, S., Ramig, L., Fox, C.: Speech and swallowing disorders in parkinson disease. Current opinion in otolaryngology & head

and neck surgery 16(3), 205–210 (2008)

49. Sobell, M.G.: A Practical Guide to Fedora and Red Hat Enterprise Linux. Pearson Education (2013)

50. Spielman, J., Mahler, L., Halpern, A., Gilley, P., Klepitskaya, O., Ramig, L.: Intensive voice treatment (lsvt R

loud) for parkinson’s

disease following deep brain stimulation of the subthalamic nucleus. Journal of communication disorders 44(6), 688–700 (2011)

51. Sun, X.: A pitch determination algorithm based on subharmonic-to-harmonic ratio (2000)

52. Tan, Z.H., Lindberg, B.: Low-complexity variable frame rate analysis for speech recognition and voice activity detection. IEEE

Journal of Selected Topics in Signal Processing 4(5), 798–807 (2010)

53. Tsanas, A.: Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statis-

tical machine learning. Ph.D. thesis, University of Oxford (2012)

54. Varghees, V.N., Ramachandran, K.: A novel heart sound activity detection framework for automated heart sound analysis. Biomed-

ical Signal Processing and Control 13, 174–188 (2014)

55. Yang, Y.H., Lin, Y.C., Su, Y.F., Chen, H.H.: A regression approach to music emotion recognition. IEEE Transactions on Audio,

Speech, and Language Processing 16(2), 448–457 (2008)

56. Zwicker, E., Fastl, H.: Psychoacoustics: Facts and models, vol. 22. Springer Science & Business Media (2013)

ECG permanence evaluation with QRS-complex correlation graphs

Thesis

Full-text available

Jun 2018

Audun Kjeldaas

We see that ECG is becoming more of a viable option for biometric authentication and in some cases biometric identification and there are only a few studies related to the permanence of the ECG-signature of a person. Most of the studies we found regarding change over time were done by the medical community and looked at changes during different types of heart conditions, to be able to predict illness. We saw the need for more work performed on younger, healthy subjects, to discover if there is a change of the ECG-signature over time, that would warrant frequent reenrolments. We looked for available, reasonably priced, non-medical equipment, that would enable us to collect the biometric samples we needed. We followed several different subjects over a period of two to three months, limited by the time constraints of a thesis like this. We extracted the QRS-complex from the biometric samples we collected in each session. Then we compared the different wavelets of a subject with the other from the same subject, to discover if there were any recognisable trend in the differences between them. To do this we calculated the correlation coefficient for each comparison set and looked for whether the coefficient stayed the same or slowly got smaller over the time of the study. Our measure for permanence is the degree with which the curve slopes downward. That is horizontal curve equals perfect permanence over the time interval measured and the faster the curve sinks, the lower the permanence over the time interval measured. This is a quick visual way of inspecting whether the permanence is good enough for a given time frame and con be utilized in a company's R&D as a quick preliminary investigation, before investing heavily in a project. We discovered no sinking trend, so we concluded that based on the limited timespan, the permanence of ECG as a biometric modality is good. We recommend a similar study over a much longer period of time, maybe several years. We also concluded that a system with more or less continuous enrolment during the use of the system, will eliminate the need for long time permanence, but that it would be interesting for systems where a person might not authenticate for a long period of time, like for bank deposit boxes or voting.

Fog computing security: A review

Article

Full-text available

Mar 2023

Fog computing, also known as edge computing, is a decentralized computing architecture that brings computing and data storage closer to the users and devices that need it. It offers several advantages over traditional cloud computing , such as lower latency, improved reliability, and enhanced security. As the Internet of Things continues to grow, the demand for fog computing is also increasing, making it an important topic for research and development. However , the deployment of fog computing also brings new technical challenges and security risks. For example, fog nodes are often deployed in resource-constrained environments and are exposed to potential security threats, such as malware and attacks on devices connected to the network. In addition, the decentralized nature of fog computing creates new challenges in terms of privacy, security, and data management. This survey aims to address these technical challenges and research gaps in the field of fog computing security. It provides an overview of the current state of fog computing and its security challenges, and identifies key areas for future research. The survey also highlights the importance of fog computing security and the need for continued investment in this area in order to fully realize the potential of this promising technology.

Sensor Data Fusion and Big Mobility Data Analytics for Activity Recognition

Conference Paper

Oct 2019

Evaluation and Analysis of Bio-Inspired Techniques for Resource Management and Load Balancing of Fog Computing

Thesis

Full-text available

Jul 2019

With the evolution of fog computing, processing takes place locally in a virtual platform rather than in a centralized cloud server. Fog computing combined with cloud computing is more efficient as fog computing alone does not serve the purpose. Inefficient resource management and load balancing leads to degradation in quality of service as well as energy losses. Traffic overhead is increased because all the requests are sent to main server causing delays which cannot be tolerated in health-care scenarios. To overcome this problem, the authors are consolidating fog computing resources so that requests are handled by cloudlets and only critical requests are sent to cloud for processing. Servers are placed locally in each city to handle the near-by requests in order to utilize the resources efficiently along with load balancing among all the servers, which leads to reduced latency and traffic overhead with the improved quality of service. Due to the limited data storage capacity available to Internet service providers and large-scale enterprises, the concept of resource sharing arises. The services can be given on lease to enterprises through Service Level Agreements (SLAs). Being the extension of the cloud computing, fog computing architecture brings the resources near end users. In order to get the services on lease, the enterprises are supposed to pay for the resources or services which are being used by them. For this, four nature inspired algorithms are analyzed in order to determine the efficient management of services or resources so that the cost of resources can be reduced and the billing can be attained through calculation of the utilized resources. Pigeon Inspired Optimization (PIO), Enhanced Differential Evolution (EDE), Binary Bat Algorithm (BBA) and Simple Human Learning Optimization (SHLO) are used to evaluate the energy consumed by the edge nodes or cloudlets that in turn can be used for estimating the bill through the Time of Use pricing variable. We evaluate the aforementioned techniques to analyze their performance regarding the bill calculation on the basis of fog servers usage. Simulation results demonstrate that BAT algorithm gives significantly better results than other three algorithms in terms of resource utilization and bill reduction.

Analysis of multi-dimensional Industrial IoT (IIoT) data in Edge-Fog-Cloud based architectural frameworks : A survey on current state and research challenges

Article

Jul 2023

Internet of Things in the Field of Health Care: Trends and Challenges

Article

Full-text available

Jun 2022

The internet of things has maintained continuous growth in recent years. The potentialities of use that it shows in different fields have been widely documented. Its effective use in the field of health can bring improvements in the efficiency of medical treatments, prevention of risky situations, help raising the quality of service and provide support for decision-making. The present review explores into core aspects of its use in order to analyze trends, challenges and strengths. Document analysis was used to show the main characteristics of these systems, as well as their architecture, tools used for the management of the captured data and security mechanisms. The use of the internet of things in the health field has a great impact, improving the lives of millions of people around the world and providing great opportunities for the development of intelligent health systems.

An AI-based Prediction-as-a-Service Model for Estimating Machine Bearing Health Status in Industry 4.0 5G Applications

Conference Paper

Full-text available

Jul 2021

OPTIMIZING TELEMEDICINE FRAMEWORK USING FOG COMPUTING FOR SMART HEALTHCARE SYSTEMS

Preprint

Full-text available

Sep 2021

Michael Enbibel

This research is done for optimizing telemedicine framework by using fogging or fog computing for smart healthcare systems. Fog computing is used to solve the issues that arise on telemedicine framework of smart healthcare system like Infrastructural, Implementation, Acceptance, Data Management, Security, Bottleneck system organization, and Network latency Issues. we mainly used Distributed Data Flow (DDF) method using fog computing in order to fully solve the listed issues.

Binarized Multi-Factor Cognitive Detection of Bio-Modality Spoofing in Fog Based Medical Cyber-Physical System

Conference Paper

Jan 2019

Utilization and load balancing in fog servers for health applications

Article

Full-text available

Apr 2019
EURASIP J WIREL COMM

Abstract With the evolution of fog computing, processing takes place locally in a virtual platform rather than in a centralized cloud server. Fog computing combined with cloud computing is more efficient as fog computing alone does not serve the purpose. Inefficient resource management and load balancing leads to degradation in quality of service as well as energy losses. Traffic overhead is increased because all the requests are sent to the main server causing delays which cannot be tolerated in healthcare scenarios. To overcome this problem, the authors are consolidating fog computing resources so that requests are handled by foglets and only critical requests are sent to the cloud for processing. Servers are placed locally in each city to handle the nearby requests in order to utilize the resources efficiently along with load balancing among all the servers, which leads to reduced latency and traffic overhead with the improved quality of service.

Harmonic Sum-based Method for Heart Rate Estimation using PPG Signals Affected with Motion Artifacts

Article

Full-text available

Feb 2018

Wearable photoplethysmography has recently become a common technology in heart rate (HR) monitoring. General observation is that the motion artifacts change the statistics of the acquired PPG signal. Consequently, estimation of HR from such a corrupted PPG signal is challenging. However, if an accelerometer is also used to acquire the acceleration signal simultaneously, it can provide helpful information that can be used to reduce the motion artifacts in the PPG signal. By dint of repetitive movements of the subjects hands while running, the accelerometer signal is found to be quasi-periodic. Over short-time intervals, it can be modeled by a finite harmonic sum (HSUM). Using the HSUM model, we obtain an estimate of the instantaneous fundamental frequency of the accelerometer signal. Since the PPG signal is a composite of the heart rate information (that is also quasi-periodic) and the motion artifact, we fit a joint HSUM model to the PPG signal. One of the harmonic sums corresponds to the heart-beat component in PPG and the other models the motion artifact. However, the fundamental frequency of the motion artifact has already been determined from the accelerometer signal. Subsequently, the HR is estimated from the joint HSUM model. The mean absolute error in HR estimates was 0.7359 beats per minute (BPM) with a standard deviation of 0.8328 BPM for 2015 IEEE Signal Processing cup data. The ground-truth HR was obtained from the simultaneously acquired ECG for validating the accuracy of the proposed method. The proposed method is compared with four methods that were recently developed and evaluated on the same dataset.

BigEAR: Inferring the Ambient and Emotional Correlates from Smartphone-based Acoustic Big Data

Conference Paper

Full-text available

Jun 2016

This paper presents a novel BigEAR big data framework that employs psychological audio processing chain (PAPC) to process smartphone-based acoustic big data collected when the user performs social conversations in naturalistic scenarios. The overarching goal of BigEAR is to identify moods of the wearer from various activities such as laughing, singing, crying, arguing, and sighing. These annotations are based on ground truth relevant for psychologists who intend to monitor/infer the social context of individuals coping with breast cancer. We pursued a case study on couples coping with breast cancer to know how the conversations affect emotional and social well being. In the state-of-the-art methods, psychologists and their team have to hear the audio recordings for making these inferences by subjective evaluations that not only are time-consuming and costly, but also demand manual data coding for thousands of audio files. The BigEAR framework automates the audio analysis. We computed the accuracy of BigEAR with respect to the ground-truth obtained from a human rater. Our approach yielded overall average accuracy of 88.76% on real-world data from couples coping with breast cancer.

FIT: A Fog Computing Device for Speech TeleTreatments

Conference Paper

Full-text available

May 2016

Harishchandra Dubey

There is an increasing demand for smart fogcomputing gateways as the size of cloud data is growing. This paper presents a Fog computing interface (FIT) for processing clinical speech data. FIT builds upon our previous work on EchoWear, a wearable technology that validated the use of smartwatches for collecting clinical speech data from patients with Parkinson’s disease (PD). The fog interface is a low-power embedded system that acts as a smart interface between the smartwatch and the cloud. It collects, stores, and processes the speech data before sending speech features to secure cloud storage. We developed and validated a working prototype of FIT that enabled remote processing of clinical speech data to get speech clinical features such as loudness, short-time energy, zero-crossing rate, and spectral centroid. We used speech data from six patients with PD in their homes for validating FIT. Our results showed the efficacy of FIT as a Fog interface to translate the clinical speech processing chain (CLIP) from a cloud-based backend to a fog-based smart gateway.

Use of Smartwatch Technology for People with Dysarthria

Conference Paper

Full-text available

Mar 2016

Purpose: Dysarthria is caused by a variety of neurological diagnoses resulting in decreased communicative effectiveness. Treatment of dysarthria could be improved if speech-language pathologists (SLPs) had the ability to obtain speech data during exercises and typical daily activities outside the clinic. This study evaluated the feasibility of smartwatch technology to collect reliable speech data in ecologically valid environments outside of the clinical environment. Methods: Six people with hypokinetic dysarthria secondary to PD were recruited for this study, three men and three women. The length of the study was four weeks. Participants were randomized to use the smartwatch in weeks 1 and 3 or weeks 2 and 4 to allow comparison of exercises with and without the smartwatch technology. Participants completed voice and speech exercises twice each day consisting of sustained “ah”, high and low pitch exercises, reading sentences aloud, and one functional speech task. Participants also completed questionnaires to assess their experience using the smartwatch. Results: Vocal intensity and pitch data were successfully obtained from the smartwatch during the field trial. Five of the six participants reported they completed their exercises more frequently during the trial with the smartwatch. Three participants indicated they would use the system regularly if it was available and three reported they would use it periodically. Five of the six participants found the smartwatch technology very easy or easy to use.

Fog Data: Enhancing Telehealth Big Data Through Fog Computing

Conference Paper

Full-text available

Nov 2015

The size of multi-modal, heterogeneous data collected through various sensors is growing exponentially. It demands intelligent data reduction, data mining and analytics at edge devices. Data compression can reduce the network bandwidth and transmission power consumed by edge devices. This paper proposes, validates and evaluates Fog Data, a service-oriented architecture for Fog computing. The center piece of the proposed architecture is a low power embedded computer that carries out data mining and data analytics on raw data collected from various wearable sensors used for telehealth applications. The embedded computer collects the sensed data as time series, analyzes it, and finds similar patterns present. Patterns are stored, and unique patterns are transmited. Also, the embedded computer extracts clinically relevant information that is sent to the cloud. A working prototype of the proposed architecture was built and used to carry out case studies on telehealth big data applications. Specifically, our case studies used the data from the sensors worn by patients with either speech motor disorders or cardiovascular problems. We implemented and evaluated both generic and application specific data mining techniques to show orders of magnitude data reduction and hence transmission power savings. Quantitative evaluations were conducted for comparing various data mining techniques and standard data compression techniques. The obtained results showed substantial improvement in system efficiency using the Fog Data architecture.

EchoWear: Smartwatch Technology for Voice and Speech Treatments of Patients with Parkinson’s Disease

Conference Paper

Full-text available

Oct 2015

About 90 percent of people with Parkinson’s disease (PD) experience decreased functional communication due to the presence of voice and speech disorders associated with dysarthria that can be characterized by monotony of pitch (or fundamental frequency), reduced loudness, irregular rate of speech, imprecise consonants, and changes in voice quality. Speech-language pathologists (SLPs) work with patients with PD to improve speech intelligibility using various intensive in-clinic speech treatments. SLPs also prescribe home exercises to enhance generalization of speech strategies outside of the treatment room. Even though speech therapies are found to be highly effective in improving vocal loudness and speech quality, patients with PD find it difficult to follow the prescribed exercise regimes outside the clinic and to continue exercises once the treatment is completed. SLPs need techniques to monitor compliance and accuracy of their patients’ exercises at home and in ecologically valid communication situations. We have designed EchoWear, a smartwatch-based system, to remotely monitor speech and voice exercises as prescribed by SLPs. We conducted a study of 6 individuals; three with PD and three healthy controls. To assess the performance of EchoWear technology compared with highquality audio equipment obtained in a speech laboratory. Our preliminary analysis shows promising outcomes for using EchoWear in speech therapies for people with PD.

Speech rate in Parkinson's disease: A controlled study

Article

Full-text available

Feb 2015
NEUROLOGIA

Speech disturbances will affect most patients with Parkinson's disease (PD) over the course of the disease. The origin and severity of these symptoms are of clinical and diagnostic interest. To evaluate the clinical pattern of speech impairment in PD patients and identify significant differences in speech rate and articulation compared to control subjects. Speech rate and articulation in a reading task were measured using an automatic analytical method. A total of 39 PD patients in the 'on' state and 45 age-and sex-matched asymptomatic controls participated in the study. None of the patients experienced dyskinesias or motor fluctuations during the test. The patients with PD displayed a significant reduction in speech and articulation rates; there were no significant correlations between the studied speech parameters and patient characteristics such as L-dopa dose, duration of the disorder, age, and UPDRS III scores and Hoehn & Yahr scales. Patients with PD show a characteristic pattern of declining speech rate. These results suggest that in PD, disfluencies are the result of the movement disorder affecting the physiology of speech production systems. Copyright © 2014 Sociedad Española de Neurología. Published by Elsevier Espana. All rights reserved.

A novel heart sound activity detection framework for automated heart sound analysis

Article

Full-text available

Sep 2014
BIOMED SIGNAL PROCES

In automated heart sound analysis and diagnosis, a set of clinically valued parameters including sound intensity, frequency content, timing, duration, shape, systolic and diastolic intervals, the ratio of the first heart sound amplitude to second heart sound amplitude (S1/S2), and the ratio of diastolic to systolic duration (D/S) is measured from the PCG signal. The quality of the clinical feature parameters highly rely on accurate determination of boundaries of the acoustic events (heart sounds S1, S2, S3, S4 and murmurs) and the systolic/diastolic pause period in the PCG signal. Therefore, in this paper, we propose a new automated robust heart sound activity detection (HSAD) method based on the total variation filtering, Shannon entropy envelope computation, instantaneous phase based boundary determination, and boundary location adjustment. The proposed HSAD method is validated using different clean and noisy pathological and non-pathological PCG signals. Experiments on a large PCG database show that the HSAD method achieves an average sensitivity (Se) of 99.43% and positive predictivity (+P) of 93.56%. The HSAD method accurately determines boundaries of major acoustic events of the PCG signal with signal-to-noise ratio of 5 dB. Unlike other existing methods, the proposed HSAD method does not use any search-back algorithms. The proposed HSAD method is a quite straightforward and thus it is suitable for real-time wireless cardiac health monitoring and electronic stethoscope devices.

Improved Teager Energy Operator and Improved Chirp-Z Transform for Parameter Estimation of Voltage Flicker

Article

Jan 2015

Effective estimation of voltage flicker components plays an important role in distribution systems for either flicker meters or flicker compensators. A novel approach has been presented in this paper to accurately estimate voltage flicker components by using the improved Teager energy operator (ITEO) and the improved chirp-Z transform (ICZT). The error correction factor K of the Teager energy operator is presented and ITEO is established to reduce the extraction errors of voltage flicker waveform. ICZT is used to extract the frequency and magnitude of the voltage envelope which is corrected by the K factor of ITEO. The effects of signal sampling rate, sampling number, spectrum subdivision points of ICZT, voltage harmonics and interharmonics, frequency fluctuation, and white noise are investigated. The implementation of the proposed approach in the digital-signal-processor platform is also introduced. Multiple simulation and experimental test application results validate the accuracy and efficiency of the proposed approach.

The "big data" revolution in healthcare. Accelerating value and innovation

Book

Jan 2013

Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

Abstract and Figures

Recommended publications

Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

Fog Computing in Medical Internet-of-Things: Architecture, Implementation and Applications

Smart Fog: Fog Computing Framework for Unsupervised Clustering Analytics in Wearable Internet of Thi...

Fit: A Fog Computing Device for Speech Tele-Treatments