Content uploaded by Hooman Samani
Author content
All content in this area was uploaded by Hooman Samani on Apr 12, 2019
Content may be subject to copyright.
Design and Development of Interactive Intelligent
Medical Agent
Shu-Chiao Tasi1, Hooman Samani1, Yu-Wei Kao1, Kening Zhu2, Brian Jalaian3
National Taipei University, Taiwan1
School of Creative Media, City University of Hong Kong2
U.S. Army Research Laboratory, Adelphi, Maryland, USA3
Abstract—In this age of high technology development, a
greater number of people are facing modern civilization diseases
due to lifestyle changes.
Fast-paced and busy lifestyles, noisy
nightlife, and other various social parameters have resulted in
increased disturbance during sleeping. In addition to external
causes, a stressed-out mind is a source of anxiety or excitement
that can lead to sleep disturbances. The aim of this research is to
propose an interactive robotic system companion which could be
used in the treatment of insomnia by providing various interactive
services to the user. In addition to direct communication, the robot
employs a variety of environmental and physiological sensors to
receive feedback from a user and observe performance. One of the
key modules of this system is the use of electroencephalography for
monitoring sleep quality. We use an interaction design approach
based on simplicity and approachability. Speech Recognition and
Human-Computer Interaction are the two major parts of this
research. The robot’s audio channel uses Speech Recognition to
communicate with the user, mostly via conversation because other
forms of direct communication are not convenient for this
scenario. The Human-Computer Interaction aspect includes
playing suitable music (including white noise) and spraying
essential oils according to different instructions. In addition to
establishing multimodal interactive connection with the user, the
system also provides entertainment services. The aim is for this
proposed robot to act as a personal companion for people with
insomnia and improve their quality of sleeping.
Keywords— Social Robot, insomnia treatment, human-computer
interaction, EEG
I. INTRODUCTION
Sleep is the most important part of human rest. Its main
function is to restore and supplement the physical energy lost
while performing daytime activities. Sleep also regulates and
reconstructs personal emotional behavior and learning cognitive
memory. People are sleeping less nowadays, probably because
of the
ir busy, high-pressure lifestyles, which include increased
time spent at work, engaging in social networking, and
interacting in virtual Internet worlds. Serious complications
from long-term insufficient sleep include health problems and
increased risk of disease; mental, autonomic nervous, and
behavior disorders; memory decline; and negative emotions.
The most serious consequence from sleep deprivation is death
from overwork. Sleep quality is more important than sleep
quantity and quality is irrelevant to the length of sleep.
This research explores sleep disorders to help solve the
insomnia problem. A wearable wireless physiological monitor,
or
Electroencephalogram (EEG) sensor, is used to help the user
get to sleep and monitor their sleep quality. Although many
peo ple are concerned about and search for ways to improve their
sleep quality, they ignore the affects that their mobile phones and
computers are having. Artificial light, especially blue light,
affects the circadian rhythm of
the
human body. It causes the
pineal gland to inhibit the brain’s release of melatonin, which
interferes with sleep. An interactive robot is a good sleep partner
because through a simple dialogue, it can provide advice and
services for relaxation and sleep, and can use the EEG to monitor
a user’s sleep quality.
Teams at home and abroad have studied the use of artificial
intelligence to help sleep. Students at Technische Universiteit
Delft in Holland have developed a peanut-shaped pillow with
soft material called a Somnox robot. When held, this device
measures the user’s respiration rate and
create
s a simulated
breathing rhythm. The user subconsciously adjusts their
breathing to the Somnox’s consistent rhythm, which helps them
to relax and fall asleep, effectively relieving insomnia. At
present, the Somnox robot is in the prototype stage [1].
II. METHOD
A. System Design
This research
is
divided into two major parts: Speech
Recognition and Human-Computer Interaction. The robot
communicates with the user via speech recognition.
Human-computer interaction includes input and output
components, with output based on the input conditions.
The body sensor module of the input component is divided
into wearable and non-wearable types. Wearable EEG device is
used
in a form of a headset. The output component includes the
digital fragrance module, which applies a scent according to
input settings. The design component includes the external
appearance, internal structure, and LED modules. The
architecture of the robot is shown in Figure 1.
210
2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)
978-1-5386-9269-1/18/$31.00 ©2018 IEEE
DOI 10.1109/AIVR.2018.00049
Fig 1. System architecture
B. Mechanism design
Because the robot performs many functions, it is necessary
to position the environmental and body surface sensors in the
corresponding part of the robot. For example, the microphone,
speaker, and photoreceptor will be placed in the robot's ear,
mouth, and eye, respectively. The digital fragrance module
should include an override function to prevent over-
dispens
ing
and an internal fan to control volatilization of the essential oil.
The clock component performs two functions. The first is to
set up a wake up time to prevent over-sleeping; the second is to
remind the user to prepare for bedtime. This reminder can be
set to a fixed time every day, to help the user develop good sleep
habits.
C. Input
Individual sleep requirements and physical fitness level are
different but in general, people sleep best in a room temperature
of 20−23 °C. Lowering the room temperature or humidity is
counterproductive because when the body is too cold, the blood
vessels narrow, which increases the burden on the body [2].
The temperature of the
bed cover
, though often ignored, also
affects the quality of sleep. According to studies, people fall
asleep faster when the bedding temperature is 32−34 °C and the
temperature of the bed cover is low. Subjecting the human body
surface to a cold stimulus for a period of time will stimulate the
cerebral cortex, thus postponing the sleep time or negatively
affecting deep sleep [3].
Another factor that may affect sleep comfort is humidity.
Humidity does not directly affect sleep
, but it can create an
environment that interferes with overall sleep quality.
Individual levels vary, but 50%−65% humidity is generally
recommended as ideal. Modern conventions such as air
conditioning, an automatic dehumidifier, heater and cooler can
be used to adjust the humidity in the room to avoid conditions
that are too dry or too wet [4].
A sound with a decibel level greater than 60 will wake a
person, but even a volume of about 32 decibels (which is not
considered high) may interrupt their sleep. A noise sensor is
used to obtain the maximum sound intensity value for a certain
period of time, and the audio is simplified and reduced.
Meanwhile, a potentiometer can adjust the output signal gain.
After checking the volume of sleep, the data will be compared
with the established database, and the value will be
distinguished, and the result will be communicated through the
user interface [5].
Environmental photoreceptor measurements are taken in
and around the environment, and the sensitivity of the response
to the human eye is considered. The sensor will continue to
measure and provide a consistent output display no matter what
the lighting conditions are. The influence of spectral
characteristics of light sources on light sensors is much greater
than we imagine.
Fig 2. Elektromagnetisches Spectrum [6]
D. Wearable sensor (EEG sensor)
Brainwaves can be divided into five main categories that
vary in frequency according to a person’s activities and
emotions: Delta Wave, Theta Wave, Alpha Wave, Beta Wave,
and Gamma Wave. These combinations of consciousness
determine a person's behavior, emotions, and learning
performance, both inside and outside [7].
The delta wave is a slow, high-amplitude brainwave with a
frequency of 0.5−4 Hz that is related to the third and fourth
phases of slow wave sleep. It is generally regarded as the
brainwave that occurs in the deep sleep state. When the delta
wave is the dominant brainwave, the human body is in the
dreamless, unconscious state of deep sleep, which has a direct
effect on the quality of sleep. Its wave pattern is shown in
Figure 3 [8].
Fig 3. Delta wave
211
The theta wave is a brainwave with a frequency of 4−8 Hz
that is related to the second phase of slow wave sleep. This
brainwave affects a person's involuntary attitudes, expectations,
beliefs, and behaviors and occurs under hypnosis and in deep
meditation. When theta wave is dominant, consciousness is in
a state of interruption and highly receptive to the information
stimulus of the outside world. Theta wave is quite helpful for
long term memory, so it is also called "The Gateway to
Learning and Memory.” Its wave pattern is shown in Figure 4
[9].
Fig 4. Theta wave
The alpha wave is a slow wave with a frequency of 8−12 Hz.
When a human's consciousness is clear and their body is relaxed,
the alpha wave is dominant, but it decreases when they are
awake, anxious, or exercising. When the alpha wave is
dominant, the energy consumption of the body is low and the
energy consumption of
the brain is relatively high, so data
processing and creativity are more fluent. Therefore, Science is
advocating the alpha state as the best state for learning. Its wave
pattern is shown in Figure 5 [10].
Fig 5. Alpha wave
The beta wave is a fast, low-amplitude wave with a
frequency of 12−35 Hz. This
brainwave occurs in wakefulness,
which is related to thinking, anxiety, operation, and attention.
The high-frequency beta wave will cause emotional excitement
and anxiety. Many tranquilizers (such as barbiturates) achieve
results by reducing the beta wave stabilization effect. When the
beta wave is too strong, the body tires easily. Without sufficient
rest, pressure accumulates. The beta wave
pattern
is shown in
Figure 6 [11].
Fig 6. Beta wave
The gamma wave is the fastest brainwave, with a frequency
above 40 Hz, and is related to Rapid Eye Movement (REM)
sleep. The high frequency of the gamma state makes people feel
profoundly insightful. The wave pattern is shown in Figure 7
[12].
Fig 7. Gamma wave
The EEG sensor is used to collect brainwave information,
which is arranged in ascending frequency (i.e., delta, theta,
alpha, beta, and gamma) so that the corresponding state of
consciousness is clear (i.e., unconscious, subconscious, sense,
clear, and warning). In addition, the best learning should occur
between alpha and beta waves, and the best sleep should occur
between theta and alpha waves [13].
This study used Sheng Wang Technology’s BrainLink 60Hz
brainwave sensor, which can accurately detect the brain’s state.
Its main features include medical grade materials, Bluetooth
connectivity, LED indicators, high-integration PCT patented
interface, precision testing, a low battery reminder,
UL-certified battery safety, and 8 hours of use after 1 hour of
charging. For all experimental activities such as sleep detection,
the BrainLink is a portable EEG that includes a TGAM EEG
module, Bluetooth module, and USB rechargeable lithium
battery. It works with Bluetooth and the baud rate is about
57600bps, and fits adults and children with a
head
circumference of 47–71 cm [14].
Fig 8. BrainLink 60Hz brainwave sensor
E. Non-wearable sensors
In general, body temperature begins to rise in the morning,
and continues to rise until the evening, where it reaches its
highest point and the body is most active. After that, body
temperature will begin to drop, reaching its lowest point at 4:00
a.m.
Abnormalities in the body
temperature rhythm can make
sleeping well difficult [15].
This study used the SparkFun MAX30105 Particle Sensor
Breakout particle sensor, which is a flexible and powerful
sensor that enables sensing of distance, particle detection, heart
rate, and even the blink of an eye. It is equipped with a three-
LED photon detector, is very sensitive, and can detect
different
types of particles or material (such as oxygenated blood or
smoke from a fire). The MAX30105 uses red, green, and
infrared LEDs for presence sensing, heart beat plotting, heart
rate monitoring, and more [16].
F. Output
Music is one method that can be used to help with sleep
212
difficulties. Teams of doctors and music therapists have done
many experiments to determine if subjects with sleep issues are
helped by listening to soothing music. Data for these
experiments were collected using multiple physiological tests,
which showed that listening to relaxing music can reduce stress
responses, lower heart rate, and reduce subjective anxiety. The
music features that show positive sleep effects are slow speed
(a tempo of about 60bpm) and a smooth, continuous melody.
Though music may not shorten the ti
me
it takes to fall asleep, it
can increase sleep quality [17].
According to studies, white noise can regulate dopamine in
the brain. White noise has been proven to help control attention
deficit hyperactivity disorder (ADHD), effectively enhance
connectivity in different brain regions, and enhance memory.
This study uses white noise to help sleep, generating it through
internal fan operation and stomatal diversion. A
single-frequency sound can cover noise in the environment and
can
be increased to
different volumes according to the loudness
of environmental noise. Increasing the volume of white noise
allows users to get used to it. The white noise frequency trains
the user to ignore other sounds to achieve the effect of better
sleep [18].
Some cultures use “number of sheep” as a mentally
suggestive activity that helps people to relax and fall asleep. It
is speculated that this method induces sleep through simple,
boring repetitions. According to Hong Kong’s “Apple” report,
Jia-hui Yu, a hypnosis training instructor with many years of
experience, “The most accurate way to count sheep is to think
of a fence first. There are about twenty sheep inside. Then
imagine an empty fence next to it. The flock must jump from
the old fence one by one to the new fence.” This process
requires a high degree of concentration, and most people fall
asleep before they have finished counting [19].
G. Digital scent
The key to olfactory technology is to receive and digitize
actual particle data and use it to reproduce scents. The
information technology industry has produced some results,
including iSmell, which was exhibited at COMDEX, the largest
American computer expo trade show. The concept behind
iSmell was to recreate smells and perfumes through a device
connected to a PC [20].
At the end of 1950s, Hans Laube invented the
Smell-O-Vision system, using the air conditioning system to
release odors over the course of a film, so that the audience
could smell what was happening in the movie.
In our
study, we placed a number of bottles into the robot,
each containing a different essential oil. The robot released the
essential oil according to the user’s emotional state [21].
III. USER INTERFACE
A. Speech recongnition
Speech recognition systems, which are primarily based on
statistical pattern recognition technology, consist of the
following five modules:
1. Signal processing and feature extraction module
2. Acoustic model (typical systems are modeled by
Hidden Markov Model [HMM])
Fig 9. Hidden Markov Model
3. Pronunciation dictionary
4. Language model
5. Decoder
A speech recognition flow chart is shown in Figure 10.
Fig 10. Speech recognition flow chart
B. Speech synthesis
The Text-to-speech (TTS) system converts text into speech.
In order to synthesize a more natural voice and to simulate the
emotional state of the person speaking, the scheme will use
prosody characteristic parameters for emotional speech
synthesis. This technology must first analyze the prosodic
parameters of four emotions (e.g., happiness, anger, sadness,
and boredom). It then
establishes an emotional template and
tone model, using the waveform splicing speech synthesis
technology and the time domain pitch synchronous
superposition algorithm (PSOLA) to synthesize the emotional
voice.
The research will use the open emotional database Berlin
Emotion Speech database (BES). Using ACF and AMDF, the
pitch period of speech signal is calculated and the average value
is obtained. The calculation formula of the autocorrelation
method is as follows:
() = ()(
+
)
The formula for calculating the mean amplitude difference
method is as follows:
213
() = |(+)()|
!
!
In this study, the speech signal endpoint detection technique
is used to calculate the voiced segment with short time energy.
The short time zero crossing rate is used to calculate the
voiceless segment, and the effective speech segment is detected
from the continuous speech, including its starting and
termination points. Then we use the lower formula to calculate
the amplitude energy of each frame and obtain
the energy
spectrum as follows:
() =
log(
)!
!
Finally, time domain pitch synchronous overlap add
(TD-PSOLA) is used to synthesize the speech containing
emotion. The time domain pitch synchronization superposition
algorithm is a waveform-editing technique, which mainly
changes the various prosodic parameters in the speech and then
changes the effect of the synthetic speech. The rhyme
parameters include the sound length, the sound intensity
, and
the pitch. The steps of this technology are pitch synchronization
analysis, pitch synchronization modification, and pitch
synchronization synthesis.
Pitch synchronization analysis is the core of the algorithm.
To ensure the synthesis of the waveform and the frequency of
the spectrum, it is necessary to accurately mark out the
complete pitch period and modify the rhythm in a complete
pitch period. However, because the voiceless sound has no base
period and the voiced sound is only a quasi-pitch period signal,
the base of the sound is the base. The sound period is replaced
by a constant, and because it is necessary to make the speech
waveform continuous when the period of the initial pitch near
the voiced segment is better,
the following pitch labeling steps
need to be carried out: the endpoint detection of the speech
signal, the judgment of voiceless voiced sound, the position
sequence of the speech cycle, the estimation of the pitch period
and the position tagging, and the speech synthesis after
completion. Window processing is used to obtain short time
signals () as follows:
() = ℎ(−)()
Pitch synchronization modification adjusts the
synchronization mark obtained in the last step according to
certain rules and produces a new set of pitch synchronization
markers, including the pitch length, pitch frequency, and sound
intensity. The length of the pitch is modified to increase or
reduce the length of the synthetic speech, inserting or deleting
several periodic waveforms in the base sound of the original
voice. The modification of the pitch includes adjusting the pitch
to change the pitch frequency of the synthetic speech (that is,
the distance between the original pitch synchronization marks),
and modifying the sound intensity to the original speech wave.
Pitch synchronization synthesis is applied to superimpose
the short time speech database sequence acquired in the
previous step. The least square superposition method is applied
as follows:
̅() = ∑̅()ℎ
(̅−)
∑ℎ
(̅−)
!
!
C. Action command
After the speech recognition system interprets the user’s
instructions, an action plan is implemented according to the
following steps: (1) the user sets the sleep and wake time in the
robot and the robot reminds the user when it is time for rest, (2)
the user calls the robot if they cannot sleep and the robot asks
the user if they know the factors causing their insomnia
(physiological or psychological), (3) if the user responds with
“physiological”, the robot provides suggestions; if the user
responds with “psychological”, the robot will play music or
spray fragrance to relieve the stress,
(4) if the user responds with
“physiological and psychological” or does not know which
factors are causing their insomnia, the robot will first check the
sleep environment (including noise, light, temperature, and
humidity), then the user's temperature and pulse rate; (6) the
robot compares the measurement data with a database to give a
suggestions; (7) the robot starts the
alarm cloc
k function, so that
the user is awakened with joyful music.
The music and digital fragrance functions are programmed
into the robot by the user. The most important feature of this
study is the use of EEG sensors which, if worn by the user,
allow the robot to gauge the user's sleep quality. The sheep
model is another interesting special feature. The user interface
instruction action flow chart is shown in Figure 11.
Fig 11. User interface chart
IV. CONCLUSION
In this paper we have proposed a robot to serve as a personal
companion to help people with insomnia improve the quality of
their sleep. This robot would be positioned next to the sleeper,
interacting with them through various means. The system uses
a microphone and temperature, heartbeat, and EEG sensors to
observe the user’s state (i.e., input) and produces white noise
214
and smell in response (i.e., output). In the future, we aim to
improve this robot by adding mobility capabilities and
improved signal processing power, as well as improved design
and functionality features.
Fig.12. Scenario of a user interacting with the system
ACKNOWLEDGMENT:
This work was partially supported by grants from the Research
Grants Council of the Hong Kong Special Administrative
Region, China (Project No. CityU 21200216), City University
of Hong Kong (Project No. 7005021, 7005021, 6000623, &
6000623), and ACIM-
SCM.
REFERENCES:
[1]
Samani, H. (2015). Cognitive robotics. CRC Press.
[2]
Saadatian, E., Samani, H., Toudeshki, A., & Nakatsu, R.
(2013, October). Technologically mediated intimate
communication: An overview and future directions. In
International Conference on Entertainment Computing
(pp. 93-104). Springer, Berlin, Heidelberg..
[3]
Zhu, K., Zhu, R., Nii, H., Samani, H., & Jalaeian, B. B.
(2014). PaperIO: a 3D interface towards the internet of
embedded paper
-
craft. IEICE TRANSACTIONS on
Information and Systems, 97(10), 2597-2605.
[4]
Saadatian, E., Samani, H., Vikram, A., Parsani, R.,
Rodriguez, L. T., & Naka
tsu, R. (2013, August).
Personalizable embodied telepresence system for remote
interpersonal communication. In RO
-
MAN, 2013 IEEE
(pp. 226-231). IEEE.
[5]
R. F. Yu, Don't let noises murder your hearing, New
naturalism, 2014.
[6]
Samani, H. A. (2012). Lovotics: Loving robots. LAP
LAMBERT Academic Publishing..
[7]
Y. D. Li, Influence of Image Color Combination on
Subjective Preference and Recognition Rate and EEG
Evaluation, National Taiwan University of Science and
Technology Department of Industrial Management
Master Thesis, 2004.
[8]
E. e. a. Kirmizi-Alsan, "Comparative analysis of event-
related potentials during Go/NoGo and CPT:
decomposition of electrophysiological markers of
response inhibition and sustained attention.," in Brain
research, 2006, pp. 114-128.
[9]
B. R. a. J. P. Cahn, "Meditation states and traits: EEG,
ERP, and neuroimaging studies,"
Psychology of
Consciousness: Theory, Research, and Practice, pp. 48-
96, Aug 2013.
[10]
M. R. Gerrard P, "Mechanisms of modafinil: A review of
current research,"
Neuropsychiatr Dis Treat, p. 349–
364,
Jun 2007.
[11]
G. a. F. L. D. S. Pfurtscheller, "Event-related EEG/MEG
synchronization and desynchronization: Basic principles,"
Clinical neurophysiology, p. 1842–1857, 1999.
[12]
E. a. F. L. d. S. e. Niedermeyer, Electroencephalography:
basic principles, clinical applications, and related fields.,
Lippincott Williams & Wilkins, 2005.
[13]
G. C. Gao, Mozart effect of electromagnetic waves and
brain waves, 2011.
[14]
Shenghong Precision Technology Co., Ltd., [Online].
Available: http://www.brain
-
sh.tw/product_content.php?p_id=134.
[15]
K. M. Postawski, Powerful Sleep: Secrets of the Inner
Sleep Clock.
[16]
Zhang, X., Liu, X., Samani, H., & Jalaian, B. (2015).
Cooperative spectrum sensing in cognitive wireless sensor
networks. International Journal of Distributed Sensor
Networks, 11(8), 170695..
[17]
H. a. M. G. Lai, Music improves sleep quality in older
adults, Journal of advanced nursing, 2006.
[18]
V. Marmarelis, Analysis of physiological systems: The
white
-
noise approach., Springer Science & Business
Media, 2012.
[19]
Samani, H. A., Koh, J. T. K. V., Saadatian, E., &
Polydorou, D. (2012, March). Towards robotics
leadership: An analysis of leadership characteristics and
the roles robots will inherit in future human society. In
Asian Conference on Intelligent Information and Database
Systems (pp. 158
-165). Springer, Berlin, Heidelberg
.
[20]
Samani, H., Saadatian, E., Pang, N., Polydorou, D.,
Fernando, O. N. N., Nakatsu, R., & Koh, J. T. K. V.
(2013). Cultural robotics: the culture of robotics and
robotics in culture. International Journal of Advanced
Robotic Systems, 10(12), 400.
[21]
H. Samani, K.-Y. Tien and J.-H. Lui, "An affective mood
booster robot based on emotional processing unit.," in
Automatic Control Conference (CACS), IEEE, 2017.
215