ArticlePDF Available

PainMeter: Automatic Assessment of Pain Intensity Levels From Multiple Physiological Signals Using Machine Learning

Authors:

Abstract

Pain assessment traditionally relies on self-report, but it is subjective and influenced by various factors. To address this, there's a need for an affordable and scalable objective pain identification method. Current research suggests that pain has physiological markers beyond the brain, such as changes in cardiovascular activity and electrodermal responses. Utilizing these markers, real-time pain detection algorithms were developed using the BioVid Heat Pain dataset, consisting of 86 healthy individuals experiencing acute pain. Three physiological signals were collected (ECG, GSR, EMG). Various machine learning models were employed to lay the foundation for future advancements in creating sophisticated pain categorization algorithms. The goal is to develop a machine learning model capable of accurately classifying levels of pain experienced based solely on physiological signals. The proposed method produced an accuracy score of 87% for binary classification and 52% accuracy for multi-class classification, with the highest-performing machine learning model being Random Forests. These results suggest that the PainMeter can be deployed in field settings using wearable sensors, offering real-time, unbiased pain sensing and management capabilities.
Received 23 March 2024, accepted 29 March 2024, date of publication 2 April 2024, date of current version 9 April 2024.
Digital Object Identifier 10.1109/ACCESS.2024.3384359
PainMeter: Automatic Assessment of Pain
Intensity Levels From Multiple
Physiological Signals Using
Machine Learning
DA’AD ALBAHDAL 1, WIJDAN ALJEBREEN 1, AND DINA M. IBRAHIM 1,2
1Department of Information Technology, College of Computer, Qassim University, Buraydah 51452, Saudi Arabia
2Department of Computers and Control Engineering, Faculty of Engineering, Tanta University, Tanta 31733, Egypt
Corresponding author: Dina M. Ibrahim (d.hussein@qu.edu.sa; dina.mahmoud@f-eng.tanta.edu.eg)
Researchers would like to thank the Deanship of Scientific Research, Qassim University for funding publication of this project.
ABSTRACT Pain assessment traditionally relies on self-report, but it is subjective and influenced by
various factors. To address this, there’s a need for an affordable and scalable objective pain identification
method. Current research suggests that pain has physiological markers beyond the brain, such as changes
in cardiovascular activity and electrodermal responses. Utilizing these markers, real-time pain detection
algorithms were developed using the BioVid Heat Pain dataset, consisting of 86 healthy individuals
experiencing acute pain. Three physiological signals were collected (ECG, GSR, EMG). Various machine
learning models were employed to lay the foundation for future advancements in creating sophisticated
pain categorization algorithms. The goal is to develop a machine learning model capable of accurately
classifying levels of pain experienced based solely on physiological signals. The proposed method produced
an accuracy score of 87% for binary classification and 52% accuracy for multi-class classification, with the
highest-performing machine learning model being Random Forests. These results suggest that the PainMeter
can be deployed in field settings using wearable sensors, offering real-time, unbiased pain sensing and
management capabilities.
INDEX TERMS ECG, EMG, GSR, machine learning, pain intensity, physiology, signal, and classification.
I. INTRODUCTION
Pain detection, a crucial aspect of healthcare and human
welfare, has traditionally relied on subjective assessments
and observable symptoms, leading to potential inaccuracies
and delays in effective intervention. Pain is measured using
well-defined scales based on self-reports and observations.
Patient-reported outcome measures, such as visual analog
scales (VAS) and numeric rating scales (NRS), are commonly
used for measuring acute and chronic pain [1]. Because these
tactics rely on patients’ narratives, they are only applicable to
those who do not have any language or cognitive problems
[2]. As a result, all of these proven Scales are ineffective
The associate editor coordinating the review of this manuscript and
approving it for publication was Angelo Trotta .
for infants, intensive care patients, or people with dementia
or developmental and intellectual impairments [3]. Such
patients are completely reliant on others’ abilities to detect
nonverbal pain signs. Therefore, when self-reported pain
measurements are inapplicable, observational pain scales are
recommended for adults in critical conditions. Nonetheless,
the reliability and validity of observation scales remain
limited because even qualified examiners cannot guarantee
an unbiased evaluation [4].
Despite breakthroughs in technology and a complete grasp
of the pathophysiological processes behind the physical
pain response, pain is frequently mismanaged. Misdiagnosis
of pain levels due to associated subjective biases, as cur-
rently practised, can result in excessive expenses and haz-
ards. Furthermore, inadequate pain management can cause
VOLUME 12, 2024
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 48349
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
emotional distress and has been related to a variety of
effects, including chronic pain [5]. Recently, much focus has
been placed on machine learning (ML) and its uses in the
medical field. ML-based applications are utilized to improve
the overall efficiency of healthcare systems by developing
clinical decision support, disease diagnosis, and personalized
therapies [6],[7],[8],[9]. These recent advancements in
ML techniques have opened up new avenues for accurate
and timely pain detection, revolutionizing the way healthcare
professionals approach this critical aspect of patient care [10],
[11],[12]. By leveraging data-driven models and advanced
algorithms, ML can automatically and objectively assess pain
levels across diverse populations and contexts [13]. Overall,
this approach is referred to as automatic pain assessment
(APA).
Several studies have investigated the relationship between
brain regions that engage during pain using clinical imaging
such as fMRI, revealing a pain matrix of regions reliably acti-
vated by painful stimuli [14]. While this technique resulted
in the discovery of high-specificity and high-sensitivity
pain biomarkers, it still necessitates employing expensive
imaging equipment and highly educated personnel [15]. As a
result, there is a need for an objective approach to pain
diagnosis that is both cost-effective and scalable to reduce
reliance on expensive and limited neuroimaging. The existing
evidence indicates that pain is accompanied by objective
physiological indicators that manifest beyond the brain.
Instances of both acute and chronic pain exhibit consistent
physiological changes, such as heightened cardiovascular
activity, encompassing alterations in heart rate (HR) [16],
[17], blood volume pulse (BVP) [18],[19], as well as
changes in respiration rate and depth [20]. Electrodermal
activity also shows notable variations in the presence of pain
[21],[22]. These findings open avenues for the objective
assessment of pain in ambulatory settings. The prospect of
real-time, objective pain sensing and management emerges
with advancements in non-invasive continuous physiological
monitoring devices.
The main contributions of this research are:
Introduction of the PainMeter framework: a multi-
classifier achieving results comparable to the state-of-
the-art and surpassing previous methods by relying
solely on machine learning models instead of deep
learning.
Incorporation of multiple physiological signals in
the framework to enhance classification accuracy,
diverging from previous studies focused on a single
signal.
Implement feature selection technique on the BioVid
dataset, which includes 155 features, to select the most
informative variables for binary classification and multi-
classification tasks.
Achievement of high accuracy without incorporating
facial expressions as in previous studies, simplifying the
detection process in real-world settings and reducing
time and resource complexity.
Exploration of pain sensitivity differences between
genders and age groups using multiple physiological
signals from the Biovid dataset. To the best of our
knowledge, no previous literature has undertaken such
an analysis.
The remainder of the paper is structured as follows:
Section II provides background on subjective and obser-
vational pain assessment methods. Section III presents the
problem statement and research motivation. After that,
Section IV discusses recent studies that use machine learning
for objective pain assessment. Section Vthen introduces
the study’s methodology. Following the presentation of the
proposed ML model’s findings in Section VI, Section VII
offers a discussion. Finally, Section VIII presents the
conclusions and potential future work.
II. BACKGROUND
A. CONVENTIONAL PAIN ASSESSMENT APPROACHES
Over the years, regular clinical practice uses the patient’s self-
assessment as the most precise and dependable indication of
the presence of pain and its intensity, remaining consistent
across all age groups, irrespective of communication abilities
or cognitive challenges. In the absence of objective measures,
healthcare providers rely on patients to provide crucial details
concerning the location, nature, and severity of their pain.
[23]. The most commonly used scales for adults are the Visual
Analog Scale (VAS) and Numeric Rating Scale (NRS) [1].
In the first one, patients mark their pain intensity on a 10 cm
line, with one end representing ‘‘no pain’ and the other
end representing the ‘‘worst possible pain.’ While on The
Numeric Rating Scale (NRS), patients rate their pain directly
by choosing a number from 0 to 10, with 0 indicating ‘no
pain’’ and 10 indicating ‘the worst pain imaginable’’ [1].
There are many other scales designed for children, like the
Wong-Baker FACES, which shows a range of faces, varying
from a 0-pleased face, which suggests no pain, to a 10-crying
face, which indicates the worst pain. The patient selects
the face that depicts their pain level [24]. Different scales
are designed; some have even been developed and tested to
verify their validity. Regardless, self-reporting necessitates
considerable cognitive, linguistic, and social abilities, which
may not consistently be present in individuals under 8 years
old and other groups, such as infants, individuals with
developmental delays, brain damage, or dementia.
B. PAIN JUDGMENTS BY HEALTH CARE PROVIDERS
(OBSERVATIONAL PAIN MEASURES)
Observational pain scales are used with patients in critical
conditions when self-reported pain scales are not feasible.
Like Critical Care Pain Observation Tool (CPOT), which is
designed for patients in Intensive care units (ICUs), and Pain
Assessment in Advanced Dementia Scale (PAINAD), which
is applied to patients who are diagnosed with dementia [3].
The CPOT and PAINAD tools are designed for patients
unable to self-report [3]. The CPOT assesses four criteria,
which are facial expressions, ventilator compliance, body
48350 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
movements, and muscle rigidity. On the other hand, the
PAINAD evaluates five criteria: breathing rate, vocal sound,
facial expressions, body gestures, and controllability. The
Face, Legs, Activity, Cry, Consolability (FLACC) scale,
ranging from 0 to 10, is endorsed for evaluating postoperative
pain in children aged 2 months to 7 years. However, it is
not considered ideal for appraising procedural pain or for
children who exhibit fewer observable physical or vocal signs
of discomfort [25]. Despite the qualifications of the raters,
these tools are still considered highly subjective and have
limited reliability and validity due to potential biases [4],
[26],[27]. Thus, an objective measure of pain is needed to
overcome limitations.
III. PROBLEM STATEMENT AND RESEARCH MOTIVATION
The problem at hand revolves around the unreliability of
subjective pain assessment in medical settings, particularly
inapplicable to certain demographic groups. To address this,
the imperative for an objective pain assessment method
becomes evident, aiming to facilitate suitable medical
treatment. Effectively gauging pain levels holds profound
significance in improving individuals’ overall quality of life,
mitigating not only the continual physical repercussions of
pain but also its social and emotional ramifications, such as
anxiety, frustration, and feelings of isolation. An additional
crucial aspect involves the prudent use of opioids, given the
alarming statistic from the American Medical Association
(AMA) indicating that nearly 1 in 5 individuals prescribed
pain medications eventually develop addiction issues [28].
This high percentage leads to more people developing
substance use disorder (SUD), which is already reaching
high numbers worldwide. As the World Health Organization
(WHO) specified, nearly 39.5 million people lived with
SUDs in the year 2021 alone [29]. Another issue that arises
when inaccurately assessing pain is Opioid Use disorder
(OUD), which is the misuse of opioids such as: taking a
higher dose than prescribed, using it with other unprescribed
drugs, or taking it more often. Furthermore, emergency
departments worldwide rely on triage systems based on
medical knowledge and experience to evaluate the urgency
of patients’ state [30]. Triage systems may not always
identify critical patients with hidden symptoms. In such
situations, if these patients are not identified and medical
attention is delayed, the risk of complications will increase.
Besides, In clinical practice, physicians and nurses have
noticed the difficulty in implementing the triage system,
especially for elderly people, foreigners, or patients with
a low education level [31]. For this reason, there is a
need for more efficient and reliable triage systems that can
assist healthcare professionals in identifying critical patients
quickly and accurately.
The scope of this endeavour encompasses a multi-modal
approach utilizing various biomarkers (ECG, EDA, EMG) for
pain assessment across all age groups and genders, focusing
exclusively on acute, induced pain. The primary objectives of
FIGURE 1. Schematic overview of the related work section structure.
this research are to improve an accurate and effective machine
learning model for classifying pain intensity levels based on
diverse biomarkers and compare different ML models for the
classification task. The main contribution of this research is
the development of a robust machine learning multi-classifier
for pain levels, utilizing multiple physiological signals.
IV. RELATED WORK
Machine learning (ML) can leverage complex datasets to
undertake predictive modelling tasks with applications in
pain research. Specifically, data-driven models can be utilized
to overcome the constraints associated with subjective pain
assessment. The primary objective is to create robust pain
assessment methods grounded in objective, standardized,
and universally applicable elements. These methodologies
are collectively referred to as automatic pain assessment
(APA). Progress in research is advancing swiftly, with
numerous research groups worldwide actively involved in
the field of APA research. The structure of the related
work section is summarised in Fig. 1. First, the previous
works are categorized into two groups: behavioural-based
approaches encompassing facial expressions, linguistic anal-
yses, nonverbal physical indicators like body movements,
and physiological-based pain detection methods. While this
division serves narrative clarity, it’s important to note that
some approaches employ multi-modal strategies, combining
behavioural aspects with neurophysiological techniques.
Then, the third and last section highlights prior research
demonstrating machine learning integration for automatic
pain assessment within wearable devices.
A. BEHAVIOR-BASED
Observable markers, such as facial expressions, language
features, and body postures, serve as indicators of pain.
Technology can capture and analyze these behaviours to
identify the presence of pain and evaluate its intensity.
1) FACIAL EXPRESSIONS
Pain expressions often trigger spontaneous facial expressions.
Research has found that these expressions are consistent
across age groups, genders, cognitive states, and pain
categories [32]. The facial action coding system (FACS) is
an analytical tool used to characterize and evaluate observed
facial movements.FACS breaks down facial movements into
specific action units (AUs), corresponding to the activation
VOLUME 12, 2024 48351
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
of particular muscles [33]. This approach is widely used in
communication studies, psychology, and sociology. FACS
has been used to study pain expressions in different popu-
lations, including healthy individuals, chronic pain patients,
and individuals with neurological or mental disorders [34].
Recent studies have made significant progress using
machine learning and deep learning techniques to assess
pain intensity from facial expressions. Researchers in [35],
[36] developed deep learning algorithms, with the former
achieving good accuracy in detecting four pain levels and
the latter outperforming existing models in estimating pain
intensity across seven levels. Sri et al. [37] provided an
automatic patient monitoring and alerting system to assist in
continuously assessing and detecting pain levels using facial
reactions.
Nonetheless, employing facial expressions to detect pain
is generally ineffective and unsuitable in most cases. Taking
images or videos requires special equipment and a prepared
environment to ensure the quality and clarity of the frames.
Still, detection by facial expressions could be useful in
emergency waiting rooms to replace the traditional triage
classification system [38]
2) LINGUISTIC ANALYSIS
Natural language processing (NLP) is a type of machine
learning that allows for the analysis and processing of free-
form text. When applied to medical records, it can assist
in predicting patient outcomes, improve emergency triage
systems, and develop a conversational chatbot for patients to
ask questions and obtain relevant information [39]. In clinical
settings and pain research, NLP has been employed to
gather and analyze information from text-based data sources
such as medical records and clinical reports. These sources
contain essential details about the patient’s pain, including
its location, intensity, and duration [40]. This utilization of
NLP contributes to a more comprehensive understanding of
how patients express their pain. It could also aid clinicians by
generating predicted diagnoses and identifying urgent cases,
particularly in the emergency department [41]
In a recent study, Naseri et al. [42] applied an NLP
pipeline on clinical notes to automatically extract and identify
physician-reported pain, even in cases where it is not
documented through structured data entry. The NLP pipeline
attained an 85% F1-score in the identification of pain from
radiation oncology clinical notes. Many studies leverage
extensive electronic medical records to enhance clinical
assessments by applying NLP. In one study, an NLP pipeline
was designed to extract nine parameters describing pain from
electronic medical records [43]. These parameters include
location, onset, quality, quantity, severity, radiation, interven-
tion, prior treatment, and frequency. The model developed
by [43] achieved a 90% F1-score in identifying pain severity
from medical records. Another study applied a clinical text
deep learning model to unstructured nursing assessments
in electronic health records, intending to determine the
prevalence of pain upon arrival at the Emergency Department
[44]. The model demonstrated an average accuracy of 93.2%
Overall, to utilize NLP in medicine, there is a paramount
need for unbiased training data to ensure the reliability of the
conclusions reached by NLP algorithms. Additionally, clini-
cians must undergo training to understand the safe utilization
of NLP within routine practice [39]. Not to mention situations
where individuals face challenges in verbalizing pain. Some
of these situations include cognitive impairments, infants
and pre-verbal children, language barriers, and cultural
differences.
3) BODY MOVEMENTS
Several studies have investigated the role of body movements
in pain assessment. Researchers in [45] found that body
movements can serve as indicators of pain in older individuals
with cognitive impairment. Additionally, Werner et al. [46]
observed that head movements and postures frequently tilt
downward or towards the location of pain. Together, these
studies suggest that body movements can be valuable indi-
cators of pain, especially in populations where self-reporting
is challenging.
B. PHYSIOLOGICAL-BASED
Physiological-based pain detection involves evaluating and
measuring pain by studying the physiological changes that
occur in response to it. Sensors are used to capture the
biomarkers for assessing pain, and the pain estimation results
are often superior to those based on behaviour. This field is
dynamic and continuously advancing, and ongoing research
continually reveals new avenues for exploration. Table (1)
reviews all studies presented that depend on using the
information from sensors to assess. This literature review
categorizes studies based on the model type, distinguishing
between unimodal and multimodal approaches.
1) UNIMODAL APPROACHES
Winslow et al. [16] and Naeini et al. [17] developed a
ML classifier to assess pain based on an electrocardiogram
(ECG), which measures the heart’s electrical activity through
multiple cardiac cycles. The former [16] employed a binary
logistic regression classifier to indicate the presence of pain
without specifying its degree. The dataset includes 41 healthy
participants, mostly men, with an average age of 21. This
study induced pain through a cold pressor test, resulting in
an F1 score of 81.9% of the classifier. In the latter study [17],
pain assessment using the same ECG biomarker involved a
dataset of 25 postoperative patients aged between 18 and 65.
Five machine learning algorithms were applied for binary
baseline (BL) classification against four pain levels (PL1
through PL4). SVM achieved the highest accuracy at 84.14%
(BL vs. PL2).
Several studies highlight the utility of electrodermal
activity (EDA) for pain assessment [21],[22]. EDA measures
the skin’s electrical changes in response to moisture levels,
48352 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
offering an objective measure of pain sensitivity, especially
during the fight-or-flight response. Aqajari et al. [21]
proposed the utilization of electrodermal activity (EDA)
for pain assessment, specifically employing GSR signals to
predict pain levels in postoperative patients. It’s important
to distinguish between EDA and GSR, as EDA encompasses
overall sweat gland activity, while GSR specifically measures
changes in skin conductance. Thus, GSR is a subset of
EDA, and the terms are not alike. Moreover, [22] conducted
an experiment where EDA signals were recorded from
28 healthy participants by inducing electrical pain at the hand
and forearm for each participant. The implementation of the
Artificial Neural Network (ANN) classifier achieved 90%
accuracy in binary pain detection (no pain vs. pain), while
the pain localization experiment (hand vs. forearm) yielded
an accuracy of 66.67%.
Khan et al. [18] and Pouromran et al. [19] both found
that blood volume pulse (BVP) signals, when combined
with machine learning, can accurately assess pain levels.
Khan et al. [18] recorded BVP signals from 22 healthy
subjects exposed to electrical induction. In this study, the
classification of (no pain vs. high pain) achieved a high
accuracy of 96.6% using ANNs. The classification of (no
pain vs. low pain) attained an accuracy of 83.3% using the
AdaBoost classifier. In the multi-class experiment, which
classified (no pain, low pain, and high pain), an overall
accuracy of 69% was achieved using ANNs. Additionally,
researchers in [20] further supported this, demonstrating the
effectiveness of BVP signals in detecting different pain states.
Indeed, in a relative study [47], researchers thoroughly
investigated diverse physiological signals to estimate pain
intensity and identified EDA as the most informative
signal for continuous assessment of pain intensity. Another
research [48] conducted an in-depth exploration of EDA
signals for pain assessment and suggested that sympathetic
reactions recorded by EDA show a higher correlation with the
intensity of the applied stimuli than with the pain sensation
described by the subject.
Developing machine learning models reliant on a single
physiological signal for pain assessment presents limitations,
including an incomplete understanding of the multifaceted
nature of pain, reduced robustness in capturing individual
variations, susceptibility to noise, and an inability to address
the heterogeneity of pain experiences effectively. Researchers
often explore multimodal approaches integrating information
from various physiological signals to overcome these draw-
backs and enhance model performance.
2) MULTIMODAL APPROACHES
Several studies aimed at developing an automatic pain
assessment model by classifying multiple physiological
parameters. Jiang et al. [49] conducted an experiment
involving 30 healthy participants exposed to thermal and
electrical pain stimuli while recording heart rate (HR), breath
rate (BR), galvanic skin response (GSR), and facial surface
electromyogram. The collected samples were labelled (no
pain, mild pain, and severe pain) based on the participants’
reported VAS ratings. Subsequently, an ANN classifier was
trained, validated, and tested on physiological parameters,
resulting in an average accuracy of 70.6%. The same method
was then applied to the medians of each class in each test,
enhancing accuracy to 83.3%.
In a comparable study, Lin et al. [20] recorded nine phys-
iological signals while participants experienced cold pain
stimuli, including facial expressions (FE), electroencephalog-
raphy (EEG), eye movement (EM), skin conductance
level(SCL), blood volume pulse (BVP), electromyography
(EMG), respiration rate (RR), skin temperature (ST), and
blood volume pressure (BP). Then, physiological signals
were employed in ML algorithms and variance analysis. Find-
ings of the study [20] emphasized the effectiveness of facial
expressions, eye movement, EEG, skin conductance level,
skin temperature, and blood volume pressure in discerning
different levels of painful states. Also, researchers in [20]
showed that decision-level multimodal fusion could make it
easier to tell the difference between different pain levels.
Othman et al. [50] introduced a multimodal approach
for automatic pain assessment, integrating facial expres-
sions, audio, and physiological signals (ECG, EMG, EDA).
Researchers highlighted the challenge of database imbalance
and outliers, leading to model failure in identifying minority
classes and handling noise. The study’s results proved that
EMG and EDA signals are the most effective features for
multimodal performance. Furthermore, multimodels using
all features demonstrated better performance, especially
handling imbalance datasets [50].
C. WEARABLE SENSORS FOR PAIN ASSESSMENT AND
MANAGEMENT
Other studies explored the use of wearable sensors to collect
physiological data to assess pain [51]. A recent study by
Korving et al. [52] proposed employing a mobile application
that applies an AI predictor to the physiological data collected
from a conductive skin sensor in socks. While this study
opens the door for different types of pain sensors, especially
for non-communicative patients, it falls short on accuracy and
computational power limitations in a mobile device.
All the referenced studies have centred around developing
objective automated pain assessment systems. Initiatives
have been undertaken to establish multimodal databases for
analyzing pain and identifying authentic pain patterns. While
the outcomes are promising, they primarily concentrate on
distinguishing between pain, different pain levels, and the
absence of pain. To advance this field, research efforts
should also incorporate considerations of identifying discrete
pain levels and the location of pain, as these factors offer
supplementary valuable information for more sophisticated
pain management strategies.
Table 1summarizes earlier research on machine learning-
based automatic pain assessment, taking into account
VOLUME 12, 2024 48353
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 2. PainMeter framework for pain levels classification.
different pain induction techniques, physiological signals,
binary or multi-classification, and demonstrating the best
outcomes.
V. MATERIALS AND METHODS
The subsequent sections present the research methodol-
ogy intending to classify pain levels based on multiple
physiological signals using six machine learning models,
followed by a comparative analysis of their outcomes.
The framework comprises six key steps, namely: dataset
collection, feature extraction, exploratory data analysis,
feature selection, building machine learning models, and,
finally, the evaluation process. Fig. 2offers a comprehensive
overview of the PainMeter framework.
A. DATASET
The dataset used in this study was the BioVid Heat Pain
Database, Part B. The data was obtained by contacting the
BioVid research team lead, Philipp Werner. The data includes
86 healthy subjects between 18 and 65 years old, divided into
three age groups: 18-35, 36-50, and 51-65, Fig. 3shows the
highest age group being 18-35. The data is approximately
balanced between the sexes. The pain measured in the study
induced heat pain in healthy test subjects. The pain levels
are split into five classes, with each class having 20 samples
for each subject, thus having 8600 samples labelled for each
signal. Each pain level had a time window of 5.5 seconds. For
each subject, the five pain levels were noted, with 0 being
no pain, 1 being the point they start to experience pain and
4 when the subject cannot tolerate it anymore. For levels
2 and 3, the pain measured between 1 and 4 was split so that
an equal distance separated each pain level.
1) PHYSIOLOGICAL SIGNALS
The physiological signals collected included ECG, GSR and
EMG at three different muscles: trapezius, corrugator, and
zygomaticus. An overview of each signal is provided below:
1) ElectrocardiogramECG is a medical test used to evalu-
ate the heart and indicate any changes in heart activity
FIGURE 3. The distribution of ages in the dataset.
patterns. It records the electrical signals from the
heart using electrodes placed on specific arms’ spots.
The ECG signals were filtered by the Butterworth
band-pass filter with 0.1-250 Hz frequency band
through the Empirical Mode Decomposition technique.
2) The galvanic skin response (GSR), a component of
electrodermal activity (EDA), refers to variations in
sweat gland function that mirror the level of emotional
arousal, indicating the intensity of the emotional state.
3) Electrical muscle activity is also an indicator of general
psychophysiological arousal, as increased muscle tone
is associated with increasing activity of the sympathetic
nervous system, while a decrease in somatomotor activ-
ity reflects predominantly parasympathetic arousal.
B. FEATURE EXTRACTION
Various mathematical groups were used to extract several
features, including signal amplitude, variability, stationarity,
entropy, linearity, similarity, and frequency attributes. Fig. 4
demonstrates the complete list of these 155 features. The
feature names were shortened to be as brief as possible
while maintaining consistency with the relevant physiological
literature. The mathematical descriptions of those features
in mathematical formulas are mentioned in the definition
column. The zygomaticus (zEMG), corrugator (cEMG), and
trapezius (tEMG) muscle modes were each distinguished by
39 features (1-39), the SCL modality by 35 features (1-35),
and the ECG modality by three (40-42). Fig. 4provides a
detailed overview of all features.
C. EXPLORATORY DATA ANALYSIS
An exploratory data analysis has been conducted to under-
stand the data and the relationships between the features.
Firstly, an overview of the data is obtained by examining the
data’s basic statistics and structure, providing the data’s cen-
tral tendency. Secondly, the data is checked for missing values
to handle them appropriately. Then the data distributions
were explored and visualized to inspect patterns or outlier
observations. Figure 5shows the pain level distribution based
on sex and age group.
48354 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
TABLE 1. Review of previous studies on automatic pain assessment using machine learning, considering various pain induction methods, physiological
signals, binary or multi-classification, and highlighting best results.
D. FEATURE SELECTION
The dataset comprises 155 features, necessitating an effective
feature selection method to enhance model performance. This
research employed a meticulous feature selection approach
to identify the most informative variables for analysis. This
method aimed to streamline the dataset, improving model
accuracy and interoperability. The following techniques were
employed:
Correlation coefficient: A measure that assesses the
strength and direction of the linear relationship between
two variables. In feature selection, it can identify
features highly correlated with the target variable or
each other. After assessing the correlation coefficient of
Biovid features, it becomes evident that only 10 features
exhibit a correlation, while the remaining features
demonstrate a lack of correlation.
Chi-Sequre: A statistical test that measures the inde-
pendence between categorical variables. Using the chi-
squared test, the top 10 features from the dataset have
been selected after normalizing them. The selection is
based on their relationship with the target variable (Pain
level). Table 2lists the selected features for each task in
this research.
The analysis of redundancy across the provided lists
reveals the frequency distribution of best-performing fea-
tures selected for each classification task in the research.
Leading the list is CH23_S-sd, appearing six times, followed
closely by CH22_A-PEAK, CH22_A-P2P, CH22_V-range,
CH22_Simale-corr, and CH23_Simale-corr, each occurring
five times. Subsequently, CH22_A-RmaleS, CH22_S-sd,
CH22_V-std, and CH24_Simale-corr appear four times.
TABLE 2. The ten features selected from the Biovid dataset in different
classification tasks.
E. DATASET TRAIN AND TEST METHOD
After data preprocessing, the dataset is randomly divided
into 80% training 20% testing sets in the train/test splitting
VOLUME 12, 2024 48355
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 4. Detailed information overview of all features extracted from
Biovid dataset [53].
FIGURE 5. The distribution of pain level based on sex and age group.
approach, as suggested in previous literature [54], to provide
sufficient training samples for multi-classification ML mod-
els. Then, the dataset is divided into k-fold cross-validation
for evaluation. Each ML model is trained and tested k
times, with a different fold serving as the test set each time.
Performance metrics from each fold are averaged to estimate
the models’ performance. In literature, the most commonly
used values were k =5 or k =10. However, a recent study
[55] found that k =7 slightly increased accuracy and area
under the curve while consuming less computing complexity
than k =10 across most ML models. Hence, 7-fold cross-
validation has been employed to evaluate the accuracy of ML
models.
F. MACHINE LEARNING MODELS
In implementation, a diverse set of machine learning models
will be employed to assess their performance in pain
assessment. The classifiers include Random Forest (RF),
Support Vector Machine (SVM), Logistic Regression (LR),
Decision Tree (DT), Naive Bayes (NB), and K-Nearest
Neighbour (KNN). These models will undergo training to
evaluate their effectiveness in capturing patterns and nuances
within the dataset.
Training machine learning classification algorithms with
high-dimensional data often generates overfitted models,
which present good results on the training set but falter
when tested with new data. Overfitting occurs when the
model mistakenly identifies noise and random fluctuations
in the training data as meaningful patterns. Moreover,
excessive features clutter the learning algorithm, primarily
due to irrelevant and redundant features, resulting in higher
learning and computational time. Thus, feature selection,
as detailed in Section D, is implemented to minimize
the overfitting issue, as evidenced in prior studies [56].
Additionally, as elaborated in Section E, the k-fold cross-
validation approach has been implemented to overcome fur-
ther the overfitting problem, a technique validated by earlier
research [57].
VI. RESULTS
A. MACHINE LEARNING MODEL RESULT
The application of all models exhibited varying accuracies
in the research study. Initially, a binary classification was
performed, distinguishing base-level 0 from other levels
(1, 2, 3, 4). This process was executed twice, once utilizing
all features and once with feature selection. Across all
classes, Random Forest (RF) consistently achieved the
highest accuracy, with the highest accuracy observed in
the base level versus level 3 classification, reaching an
accuracy of 87%. Table 3summarizes the results of binary-
Classification models. Furthermore, to have a deeper analysis
and precise presentation of the model’s performance, the
best-performing model (Rf) results with all features and
selected ones are visually presented using confusion matrices,
classification reports, ROC curves, and AUC values for
binary classification tasks in Figures 6,7,8,9,10,11,12,
and 13.
For multi-classification, two tasks were undertaken: multi-
classification involving levels 0, 1, 2, 3, and 4, and a
second task involving levels 0, 2, and 4. Both tasks were
implemented with and without feature selection. In the
multi-class task (0, 1, 2, 3, 4), Random Forest (RF)
demonstrated the highest accuracy at 43% without feature
48356 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 6. Classification results of random forest model with binary-classification task (0 Vs 4) using all features.
FIGURE 7. Classification results of random forest model with binary-classification task (0 Vs 4) using selected features.
FIGURE 8. Classification results of random forest model with binary-classification task (0 Vs 3) using all features.
selection. For the second task (0, 2, 4), RF and KNN
achieved similar results of 52% accuracy, both with and
without feature selection. Table 4summarizes the results of
Multi-Classification models. Figures 14,15,16,17 present
detailed evaluation metrics of the random forest model for
each multi-classification task.
VOLUME 12, 2024 48357
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 9. Classification results of random forest model with binary-classification task (0 Vs 3) using selected features.
FIGURE 10. Classification results of random forest model with binary-classification task (0 Vs 2) using all features.
FIGURE 11. Classification results of random forest model with binary-classification task (0 Vs 2) using selected features.
In this research, k-fold cross-validation was performed to
assess the model’s generalizability and mitigate overfitting.
The Random Forest (RF) model was found to be the
best-performing model, as depicted in Tables 3and 4. Hence,
the effectiveness of the RF model has been evaluated across
various classification tasks using a 7-fold cross-validation.
48358 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 12. Classification results of random forest model with binary-classification task (0 Vs 1) using all features.
FIGURE 13. Classification results of random forest model with binary-classification task (0 Vs 1) using selected features.
FIGURE 14. Classification results of random forest model with multi-classification task (0,1,2,3,4) using all features.
Table 5summarises model evaluations for each classification
task.
Individual differences in perceiving pain due to numerous
biological and psychosocial factors are noticeable. Variables
such as demographics, genetic elements, and psychosocial
processes contribute to these individual differences in pain
experiences [58]. Notably, age and gender have been
identified as influential factors, as evidenced by a study
VOLUME 12, 2024 48359
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 15. Classification results of random forest model with multi-classification task (0,1,2,3,4) using selected features.
FIGURE 16. Classification results of random forest model with multi-classification task (0,2,4) using all features.
FIGURE 17. Classification results of random forest model with multi-classification task (0,2,4) using selected features.
[58]. The research indicates that older adults tend to exhibit
reduced sensitivity to brief cutaneous pains, such as heat
pain. Additionally, in response to sustained and repeated
thermal stimuli, females demonstrate greater habituation than
males, suggesting a stronger pain-inhibitory response to these
stimuli.
48360 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 18. Classification results of random forest model with multi-classification task (0,1,2,3,4) with female sample.
FIGURE 19. Classification results of random forest model with multi-classification task (0,1,2,3,4) with male sample.
FIGURE 20. Classification results of random forest model with multi-classification task (0,1,2,3,4) with 18-35 age sample.
Consequently, the model is trained separately for each
category to explore potential impacts on performance.
The best-performing model (RF) was used to analyze the
performance metrics of various age groups and genders from
the Biovid dataset. Table 6shows accuracy, precision and
recall results. Also, detailed evaluation metrics of gender
VOLUME 12, 2024 48361
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
FIGURE 21. Classification results of random forest model with multi-classification task (0,1,2,3,4) with 36-50 age sample.
FIGURE 22. Classification results of random forest model with multi-classification task (0,1,2,3,4) with 51-65 age sample.
TABLE 3. Accuracy of machine learning models(binary classification).
TABLE 4. Accuracy of machine learning models(multi-classification).
samples are presented in Figure 18, and 19, while age group
samples are presented in Figure 20,21, and 22.
The results showed that the male sample had a slightly
higher accuracy than females. For age groups, the younger
subjects showed a higher pain sensitivity than older subjects.
TABLE 5. Average accuracy of 7-fold cross-validation for best performing
model (RF) across various classification tasks.
TABLE 6. Performance metrics of random forest multiclassification
(0,1,2,3,4) for different samples of the Biovid dataset.
These results align with previous findings suggesting poten-
tial differences in pain perception between ages and genders.
Hence, the design of a sophisticated pain assessment model
48362 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
TABLE 7. Result comparison of the Pouromran et al. [19] and current
studies, using MAE and RMSE metrics.
TABLE 8. Result comparison of previous studies from Table 1and the
current study, using the accuracy metric, with a focus only on the binary
classifications.
that considers the patient’s gender, age, and health condition
is suggested. since all these factors affect how physiological
signals react to pain.
B. RESULT COMPARISON
To examine the performance of the methodology of the
current study, we compared the MAE and RMSE values
with the only other study that used the BioVid dataset,
which is the study by Pouromran et al. [19]. Pouromran et al
applied different ML methods to Part A of the BioVid
dataset, which included biomedical signals only from the
trapezius muscle, however, Part B of the dataset, which
was used in the study has biomedical signals from the
trapezius, corrugator, and zygomaticus muscles, thus yielding
more data to improve prediction. Table 7demonstrates the
different MAE and RMSE scores of each of the top two
performing models. In this current study, the Random Forest
and Decision Tree models scored the lowest MAE and
RMSE scores, respectively, thus producing more accurate
pain level predictions. Pouromran et al. [19] high-performing
models were XGBoost and Random Forest, making the
Random Forest models one of the top two performing models
overall.
Table 8exhibits the various accuracy levels from the
previous studies compared to the current study. Studies with
a binary classification were compared to the current study
results of pain levels (0 vs 4). The Support Vector Machine
model proved to be the highest-performing model among
the two studies, Naeini et al. [17], and Khan et al. [18],
with the Random Forest model being the next highest
model in accuracy, as shown by the current study and
Aqajari et al. [21].
VII. DISCUSSION
This research contributes to advancing Automatic Pain
Assessment (APA) by addressing gaps and limitations
identified in prior studies. An improvement is made by
comparing binary and multi-classification models, aiming for
more detailed and accurate pain assessments. One limitation
in automated pain assessment research is the scarcity of
datasets, preventing the validation of proposed models and
findings across diverse scenarios. It’s important to note that
the predefined heat pain window in the BioVid dataset
is 5.5 seconds, which might not be sufficiently long for
collecting physiological signals for accurate automated pain
intensity estimation. Notably, the dataset utilized in this study,
comprising 90 subjects and four physiological signals, is the
largest available. Some studies have shown that the proposed
feature set is better at finding pain than others, even though
the results were better in binary classification than multi-
classification.
VIII. CONCLUSION AND FUTURE WORK
The challenge of automatically recognizing pain is complex,
given that the perception and expression of pain are
influenced by various factors such as personality, social
context, the source of pain, or past experiences. This study
aims to enhance the validity and reliability of existing
pain recognition approaches by integrating multiple signal
sources, including biomedical signals (GSR, EMG, EDA, and
ECG). The approach involves feature selection to reduce the
number of features, followed by the application of various
machine learning models, including RF, SVM, LR, DT, NB,
and KNN. RF demonstrated high performance in both binary
and multi-classification tasks. The classification performance
was thoroughly evaluated by comparing the separate use and
combination of features.
In future research, it is critical to properly assess the
efficacy and scalability of the PainMeter framework in
real-time applications. This investigation will provide vital
insights into the framework’s adaptability and performance
across various scenarios, laying the groundwork for its
potential implementation in real-world settings. Moreover,
addressing the unexplored challenges in pain literature is
crucial, as emotions, anxiety, sleep, and other experiences
can also influence physiological signals responding to
pain. Thus, developing a model capable of appropriately
distinguishing the reasons for changes in physiological
signals is crucial. Additionally, physiological responses to
pain may differ between healthy subjects and patients
with specific diseases. Designing a model that can be
carefully calibrated based on different patient groups is
essential.
REFERENCES
[1] O. Karcioglu, H. Topacoglu, O. Dikme, and O. Dikme, ‘A sys-
tematic review of the pain scales in adults: Which to use?’ Amer.
J. Emergency Med., vol. 36, no. 4, pp. 707–714, Apr. 2018, doi:
10.1016/j.ajem.2018.01.008.
[2] S. E. Berger and A. T. Baria, ‘Assessing pain research: A narrative
review of emerging pain methods, their technosocial implications, and
opportunities for multidisciplinary approaches,’ Frontiers Pain Res.,
vol. 3, Jun. 2022, Art. no. 896276, doi: 10.3389/fpain.2022.896276.
VOLUME 12, 2024 48363
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
[3] T. Alghamdi and G. Alaghband, ‘Facial expressions based automatic pain
assessment system,’ Appl. Sci., vol. 12, no. 13, p. 6423, Jun. 2022, doi:
10.3390/app12136423.
[4] M. Lotan and M. Icht, ‘‘Diagnosing pain in individuals with intellectual
and developmental disabilities: Current state and novel technologi-
cal solutions,’ Diagnostics, vol. 13, no. 3, p. 401, Jan. 2023, doi:
10.3390/diagnostics13030401.
[5] P. Werner, D. Lopez-Martinez, S. Walter, A. Al-Hamadi, S. Gruss,
and R. W. Picard, ‘‘Automatic recognition methods supporting pain
assessment: A survey,’’ IEEE Trans. Affect. Comput., vol. 13, no. 1,
pp. 530–552, Jan. 2022, doi: 10.1109/TAFFC.2019.2946774.
[6] A. Zhang, L. Xing, J. Zou, and J. C. Wu, ‘‘Shifting machine learning for
healthcare from development to deployment and from models to data,’’
Nature Biomed. Eng., vol. 6, no. 12, pp. 1330–1345, Jul. 2022, doi:
10.1038/s41551-022-00898-y.
[7] L. Rubinger, A. Gazendam, S. Ekhtiari, and M. Bhandari, ‘‘Machine
learning and artificial intelligence in research and healthcare,’ Injury,
vol. 54, pp. 69–73, May 2023, doi: 10.1016/j.injury.2022.01.046.
[8] M. Javaid, A. Haleem, R. Pratap Singh, R. Suman, and S. Rab,
‘‘Significance of machine learning in healthcare: Features, pillars and
applications,’ Int. J. Intell. Netw., vol. 3, pp. 58–73, Jan. 2022, doi:
10.1016/j.ijin.2022.05.002.
[9] A. Alanazi, ‘‘Using machine learning for healthcare challenges and oppor-
tunities,’ Informat. Med. Unlocked, vol. 30, Jan. 2022, Art. no. 100924,
doi: 10.1016/j.imu.2022.100924.
[10] I. van der Wal, F. Meijer, R. Fuica, Z. Silman, M. Boon, C. Martini,
M. van Velzen,A. Dahan, M. Niesters, and Y. Gozal, ‘Intraoperative use of
the machine learning-derived nociception level monitor results in less pain
in the first 90 min after surgery,’’ Frontiers Pain Res., vol. 3, Jan. 2023,
Art. no. 1086862, doi: 10.3389/fpain.2022.1086862.
[11] T. Harland, A. Hadanny, and J. G. Pilitsis, ‘‘Machine learning and pain
outcomes,’ Neurosurgery Clinics North Amer., vol. 33, no. 3, pp. 351–358,
Jul. 2022, doi: 10.1016/j.nec.2022.02.012.
[12] G. Baskozos, A. C. Themistocleous, H. L. Hebert, M. M. V. Pascal, J. John,
B. C. Callaghan, H. Laycock, Y. Granovsky, G. Crombez, D. Yarnitsky,
A. S. C. Rice, B. H. Smith, and D. L. H. Bennett, ‘‘Classification of
painful or painless diabetic peripheral neuropathy and identification of the
most powerful predictors using machine learning models in large cross-
sectional cohorts,’ BMC Med. Informat. Decis. Making, vol. 22, no. 1,
p. 144, May 2022, doi: 10.1186/s12911-022-01890-x.
[13] X. Xu and Y. Huang, ‘Objective pain assessment: A key for the
management of chronic pain,’ FResearch, vol. 9, p. 35, Jan. 2020, doi:
10.12688/f1000research.20441.1.
[14] C. P. Mao, Y. Wu, H. J. Yang, J. Qin, Q. C. Song, B. Zhang, X. Q. Zhou,
L. Zhang, and H. H. Sun, ‘‘Altered habenular connectivity in chronic low
back pain: An fMRI and machine learning study,’’ Human Brain Mapping,
vol. 44, no. 11, pp. 4407–4421, Jun. 2023, doi: 10.1002/hbm.26389.
[15] R. Sankaran, A. Kumar, and H. Parasuram, ‘Role of artificial intelligence
and machine learning in the prediction of the pain: A scoping systematic
review,’ Proc. Inst. Mech. Eng., H, J. Eng. Med., vol. 236, no. 10,
pp. 1478–1491, Sep. 2022, doi: 10.1177/09544119221122012.
[16] B. D. Winslow, R. Kwasinski, K. Whirlow, E. Mills, J. Hullfish,
and M. Carroll, ‘‘Automatic detection of pain using machine learn-
ing,’ Frontiers Pain Res., vol. 3, Nov. 2022, Art. no. 1044518, doi:
10.3389/fpain.2022.1044518.
[17] E. Kasaeyan Naeini, A. Subramanian, M.-D. Calderon, K. Zheng, N. Dutt,
P. Liljeberg, S. Salantera, A. M. Nelson, and A. M. Rahmani, ‘Pain
recognition with electrocardiographic features in postoperative patients:
Method validation study,’’ J. Med. Internet Res., vol. 23, no. 5, May 2021,
Art. no. e25079, doi: 10.2196/25079.
[18] M. U. Khan, S. Aziz, N. Hirachan, C. Joseph, J. Li, and
R. Fernandez-Rojas, ‘Experimental exploration of multilevel human
pain assessment using blood volume pulse (BVP) signals,’ Sensors,
vol. 23, no. 8, p. 3980, Apr. 2023, doi: 10.3390/s23083980.
[19] F. Pouromran, Y. Lin, and S. S. Kamarthi, ‘‘Automatic pain recognition
from blood volume pulse (BVP) signal using machine learning tech-
niques,’ 2023, arXiv:2303.10607.
[20] Y. Lin, Y. Xiao, L. Wang, Y. Guo, W. Zhu, B. Dalip, S. Kamarthi,
K. L. Schreiber, R. R. Edwards, and R. D. Urman, ‘Experimental
exploration of objective human pain assessment using multimodal sensing
signals,’ Frontiers Neurosci., vol. 16, Feb. 2022, Art. no. 831627, doi:
10.3389/fnins.2022.831627.
[21] S. A. H. Aqajari, R. Cao, E. Kasaeyan Naeini, M.-D. Calderon, K. Zheng,
N. Dutt, P. Liljeberg, S. Salanterä, A. M. Nelson, and A. M. Rahmani,
‘‘Pain assessment tool with electrodermal activity for postoperative
patients: Method validation study,’’ JMIR mHealth uHealth, vol. 9, no. 5,
May 2021, Art. no. e25258, doi: 10.2196/25258.
[22] S. Aziz, M. U. Khan, N. Hirachan, G. Chetty, R. Goecke, and
R. Fernandez-Rojas, ‘‘Where does it hurt?’: Exploring EDA
signals to detect and localise acute pain,’ in Proc. 45th Annu.
Int. Conf. IEEE Eng. Med. Biol. Soc., Jul. 2023, pp. 1–5, doi:
10.1109/embc40787.2023.10341157.
[23] U.S. Department of Health and Human Services, ‘‘Process of pain
assessment and reassessment,’ in Clinical Practice Guideline for Acute
Pain Management: Operative or Medical Procedures and Trauma. Pacific
Univ., 1992.
[24] J. Zielinski, M. Morawska-Kochman, and T. Zatonski, ‘Pain assessment
and management in children in the postoperative period: A review
of the most commonly used postoperative pain assessment tools, new
diagnostic methods and the latest guidelines for postoperative pain therapy
in children,’ Adv. Clin. Experim. Med., vol. 29, no. 3, pp. 365–374,
Mar. 2020, doi: 10.17219/acem/112600.
[25] E. D. Trottier, S. Ali, M.-J. Doré-Bergeron, and L. Chauvin-Kimoff,
‘‘Best practices in pain assessment and management for children,’’
Paediatrics Child Health, vol. 27, no. 7, pp. 429–437, Dec. 2022, doi:
10.1093/pch/pxac048.
[26] R. Nazari, S. Pahlevan Sharif, K. A. Allen, H. Sharif Nia, B.-L. Yee, and
A. Yaghoobzadeh, ‘‘Behavioral pain indicators in patients with traumatic
brain injury admitted to an intensive care unit,’’ J. Caring Sci., vol. 7, no. 4,
pp. 197–203, Jun. 2018, doi: 10.15171/jcs.2018.030.
[27] P. L. Manfredi, B. Breuer, D. E. Meier, and L. Libow, ‘Pain assessment
in elderly patients with severe dementia,’’ J. Pain Symptom Manage.,
vol. 25, no. 1, pp. 48–52, Jan. 2003, doi: 10.1016/s0885-3924(02)
00530-4.
[28] American Psychiatric Association. (Dec. 2022). Opioid Use Disorder.
[Online]. Available: https://www.psychiatry.org/Patients-Families/Opioid-
Use-Disorder
[29] (Aug. 29, 2023). Opioid Overdose. World Health Organization.
[Online]. Available: https://www.who.int/news-room/fact-sheets/detail/
opioid-overdose
[30] J. S. Hinson, D. A. Martinez, S. Cabral, K. George, M. Whalen, B. Hansoti,
and S. Levin, ‘Triage performance in emergency medicine: A systematic
review,’ Ann. Emergency Med., vol. 74, no. 1, pp. 140–152, Jul. 2019, doi:
10.1016/j.annemergmed.2018.09.022.
[31] M. Fernandes, S. M. Vieira, F. Leite, C. Palos, S. Finkelstein, and
J. M. C. Sousa, ‘‘Clinical decision support systems for triage in the
emergency department using intelligent systems: A review,’’ Artif. Intell.
Med., vol. 102, Jan. 2020, Art. no. 101762, doi: 10.1016/j.artmed.2019.
101762.
[32] C. T. Chambers and J. S. Mogil, ‘Ontogeny and phylogeny of facial
expression of pain,’ Pain, vol. 156, no. 5, pp. 798–799, May 2015, doi:
10.1097/j.pain.0000000000000133.
[33] B. M. Waller, E. Julle-Daniere, and J. Micheletta, ‘Measuring
the evolution of facial ‘expression’ using multi-species FACS,’
Neurosci. Biobehavioral Rev., vol. 113, pp. 1–11, Jun. 2020, doi:
10.1016/j.neubiorev.2020.02.031.
[34] J. A. Priebe, M. Kunz, C. Morcinek, P. Rieckmann, and S. Lautenbacher,
‘‘Does Parkinson’s disease lead to alterations in the facial expression of
pain?’’ J. Neurological Sci., vol. 359, nos. 1–2, pp. 226–235, Dec. 2015,
doi: 10.1016/j.jns.2015.10.056.
[35] G. Bargshady, X. Zhou, R. C. Deo, J. Soar, F. Whittaker, and H. Wang,
‘‘Enhanced deep learning algorithm development to detect pain intensity
from facial expression images,’ Exp. Syst. Appl., vol. 149, Jul. 2020,
Art. no. 113305, doi: 10.1016/j.eswa.2020.113305.
[36] E. Hosseini, R. Fang, R. Zhang, C.-N. Chuah, M. Orooji, S. Rafatirad,
S. Rafatirad, and H. Homayoun, ‘‘Convolution neural network for pain
intensity assessment from facial expression,’ in Proc. 44th Annu. Int.
Conf. IEEE Eng. Med. Biol. Soc. (EMBC), Jul. 2022, pp. 2697–2702, doi:
10.1109/EMBC48229.2022.9871770.
[37] A. N. Shreya Sri, S. Nithin, P. Vasist, V. Mundra, R. Babu, and A.
Girish, ‘‘A relative analysis of machine learning based approaches to
detect human pain intensity using facial expressions,’ in Proc. Int. Conf.
Adv. Electron., Commun., Comput. Intell. Inf. Syst. (ICAECIS), Banga-
lore, India, Apr. 2023, pp. 406–411, doi: 10.1109/ICAECIS58353.2023.
10170170.
48364 VOLUME 12, 2024
D. Albahdal et al.: PainMeter: Automatic Assessment of Pain Intensity Levels
[38] F.-S. Tsai, Y.-L. Hsu, W.-C. Chen, Y.-M. Weng, C.-J. Ng, and C.-C. Lee,
‘‘Toward development and evaluation of pain level-rating scale for
emergency triage based on vocal characteristics and facial expressions,’’ in
Proc. Interspeech, Sep. 2016, pp. 92–96, doi: 10.21437/interspeech.2016-
408.
[39] S. Locke, A. Bashall, S. Al-Adely, J. Moore, A. Wilson, and G. B. Kitchen,
‘‘Natural language processing in medicine: A review,’ Trends Anaesthesia
Crit. Care, vol. 38, pp. 4–9, Jun. 2021, doi: 10.1016/j.tacc.2021.02.007.
[40] L. A. Carlson and W. M. Hooten, ‘Pain—Linguistics and natural language
processing,’ Mayo Clinic Proc., Innov., Quality Outcomes, vol. 4, no. 3,
pp. 346–347, Jun. 2020, doi: 10.1016/j.mayocpiqo.2020.01.005.
[41] J. C. Rojas, M. Teran, and C. A. Umscheid, ‘Clinician trust in artificial
intelligence,’ Crit. Care Clinics, vol. 39, no. 4, pp. 769–782, Oct. 2023,
doi: 10.1016/j.ccc.2023.02.004.
[42] H. Naseri, K. Kafi, S. Skamene, M. Tolba, M. D. Faye, P. Ramia,
J. Khriguian, and J. Kildea, ‘‘Development of a generalizable natural
language processing pipeline to extract physician-reported pain from
clinical reports: Generated using publicly-available datasets and tested
on institutional clinical reports for cancer patients with bone metas-
tases,’ J. Biomed. Informat., vol. 120, Aug. 2021, Art. no. 103864, doi:
10.1016/j.jbi.2021.103864.
[43] A. D. Dave, G. Ruano, J. Kost, and X. Wang, ‘Automated extraction
of pain symptoms: A natural language approach using electronic health
records,’ Pain Physician, vol. 25, no. 2, pp. 245–254, Mar. 2022. [Online].
Available: https://pubmed.ncbi.nlm.nih.gov/35322976/
[44] J. A. Hughes, Y. Wu, L. Jones, C. Douglas, N. J. Brown, S. Hazelwood,
A. L. Lyrstedt, R. Jarugula, K. Chu, and A. Nguyen, ‘‘Analyzing pain
patterns in the emergency department: Leveraging clinical text deep
learning models for real-world insights,’ MedRxiv, vol. 2023, pp. 1–27,
Sep. 2023, doi: 10.1101/2023.09.24.23296019.
[45] L. I. Strand, K. F. Gundrosen, R. K. Lein, M. Laekeman, F. Lobbezoo,
R. Defrin, and B. S. Husebo, ‘‘Body movements as pain indicators in older
people with cognitive impairment: A systematic review,’’ Eur. J. Pain,
vol. 23, no. 4, pp. 669–685, Dec. 2018, doi: 10.1002/ejp.1344.
[46] P. Werner, A. Al-Hamadi, K. Limbrecht-Ecklundt, S. Walter, and
H. C. Traue, ‘‘Head movements and postures as pain behavior,’’ PLoS
ONE, vol. 13, no. 2, Feb. 2018, Art. no. e0192767, doi: 10.1371/jour-
nal.pone.0192767.
[47] F. Pouromran, S. Radhakrishnan, and S. Kamarthi, ‘Exploration of physi-
ological sensors, features, and machine learning models for pain intensity
estimation,’ PLoS ONE, vol. 16, no. 7, Jul. 2021, Art. no. e0254108, doi:
10.1371/journal.pone.0254108.
[48] H. F. Posada-Quintero, Y. Kong, and K. H. Chon, ‘‘Objective pain
stimulation intensity and pain sensation assessment using machine learning
classification and regression based on electrodermal activity,’ Amer. J.
Physiol.-Regulatory, Integrative Comparative Physiol., vol. 321, no. 2,
Jun. 2021, doi: 10.1152/ajpregu.00094.2021.
[49] M. Jiang, R. Mieronkoski, E. Syrjälä, A. Anzanpour, V. Terävä,
A. M. Rahmani, S. Salanterä, R. Aantaa, N. Hagelberg, and P. Liljeberg,
‘‘Acute pain intensity monitoring with the classification of multiple
physiological parameters,’ J. Clin. Monitor. Comput., vol. 33, no. 3,
pp. 493–507, Jun. 2018, doi: 10.1007/s10877-018-0174-8.
[50] E. Othman, P. Werner, F. Saxen, M.-A. Fiedler, and A. Al-Hamadi,
‘‘An automatic system for continuous pain intensity monitoring based on
analyzing data from uni-, bi-, and multi-modality,’’ Sensors, vol. 22, no. 13,
p. 4992, Jul. 2022, doi: 10.3390/s22134992.
[51] Y. Kong, H. F. Posada-Quintero, and K. H. Chon, ‘‘Real-time high-level
acute pain detection using a smartphone and a wrist-worn electrodermal
activity sensor,’’ Sensors, vol. 21, no. 12, p. 3956, Jun. 2021, doi:
10.3390/s21123956.
[52] H. Korving, D. Zhou, H. Xiang, P. Sterkenburg, P. Markopoulos,
and E. Barakova, ‘Development of an AI-enabled system for pain
monitoring using skin conductance sensoring in socks,’ Int. J. Neural
Syst., vol. 32, no. 10, Sep. 2022, Art. no. 2250047, doi: 10.1142/
s0129065722500472.
[53] S. Gruss, R. Treister, P. Werner, H. C. Traue, S. Crawcour, A. Andrade, and
S. Walter, ‘‘Pain intensity recognition rates via biopotential feature patterns
with support vector machines,’ PLoS ONE, vol. 10, no. 10, Oct. 2015,
Art. no. e0140330, doi: 10.1371/journal.pone.0140330.
[54] A. Rácz, D. Bajusz, and K. Héberger, ‘Effect of dataset size and train/test
split ratios in QSAR/QSPR multiclass classification,’ Molecules, vol. 26,
no. 4, p. 1111, Feb. 2021, doi: 10.3390/molecules26041111.
[55] I. K. Nti, O. Nyarko-Boateng, and J. Aning, ‘‘Performance of machine
learning algorithms with different K values in K-fold CrossValidation,’’
Int. J. Inf. Technol. Comput. Sci., vol. 13, no. 6, pp. 61–71, Dec. 2021, doi:
10.5815/ijitcs.2021.06.05.
[56] N. Pudjihartono, T. Fadason, A. W. Kempa-Liehr, and J. M. O’Sullivan,
‘‘A review of feature selection methods for machine learning-based disease
risk prediction,’ Frontiers Bioinf., vol. 2, Jun. 2022, Art. no. 927312, doi:
10.3389/fbinf.2022.927312.
[57] O. A. M. Lopez, A. M. López, and J. Crossa, Multivariate Statistical
Machine Learning Methods for Genomic Prediction, 1st ed. Cham,
Switzerland: Springer, 2022, doi: 10.1007/978-3-030-89010-0.
[58] R. B. Fillingim, ‘‘Individual differences inpain: Understanding the mosaic
that makes pain personal,’ Pain, vol. 158, no. 1, pp. 11–18, Apr. 2017, doi:
10.1097/j.pain.0000000000000775.
DA’AD ALBAHDAL is currently pursuing the master’s degree with the
Program of Data Science, Department of Information Technology, College
of Computer, Qassim University, Saudi Arabia.
WIJDAN ALJEBREEN is currently pursuing the master’s degree with the
Program of Data Science, Department of Information Technology, College
of Computer, Qassim University, Saudi Arabia.
DINA M. IBRAHIM was born in United Arab
Emirates. She received the B.Sc., M.Sc., and
Ph.D. degrees from the Computers and Control
Engineering Department, Faculty of Engineering,
Tanta University, Egypt, in 2002, 2008, and 2014,
respectively. She was a Consultant Engineer,
a Database Administrator, and a Vice Manager
with the Management Information Systems (MIS)
Project, Tanta University, from 2008 to 2014.
She has been an Associate Professor with the
Department of Information Technology, College of Computers, Qassim
University, Buraydah, Saudi Arabia, since September 2015. She is currently
an Assistant Professor with the Department of Computers and Control
Engineering, Faculty of Engineering, Tanta University. She has published
more than 60 papers in various refereed international journals and confer-
ences. Her research interests include networking, wireless communications,
machine learning, and the Internet of Things. She has been serving as the
Co-Chair for the International Technical Committee for the Middle East
Region of the ICCMIT Conference, since 2021. She has been serving as a
Reviewer for Wireless Network (WINE) (Springer) and Journal of Mobile
Communication, Computation and Information (Springer), since 2015,
and recently, Multidisciplinary Digital Publishing Institute (MDPI), IEEE
ACCESS, and International Journal of Supply and Operations Management
(IJSOM), in 2021.
VOLUME 12, 2024 48365
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Pain is a highly unpleasant sensory experience, for which currently no objective diagnostic test exists to measure it. Identification and localisation of pain, where the subject is unable to communicate, is a key step in enhancing therapeutic outcomes. Numerous studies have been conducted to categorise pain, but no reliable conclusion has been achieved. This is the first study that aims to show a strict relation between Electrodermal Activity (EDA) signal features and the presence of pain and to clarify the relation of classified signals to the location of the pain. For that purpose, EDA signals were recorded from 28 healthy subjects by inducing electrical pain at two anatomical locations (hand and forearm) of each subject. The EDA data were preprocessed with a Discrete Wavelet Transform to remove any irrelevant information. Chi-square feature selection was used to select features extracted from three domains: time, frequency, and cepstrum. The final feature vector was fed to a pool of classification schemes where an Artificial Neural Network classifier performed best. The proposed method, evaluated through leave-one-subject-out cross-validation, provided 90% accuracy in pain detection (no pain vs. pain), whereas the pain localisation experiment (hand pain vs. forearm pain) achieved 66.67% accuracy.Clinical relevance— This is the first study to provide an analysis of EDA signals in finding the source of the pain. This research explores the viability of using EDA for pain localisation, which may be helpful in the treatment of noncommunicable patients.
Preprint
Full-text available
Objective: To estimate the prevalence of patients presenting in pain to an inner-city emergency department (ED), describing this population, their treatment, and the effect of the COVID-19 pandemic. Materials and Methods: We applied a clinical text deep learning model to the free text nursing assessments to identify the prevalence of pain on arrival to the ED. Using interrupted time series analysis, we examined the prevalence over three years. We describe this population pre- and post-pandemic in terms of their demographics, arrival patterns and treatment. Results: 55.16% (95%CI 54.95% - 55.36%) of all patients presenting to this ED had pain on arrival. There were significant differences in demographics, arrival and departure patterns between those patients with and without pain. The COVID-19 pandemic initially precipitated a decrease followed by a sharp, sustained rise in the prevalence of pain on arrival, altering the population arriving in pain and their treatment. Discussion The application of a clinical text deep learning model has successfully identified the prevalence of pain on arrival. The description of this population and their treatment forms the basis of intervention to improve care for patients presenting with pain. The combination of the clinical text deep learning model and interrupted time series analysis has identified the effects of the COVID-19 pandemic on pain care in the ED. Conclusion A clinical text deep learning model has led to identifying the prevalence of pain on arrival and was able to identify the effect a major pandemic had on pain care in this ED.
Article
Full-text available
The habenula has been implicated in the pathogenesis of pain and analgesia, while evidence concerning its function in chronic low back pain (cLBP) is sparse. This study aims to investigate the resting-state functional connectivity (rsFC) and effective connectivity of the habenula in 52 patients with cLBP and 52 healthy controls (HCs) and assess the feasibility of distinguishing cLBP from HCs based on connectivity by machine learning methods. Our results indicated significantly enhanced rsFC of the habenula-left superior frontal cortex (SFC), habenula-right thalamus, and habenula-bilateral insular pathways as well as decreased rsFC of the habenula-pons pathway in cLBP patients compared to HCs. Dynamic causal modelling revealed significantly enhanced effective connectivity from the right thalamus to right habenula in cLBP patients compared with HCs. RsFC of the habenula-SFC was positively correlated with pain intensities and Hamilton Depression scores in the cLBP group. RsFC of the habenula-right insula was negatively correlated with pain duration in the cLBP group. Additionally, the combination of the rsFC of the habenula-SFC, habenula-thalamus, and habenula-pons pathways could reliably distinguish cLBP patients from HCs with an accuracy of 75.9% by support vector machine, which was validated in an independent cohort (N = 68, accuracy = 68.8%, p = .001). Linear regression and random forest could also distinguish cLBP and HCs in the independent cohort (accuracy = 73.9 and 55.9%, respectively). Overall, these findings provide evidence that cLBP may be associated with abnormal rsFC and effective connectivity of the habenula, and highlight the promise of machine learning in chronic pain discrimination.
Article
Full-text available
Critically ill patients often lack cognitive or communicative functions, making it challenging to assess their pain levels using self-reporting mechanisms. There is an urgent need for an accurate system that can assess pain levels without relying on patient-reported information. Blood volume pulse (BVP) is a relatively unexplored physiological measure with the potential to assess pain levels. This study aims to develop an accurate pain intensity classification system based on BVP signals through comprehensive experimental analysis. Twenty-two healthy subjects participated in the study, in which we analyzed the classification performance of BVP signals for various pain intensities using time, frequency, and morphological features through fourteen different machine learning classifiers. Three experiments were conducted using leave-one-subject-out cross-validation to better examine the hidden signatures of BVP signals for pain level classification. The results of the experiments showed that BVP signals combined with machine learning can provide an objective and quantitative evaluation of pain levels in clinical settings. Specifically, no pain and high pain BVP signals were classified with 96.6% accuracy, 100% sensitivity, and 91.6% specificity using a combination of time, frequency, and morphological features with artificial neural networks (ANNs). The classification of no pain and low pain BVP signals yielded 83.3% accuracy using a combination of time and morphological features with the AdaBoost classifier. Finally, the multi-class experiment, which classified no pain, low pain, and high pain, achieved 69% overall accuracy using a combination of time and morphological features with ANN. In conclusion, the experimental results suggest that BVP signals combined with machine learning can offer an objective and reliable assessment of pain levels in clinical settings.
Article
Full-text available
Pain assessment poses a challenge in several groups of clients, yet specific barriers arise when it comes to pain assessment of individuals with intellectual and developmental disabilities (IDD), due mostly to communication challenges preventing valid and reliable self-reports. Despite increased interest in pain assessment of those diagnosed with IDD within recent years, little is known about pain behavior in this group. The present article overviews the current state of pain diagnosis for individuals with IDD, focusing on existing pain assessment scales. In addition, it suggests technological developments offering new ways to diagnose existence of pain in this population, such as a Smartphone App for caregivers based on unique acoustic characteristics of pain-related vocal responses, or the use of smart wearable shirts that enable continuous surveillance of vital physiological signs. Such novel technological solutions may improve diagnosis of pain in the IDD population, as well as in other individuals with complex communication needs, to provide better pain treatment and enhance overall quality of life.
Article
Full-text available
In this pooled analysis of two randomized clinical trials, intraoperative opioid dosing based on the nociception level-index produced less pain compared to standard care with a difference in pain scores in the post-anesthesia care unit of 1.5 (95% CI 0.8–2.2) points on an 11-point scale. The proportion of patients with severe pain was lower by 70%. Severe postoperative pain remains a significant problem and associates with several adverse outcomes. Here, we determined whether the application of a monitor that detects intraoperative nociceptive events, based on machine learning technology, and treatment of such events reduces pain scores in the post-anesthesia care unit (PACU). To that end, we performed a pooled analysis of two trials in adult patients, undergoing elective major abdominal surgery, on the effect of intraoperative nociception level monitor (NOL)-guided fentanyl dosing on PACU pain was performed. Patients received NOL-guided fentanyl dosing or standard care (fentanyl dosing based on hemodynamic parameters). Goal of the intervention was to keep NOL at values that indicated absence of nociception. The primary endpoint of the study was the median pain score obtained in the first 90 min in the PACU. Pain scores were collected at 15 min intervals on an 11-point Likert scale. Data from 125 patients (55 men, 70 women, age range 21–86 years) were analyzed. Sixty-one patients received NOL-guided fentanyl dosing and 64 standard care. Median PACU pain score was 1.5 points (0.8–2.2) lower in the NOL group compared to the standard care; the proportion of patients with severe pain was 70% lower in the NOL group ( p = 0.045). The only significant factor associated with increased odds for severe pain was the standard of care compared to NOL treatment (OR 6.0, 95% CI 1.4 −25.9, p = 0.017). The use of a machine learning-based technology to guide opioid dosing during major abdominal surgery resulted in reduced PACU pain scores with less patients in severe pain.
Article
Full-text available
Pain is one of the most common symptoms reported by individuals presenting to hospitals and clinics and is associated with significant disability and economic impacts; however, the ability to quantify and monitor pain is modest and typically accomplished through subjective self-report. Since pain is associated with stereotypical physiological alterations, there is potential for non-invasive, objective pain measurements through biosensors coupled with machine learning algorithms. In the current study, a physiological dataset associated with acute pain induction in healthy adults was leveraged to develop an algorithm capable of detecting pain in real-time and in natural field environments. Forty-one human subjects were exposed to acute pain through the cold pressor test while being monitored using electrocardiography. A series of respiratory and heart rate variability features in the time, frequency, and nonlinear domains were calculated and used to develop logistic regression classifiers of pain for two scenarios: (1) laboratory/clinical use with an F1 score of 81.9% and (2) field/ambulatory use with an F1 score of 79.4%. The resulting pain algorithms could be leveraged to quantify acute pain using data from a range of sources, such as ECG data in clinical settings or pulse plethysmography data in a growing number of consumer wearables. Given the high prevalence of pain worldwide and the lack of objective methods to quantify it, this approach has the potential to identify and better mitigate individual pain.
Article
Pain assessment and management are essential components of paediatric care. Developmentally appropriate pain assessment is an important first step in optimizing pain management. Self-reported pain should be prioritized. Alternatively, developmentally appropriate behavioural tools should be used. Acute pain management and prevention guidelines and strategies that combine physical, psychological, and pharmacological approaches should be accessible in all health care settings. Chronic pain is best managed using combined treatment modalities and counselling, with the primary goal of attaining functional improvement. The planning and implementation of pain management strategies for children should always be personalized and family-centred.