ArticlePDF Available

Methodological Shortcomings of Wrist-Worn Heart Rate Monitors Validations

February 2018
Journal of Medical Internet Research 20(7)

February 2018
20(7)

DOI:10.2196/10108

License
CC BY 4.0

Authors:

Francesco Sartor

Università degli Studi "G. d'Annunzio" Chieti-Pescara, Italy

Gabriele Papini

Philips

Lieke Cox

Philips

John Cleland

University of Glasgow

Wearable sensor technology could have an important role for clinical research and in delivering health care. Accordingly, such technology should undergo rigorous evaluation prior to market launch, and its performance should be supported by evidence-based marketing claims. Many studies have been published attempting to validate wrist-worn photoplethysmography (PPG)-based heart rate monitoring devices, but their contrasting results question the utility of this technology. The reason why many validations did not provide conclusive evidence of the validity of wrist-worn PPG-based heart rate monitoring devices is mostly methodological. The validation strategy should consider the nature of data provided by both the investigational and reference devices. There should be uniformity in the statistical approach to the analyses employed in these validation studies. The investigators should test the technology in the population of interest and in a setting appropriate for intended use. Device industries and the scientific community require robust standards for the validation of new wearable sensor technology.

Brief overview of potential clinical and nonclinical applications derivable from continuous heart rate monitoring. AF/VT: atrial fibrillation/ventricular tachycardia; HFrEH: heart failure with reduced ejection fraction.

…

Correlation between 3 heart rate (HR) monitoring devices and the electrocardiography (ECG) reference. When the 2 chest straps and the wrist-worn photoplethysmography (PPG) heart rate monitors consistently disagree with the reference, their points depart from the 45-degree line in the same way.

…

Segment of heart rate (HR) recordings by 3 devices: electrocardiography (ECG) reference, chest strap, and photoplethysmography (PPG) watch. The red circles represent the instants when heart rate from those devices would be collected if these were a value per minute observation. It is evident how these values do not represent the actual second by second or even the average agreement among the 3 devices.

…

Figures - uploaded by Francesco Sartor

Content may be subject to copyright.

Content uploaded by Francesco Sartor

Content may be subject to copyright.

Available via license: CC BY 4.0

Content may be subject to copyright.

Viewpoint

Methodological Shortcomings of Wrist-Worn Heart Rate Monitors

Validations

Francesco Sartor1, PhD; Gabriele Papini2, MSc; Lieke Gertruda Elisabeth Cox1, PhD; John Cleland3, MD, FRCP

1Personal Health, Philips Research, Eindhoven, Netherlands

2Department of Electrical Engineering, Technical University Eindhoven, Eindhoven, Netherlands

3Institute of Health & Well-Being, Robertson Centre for Biostatistics and Clinical Trials, University of Glasgow, Glasgow, United Kingdom

Corresponding Author:

Francesco Sartor, PhD

Personal Health

Philips Research

High Tech Campus

Eindhoven,

Netherlands

Phone: 31 61 550 9627

Email: francesco.sartor@philips.com

Abstract

Wearable sensor technology could have an important role for clinical research and in delivering health care. Accordingly, such

technology should undergo rigorous evaluation prior to market launch, and its performance should be supported by evidence-based

marketing claims. Many studies have been published attempting to validate wrist-worn photoplethysmography (PPG)-based heart

rate monitoring devices, but their contrasting results question the utility of this technology. The reason why many validations did

not provide conclusive evidence of the validity of wrist-worn PPG-based heart rate monitoring devices is mostly methodological.

The validation strategy should consider the nature of data provided by both the investigational and reference devices. There should

be uniformity in the statistical approach to the analyses employed in these validation studies. The investigators should test the

technology in the population of interest and in a setting appropriate for intended use. Device industries and the scientific community

require robust standards for the validation of new wearable sensor technology.

(J Med Internet Res 2018;20(7):e10108) doi:10.2196/10108

KEYWORDS

sensor technology; accuracy; wearable; telemonitoring

In the past 5 years, there has been a huge proliferation of

wrist-worn heart rate monitors, often embedded in smart-bands

and smartwatches, which can generate a vast amount of data on

lifestyle, physiology, and disease providing exciting

opportunities for future health applications. Wearable sensor

technology could have an important role for clinical research

and in delivering health care [1]. Wearable sensors can be used

to encourage healthier living (possible delaying or preventing

the onset of disease), screen for incident disease, and provide

unobtrusive continuous monitoring for people with chronic

illnesses in order to optimize care and detect disease progression

and complications. In Figure 1, we show an overview of

potential continuous heart rate monitoring applications. New

diagnostic applications could become possible thanks to the

integration of heart rate and personal information such as age,

sex, fitness, activity type, and symptoms. A large number of

lifestyle apps and games are emerging thanks to continuous

heart rate monitoring, currently most of them related to fitness

(eg, Google Fit, Strava) or biofeedback relaxation (eg, Letter

Zap, Skip a Beat). It is conceivable that health-promoting apps

or games based on heart rate will soon be developed. Wearable

heart rate monitors could also enable therapeutic monitoring

such as medication titration. Accordingly, such monitors should

undergo rigorous evaluation prior to market launch, and their

performance should be supported by evidence-based marketing

claims [1].

There are several types of validation studies. These studies may

be marketing claim validations or medical claim validations for

medical grade certification. They are usually done by the

manufacturers, sometimes in collaboration with clinical sites,

on unreleased products. There may also be benchmarking

validation studies, where several commercially available

competing products are compared to one another and against a

reference. In some cases, there may be even single device

validation studies.

J Med Internet Res 2018 | vol. 20 | iss. 7 | e10108 | p.1http://www.jmir.org/2018/7/e10108/

(page number not for citation purposes)

Sartor et alJOURNAL OF MEDICAL INTERNET RESEARCH

XSL

•

RenderX

Figure 1. Brief overview of potential clinical and nonclinical applications derivable from continuous heart rate monitoring. AF/VT: atrial

fibrillation/ventricular tachycardia; HFrEH: heart failure with reduced ejection fraction.

The latter 2 types are generally performed by academic or

clinical centers even though industries often engage in such

comparisons as well. The only studies which go through a strict

quality regulatory framework are medical claim validation

studies for medical grade certification (eg, Food and Drug

Administration in the United States, medical CE [Conformité

Européene] marking in Europe) [2,3]. As a consequence, many

nonmedical devices are released on the market without rigorous

validation.

In Europe, the choice on how to position a device is the

responsibility of the manufacturer, whereas in the United States,

this decision can be overruled if the device is perceived to have

potential health risks for the user [4]. Because manufacturers

can decide whether or not they wish to comply with medical

certification regulations, this inevitably leads to heterogeneity

in what validations are done. In our view, the lack of stringent

regulations for the release of nonmedical heart rate monitoring

devices should not justify the lack of standard requirements for

validating this technology. The adoption of such technology by

health care professionals could be hampered by their liability

in case of adverse events when using commercially available

nonmedical devices. The authors of this viewpoint agree with

Quinn [4], who suggests “a more pragmatic, risk-based

approach,” which takes a case-by-case look at commercial

solutions that may or may not meet the standards required of

medical devices. This approach should be applied to promote

technology adoption and at the same time safeguard the safety

of end-users. Here, we give an overview of clinical applications

exploiting wearable heart rate monitors.

In a Research Letter recently published in JAMA [5], the

performance of several commercially available, wrist-worn

photoplethysmography (PPG)-based heart rate monitors was

reported. The authors concluded that PPG-based monitoring

was not suitable “when accurate measurement of heart rate is

imperative.” The authors of that Research Letter acknowledged

their report had limitations, including testing only 1 type of

activity (treadmill), only in healthy people, and noncontinuous

monitoring. Many other studies have been published validating

wrist-worn PPG-based heart rate monitoring devices [6-14] but

fail to show consensus in favor of or against the accuracy of

this sensing technology.

The authors believe that the reason why many validations did

not provide conclusive evidence of the validity of wrist-worn

PPG-based heart rate monitoring devices is mostly

methodological. Studies conducted by teams with a biomedical

engineering background are more concerned with addressing

problems like signal synchronization and averaging, while

research teams with a sports medicine background are more

concerned with target groups and exercise protocols. Moreover,

clinicians are primarily interested in apps related to

telemonitoring, in-hospital or remote. Each approach has its

methodological shortcomings. The aim of this viewpoint is to

suggest a more consistent and robust approach to validating

monitoring technologies.

When validating heart rate monitoring devices, it is sensible to

follow a common definition of accuracy. The American National

Standards Institute standard for cardiac monitors, heart rate

meters, and alarms defines accuracy as a “readout error of no

greater than ±10% of the input rate or ±5 bpm, whichever is

greater” [15]. Once accurate heart rate is defined, it is also good

to agree on what to use as a gold standard. Electrocardiography

(ECG) is the accepted gold standard for heart rate monitoring.

Nevertheless, ECG, as with PPG, can be severely affected by

artifacts [16]. Yet it is generally accepted that PPG-based heart

rate monitoring suffers from inherent drawbacks (eg, more

difficult peak detection, higher sensitivity to motion artifacts)

compared to ECG-based monitoring [16].

The validation strategy should consider the nature of data

provided by investigational devices (ID) and reference devices

(RD). Heart rate values are always derived from more complex

signals (eg, ECG, PPG). Thus, even when the ID and RD have

the same output rate (eg, 1 heart rate value per second) and these

outputs are well synchronized, the beats compared may not

belong to the same time intervals. The method used to extract

J Med Internet Res 2018 | vol. 20 | iss. 7 | e10108 | p.2http://www.jmir.org/2018/7/e10108/

(page number not for citation purposes)

Sartor et alJOURNAL OF MEDICAL INTERNET RESEARCH

XSL

•

RenderX

information from the raw data (eg, time domain or frequency

domain) and the averaging strategy (eg, interbeat intervals or

5-second periods) of the raw data will determine a specific time

lag for each heart rate value. Ideally, researchers should have

access to the raw data. This is often not possible, and it should

be acknowledged as a limitation.

Researchers should realize that their RD (often an ECG device)

will not always be accurate. Unless there is a quality check on

the validity of the ECG, a second reference device should be

used such as a second ECG-based sensor applied in a different

manner (eg, patch versus chest strap) and using a different

software algorithm for calculating heart rate. When the two RDs

fail to agree, no comparison should be made between RD and

ID outputs (Figure 2). As mentioned earlier, even the RD (for

example ECG patch or ECG strap) in certain circumstances may

suffer from inaccuracy due to artifacts (eg, motion artifacts).

Based on our own experience in testing hundreds of subjects,

we realized that ECG patches perform particularly badly when

the skin under the electrodes is stretched or excessively wet.

ECG straps perform rather poorly when the skin gets too dry,

the strap loosens up, and for certain anatomical shapes (pectus

excavatum). These problems must be reported by the researcher.

Figure 2. Correlation between 3 heart rate (HR) monitoring devices and the electrocardiography (ECG) reference. When the 2 chest straps and the

wrist-worn photoplethysmography (PPG) heart rate monitors consistently disagree with the reference, their points depart from the 45-degree line in the

same way.

Figure 3. Segment of heart rate (HR) recordings by 3 devices: electrocardiography (ECG) reference, chest strap, and photoplethysmography (PPG)

watch. The red circles represent the instants when heart rate from those devices would be collected if these were a value per minute observation. It is

evident how these values do not represent the actual second by second or even the average agreement among the 3 devices.

J Med Internet Res 2018 | vol. 20 | iss. 7 | e10108 | p.3http://www.jmir.org/2018/7/e10108/

(page number not for citation purposes)

Sartor et alJOURNAL OF MEDICAL INTERNET RESEARCH

XSL

•

RenderX

The accuracy of the observation method should be robust (ie,

repeatable and reproducible). In some validation studies, heart

rate was logged manually after visually consulting the display

of both ID and RD [5,7]. This method carries several limitations

including human data entry errors and failure to report precisely

simultaneous values from multiple devices. This method also

limits the observation rate to, for instance, 1 value per minute

[5,6]. Taking 1 value per minute is not the same as taking an

averaged value over a minute, and both approaches fail to

capitalize on the information derived from the rates of change

in heart rate and heart variability and assume that participants

are in a steady-state condition. Researchers should choose the

observation rate (eg, 1 or 5 values per second) and averaging

strategy (eg, 5- or 30-second windows) according to the use

case foreseen for the heart rate monitor. Yet researchers need

to be aware that taking, or averaging, 1 value every minute will

hide variability [17]. This is evident in Figure 3, which illustrates

that 1 single time point (red circles) is not necessarily

representative of the entire minute. Consequently, for the

purpose of testing accuracy, even when a mean heart rate value

per minute would be sufficient, accuracy should be evaluated

at the highest resolution possible.

We also observed a lack of uniformity in the statistical analyses

employed in validation studies. Pearson correlations and Student

ttests are inadequate for testing agreement [18]. This is because

the Pearson correlation coefficient is not sensitive to systematic

deviations from the 45-degree line, failing to reject agreement

when these deviations occur. The Student ttest is inadequate

in rejecting agreement when means are equal but the 2 measures

do not correlate with each other, and it can reject agreement

when a very small systematic residual error shifts 1 of the means

[19]. Moreover, the ttest assesses difference, which implies

that when not rejecting the null hypothesis (ie, means are equal)

it does not prove that the 2 means are equivalent. Concordance

correlation coefficients should be reported instead [18,19]. Also,

limits of agreement analyses should be accompanied by typical

error calculations [20]. Equivalence testing should be used when

the alternative hypothesis is that the outputs of 2 devices are

the same [21]. In equivalence testing, the null hypothesis is that

the differences between the means are outside the equivalence

limits.

Finally, there are some practical considerations. The

investigators should test the technology in the population of

interest and in a setting appropriate for intended use.

Measurements taken at rest or in the period after exercise cannot

be considered to validate measurements done during exercise.

Results gathered on healthy individuals with no abnormal heart

rhythm are inappropriate for applications aimed at patients with

cardiovascular disease where the burden of arrhythmias will be

substantially higher. Additionally, due to the effect that the

contact of the sensor with the skin and the environmental

conditions can have on the PPG signal, information such as

sensor placement, strap tightness, skin type, temperature, and

possibly light intensity should be reported.

Although many studies have been published to assess the

validity and usability of wrist-worn PPG-based heart rate

monitoring, their methodological differences and shortcomings

hamper research into their clinical utility and their introduction

into health care. Such devices could make an important

contribution to the future of mobile health and, in our view,

should be rigorously evaluated as outlined above. For the reasons

discussed in this viewpoint, we advocate standard requirements

generally accepted by both the scientific community and the

device industries in order to provide a fair and consistent

validation of new wearable sensor technology.

Acknowledgments

The authors would like to thank Dr Helma de Morree for reviewing the first draft of the manuscript.

Conflicts of Interest

FS and LGEC work for Royal Philips Electronics. GP is funded by Stichting voor de Technische Wetenschappen/Instituut voor

Innovatie door Wetenschap en Technologie in the context of the Obstructive Sleep Apnea+ project (No 14619). JC has no conflicts

of interest.

References

1. Sperlich B, Holmberg H. Wearable, yes, but able...? it is time for evidence-based marketing claims!. Br J Sports Med 2016

Dec 16;51(16):1240 [FREE Full text] [doi: 10.1136/bjsports-2016-097295] [Medline: 27986762]

2. Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices, amending

Directive 2001/83/EC, Regulation (EC) No 178/2002 and Regulation (EC) No 1223/2009 and repealing Council Directives

90/385/EEC and 93/42/EEC (Text with EEA relevance).: Official Journal of the European Union; 2017 May 05. URL:

https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32017R0745&from=EN [accessed 2018-06-08]

3. US Code of Federal Regulations, Title 21: Food and Drugs, Part 820—Quality System Regulation, Subpart C—Design

Controls. URL: https://www.ecfr.gov/cgi-bin/text-idx?SID=6f0a2b9ffb8a3ac8fe2619b2381bc725&mc=true&node=se21.

8.820_130&rgn=div8 [accessed 2018-06-08]

4. Quinn P. The EU commission's risky choice for a non-risk based strategy on assessment of medical devices. Comput Law

Security Rev 2017 Jun;33(3):361-370. [doi: 10.1016/j.clsr.2017.03.019]

5. Wang R, Blackburn G, Desai M, Phelan D, Gillinov L, Houghtaling P, et al. Accuracy of wrist-worn heart rate monitors.

JAMA Cardiol 2016 Oct 12;2(1):104-106. [doi: 10.1001/jamacardio.2016.3340] [Medline: 27732703]

J Med Internet Res 2018 | vol. 20 | iss. 7 | e10108 | p.4http://www.jmir.org/2018/7/e10108/

(page number not for citation purposes)

Sartor et alJOURNAL OF MEDICAL INTERNET RESEARCH

XSL

•

RenderX

6. Cadmus-Bertram L, Gangnon R, Wirkus EJ, Thraen-Borowski KM, Gorzelitz-Liebhauser J. The accuracy of heart rate

monitoring by some wrist-worn activity trackers. Ann Intern Med 2017 Apr 18;166(8):610-612. [doi: 10.7326/L16-0353]

[Medline: 28395305]

7. Gillinov S, Etiwy M, Wang R, Blackburn G, Phelan D, Gillinov AM, et al. Variable accuracy of wearable heart rate monitors

during aerobic exercise. Med Sci Sports Exerc 2017 Aug;49(8):1697-1703. [doi: 10.1249/MSS.0000000000001284]

[Medline: 28709155]

8. Kroll RR, Boyd JG, Maslove DM. Accuracy of a wrist-worn wearable device for monitoring heart rates in hospital inpatients:

a prospective observational study. J Med Internet Res 2016 Sep 20;18(9):e253 [FREE Full text] [doi: 10.2196/jmir.6025]

[Medline: 27651304]

9. Wallen MP, Gomersall SR, Keating SE, Wisløff U, Coombes JS. Accuracy of heart rate watches: implications for weight

management. PLoS One 2016;11(5):e0154420 [FREE Full text] [doi: 10.1371/journal.pone.0154420] [Medline: 27232714]

10. Delgado-Gonzalo R, Parak J, Tarniceriu A, Renevey P, Bertschi M, Korhonen I. Evaluation of accuracy and reliability of

PulseOn optical heart rate monitoring device. Conf Proc IEEE Eng Med Biol Soc 2015 Aug;2015:430-433. [doi:

10.1109/EMBC.2015.7318391] [Medline: 26736291]

11. Parak J, Korhonen I. Evaluation of wearable consumer heart rate monitors based on photoplethysmography. Conf Proc

IEEE Eng Med Biol Soc 2014;2014:3670-3673. [doi: 10.1109/EMBC.2014.6944419] [Medline: 25570787]

12. Shcherbina A, Mattsson CM, Waggott D, Salisbury H, Christle JW, Hastie T, et al. Accuracy in wrist-worn, sensor-based

measurements of heart rate and energy expenditure in a diverse cohort. J Pers Med 2017 May 24;7(2):1 [FREE Full text]

[doi: 10.3390/jpm7020003] [Medline: 28538708]

13. Spierer DK, Rosen Z, Litman LL, Fujii K. Validation of photoplethysmography as a method to detect heart rate during rest

and exercise. J Med Eng Technol 2015;39(5):264-271. [doi: 10.3109/03091902.2015.1047536] [Medline: 26112379]

14. Valenti G, Westerterp K. Optical heart rate monitoring module validation study. 2013 Presented at: IEEE International

Conference on Consumer Electronics; 2013; Las Vegas p. 195-196. [doi: 10.1109/ICCE.2013.6486856]

15. ANSI/AAMI. Cardiac Monitors, Heart Rate Meters, and Alarms. Arlington: American National Standards Institute, Inc;

2002.

16. Lang M. Beyond Fitbit: a critical appraisal of optical heart rate monitoring wearables and apps, their current limitations

and legal implications. Albany Law J Sci Technol 2017;28(1):39-72.

17. Guidance for Industry, Investigating Out-of-Specification (OOS), Test Results for Pharmaceutical Production.: Department

of Health and Human Services, Center for Drug Evaluation and Research (CDER); 2006. URL: https://www.fda.gov/

downloads/drugs/guidances/ucm070287.pdf [accessed 2018-06-08] [WebCite Cache ID 701iuEKAU]

18. Schäfer A, Vagedes J. How accurate is pulse rate variability as an estimate of heart rate variability? A review on studies

comparing photoplethysmographic technology with an electrocardiogram. Int J Cardiol 2013 Jun 05;166(1):15-29. [doi:

10.1016/j.ijcard.2012.03.119] [Medline: 22809539]

19. Lin LI. A concordance correlation coefficient to evaluate reproducibility. Biometrics 1989 Mar;45(1):255-268. [Medline:

2720055]

20. Hopkins WG. Measures of reliability in sports medicine and science. Sports Med 2000 Jul;30(1):1-15. [Medline: 10907753]

21. Lakens D. Equivalence tests: a practical primer for t tests, correlations, and meta-analyses. Soc Psychol Personal Sci 2017

May;8(4):355-362 [FREE Full text] [doi: 10.1177/1948550617697177] [Medline: 28736600]

Abbreviations

ECG: electrocardiography

ID: investigational device

PPG: photoplethysmography

RD: reference device

Edited by G Eysenbach; submitted 13.02.18; peer-reviewed by M Lang, P Wark; comments to author 22.03.18; revised version received

16.05.18; accepted 29.05.18; published 02.07.18

Please cite as:

Sartor F, Papini G, Cox LGE, Cleland J

Methodological Shortcomings of Wrist-Worn Heart Rate Monitors Validations

J Med Internet Res 2018;20(7):e10108

URL: http://www.jmir.org/2018/7/e10108/

doi:10.2196/10108

PMID:

J Med Internet Res 2018 | vol. 20 | iss. 7 | e10108 | p.5http://www.jmir.org/2018/7/e10108/

(page number not for citation purposes)

Sartor et alJOURNAL OF MEDICAL INTERNET RESEARCH

XSL

•

RenderX

©Francesco Sartor, Gabriele Papini, Lieke Gertruda Elisabeth Cox, John Cleland. Originally published in the Journal of Medical

Internet Research (http://www.jmir.org), 02.07.2018. This is an open-access article distributed under the terms of the Creative

Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and

reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly

cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright

and license information must be included.

J Med Internet Res 2018 | vol. 20 | iss. 7 | e10108 | p.6http://www.jmir.org/2018/7/e10108/

(page number not for citation purposes)

Sartor et alJOURNAL OF MEDICAL INTERNET RESEARCH

XSL

•

RenderX

Photoplethysmograhic sensors, potential and limitations: Is it time for regulation? A comprehensive review

Article

Jun 2023

Assessment of Samsung Galaxy Watch4 PPG-Based Heart Rate During Light-to-Vigorous Physical Activities

Article

Jun 2024

Abstract—There has been lately a notable increase in the adoption of smartwatches and fitness trackers. These wearables have gained popularity by monitoring vital health parameters, such as heart rate (HR), employing a low-cost and noninvasive technique known as photoplethysmography (PPG). Despite its ease of continuous monitoring the cardiovascular system during daily activities, PPG signal is susceptible to noise, mainly caused by user movement during monitoring, particularly in activities involving vigorous body motion. Therefore, it is crucial that PPG-based devices undergo scientific validation to ensure their reliability and accuracy in real-world conditions. In this study, we conducted an assessment of the smartwatch Samsung Galaxy Watch4 (SGW4), comparing its performance with the Polar H10, an ECG device acknowledged as the reference for HR measurement during intense activities with intense body movements. Our research protocol involved 14 participants engaged in activities that induced a range of HR variations, covering light, moderate and vigorous intensities, including stationary cycling and treadmill workouts. We employed five metrics to compare HR measurements obtained from the SGW4 with those from the reference device. Additionally, a Bland-Altman analysis was conducted to evaluate the agreement between the two devices. The results of our study indicate that the SGW4 is well-suited for accurately measuring HR during usual exercises, spanning from low to high-intensity physical activities, and considering scenarios in which the SGW4 was either nearly stationary or in free movement.

WaveFlex Sensor: Advancing Wearable Cardiorespiratory Monitoring With Flexible Wave-Shaped Polymer Optical Fiber

Article

Sep 2023
IEEE J SEL TOP QUANT

In this work, we propose a wearable cardiorespiratory monitoring sensor, the WaveFlex sensor, which incorporates a flexible wave-shaped polymer optical fiber. The stretchable polydimethylsiloxane (PDMS) material maintains the wave-shaped structure of the polymethyl methacrylate (PMMA) optical fiber, connected to the inelastic fabric on each side to form the sensor. The WaveFlex sensor enables accurate cardiorespiratory monitoring, tested on ten subjects in standing, sitting, and lying positions. Signal processing techniques such as power spectrum analysis and finite impulse response (FIR) theory extract respiration rate (RR), heart rate (HR), respiratory, and heartbeat waveforms. The sensor exhibits high accuracy, with errors of less than 9% for RR and less than 3% for HR. The proposed WaveFlex sensor also allows for the timely monitoring of apnea through frequency and time domain analysis, offering a simple and cost-effective optical solution for wearable health applications.

A Sliding Scale Signal Quality Metric of Photoplethysmography Applicable to Measuring Heart Rate across Clinical Contexts with Chest Mounting as a Case Study

Article

Full-text available

Mar 2023
SENSORS-BASEL

Unlabelled: Photoplethysmography (PPG) signal quality as a proxy for accuracy in heart rate (HR) measurement is useful in various public health contexts, ranging from short-term clinical diagnostics to free-living health behavior surveillance studies that inform public health policy. Each context has a different tolerance for acceptable signal quality, and it is reductive to expect a single threshold to meet the needs across all contexts. In this study, we propose two different metrics as sliding scales of PPG signal quality and assess their association with accuracy of HR measures compared to a ground truth electrocardiogram (ECG) measurement. Methods: We used two publicly available PPG datasets (BUT PPG and Troika) to test if our signal quality metrics could identify poor signal quality compared to gold standard visual inspection. To aid interpretation of the sliding scale metrics, we used ROC curves and Kappa values to calculate guideline cut points and evaluate agreement, respectively. We then used the Troika dataset and an original dataset of PPG data collected from the chest to examine the association between continuous metrics of signal quality and HR accuracy. PPG-based HR estimates were compared with reference HR estimates using the mean absolute error (MAE) and the root-mean-square error (RMSE). Point biserial correlations were used to examine the association between binary signal quality and HR error metrics (MAE and RMSE). Results: ROC analysis from the BUT PPG data revealed that the AUC was 0.758 (95% CI 0.624 to 0.892) for signal quality metrics of STD-width and 0.741 (95% CI 0.589 to 0.883) for self-consistency. There was a significant correlation between criterion poor signal quality and signal quality metrics in both Troika and originally collected data. Signal quality was highly correlated with HR accuracy (MAE and RMSE, respectively) between PPG and ground truth ECG. Conclusion: This proof-of-concept work demonstrates an effective approach for assessing signal quality and demonstrates the effect of poor signal quality on HR measurement. Our continuous signal quality metrics allow estimations of uncertainties in other emergent metrics, such as energy expenditure that relies on multiple independent biometrics. This open-source approach increases the availability and applicability of our work in public health settings.

Fitbit Data Show Poor Correlation with Measures of Activity and Sleep among Hospitalized General Medicine Patients: A Prospective Cohort Study

Article

Full-text available

Nov 2022

Background: Wearable devices could provide important insights about hospitalized patients that include data collected on variations in heart rate, low activity, and poor sleep. Objective: To determine the accuracy of Fitbit heart rate, sleep, and physical activity in patients hospitalized in general medical ward. Methods: We conducted a prospective study enrolling 50 inpatients, and providing them with a Fitbit Charge. Our main measures were Fitbit heart rate, activity, and sleep as well as nurse-recorded heart rate, nurse assessments of activity, and patient-reported sleep. Results: Comparing of heart rate data, the mean difference was 0.45 beats per minute (Pearson correlation: 0.68, P < 0.001). The correlation between nurses’ recorded activity and Fitbit daily steps was 0.06 (P = 0.52). The association between patient-reported sleep score and Fitbit total sleep duration was 0.19 (P = 0.24). Conclusions: Fitbit heart rate appeared to be correlated well with nurse-recorded heart rate, but Fitbit mea-surements of activity and sleep did not correlate well with corresponding assessments. RésuméContexte: Les appareils intelligents portables peuvent fournir des renseignements importants sur les patients hospitalisés tels que des données sur les variations de la fréquence cardiaque, le manque d’activité physique et le manque de sommeil. Objectif: Déterminer l’exactitude des données sur la fréquence cardiaque, le sommeil et l’activité physique recueillies par la montre Fitbit chez les patients hospitalisés. Méthodologie: Nous avons mené une étude prospective sur 50 patients hospitalisés à qui nous avons fourni une montre Fitbit Charge. Nos principales mesures sont la fréquence cardiaque, le niveau d’activité et la durée du sommeil mesurés par la montre Fitbit, de même que les fréquences cardiaques consignées par les infirmières, les évaluations de l’activité par les infirmières et le sommeil déclaré par les patients. Résultats: En comparant les données sur la fréquence cardiaque, la différence moyenne est de 0.45 bpm (corrélation de Pearson: 0.68, P < 0.001). La corrélation entre l’activité consignée par l’infirmière et le nombre de pas quotidiens enregistrés par la montre Fitbit est de 0.06 (P = 0.52). L’association entre le score de sommeil déclaré par le patient et la durée totale de sommeil enregistrée par la montre Fitbit est de 0.19 (P = 0.24). Conclusions: La fréquence cardiaque mesurée par la montre Fitbitsemble bien corrélée avec celle consignée par l’infirmière, mais les mesures Fitbit concernant l’activité et le sommeil ne sont pas bien corrélées avec les évaluations correspondantes.

Continuous Blood Pressure Estimation in Wearable Devices Using Photoplethysmography: A Review

Article

Full-text available

Oct 2022

Cardiovascular diseases (CVD) are among those with the highest mortality rates, and various wearable devices for continuous monitoring are emerging as a complement to medical procedures. Blood pressure (BP) monitoring in wearable devices, in order to be continuous, must be performed noninvasively, thus involving photoplethysmography (PPG), a technology that has been widely studied in recent years as a non-invasive solution for BP estimation. However, continuous data acquisition in a wearable system is still a challenge, one of the reasons being the noise caused by movement, the correct use of the PPG signal, and the estimation method to be used. This paper reviews the advances in blood pressure estimation based on photoplethysmography, focusing on the analysis of the preprocessing (ICA, FIR, adaptive filters) of the signals. Among the filters reviewed, the most suitable for dealing with Motion Artifacts (MA) of a wearable system are the adaptive filters, because conventional filters are limited to work only in the band for which they are designed, which does not always cover the spectrum of the MA. A review of the estimation methods is also carried out, among them machine learning stands out because it shows greater growth due to the new proposals that use more signals and obtain better results in terms of accuracy. The objective is to know and analyze the appropriate preprocessing filters and estimation methods from the perspective of wearable systems using PPG sensors affected by AM. Keywords— Blood Pressure Estimation, PAT, PTT, Machine Learning, Photoplethysmography, adaptive filtering.

Heart Rate Monitoring Using Infrared Imaging

Conference Paper

Oct 2023

Heart rate responses, agreement and accuracy among persons with severe disabilities participating in the indirect movement program: Team Twin—an observational study

Article

Full-text available

Oct 2023

Introduction Heart rate (HR) monitors are rarely used by people living with disabilities (PLWD), and their accuracy is undocumented. Thus, this study aims to describe the HR response during the Team Twin co-running program and, secondly, to assess the agreement and accuracy of using HR monitors among PLWD. Methods This 16-week single-arm observational study included 18 people with various disabilities. During the study, the subjects wore a Garmin Vivosmart 4 watch (wrist). To evaluate the agreement and accuracy we applied Garmin’s HRM-DUAL™ chest-worn HR monitors for comparison with the Vivosmart 4. The HR response analysis was performed descriptively and with a mixed regression model. The HR agreement and accuracy procedure was conducted on a subsample of five subjects and analyzed using Lin’s concordance analysis, Bland and Altman’s limits of agreement, and Cohen’s kappa analysis of intensity zone agreement. This study was prospectively registered at Clinical Trials.gov (NCT04536779). Results The subjects had a mean age of 35 (±12.6), 61% were male, 72% had cerebral palsy were 85% had GMFCS V-IV. HR was monitored for 202:10:33 (HH:MM:SS), with a mean HR of 90 ± 17 bpm during training and race. A total of 19% of the time was spent in intensity zones between light and moderate (30%–59% HR reserve) and 1% in vigorous (60%–84% HR reserve). The remaining 80% were in the very light intensity zone (<29% HR reserve). HR was highest at the start of race and training and steadily decreased. Inter-rater agreement was high (k = 0.75), limits of agreement were between −16 and 13 bpm, and accuracy was acceptable (Rc = 0.86). Conclusion Disability type, individual, and contextual factors will likely affect HR responses and the agreement and accuracy for PLWD. The Vivosmart 4, while overall accurate, had low precision due to high variability in the estimation. These findings implicate the methodical and practical difficulties of utilizing HR monitors to measure HR and thus physical activity in adapted sports activities for severely disabled individuals.

A Robust Metric of Heart Rate Signal Quality Using Chest Mounted Photoplethysmography

Article

Jan 2022

Measurement of Heart Rate Using the Withings ScanWatch Device During Free-living Activities: Validation Study

Article

Full-text available

Sep 2022

Background Wrist-worn devices that incorporate photoplethysmography (PPG) sensing represent an exciting means of measuring heart rate (HR). A number of studies have evaluated the accuracy of HR measurements produced by these devices in controlled laboratory environments. However, it is also important to establish the accuracy of measurements produced by these devices outside the laboratory, in real-world, consumer use conditions. Objective This study sought to examine the accuracy of HR measurements produced by the Withings ScanWatch during free-living activities. MethodsA sample of convenience of 7 participants volunteered (3 male and 4 female; mean age 64, SD 10 years; mean height 164, SD 4 cm; mean weight 77, SD 16 kg) to take part in this real-world validation study. Participants were instructed to wear the ScanWatch for a 12-hour period on their nondominant wrist as they went about their day-to-day activities. A Polar H10 heart rate sensor was used as the criterion measure of HR. Participants used a study diary to document activities undertaken during the 12-hour study period. These activities were classified according to the 11 following domains: desk work, eat or drink, exercise, gardening, household activities, self-care, shopping, sitting, sleep, travel, and walking. Validity was assessed using the Bland-Altman analysis, concordance correlation coefficient (CCC), and mean absolute percentage error (MAPE). ResultsAcross all activity domains, the ScanWatch measured HR with MAPE values

Beyond Fitbit: A Critical Appraisal of Optical Heart Rate Monitoring Wearables and Apps, Their Current Limitations and Legal Implications

Article

Full-text available

Dec 2017

Michael Lang

Fitness and health-care-oriented wearables and apps have been around for a couple of years and are still gaining momentum. Over time, they have begun to harness considerable computational power and to incorporate increasingly sophisticated sensors, eventually resulting in a blurring of the lines between consumer electronics and medical devices. While their benefits and potentials are undisputed, the overly optimistic appraisal commonly encountered in both mass media and academic literature does not adequately reflect unsolved problems and inherent limitations of these devices. This Article will argue that while these issues have long been known to the engineering community, their relevance and legal implications appear to have been grossly underestimated. January 2016 marked a turning point, as news of two class-action lawsuits filed against major manufacturer Fitbit brought widespread attention to accuracy, reliability, and safety concerns regarding these devices. This Article will provide a concise overview of optical heart rate monitoring technology, the current state of the art, and research trends. It will be argued that under real-world scenarios these apps and devices are currently inherently inaccurate and unreliable, with even greater problems on the horizon as the industry shifts towards areas such as heart rate variability monitoring or the detection of cardiac arrhythmias. Available at http://heinonline.org/HOL/P?h=hein.journals/albnyst28&i=45

Accuracy in Wrist-Worn, Sensor-Based Measurements of Heart Rate and Energy Expenditure in a Diverse Cohort

Article

Full-text available

May 2017

The ability to measure physical activity through wrist-worn devices provides an opportunity for cardiovascular medicine. However, the accuracy of commercial devices is largely unknown. The aim of this work is to assess the accuracy of seven commercially available wrist-worn devices in estimating heart rate (HR) and energy expenditure (EE) and to propose a wearable sensor evaluation framework. We evaluated the Apple Watch, Basis Peak, Fitbit Surge, Microsoft Band, Mio Alpha 2, PulseOn, and Samsung Gear S2. Participants wore devices while being simultaneously assessed with continuous telemetry and indirect calorimetry while sitting, walking, running, and cycling. Sixty volunteers (29 male, 31 female, age 38 ± 11 years) of diverse age, height, weight, skin tone, and fitness level were selected. Error in HR and EE was computed for each subject/device/activity combination. Devices reported the lowest error for cycling and the highest for walking. Device error was higher for males, greater body mass index, darker skin tone, and walking. Six of the devices achieved a median error for HR below 5% during cycling. No device achieved an error in EE below 20 percent. The Apple Watch achieved the lowest overall error in both HR and EE, while the Samsung Gear S2 reported the highest. In conclusion, most wrist-worn devices adequately measure HR in laboratory-based activities, but poorly estimate EE, suggesting caution in the use of EE measurements as part of health improvement programs. We propose reference standards for the validation of consumer health devices (http://precision.stanford.edu/).

Equivalence Tests: A Practical Primer for t Tests, Correlations, and Meta-Analyses

Article

Full-text available

May 2017

Daniël Lakens

Scientists should be able to provide support for the absence of a meaningful effect. Currently, researchers often incorrectly conclude an effect is absent based a nonsignificant result. A widely recommended approach within a frequentist framework is to test for equivalence. In equivalence tests, such as the two one-sided tests (TOST) procedure discussed in this article, an upper and lower equivalence bound is specified based on the smallest effect size of interest. The TOST procedure can be used to statistically reject the presence of effects large enough to be considered worthwhile. This practical primer with accompanying spreadsheet and R package enables psychologists to easily perform equivalence tests (and power analyses) by setting equivalence bounds based on standardized effect sizes and provides recommendations to prespecify equivalence bounds. Extending your statistical tool kit with equivalence tests is an easy way to improve your statistical and theoretical inferences.

The EU commission's risky choice for a non-risk based strategy on assessment of medical devices

Article

Full-text available

Apr 2017
Comput Law Secur Rep

Paul Quinn

Regulation of medical devices has been one of the most notable regulatory initiatives of the European Union. The need to ensure that medical devices are of a high quality is self-evident in nature. This is demonstrated by the lack of willingness of both healthcare institutions and professionals to use medical devices that have not properly been certified. In determining which devices are medical devices and should therefore meet the requirements of the regulatory framework, both the current and the proposed frameworks foresee a central place for the concept of ‘intended purpose’. This means that only those manufacturers that have explicitly stated that their device is to be used for a medical purpose should have to comply with the medical device framework. Unfortunately, however, this concept has become increasingly problematic given the rise in mHealth (mobile health) practices and ‘appification’ (shift to mobile devices) in particular, arguably posing potentially serious risks to human health in certain cases. This article discusses the problems that are created by the ever-increasing amount of ‘well-being’ apps and the fact that most will not be classed as medical devices. Despite apparently being aware of these problems, the EU Commission has opted to maintain its current approach in the newly proposed regulation, choosing not to employ other approaches as the FDA has for example done in opting to use a ‘risk based case-by-case approach’.

Variable Accuracy of Wearable Heart Rate Monitors during Aerobic Exercise

Article

Full-text available

Mar 2017

Purpose: Athletes and members of the public increasingly rely on wearable HR monitors to guide physical activity and training. The accuracy of newer, optically based monitors is unconfirmed. We sought to assess the accuracy of five optically based HR monitors during various types of aerobic exercise. Methods: Fifty healthy adult volunteers (mean ± SD age = 38 ± 12 yr, 54% female) completed exercise protocols on a treadmill, a stationary bicycle, and an elliptical trainer (±arm movement). Each participant underwent HR monitoring with an electrocardiogaphic chest strap monitor (Polar H7), forearm monitor (Scosche Rhythm+), and two randomly assigned wrist-worn HR monitors (Apple Watch, Fitbit Blaze, Garmin Forerunner 235, and TomTom Spark Cardio), one on each wrist. For each exercise type, HR was recorded at rest, light, moderate, and vigorous intensity. Agreement between HR measurements was assessed using Lin's concordance correlation coefficient (rc). Results: Across all exercise conditions, the chest strap monitor (Polar H7) had the best agreement with ECG (rc = 0.996) followed by the Apple Watch (rc = 0.92), the TomTom Spark (rc = 0.83), and the Garmin Forerunner (rc = 0.81). Scosche Rhythm+ and Fitbit Blaze were less accurate (rc = 0.75 and rc = 0.67, respectively). On treadmill, all devices performed well (rc = 0.88-0.93) except the Fitbit Blaze (rc = 0.76). While bicycling, only the Garmin, Apple Watch, and Scosche Rhythm+ had acceptable agreement (rc > 0.80). On the elliptical trainer without arm levers, only the Apple Watch was accurate (rc = 0.94). None of the devices was accurate during elliptical trainer use with arm levers (all rc < 0.80). Conclusion: The accuracy of wearable, optically based HR monitors varies with exercise type and is greatest on the treadmill and lowest on elliptical trainer. Electrode-containing chest monitors should be used when accurate HR measurement is imperative.

Wearable, yes, but able...?: It is time for evidence-based marketing claims!

Article

Full-text available

Dec 2016

With great interest, we1 have been following the growing popularity of non-invasive wearable sensor technology as a way to increase physical performance, assist recovery or monitor health. These sensors, integrated into clothing worn on the body, are often referred to as ‘wearables’ or ‘wearable technology’. The popularity of the wearables is mainly due to three recent advances: (1) miniature sensor technology,1 (2) telemetric transfer and (web-based) storage of personal data and (3) extension of battery life. According to a worldwide survey of fitness trends, wearable technology appears set to be the number 1 trend in 2017,2 with expected sales for some wearables in the range of 1.5–2.6 billion US$.2 We believe that this type of technology will be a central tool in the fitness and health industry, provided some fundamental issues …

Accuracy of Wrist-Worn Heart Rate Monitors

Article

Full-text available

Oct 2016

Wrist-worn fitness and heart rate (HR) monitors are popular.¹,2 While the accuracy of chest strap, electrode-based HR monitors has been confirmed,³,4 the accuracy of wrist-worn, optically based HR monitors is uncertain.⁵,6 Assessment of the monitors’ accuracy is important for individuals who use them to guide their physical activity and for physicians to whom these individuals report HR readings. The objective of this study was to assess the accuracy of 4 popular wrist-worn HR monitors under conditions of varying physical exertion.

Accuracy of a Wrist-Worn Wearable Device for Monitoring Heart Rates in Hospital Inpatients: A Prospective Observational Study

Article

Full-text available

Sep 2016
J MED INTERNET RES

Background: As the sensing capabilities of wearable devices improve, there is increasing interest in their application in medical settings. Capabilities such as heart rate monitoring may be useful in hospitalized patients as a means of enhancing routine monitoring or as part of an early warning system to detect clinical deterioration. Objective: To evaluate the accuracy of heart rate monitoring by a personal fitness tracker (PFT) among hospital inpatients. Methods: We conducted a prospective observational study of 50 stable patients in the intensive care unit who each completed 24 hours of heart rate monitoring using a wrist-worn PFT. Accuracy of heart rate recordings was compared with gold standard measurements derived from continuous electrocardiographic (cECG) monitoring. The accuracy of heart rates measured by pulse oximetry (Spo2.R) was also measured as a positive control. Results: On a per-patient basis, PFT-derived heart rate values were slightly lower than those derived from cECG monitoring (average bias of -1.14 beats per minute [bpm], with limits of agreement of 24 bpm). By comparison, Spo2.R recordings produced more accurate values (average bias of +0.15 bpm, limits of agreement of 13 bpm, P<.001 as compared with PFT). Personal fitness tracker device performance was significantly better in patients in sinus rhythm than in those who were not (average bias -0.99 bpm vs -5.02 bpm, P=.02). Conclusions: Personal fitness tracker-derived heart rates were slightly lower than those derived from cECG monitoring in real-world testing and not as accurate as Spo2.R-derived heart rates. Performance was worse among patients who were not in sinus rhythm. Further clinical evaluation is indicated to see if PFTs can augment early warning systems in hospitals. Trial registration: ClinicalTrials.gov NCT02527408; https://clinicaltrials.gov/ct2/show/NCT02527408 (Archived by WebCite at http://www.webcitation.org/6kOFez3on).

Accuracy of Heart Rate Monitoring by Some Wrist-Worn Activity Trackers

Article