Conference PaperPDF Available

Diabetes Disease through Machine Learning: A comparative study

Authors:

Abstract and Figures

Diabetes is a critical problem in developed and developing countries. The early detection of this disease is crucial for efficient and effective treatment. Moreover, the application of machine learning for disease detection is a trending topic. There are numerous machine learning methods available in the literature. The main contribution of this paper is to present a preliminary study on the application of machine learning methods on a public and widely used diabetes dataset. The authors have applied eight different machine learning techniques using PIMA diabetes dataset. The data have been normalized, and Neural Networks, SGD, Random Forest, kNN, Naïve Bayes, AdaBoost, Decision Tree and SVM methods have been applied. First, the techniques have been validated using stratified 10-fold cross-validation. Second, the confusion matrix has been extracted for each method, and the accuracy, recall, precision and F1-score have been calculated. The three methods with better accuracy are Neural Networks, SGD and kNN. These methods report 77.47%, 76.43% and 73.96% of average accuracy between classes.
Content may be subject to copyright.
Diabetes Disease through Machine Learning: A comparative
study
Gonçalo Marques
Instituto de Telecomunicações,
Universidade da Beira Interior,
Covilhã, Portugal
Ivan Miguel Pires
Instituto de Telecomunicações,
Universidade Da Beira Interior,
Covilhã, Portugal, Computer Science
Department, Polytechnic Institute of
Viseu, Viseu, Portugal, and UICISA:E
Research Centre, Polytechnic Institute
of Viseu, Viseu, Portugal
Nuno M. Garcia
Instituto de Telecomunicações,
Universidade da Beira Interior,
Covilhã, Portugal
ABSTRACT
Diabetes is a critical problem in developed and developing coun-
tries. The early detection of this disease is crucial for ecient and
eective treatment. Moreover, the application of machine learn-
ing for disease detection is a trending topic. There are numerous
machine learning methods available in the literature. The main
contribution of this paper is to present a preliminary study on the
application of machine learning methods on a public and widely
used diabetes dataset. The authors have applied eight dierent ma-
chine learning techniques using PIMA diabetes dataset. The data
have been normalized, and Neural Networks, SGD, Random Forest,
kNN, Naïve Bayes, AdaBoost, Decision Tree and SVM methods have
been applied. First, the techniques have been validated using strati-
ed 10-fold cross-validation. Second, the confusion matrix has been
extracted for each method, and the accuracy, recall, precision and
F1-score have been calculated. The three methods with better accu-
racy are Neural Networks, SGD and kNN. These methods report
77.47%, 76.43% and 73.96% of average accuracy between classes.
CCS CONCEPTS
Computing methodologies
;
Machine Learning
;
Machine
Learning Approaches;
KEYWORDS
Diabetes, Machine Learning, Health Informatics
ACM Reference Format:
Gonçalo Marques, Ivan Miguel Pires, and Nuno M. Garcia. 2020. Diabetes
Disease through Machine Learning: A comparative study. In 2020 4th In-
ternational Conference on Computer Science and Articial Intelligence (CSAI
2020), December 11–13, 2020, Zhuhai, China. ACM, New York, NY, USA,
6 pages. https://doi.org/10.1145/3445815.3445828
1 INTRODUCTION
Chronic diseases are a critical problem nowadays. Mainly, diabetes
is one of the major challenges for healthcare researchers. World
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
CSAI 2020, December 11–13, 2020, Zhuhai, China
©2020 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-8843-6/20/12.
https://doi.org/10.1145/3445815.3445828
Health Organization (WHO) reports that 402 million people have
diabetes. These people are particularly located in low-and middle-
income countries. Diabetes is closely related to critical damages
in heart, blood vessels and eyes. Furthermore, 1.6 million deaths
are directly connected with diabetes, according to WHO [
11
]. Addi-
tionally, WHO have launched several activities to support patients
with diabetes and is expanding access to treatments diabetes [14].
The early diagnosis of diabetes disease is critical for its treatment
[
4
]. Health professionals analyze the diagnosis according to their
experience and knowledge [
9
]. Currently, healthcare units support
electronic healthcare records (EHR), especially in developed coun-
tries. EHR support a massive number of records and organized
information of patients [13].
Articial intelligence is widely used in multiple engineering ap-
plications as well as in the healthcare domain [
3
,
6
]. Numerous
research units are studying the implementation of machine learn-
ing for disease recognition [
3
]. On the one hand, the successive
advances in computers science enabled the design of cyber-physical
systems that can be used for real-time data collection [
7
]. On the
other hand, the evolution of the methods available for data consult-
ing and storage enables the development of intelligent systems [
10
].
The implementation of machine learning can also be associated
with the creation of enhanced living environments [
8
]. Neverthe-
less, enhanced living environments and ambient assisted living
technologies aim to promote people life quality.
Machine learning can transform these data into automated sys-
tems that can be used to support medical decision [
1
]. These systems
will identify patterns and provide signicant inputs for enhanced
medical decision [2]. The high impact and healthcare costs associ-
ated with diabetes can be attenuated using machine learning [
5
].
The application of automated methods will transform data into
knowledge and detect patterns that are dicult to be evaluated by
human beings concerning the amount of data available [12].
This study aims to evaluate the results of the application of dier-
ent machine learning methods for diabetes disease diagnosis. The
authors have applied eight dierent machine learning techniques us-
ing PIMA diabetes dataset. A public dataset has been used, and Neu-
ral Networks, SGD, Random Forest, kNN, Naïve Bayes, AdaBoost,
Decision Tree and SVM methods have been applied. The methods
have been validated using stratied 10-fold cross-validation. The
confusion matrix for each method have been extracted, and the
performance evaluated.
CSAI 2020, December 11–13, 2020, Zhuhai, China Nuno Garcia et al.
Table 1: Dataset statistical analysis.
Features Range Minimum Maximum Mean Std. Deviation Variance
Statistic Statistic Statistic Std. Error Statistic Statistic
Pregnancies 17 0 17 3.85 .122 3.370 11.354
Glucose 199 0 199 120.89 1.154 31.973 1022.248
BP 122 0 122 69.11 .698 19.356 374.647
SFT 99 0 99 20.54 .576 15.952 254.473
Insulin 846 0 846 79.80 4.159 115.244 13281.180
BMI 67.1 .0 67.1 31.993 .2845 7.8842 62.160
DPF 2.342 .078 2.420 .47188 .011956 .331329 .110
Age 60 21 81 33.24 .424 11.760 138.303
2 MATERIALS AND METHODS
2.1 Dataset
The dataset used is the Pima Indians Diabetes Dataset [
11
]. The data
was collected from the National Institute of Diabetes and Digestive
and Kidney Diseases. The original goal of this data was to test the
identication of the presence of diabetes. The data included female
individuals with at least 21 years old. The datasets consist of several
medical predictor variables and one target variable, Outcome. The
features included are the number of pregnancies, glucose level,
blood pressure (BP), triceps skinfold thickness (SFT), insulin, Body
Mass Index (BMI), diabetes pedigree function (DPF), and age. The
dataset used is publicly available. The target variable indicates the
presence or absence of diabetes disease. The dataset has 768 entries,
268 have diabetes disease and 500 do not have diabetes.
The analysis study of the dataset is presented in Table 1. The
analysis has been conducted in IBM SPSS version 26. The statistic
range, minimum, maximum, mean, standard deviation and variance
of each feature have been extracted. The data have been normalized
using a scale of [-1,1] before the application of the machine learning
methods.
2.2 Machine Learning Methods
There are numerous machine learning methods available in the lit-
erature. The authors have applied eight dierent machine learning
techniques. Neural Networks, SGD, Random Forest, kNN, Naïve
Bayes, AdaBoost, Decision Tree and SVM methods have been ap-
plied. The specication of the machine learning methods is pre-
sented in this section. The presented of the parameters used is
essential for the reproduction of the results.
In the Decision Tree, the Pruning is dened at least two instances
in leaves, at least ve instances in internal nodes, maximum depth
100. The splitting stop when the majority reaches 95% and use
binary trees.
The kNN method parameters are dened as follows. The number
of neighbours is 5, the metric used is Euclidean, and the weight is
Uniform.
The AdaBoost us a tree for the base estimator, the number of
estimators is 50, the classication algorithm is the SAMME.R.
The Random Forest was developed using 10 as the number of
trees, the maximal number of considered features is unlimited, the
replicable training is not implemented, the maximal tree depth is
unlimited, and it stops splitting nodes with 5 maximum instances.
The SVM method uses a C
=
1.0,
ϵ=
0.1, the Kernel is RBF, exp(-
auto|x-y|
2
), the numerical tolerance is 0.001 and the iteration limit
is 100.
The Neural Network consists of 500 neurons, the activation
function is Identity, the solver is ADAM, the alpha is 0.0009, the
max number of iterations is 250 and uses replicable training.
The SGD uses the Hinge classication loss function, the Squared
Loss is implemented, the regularization is Ridge (L2), the regulariza-
tion strength (
α
) is 1*10
-5
, the learning rate is Constant, the initial
learning rate (
η
0) is 0.01, and the shue data after each iteration is
dened as True.
2.3 Validation and Study Design
The experiments have been carried in a MacBook Pro (15-inch, 2018).
The machine incorporates a 2.6 GHz 6-Core Intel Core i7 CPU and
16 GB 2400 MHz DDR4 memory. The data have been normalized,
and the machine learning methods applied. The stratied 10-fold
cross-validation method has been used, and the confusion matrix
has been extracted for each method. Finally, the accuracy (1), recall
(2), precision (3) and F1-score (4) have been calculated.
Accuracy=TP +TN
TP +FP +FN +TN (1)
Precision =TP
TP +FP (2)
Recall =TP
TP +FN (3)
F1Scor e =2Recall Precision
Recal +Precision (4)
The study design used in this work is presented in Figure 1, and
it is composed by dataset, machine learning, validation, test and
score, and comparative study.
3 RESULTS AND DISCUSSION
The average results between classes have been calculated, consid-
ering the extracted confusion matrix for each method. The perfor-
mance has been validated using stratied 10-fold cross-validation.
Figure 2 presents the accuracy reported by the implemented
methods. The average values range from 66.54% and 77.47%. On the
one hand, the lowest accuracy is reported by SVM (66.54%). On the
other hand, the three methods with highest accuracy are Neural
Networks (77.47%), SGD (76.43%) and kNN (73.96%).
Diabetes Disease through Machine Learning: A comparative study CSAI 2020, December 11–13, 2020, Zhuhai, China
Figure 1: Study Design.
Figure 2: Average accuracy values.
Figure 3: Average F1-Score values.
The F1-Score values have been calculated and are presented in
Figure 3. The better results are also presented by Neural Networks
corresponding to 76.84%, followed by SGD with 75.67%, and the
lowest performance is reported by SVM (67.08%).
Figure 4 presents the average precision values of the dierent
implementation models. The values range from 76.96% reported by
Neural Networks to 68.14% presented by SVM.
As presented in Figure 5, The best recall value reported is 77.47%
concerning Neural Networks. The low recall value is 68.54% for
SVM method. Moreover, kNN method provides a recall value of
73.96%.
The experiments conducted allows the authors to suggest Neural
Networks for the development of automated decision support sys-
tems for diabetes. This method reports a 77,47%, 76,84%, 76,96% and
CSAI 2020, December 11–13, 2020, Zhuhai, China Nuno Garcia et al.
Figure 4: Average precision values.
Figure 5: Average recall values.
77,47% concerning accuracy, F1-score, precision and recall. Further-
more, a more detail analysis can be conducted by testing dierent
parameters such as the number of neurons, the activation function
and the maximum number of iterations.
The three methods with better accuracy are Neural Networks,
SGD and kNN. These methods report 77.47%, 76.43% and 73.96% of
average accuracy between classes. The Receiver Operating Charac-
teristics (ROC) is an ecient method to evaluate the performance
of a classication model. The ROC curve of Neural Networks, SGD
and kNN models for target class 1 and class 0 is represented in
Figure 6 and Figure 7, respectively. The analysis of Area Under the
Curve (AUC) across dierent classiers is a relevant method to
summarize its performance. Therefore, the AUC results average
over classes for Neural Networks, SGD and kNN are 82.84%, 71.60%
and 77.02%, respectively.
Nevertheless, the present study has several limitations. On the
one hand, the number of instances in this dataset is limited. On the
other hand, this dataset only includes female individuals.
This study aims to support future research activities as a base
point for forthcoming analysis. This work should be considered as a
preliminary study. Several parameters can be updated in the dier-
ent methods implemented. These impact of normalization, feature
selection, data imputation and augmentation are not addressed in
this study and should be done as future work.
Currently, the application of machine learning methods in the
design and development of an automated system to support health-
care is a trending and essential topic. Furthermore, it is crucial to
promote research in this eld to decrease the cost and improve the
quality of healthcare. However, medical analysis and appreciation
Diabetes Disease through Machine Learning: A comparative study CSAI 2020, December 11–13, 2020, Zhuhai, China
Figure 6: ROC Curve for target class 1.
Figure 7: ROC Curve for target class 0.
must always be ensured. These methods should be eective sys-
tems to support medical diagnostics, but they will never replace
the crucial role of doctors.
4 CONCLUSIONS
This paper has presented the application of dierent machine learn-
ing methods for the identication of diabetes diseases using a public
dataset. In total, eight dierent approaches have been applied. After
data normalization, Neural Networks, SGD, Random Forest, kNN,
Naïve Bayes, AdaBoost, Decision Tree and SVM methods have been
applied. The results suggest the use of Neural Networks, SGD and
kNN. On the one hand, the application of Neural Networks presents
an accuracy of 77.47%, an F1-score of 76.84%, a precision of 76.96%
and a recall of 77.47%. On the other hand, SGD reports 76.43%,
75.67%, 75.84% and 76.43% concerning the accuracy, F1-score, pre-
cision and recall, respectively. Finally, the kNN method states an
average accuracy of 73.96%, an F1-Score of 73.54%, a precision of
CSAI 2020, December 11–13, 2020, Zhuhai, China Nuno Garcia et al.
73.38% and a recall of 73.96%. Future work should focus on the analy-
sis of the impact of feature selection in dierent methods. Moreover,
other relevant studies can be done with Neural Networks by testing
dierent parameters such as the activation function.
ACKNOWLEDGMENTS
This work is funded by FCT/MEC through national funds and
co-funded by FEDER PT2020 partnership agreement under the
project UIDB/50008/2020.
This work is funded by National Funds through the FCT - Foun-
dation for Science and Technology, I.P., within the scope of the
project UIDB/00742/2020.
This article is based upon work from COST Action IC1303–
AAPELE–Architectures, Algorithms and Protocols for Enhanced
Living Environments and COST Action CA16226–SHELD-ON–
Indoor living space improvement: Smart Habitat for the Elderly,
supported by COST (European Cooperation in Science and Tech-
nology). More information in www.cost.eu.
Furthermore, we would like to thank the Politécnico de Viseu
for their support.
REFERENCES
[1]
Ahmed Abdelaziz, Ahmed S. Salama, A. M. Riad, and Alia N. Mahmoud. 2019. A
Machine Learning Model for Predicting of Chronic Kidney Disease Based Internet
of Things and Cloud Computing in Smart Cities. In Security in Smart Cities: Models,
Applications, and Challenges, Aboul Ella Hassanien, Mohamed Elhoseny, Syed
Hassan Ahmed and Amit Kumar Singh (eds.). Springer International Publishing,
Cham, 93–114. https://doi.org/10.1007/978-3- 030-01560-2_5
[2]
Flávio H.D. Araújo, André M. Santana, and Pedro de A. Santos Neto. 2016. Using
machine learning to support healthcare professionals in making preauthorisation
decisions. International Journal of Medical Informatics 94: 1–7. https://doi.org/10.
1016/j.ijmedinf.2016.06.007
[3]
Igor Vyacheslavovich Buzaev, Vladimir Vyacheslavovich Plechev, Irina Evge-
nievna Nikolaeva, and Rezida Maratovna Galimova. 2016. Articial intelligence:
Neural network model as the multidisciplinary team member in clinical decision
support to avoid medical mistakes. Chronic Diseases and Translational Medicine
2, 3: 166–172. https://doi.org/10.1016/j.cdtm.2016.09.007
[4]
Imane Chakour, Yousef El Mourabit, Cherki Daoui, and Mohamed Baslam. 2020.
Multi-Agent System Based on Machine Learning for Early Diagnosis of Diabetes.
In2020 IEEE 6th International Conference on Optimization and Applications (ICOA),
1–6.
[5]
Irene Dankwa-Mullan, Marc Rivo, Marisol Sepulveda, Yoonyoung Park, Jane
Snowdon, and Kyu Rhee. 2019. Transforming Diabetes Care Through Articial
Intelligence: The Future Is Here. Population Health Management 22, 3: 229–242.
https://doi.org/10.1089/pop.2018.0129
[6]
Thomas Davenport and Ravi Kalakota. 2019. The potential for articial intel-
ligence in healthcare. Future Healthcare Journal 6, 2: 94–98. https://doi.org/10.
7861/futurehosp.6-2- 94
[7]
Nilanjan Dey, Amira S. Ashour, Fuqian Shi, Simon James Fong, and João Manuel
R. S. Tavares. 2018. Medical cyber-physical systems: A survey. Journal of Medical
Systems 42, 4: 74. https://doi.org/10.1007/s10916-018- 0921-x
[8]
Ivan Ganchev, Nuno M. Garcia, Ciprian Dobre, Constandinos X. Mavromoustakis,
and Rossitza Goleva (eds.). 2019. Enhanced Living Environments: Algorithms,
Architectures, Platforms, and Systems. Springer International Publishing, Cham.
https://doi.org/10.1007/978-3- 030-10752- 9
[9]
Jigna J Hathaliya, Sudeep Tanwar, Sudhanshu Tyagi, and Neeraj Kumar. 2019.
Securing electronics healthcare records in Healthcare 4.0: A biometric-based
approach. Computers & Electrical Engineering 76: 398–410. https://doi.org/10.
1016/j.compeleceng.2019.04.017
[10]
Gonçalo Marques, Rui Pitarma, Nuno M. Garcia, and Nuno Pombo. 2019. Internet
of Things Architectures, Technologies, Applications, Challenges, and Future
Directions for Enhanced Living Environments and Healthcare Systems: A Review.
Electronics 8, 10: 1081. https://doi.org/10.3390/electronics8101081
[11]
Huma Naz and Sachin Ahuja. 2020. Deep learning approach for diabetes predic-
tion using PIMA Indian dataset. Journal of Diabetes & Metabolic Disorders 19, 1:
391–403. https://doi.org/10.1007/s40200-020- 00520-5
[12]
Andreas K Triantafyllidis and Athanasios Tsanas. 2019. Applications of Machine
Learning in Real-Life Digital Health Interventions: Review of the Literature.
Journal of Medical Internet Research 21, 4: e12286. https://doi.org/10.2196/12286
[13]
Le Zheng, Oliver Wang, Shiying Hao, Chengyin Ye, Modi Liu, Minjie Xia, Alex
N. Sabo, Liliana Markovic, Frank Stearns, Laura Kanov, Karl G. Sylvester, Eric
Widen, Do B. McElhinney, Wei Zhang, Jiayu Liao, and Xuefeng B. Ling. 2020.
Development of an early-warning system for high-risk patients for suicide at-
tempt using deep learning and electronic health records. Translational Psychiatry
10, 1: 72. https://doi.org/10.1038/s41398-020- 0684-2
[14]
Diabetes. Retrieved August 20, 2020 from https://www.who.int/westernpacic/
health-topics/diabetes
... The distances drawn between the two are maximized, thus reducing the classification error [16]. ...
Article
Full-text available
The dengue virus has become an increasingly critical problem for humanity due to its extensive spread. This is transmitted through a vector that sprouts in certain climatic conditions (tropical and subtropical climates). The transmission of the disease can be associated with certain climatic variables that reinforce the outbreak. Data were collected on dengue cases by epidemiological week registered in Loreto-Peru from January 1, 2016, to January 31, 2022. Likewise, data on meteorological variables (maximum and minimum temperature; dry and humid bulb temperature; wind speed and total precipitation in the area). In this study, four Machine learning modeling techniques were considered: Support Vector Machine (SVM), Decision Tree, Random Forest and AdaBoost; and the parameters defined to evaluate the models are: Accuracy, Precision, Recall and F-1. As a result, optimal AUC values were obtained in a range from 0.818 to 0.996 for the SVM, Random Forest and AdaBoost algorithms, likewise, in all weather stations the ROC curve showed good performance for all models, except for the Decision Tree algorithm. As a conclusion for this study, we propose the optimal model to associate dengue cases with climatic conditions is SVM.
Article
Full-text available
PurposeInternational Diabetes Federation (IDF) stated that 382 million people are living with diabetes worldwide. Over the last few years, the impact of diabetes has been increased drastically, which makes it a global threat. At present, Diabetes has steadily been listed in the top position as a major cause of death. The number of affected people will reach up to 629 million i.e. 48% increase by 2045. However, diabetes is largely preventable and can be avoided by making lifestyle changes. These changes can also lower the chances of developing heart disease and cancer. So, there is a dire need for a prognosis tool that can help the doctors with early detection of the disease and hence can recommend the lifestyle changes required to stop the progression of the deadly disease.Method Diabetes if untreated may turn into fatal and directly or indirectly invites lot of other diseases such as heart attack, heart failure, brain stroke and many more. Therefore, early detection of diabetes is very significant so that timely action can be taken and the progression of the disease may be prevented to avoid further complications. Healthcare organizations accumulate huge amount of data including Electronic health records, images, omics data, and text but gaining knowledge and insight into the data remains a key challenge. The latest advances in Machine learning technologies can be applied for obtaining hidden patterns, which may diagnose diabetes at an early phase. This research paper presents a methodology for diabetes prediction using a diverse machine learning algorithm using the PIMA dataset.ResultsThe accuracy achieved by functional classifiers Artificial Neural Network (ANN), Naive Bayes (NB), Decision Tree (DT) and Deep Learning (DL) lies within the range of 90–98%. Among the four of them, DL provides the best results for diabetes onset with an accuracy rate of 98.07% on the PIMA dataset. Hence, this proposed system provides an effective prognostic tool for healthcare officials. The results obtained can be used to develop a novel automatic prognosis tool that can be helpful in early detection of the disease.Conclusion The outcome of the study confirms that DL provides the best results with the most promising extracted features. DL achieves the accuracy of 98.07% which can be used for further development of the automatic prognosis tool. The accuracy of the DL approach can further be enhanced by including the omics data for prediction of the onset of the disease.
Article
Full-text available
Suicide is the tenth leading cause of death in the United States (US). An early-warning system (EWS) for suicide attempt could prove valuable for identifying those at risk of suicide attempts, and analyzing the contribution of repeated attempts to the risk of eventual death by suicide. In this study we sought to develop an EWS for high-risk suicide attempt patients through the development of a population-based risk stratification surveillance system. Advanced machine-learning algorithms and deep neural networks were utilized to build models with the data from electronic health records (EHRs). A final risk score was calculated for each individual and calibrated to indicate the probability of a suicide attempt in the following 1-year time period. Risk scores were subjected to individual-level analysis in order to aid in the interpretation of the results for health-care providers managing the at-risk cohorts. The 1-year suicide attempt risk model attained an area under the curve (AUC ROC) of 0.792 and 0.769 in the retrospective and prospective cohorts, respectively. The suicide attempt rate in the “very high risk” category was 60 times greater than the population baseline when tested in the prospective cohorts. Mental health disorders including depression, bipolar disorders and anxiety, along with substance abuse, impulse control disorders, clinical utilization indicators, and socioeconomic determinants were recognized as significant features associated with incident suicide attempt.
Article
Full-text available
Internet of Things (IoT) is an evolution of the Internet and has been gaining increased attention from researchers in both academic and industrial environments. Successive technological enhancements make the development of intelligent systems with a high capacity for communication and data collection possible, providing several opportunities for numerous IoT applications, particularly healthcare systems. Despite all the advantages, there are still several open issues that represent the main challenges for IoT, e.g., accessibility, portability, interoperability, information security, and privacy. IoT provides important characteristics to healthcare systems, such as availability, mobility, and scalability, that o�er an architectural basis for numerous high technological healthcare applications, such as real-time patient monitoring, environmental and indoor quality monitoring, and ubiquitous and pervasive information access that benefits health professionals and patients. The constant scientific innovations make it possible to develop IoT devices through countless services for sensing, data fusing, and logging capabilities that lead to several advancements for enhanced living environments (ELEs). This paper reviews the current state of the art on IoT architectures for ELEs and healthcare systems, with a focus on the technologies, applications, challenges, opportunities, open-source platforms, and operating systems. Furthermore, this document synthesizes the existing body of knowledge and identifies common threads and gaps that open up new significant and challenging future research directions.
Article
Full-text available
The complexity and rise of data in healthcare means that artificial intelligence (AI) will increasingly be applied within the field. Several types of AI are already being employed by payers and providers of care, and life sciences companies. The key categories of applications involve diagnosis and treatment recommendations, patient engagement and adherence, and administrative activities. Although there are many instances in which AI can perform healthcare tasks as well or better than humans, implementation factors will prevent large-scale automation of healthcare professional jobs for a considerable period. Ethical issues in the application of AI to healthcare are also discussed.
Article
Full-text available
Background: Machine learning has attracted considerable research interest toward developing smart digital health interventions. These interventions have the potential to revolutionize health care and lead to substantial outcomes for patients and medical professionals. Objective: Our objective was to review the literature on applications of machine learning in real-life digital health interventions, aiming to improve the understanding of researchers, clinicians, engineers, and policy makers in developing robust and impactful data-driven interventions in the health care domain. Methods: We searched the PubMed and Scopus bibliographic databases with terms related to machine learning, to identify real-life studies of digital health interventions incorporating machine learning algorithms. We grouped those interventions according to their target (ie, target condition), study design, number of enrolled participants, follow-up duration, primary outcome and whether this had been statistically significant, machine learning algorithms used in the intervention, and outcome of the algorithms (eg, prediction). Results: Our literature search identified 8 interventions incorporating machine learning in a real-life research setting, of which 3 (37%) were evaluated in a randomized controlled trial and 5 (63%) in a pilot or experimental single-group study. The interventions targeted depression prediction and management, speech recognition for people with speech disabilities, self-efficacy for weight loss, detection of changes in biopsychosocial condition of patients with multiple morbidity, stress management, treatment of phantom limb pain, smoking cessation, and personalized nutrition based on glycemic response. The average number of enrolled participants in the studies was 71 (range 8-214), and the average follow-up study duration was 69 days (range 3-180). Of the 8 interventions, 6 (75%) showed statistical significance (at the P=.05 level) in health outcomes. Conclusions: This review found that digital health interventions incorporating machine learning algorithms in real-life studies can be useful and effective. Given the low number of studies identified in this review and that they did not follow a rigorous machine learning evaluation methodology, we urge the research community to conduct further studies in intervention settings following evaluation principles and demonstrating the potential of machine learning in clinical practice.
Article
Full-text available
An estimated 425 million people globally have diabetes, accounting for 12% of the world's health expenditures, and yet 1 in 2 persons remain undiagnosed and untreated. Applications of artificial intelligence (AI) and cognitive computing offer promise in diabetes care. The purpose of this article is to better understand what AI advances may be relevant today to persons with diabetes (PWDs), their clinicians, family, and caregivers. The authors conducted a predefined, online PubMed search of publicly available sources of information from 2009 onward using the search terms "diabetes" and "artificial intelligence." The study included clinically-relevant, high-impact articles, and excluded articles whose purpose was technical in nature. A total of 450 published diabetes and AI articles met the inclusion criteria. The studies represent a diverse and complex set of innovative approaches that aim to transform diabetes care in 4 main areas: automated retinal screening, clinical decision support, predictive population risk stratification, and patient self-management tools. Many of these new AI-powered retinal imaging systems, predictive modeling programs, glucose sensors, insulin pumps, smartphone applications, and other decision-support aids are on the market today with more on the way. AI applications have the potential to transform diabetes care and help millions of PWDs to achieve better blood glucose control, reduce hypoglycemic episodes, and reduce diabetes comorbidities and complications. AI applications offer greater accuracy, efficiency, ease of use, and satisfaction for PWDs, their clinicians, family, and caregivers.
Article
In recent years, there has been an exponential increase in the usage of Healthcare 4.0-based diagnostics systems across the globe. In healthcare 4.0, the patient’s records are stored in electronic health record (EHR) repository either at the centralized or distributed location to help the Doctors to easily access the patient’s healthcare data from anywhere at any time. As this data is accessed from the database repository using an open channel, i.e., the Internet, so, security and privacy are major concerns while accessing it from any location. Motivated from these facts, in this paper, we propose a biometric-based authentication scheme to ensure secure access of the patients EHR from any location. In the proposal, we first identified various security threats and challenges in accessing EHR from the database repository. Then, the secure biometric-based scheme is designed which is validated using the Automated Validation of Internet Security Protocols and Applications (AVISPA) tool. The results obtained demonstrated that the proposed scheme is superior (in terms of computation and communication costs) in comparison to the traditional state-of-the-art existing schemes.
Book
This open access book is the final publication of the COST Action IC1303 “Algorithms, Architectures and Platforms for Enhanced Living Environments (AAPELE)” project. Ambient Assisted Living (AAL) is an area of research based on Information and Communication Technologies (ICT), medical research, and sociological research. AAL is based on the notion that technology and science can provide improvements in the quality of life for people in their homes, and that it can reduce the financial burden on the budgets of the healthcare providers. The concept of Enhanced Living Environments (ELE) refers to the AAL area that is more related with ICT. Effective ELE solutions require appropriate ICT algorithms, architectures, platforms, and systems, having in view the advance of science in this area and the development of new and innovative solutions. The aim of this book is to become a state-of-the-art reference, discussing progress made, as well as prompting future directions on theories, practices, standards, and strategies related to the ELE area. It was prepared as a Final Publication of the COST Action IC1303 “Algorithms, Architectures and Platforms for Enhanced Living Environments (AAPELE)”. The book contains 12 chapters and can serve as a valuable reference for undergraduate students, post-graduate students, educators, faculty members, researchers, engineers, medical doctors, healthcare organizations, insurance companies, and research strategists working in this field.
Chapter
Cloud computing and internet of things (IOT) plays an important role in health care services especially in the prediction of diseases in smart cities. IOT devices (digital sensors and etc.) can be used to send big data onto chronic kidney diseases (CKD) to store it in the cloud computing. Therefore, these big data are used to increase the accuracy of prediction of CKD on cloud environment. The prediction of dangerous diseases such as CKD based cloud-IOT is considered a big problem that facing the stakeholders of health cares in smart cities. This paper focuses on predicting of CKD as an example of health care services on cloud computing environment. Cloud computing is supported patients to predict of CKD anywhere and anytime in smart cities. For that, this paper proposes a hybrid intelligent model for predicting CKD based cloud-IOT by using two intelligent techniques, which are linear regression (LR) and neural network (NN). LR is used to determine critical factors that influence on CKD. NN is used to predict of CKD. The results show that, the accuracy of hybrid intelligent model in predicting of CKD is 97.8%. In addition, a hybrid intelligent model is applied on windows azure as an example of a cloud computing environment to predict of CKD to support patients in smart cities. The proposed model is superior to most of the models referred to in the related works by 64%.