Conference PaperPDF Available

A Novel Approach to Predict Chronic Kidney Disease using Machine Learning Algorithms

November 2020

November 2020

DOI:10.1109/ICECA49313.2020.9297392

Conference: 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA)

Authors:

Bhavya Gudeti

VIT University

Shaveta Malik

Terrance Frederick Fernandez

Saveetha Institute of Medical and Technical Sciences (SIMATS)

Show all 6 authorsHide

DATA SET USED

…

Figures - uploaded by Amit Tyagi

Content may be subject to copyright.

Content uploaded by Amit Tyagi

Content may be subject to copyright.

A Novel Approach to Predict Chronic Kidney Disease

using Machine Learning Algorithms

Bhavya Gudeti

School of Computer Science and

Engineering

Vellore Institute of Technology,

Chennai, 600127, Tamilnadu, India.

bhavyagudeti@gmail.com

Terrance Frederick Fernandez

[0000-0002-7317-3362]*

Rajiv Gandhi College of Engineering

and Technology, Puducherry, India

frederick@pec.edu

Shashvi Mishra

School of Computer Science and

Engineering

Vellore Institute of Technology,

Chennai, 600127,

Tamilnadu, India.

shashvimishra@gmail.com

Amit Kumar Tyagi [0000-0003-2657-8700]*

School of Computer Science and

Engineering,

Vellore Institute of Technology,

Chennai, 600127, Tamilnadu, India.

amitkrtyagi025@gmail.com

Shaveta Malik

Terna College of Engineering,

University of Mumbai ,

Maharashtra, India

shavetamalik687@gmail.com

Shabnam Kumari

Anumit Academy of Research and

Research Network, India

shabnam.kt25@gmail.com

Abstract—A staggering 63,538 cases have been registered,

according to India's health statistics upon Chronic Kidney

Disease (CKD). The average age of nephropathy for humans lies

between 48-70 years. CKD is more prevalent among males than

females. Bitterly, our Nation rank among top 17 countries in

CKD since 2015 which is characterized by a gradual loss of

excretory organ performance over time. Earlier detection of the

illness followed by treatment could keep this dreaded disease at

the shore. Machine Learning, is making sensible applications in

areas such as analyzing medical science outcomes, sleuthing

fraud etc. For the prediction of chronic diseases various machine

learning algorithms are implemented.

Our main aim is to differentiate the performance of various

machine learning algorithms primarily based on its accuracy. In

this work we idolized Rcode to compare their performances. The

pivotal purpose of this study is to analyze the Chronic Kidney

Disease dataset and conduct CKD and Non CKD classification

cases.

Keywords— MachineLearning, Chronic Kidney Disease,

Classification, Accuracy, LogisticRegression, Support Vector

Machine

I. INTRODUCTION

Way back in 1950s, the communication among the human

were predominantly oral. However as technology progressed

since, mankind were obsessed with the technology. The million

dollar query remains, "Why Humans are more obsessed with

technology?”. The response is straight-forward. Demand rise

in manufacturing accelerates surplus data in trade growth,

product, business perspective and sales. These days industries

like automation, aerospace, health care, etc., are operating in

communication of Machines or interconnection of Internet of

Things (IoTs). These IoTs devices (in interconnection) are

manufacturing heaps of knowledge that is required to be

analyzed with efficiency through efficient and fashionable

tools/approaches. Current available ancient tools don’t seem to

be enough to analyze huge volumes of data.

Clustering may be thought as an assortment of objects into

clusters that are similar in nature. The cluster/group contains

objects that mimic each other, while the objects in the other

ones are dissimilar. The application is widely applicable in

applications like Marketing, World Wide Web (WWW),

Earthquake Studies, Aerospace, Biology, Insurance, etc. On

another hand, if the information renders with categorizes/ class

labels, and then classification technique is employed to

categorize the given information into number of classes/

categories based on their similarities. The various applications

of classification are speech and handwriting recognition,

Identification of biometric, classification of documents, etc.

Association Rule Mining (ARM) is: if-then statements

facilitate to indicate the relationships between data items

among transactional databases. Further, Regression (or linear

regression) is employed to seek out the relationship between

two continuous variables. One variable is termed as predictor

or independent variable and other is dependent or response

variable. Outlier detection is outlined as “The method of

distinguishing the extreme points within the data. It is a branch

of data mining.” These all algorithms (discussed above) are a

part of Data mining/ Machine Learning/ Computer Vision.

In the human body, the kidney is instrumental in absorbing

and discharging all the toxic and unessential materials,

typically wastes, from the body through egesting and excretion

process. In India, there are approximately one million cases of

Chronic Kidney Disease (CKD) every year. It is dangerous to

kidney and it produces gradual loss in kidney functionality.

Nevertheless, it is unpredictable because its symptoms grow

gradually and are not unique to the disorder, it is important to

detect CKD at its early stage. Kidneys filter wastes and excess

fluids from the blood that are then excreted in excrement. In

the early stages of CKD, we will have a few signs or

symptoms.

In healthcare organization, Classification is one in all the

foremost usually used ways of machine learning. The

classification model shows the class of result for each data

point.

The classifying methods are Decision Tree, Support Vector

Machine, K-Nearest Neighbor, Naïve Bayes classifier, and

Neural Networks. KNN is used to visualize at the relationship

between different CKD risk factors, in order to predict the

disease at an early stage. Machine Learning is a growing phase

dealing with the study of a huge variable data and it is grown

from the study of pattern (speech and handwriting) recognition

and computational learning theory in Artificial Intelligence

having numerous methods, algorithms, and techniques to

analyze and predict the data. Machine Learning techniques

have proved huge success in detection and recognition of many

essential diseases in medical science’s point of view. Machine

learning would thus be useful for predicting whether the patient

has CKD or not in this question. By using old CKD patient

data to train predictive model, Machine Learning does so.

A. Analysis of Chronic Kidney Disease (CKD)

It makes its way as a ground-breaking and actual channel

that liberates the body from squander and parlous substances

and return supplements, amino acids, insulin, hormones and

different basic substances to the circulatory framework.

Incidentally things will flip out gravely, however. “Chronic

Kidney Disease (CKD) is used at some stage in the world to

suggest to any variety of nephritis that returns. “Infection”

incorporates any deviation from the urinary organ structure or

limit customary, paying very little heed to whether or not it is

most likely going to create a man feel unwell or manufacture

complexities. It is a typical issue that may influence anybody at

any age. It’s assessed that just about 3 million people within

the United Kingdom are at risk of CKD. A combination of

totally different conditions that usually place as train on the

kidneys works on CKD.

Hence, the manuscript is organized as follows: Section II

mentions about related work upon this research topic. Section

III, discusses the Proposed System to classify Chronic Kidney

Disease (CKD). Section IV describes the information regarding

the dataset used and transient introduction regarding the

attributes. Section V deals with the machine learning

algorithms, code and its results for variable measures and

therefore the corresponding output obtained in each

classification algorithm. Further, section VI discusses about an

open discussion about current view, results about chronic

disease. Section VII finally discusses the conclusion of the

research work alongside with the attribute improvement.

II. `RELATED WORK

There are diverse researchers who have worked with the

assistance of several different classification algorithms on CKD

prediction. All those had their model performance expected.

Gunarathne, W.H.S.D. [1] compared the effects of divergent

models. Finally, they concluded that the Multiclass Decision

forest algorithm provides plentiful precision for the 14-attribute

(reduced) data set. S.Dilli Arasu and Dr. R. Thirumalaiselvi [2]

worked on missing values in a Chronic Kidney Disease dataset.

They deduced that the missing values in the dataset can not

only reduce the model's accuracy but also the effects of the

prediction. By patterning a numerical method on stages of

Chronic Kidney Disease, they found a solution to this issue and

by doing so; they stood up with unknown values. They

substituted the missing values with those recalculated ones.

In discovering Chronic Kidney Disease using machine

learning algorithms, Asif Salekin and John stankovic[3] used

novel approach. They got findings on a dataset consisting of

400 records and 25 attributes resulting in a patient prone to

CKD or not. In order to achieve results, they used KNN,

random forest and Neural Network algorithms. They used

wrapper methodology for feature reduction which finds CKD

with high accuracy.

12 different classification algorithms on various datasets

were tested by Sahil Sharma, Vinod Sharma, and Atul Sharma

[4], each with 400 records and 24 attributes. They compared

their expected outcomes with actual results in order to

determine predictive accuracy. They used metrics such as

precision, sensitivity, accuracy and specificity for measuring

the performance of the classifiers. Note that Chronic Kidney

Disease (CKD) is not uncommon.

However, a lot of correct information regarding risk for

progression to nephropathy is direly needed for clinical

selections concerning testing, treatment and referral. Hence,

this section highlighted upon the state of art in the field of

CKD. Interestingly, the further section would discuss our work

in detail.

III. PROPOSED SYSTEM

The proposed system deals with the detection of Chronic

Kidney disease. The healthcare systems generate colossal data.

Thus, it is obligatory to use this data productively to analyze,

predict, and to treat an explicit disease. A classification model

offers some solution from determined values. In classification

type, we have a tendency to expect fewer or lots of input to

predict values of their outcomes. In a supervised machine

learning algorithms, the classification algorithm uses the

training dataset. Classification predicts the categorical class

labels in the data.

The research work tries to present a machine learning

framework for information discovery on the Chronic Kidney

Disease dataset. To classify the disease at puerility, three

machine learning algorithms are used, namely Logistic

Regression, Support Vector Machine, and K-Nearest

Neighbors. The molarity of every algorithm is inspected. Our

proposed model combines the Support Vector Machine,

Logistic Regression and K-Nearest Neighbours (KNN) as

mentioned in Fig.1.

This section snapshot our system that has been proposed in

this paper. Now, next section will discuss the datasets that are

being employed and introduce the table that shows the

attributes and description on the same.

Fig.1. Proposed Model using Various Classification Algorithms

IV. `DATASET USED

The proposed framework uses the UCI Machine Learning

Repository dataset called Chronic Kidney Disease (CKD) that

has 25 attributes, out of which, 11 are numerical and 14 are

nominal. Entire 400 instances of the dataset are used for

training to predict machine learning algorithms. In 400

instances, 250 are labeled as Chronic Kidney Disease (CKD)

and 150 are labeled as Non Chronic Kidney disease. The

attributes present in the data set are bacteria, sodium, age,

Hemoglobin, Diabetes Mellitus, Classification, Appetite,

Coronary Artery Disease, Blood Pressure, Pus cell, Anemia,

Pedal Edema, Sugar, White Blood Cell Count, Hypertension,

Red Blood Cell Count, Potassium, Specific Gravity, Pus cell

chumps, Packed Cell Volume, Albumin, Serum Creatinine,

Red Blood Cells, Blood Urea, and Blood Glucose Random.

The dataset that is taken is divided into two groups, one for

testing the samples and another for training the samples. The

ratio for testing and training data is 30% and 70% respectively.

The data set used has been listed in table 1. The readers can

refer following URL [16] for collecting data. Now, next section

will discuss regarding the machine learning algorithms used to

classify CKD.

TABLE 1 :DATA SET USED

S.No

Attribute

Description about the attribute

Bacteria(nominal)

ba – (present / not present)

Sodium(numerical)

sod in mEq/L

Age (numerical)

Person’sAgeinYears

Haemoglobin (numerical)

Hemo in grams

Diabetes Mellitus (nominal)

dm – ( yes / no)

Class (nominal )

class – (ckd / notckd)

Appetite (nominal)

appet – (good / poor)

Coronary Artery Disease

(nominal)

CAD – (yes / no)

Blood Pressure (numerical)

BP in mm/Hg

10.

Pus cell (nominal)

PC – (normal / abnormal)

11.

Anemia (nominal)

ane – (yes / no)

12.

Pedal Edema (nominal)

pe – (yes / no)

13.

Sugar (nominal)

su – (0/1/2/3/4/5)

14.

White Blood CellCount

(numerical )

Wc in cells/cumm

15.

Hypertension (nominal)

htn – (yes/no)

16.

Red Blood Cell Count

(numerical)

Rc in cells/cumm

17.

Potassium (numerical)

Pot in mEq/L

18.

Specific Gravity (nominal)

Sg -

(1.005/1.010/1.015/1.020/1.025)

19.

Pus Cell clumps (nominal)

pcc – (present / notpresent)

20.

Packed Cell Volume

(numerical)

P cv

21.

Albumin (nominal)

al – (0/1/2/3/4/5)

22.

Serum

Creatinine(numerical)

Sc in mgs/dl

23.

Red Blood Cells (nominal)

RBC – (normal/ abnormal)

24.

Blood Urea (numerical)

Bu in mgs/dl

25.

Blood Glucose Random

(numerical)

BGR in mgs/dl

V. SIMULATION RESULTS

This section describes about the simulation results that

are being used in the paper here.

A. Logistic Regression

Logistic Regression may be a calculation for order. As

per heaps of autonomous factors, the logic is 1/0, Yes/No,

True/False. It can be employed to access a paired answer. We

have a tendency to utilize the likelihood log as an

impoverished variable. Logistic Regression is used for the

classification problems in Machine Learning Algorithms. It is

a prophetic analysis algorithm and it is based mostly on the

concept of probability. It means that, given a certain factor,

logistic regression is used to predict an outcome that has two

values. The source code is exemplified in Table I and the

output in Fig.2. From them, we can deduce that the accuracy

of Logistic Regression is 0.7725

TABLE II : RCODE FOR LOGISTIC REGRESSION

ckd<- read.csv("C:/Users/bhavya/Desktop/ckd.csv")

ckd

ckd$Type<- NULL

head(ckd)

dim(ckd)

summary(ckd)

names(ckd)

contrasts(ckd$classification)

#Logistic Regression

glm.fit=glm(classification~age+bp+pcv+bu,

data=ckd,family=binomial)

summary(glm.fit)

#predict provides a vector of fitted probabilities.

glm.probab=predict(glm.fit,type="response")

glm.probab[1:20]

glm.predc=rep("ckd",400)

glm.predc[glm.probab>.5]="notckd"

table(glm.predc,ckd$classification)

mean(glm.pred==ckd$classification)

Fig.2. Output for Logistic Regression

B. Support Vector Machines (SVM)

For each relapse and grouping undertakings, Support

Vector Machine, curtailed as SVM, will be used. Multitude of

researchers favors it deeply as it provides unbelievable

accuracy with less power of activity. In ML, SVM support

vector systems are supervised models compatible with

learning. Support Vector Machine (SVM) offers a dual

platform for regression and classification. This can be used to

solve both linear problems and non-linear ones. This algorithm

uses a hyper plane to categorize the data points. Within this

SVM algorithm, each data point will be plotted as a point in n-

dimensional space, with a value of each attribute being the

value of a given coordinate. Classification can be

accomplished by searching for the right hyper-plane which

basically distinguishes between the two CKD and not CKD

groups. Table III presents the code behind SVM and from the

results in Fig.3, we can witness that the accuracy of SVM =

0.9925187

TABLE III : RCODE FOR SVM

#Generate a random number that is 70% of the total number of

rows in dataset.

ckd1 <- sample(1:nrow(ckd),0.7*nrow(ckd))

ckd.train<- ckd[ckd1,]

ckd.test<- ckd[-ckd1,]

set.seed(1)

ckd<-ckd[1:200,]

x=cbind.data.frame(ckd.train[,9:13])

y=ckd.train$classification

dataset=data.frame(x=x, y=as.factor(y))

library(e1071)## Support Vector Machine

svmfit=svm(y~., data=dataset, kernel="radial",gamma=1,

cost=1)

summary(svmfit)

svm.probs=predict(svmfit,type="response")

svm.probs[1:400]

svm.pred=rep("ckd",400)

svm.pred[svm.probs="notckd"]="notckd"

mean(svm.pred==ckd$classification)

Fig.3. Output for SVM

C. K-Nearest Neighbors Classification:

The sole performance of the K nearest neighbor classifier

algorithm is to predict the target variable by capturing the

nearest neighbor class. The nearest class will be known as the

target variable using the distance measures like Euclidean

distance.

Algorithm:

1. Initialize the parameter K.

2. Calculate the distance between the test sample and all

the training samples

3. Sort the distance in the ascending order.

4. Take K-nearest neighbors.

5. Gather the class of the nearest neighbor.

6. Here as we can see the accuracy in KNN = 0.7875

From the algorithm mentioned above, it is evident

that the results are better in Support Vector Machine. We

provide result with an accuracy of 0.9925187. Now, the

subsequent section will provide a conclusion regarding this

work in brief adding some future enhancement possibilities

with this work.

TABLE IV : SIMULATION RESULTS

Name of Classifier

Accuracy

Logistic Regression

0.7725

Support Vector Machine (SVM)

0.9925187

K-Nearest Neighbour

0.7875

`VI. AN OPEN DISCUSSION

Each classifier's results were evaluated using different

evaluation parameters, and cross-checked against over-fitting

with 10-fold cross-validation. The technique of nested cross-

validation has also helped to fine-tune the model parameters.

The tests will be carried out using the Python 3.3

programming language through the Jupyter Notebook web

application. Several Sciket-learning libraries were used, which

is a free machine learning system platform for Python.

Accuracy using F1-measurement, sensitivity, specificity and

Area under Curve (AUC) are the assessment measures

considered in this analysis. Each model produces different

outputs; depending on its parameter values.Thus with the GB

model we achieve the best efficiency in detection. This result

is better than the results obtained by using a multilayer

perceptron algorithm (MLP) single point split, seven

attributes, and a 98.4 percent F1 measurement. By contrast, a

98.0 per cent F1-measure was obtained with better efficiency

relative to study using RF and five apps.

Some limitations on the dataset used are, however,

important to this analysis. Second, the sample size (400

instances) is expected to be low and may affect the reliability

of the studies. Second, problem identification is another

dataset which has the same features to assess the performance

of the data sets. Also, the readers are suggested to read [17,

18, 19, 20, 21 and 22] to know more about artificial

intelligence, machine learning and deep learning techniques,

i.e., their scope in near future.

VII. CONCLUSION AND FUTURE WORK

Aimed to diagnose Chronic Kidney Disease (CKD) at an

earlier stage, this manuscript introduced a variety of machine

learning algorithms. The models obtained from CKD patients

are trained and authenticated with the mentioned input

parameters. Support Vector Machine, Logistic Regression and

knn are analyzed to conduct the study of CKD. The

performances of those algorithms were determined primarily

on the basis of precision. Our results exemplified that the

Support Vector Machine algorithm predicts Chronic Kidney

Disease better than Logistic Regression and K-Nearest

Neighbors within the narrow limits of this medical scenario.

The benefit of this approach is that the prediction process

takes far less time and helps doctors to initiate treatment at the

earliest for patients with CKD and further to classify larger

population of patients within shorter span. Because the dataset

used in this paper is tiny with 400 examples, we prefer to work

with larger datasets in the future or compare the results of this

dataset with a different dataset with the same. In addition, to

help minimise the incidence of CKD, we try to predict if a

person with this syndrome chances chronic risk factors such as

hypertension, family history of kidney failure and diabetes

using the appropriate dataset.

AUTHOR’S CONTRIBUTIONS

Shashvi and Bhavya have drafted this manuscript. Amit

Kumar Tyagi and Terrance Frederick Fernandez have

analyzed and approved this manuscript for publication.

CONFLICT OF INTEREST

The authors do not have any conflict concerning

publication of this manuscript.

REFERENCES

[1] L. Rubini, “Early stage of chronic kidney disease UCI machine learning

repository,”2015. [Online].

Available:http://archive.ics.uci.edu/ml/datasets/Chronic Kidney Disease.

[2] Asif Salekin, John Stankovic, "Detection of Chronic Kidney Disease and

Selectiing Important Predictive Attributes," Proc. IEEE International

Conference on Healthcare Informatics (ICHI), IEEE, Oct. 2016,

doi:10.1109/ICHI.2016.36.

[3] Q. Zhang and D. Rothenbacher, "Prevalence of chronic kidney disease

in population-based studies: systematic review," BMC Public Health,

vol. 8, (1), pp. 117, 2008.

[4] K. A. Padmanaban and G. Parthiban, "Applying Machine Learning

Techniques for Predicting the Risk of Chronic Kidney Disease," Indian

Journal of Science and Technology, vol. 9, (29), 2016.

[5] J. Xiao et al, "Comparison and development of machine learning tools in

the prediction of chronic kidney disease progression," Journal of

Translational Medicine, vol. 17, (1), pp. 119, 2019.

[6] Sahil Sharma, Vinod Sharma, Atul Sharma, “Performance Based

Evaluation of Various Machine Learning Classification Techniques for

Chronic Kidney Disease Diagnosis,” July18, 2016.

[7] GunarathneW.H.S.D, Perera K.D.M, Kahandawaarachchi K.A.D.C.P,

“Performance Evaluation on Machine Learning Classification

Techniques for Disease Classification and Forecasting through Data

Analytics for Chronic Kidney Disease (CKD)”, 2017 IEEE

17thInternational Conference on Bioinformatics and Bioengineering.

[8] S.Ramya, Dr.N.Radha, "Diagnosis of Chronic Kidney Disease Using

Machine Learning Algorithms," Proc. International Journal of

Innovative Research in Computer and Communication Engineering, Vol.

4, Issue 1, January 2016.

[9] S. A. Shinde and P. R. Rajeswari, “Intelligent health risk prediction

systems using machine learning: a review,” IJET, vol. 7, no. 3, pp.

1019– 1023, 2018.

[10] A.J. Aljaaf et al, "Early prediction of chronic renal

disorder mistreatment machine learning supported

by prognosticative analytics," in 2018 IEEE

Congress on organic process Computation (CEC), 2018.

[11] J.Xiao et al, "Comparison and development of machine learning tools in

the prediction of chronic renal disorder progression," Journal of

Translational drugs, vol. 17, (1), pp. 119, 2019.

[12] P. Yang et al, "A review of ensemble strategies in bioinformatics,

"Current Bioinformatics, vol. 5, (4), pp. 296-308, 2010.

[13] L.Deng et al, "Prediction of protein-protein interaction

sites mistreatment associate

ensemble methodology," BMC Bioinformatics, vol. 10, (1), pp. 426,

2009.

[14] M. Moslem and M. Pasha, "Survey of machine learning algorithms

fordisease diagnostic," Journal of Intelligent Learning Systems

andApplications, vol. 9, (01), pp. 1, 2017.

[15] S.Karamizadeh et al, "Advantage and disadvantage of support vector

machine practicality," in 2014 International Conference on laptop,

Communications, and management Technology (I4CT), 2014.

[16] http://archive.ics.uci.edu/ml/datasets/Chronic_Kidney_Disease

[17] Akshara Pramod, Harsh Sankar Naicker, Amit Kumar Tyagi, “Machine

Learning and Deep Learning: Open Issues and Future Research

Directions for Next Ten Years”, Book: Computational Analysis and

Understanding of Deep Learning for Medical Care: Principles, Methods,

and Applications, 2020, Wiley Scrivener, 2020.

[18] Tyagi, Amit Kumar and G, Rekha, Machine Learning with Big Data

(March 20, 2019). Proceedings of International Conference on

Sustainable Computing in Science, Technology and Management

(SUSCOM), Amity University Rajasthan, Jaipur - India, February 26-

28, 2019.

[19] Amit Kumar Tyagi, Poonam Chahal, “Artificial Intelligence and

Machine Learning Algorithms”, Book: Challenges and Applications for

Implementing Machine Learning in Computer Vision, IGI Global, 2020.

[20] Amit Kumar Tyagi, G. Rekha, “Challenges of Applying Deep Learning

in Real-World Applications”, Book: Challenges and Applications for

Implementing Machine Learning in Computer Vision, IGI Global 2020,

p. 92-118.

[21] Terrance Frederick Fernandez and M. Pradeep, "Multi-level Predictive

with Training Framework (MP with TF) for ranking machine learning

algorithms", IEEE proceeding of 4th International conference on I-

SMAC (IoT in Social, Mobile, Analytics and Cloud (I-SMAC 2020),

pp.697-703, ISBN: 978-1-7281-5464-0/20, 7th to 9th October 2020,

SCAD Palladum, Tamil Nadu.

[22] Aravindan C, Terrance Frederick Fernandez, Hema Malini V and

Catherine Madhu Vidha J, “An Extensive Research on Cyber Threats

using Learning Algorithm”, IEEE proceeding of International

Conference on Emerging Trends in Information Technology and

Engineering, ISBN: 978-1-7281-4141-1, 25th February 2020, Vellore,

India.

Machine Learning-Based Identification and Forecasting of Chronic Illnesses

Article

Full-text available

Jun 2024

People today suffer from a wide range of illnesses as a result of their lifestyle choices and the state of the environment. In order to stop such diseases from getting worse, it is crucial to recognize and anticipate them early on. Most of the time, doctors find it challenging to precisely identify the disorders by hand. This article aims to predict and identify patients with more prevalent chronic illnesses. This might be accomplished by making sure that this classification accurately identifies those who have chronic illnesses by applying a state-of-the-art machine learning technique. Another difficult task is predicting diseases. Therefore, data mining is essential to the prediction of disease. By using machine learning algorithms like convolutional neural network (CNN) for automatic feature extraction and disease prediction and K-nearest neighbor (KNN) for distance calculation to find the exact match in the data set and the final disease prediction outcome, the proposed system provides a broad disease prognosis based on the patient's symptoms. The creation of the data set involved gathering information on the symptoms of the sickness, the individual's lifestyle, and specifics about medical consultations, all of which were factored into this broad illness prediction. In conclusion, this research presents a comparative analysis of the suggested system using different algorithms, including logistic regression, decision trees, and Naive Bayes.

On the diagnosis of chronic kidney disease using a machine learning-based interface with explainable artificial intelligence

Article

Full-text available

Jun 2024

Chronic Kidney Disease (CKD) is increasingly recognised as a major health concern due to its rising prevalence. The average survival period without functioning kidneys is typically limited to approximately 18 days, creating a significant need for kidney transplants and dialysis. Early detection of CKD is crucial, and machine learning methods have proven effective in diagnosing the condition, despite their often opaque decision-making processes. This study utilised explainable machine learning to predict CKD, thereby overcoming the 'black box' nature of traditional machine learning predictions. Of the six machine learning algorithms evaluated, the extreme gradient boost (XGB) demonstrated the highest accuracy. For interpretability, the study employed Shapley Additive Explanations (SHAP) and Partial Dependency Plots (PDP), which elucidate the rationale behind the predictions and support the decision-making process. Moreover, for the first time, a graphical user interface with explanations was developed to diagnose the likelihood of CKD. Given the critical nature and high stakes of CKD, the use of explainable machine learning can aid healthcare professionals in making accurate diagnoses and identifying root causes.

Federated-Reinforcement Learning-Assisted IoT Consumers System for Kidney Disease Images

Article

Full-text available

Apr 2024
IEEE T CONSUM ELECTR

The number of people with kidney disease rises every day for many reasons. Many existing machine-learning-enabled mechanisms for processing kidney disease suffer from long delays and consume much more resources during processing. In this paper, the study shows how federated and reinforcement learning schemes can be used to develop the best delay scheme. The scheme must optimize both the internal and external states of reinforcement learning and the federated learning fog cloud network. This work presents the Adaptive Federated Reinforcement Learning-Enabled System (AFRLS) for Internet of Things (IoT) consumers’ kidney disease image processing. The main relationship between IoT consumers and kidney image is that the data is collected from different IoT consumer sources, such as ultrasound and X-rays in healthcare clinics. In healthcare applications, kidney urinary tasks reduce the time it takes to preprocess federated learning datasets for training and testing and run them on different fog and cloud nodes. AFRLS decides the scheduling on other nodes and improves constraints based on the decision tree. Based on the simulation results, AFRLS is a new strategy that reduces the time tasks need to be delayed compared to other machine learning methods used in fog cloud networks. The AFRLS improved the delay among nodes by 55%, the delay among internal states by 40%, and the training and testing delay by 51%.

Blockchain-Based Intelligent, Interactive Healthcare Systems

Chapter

May 2024

When it comes to the smart healthcare sector, blockchain technology presents several prospects. Aside from its usage in the financial industry, blockchain technology is now also utilised in the process of establishing trust, protecting privacy, and ensuring security. Within the scope of this work, we will provide an explanation of a new development in the healthcare business that strives to enhance the effectiveness and safety of the administration of healthcare data. We employ blockchain technology to construct a decentralised and tamper-proof network that facilitates safe data exchange among healthcare stakeholders such as patients, providers, and insurers. This technique is known as Blockchain-based Intelligent and Interactive Healthcare Systems (Blockchain-based IHS). The purpose of this chapter is to present an overview of BIIHS, including its advantages, disadvantages, and potential future paths. The BIIHS has the potential to enhance patient outcomes by facilitating personalised treatment plans, lowering the number of medical mistakes, and offering real-time access to vital and sensitive health data. Nevertheless, in order to fully realise the promise of BIIHS, it is necessary to solve problems such as regulation compliance, interoperability, and privacy concerns. Artificial intelligence and the internet of things are two examples of upcoming technologies that might be included into BIIHS in the future. This would allow for the healthcare sector to further improve its capabilities.

Digital Twin-Based Smart Healthcare Services for the Next Generation Society

Chapter

May 2024

In today's smart era, the healthcare landscape is rapidly evolving, driven by advancements in technology and the growing healthcare needs of an aging and increasingly interconnected society. To address these challenges, the concept of digital twins has emerged as a promising solution to transform healthcare services for the next generation. This work provides an overview of the key aspects and benefits of digital twin-based smart healthcare services and their potential to revolutionize the healthcare industry. DWT involves creating a digital replica or model of a physical entity, in this case, an individual's health and medical data. By harnessing real-time data from various sources, including wearable devices, electronic health records, and medical imaging, Digital Twins provide a holistic view of an individual's health status, treatment history, and predictive analytics for future health outcomes. This work provides information about data-driven approach enables healthcare providers to make more informed decisions and tailor personalized treatment plans/ improving patient outcomes.

Early Detection of Chronic Kidney Disease Using Different Machine Learning Algorithms

Conference Paper

Mar 2024

Kidney disease encompasses various abnormalities in renal function, ranging from subtle damage to severe conditions such as excessive cell expansion, impaired blood filtration, and the deposition of crystalline minerals. Recognizing its significant impact on mortality and years of life lost, early detection becomes paramount in providing timely and focused medical interventions. This research employs a diverse set of machine learning algorithms, including K-Nearest Neighbors, Decision Tree Classifier, Random Forest Classifier, Ada Boost Classifier, Gradient Boosting Classifier, Stochastic Gradient Boosting, Xg- Boost, Cat Boost Classifier, Extra Trees Classifier, Light Gradient Boosting Machine (LGBM) Classifier, Logistic Regression, Support Vector Machine (SVM), Naive Bayes, and Artificial Neural Network (ANN). Evaluation of these algorithms reveals outstanding performance by Ada Boost, XgBoost, and Light Gradient Boosting Machine (LGBM) Classifier, achieving an impressive accuracy of 98%. This study underscores the pivotal role of machine learning in early predicting kidney disease, paving the way for personalized patient care.

A Review on Kidney Failure Prediction Using Machine Learning Models

Chapter

Apr 2024

End-stage renal disease (ESRD), commonly known as kidney failure, is a critical medical condition that has a significant impact on global health. Early detection of kidney failure is crucial in preventing and managing this condition. In recent years, machine learning (ML) models have emerged as promising tools for predicting kidney failure, offering the potential to improve patient outcomes through timely intervention. This comprehensive review provides an overview of the current state of research on kidney failure prediction using various ML models. The review begins by presenting an overview of kidney failure, its prevalence, and the challenges associated with its early detection. It then delves into the role of ML in healthcare and specifically focuses on its application in predicting kidney failure. The discussion encompasses a wide range of ML techniques, including logistic regression, decision trees, support vector machines, and deep learning. The review analyzes key studies and methodologies employed in predicting kidney failure, highlighting the strengths and limitations of different ML approaches. It emphasizes the importance of feature selection, data preprocessing, and model evaluation in enhancing the accuracy and reliability of predictions. Furthermore, it addresses the issue of data imbalance, a common challenge in medical datasets, and explores strategies to mitigate its impact on model performance. In addition to summarizing existing research, the review identifies current gaps in the literature and suggests avenues for future research. This includes the exploration of novel data sources, the integration of multi-modal data, and the development of interpretable models that can assist healthcare professionals in making informed decisions. Overall, this review serves as a valuable resource for researchers, clinicians, and healthcare professionals interested in the application of ML models for kidney failure prediction. By synthesizing the current state of knowledge, it provides insights into the potential of ML models to improve patient outcomes and highlights areas for further research.

Prediction of Kidney Disease Utilizing a Hybrid Deep Learning Methodology

Conference Paper

Feb 2024

A Comparative Analysis of Machine Learning and Deep Learning Approaches for Prediction of Chronic Kidney Disease Progression

Article

Full-text available

Mar 2024

Chronic kidney disease is a significant health problem worldwide that affects millions of people, and early detection of this disease is crucial for successful treatment and improved patient outcomes. In this research paper, we conducted a comprehensive comparative analysis of several machine learning algorithms, including logistic regression, Gaussian Naive Bayes, Bernoulli Naive Bayes, Support Vector Machine, X Gradient Boosting, Decision Tree Classifier, Grid Search CV, Random Forest Classifier, AdaBoost Classifier, Gradient Boosting Classifier, XgBoost, Cat Boost Classifier, Extra Trees Classifier, KNN, MLP Classifier, Stochastic gradient descent, and Artificial Neural Network, for the prediction of kidney disease. In this study, a dataset of patient records was utilized, where each record consisted of twenty-five clinical features, including hypertension, blood pressure, diabetes mellitus, appetite and blood urea. The results of our analysis showed that Artificial Neural Network (ANN) outperformed other machine learning algorithms with a maximum accuracy of 100%, while Gaussian Naive Bayes had the lowest accuracy of 94.0%. This suggests that ANN can provide accurate and reliable predictions for kidney disease. The comparative analysis of these algorithms provides valuable insights into their strengths and weaknesses, which can help clinicians choose the most appropriate algorithm for their specific requirements.

EAI Endorsed Transactions on Internet of Things A Comparative Analysis of Machine Learning and Deep Learning Approaches for Prediction of Chronic Kidney Disease Progression

Article

Full-text available

Mar 2024

Machine Learning and Deep Learning: Open Issues and Future Research Directions for the Next 10 Years

Chapter

Full-text available

Jul 2021

With the development in technology, many other technologies like machine learning (ML), deep learning, blockchain technology, Internet of Things, and quantum computing have taken place in this current era. These technologies are helping human being to live their life comfortably and without any hurdle. Today, technology is helping human and protecting nature with minimum waste of available/limited resources. Among these inventions, ML and deep learning are two unique inventions which have attract many researchers or computer science researchers (or many research communities) to solve complex problems through ML. Today, ML use has been moved in many sectors to increase productivity of businesses; for example, for retail/marketing purpose, churn prediction of customers, for e-healthcare, and detecting disease in early stages. These are the few examples where ML is used in this current smart era. Together, this deep learning also has increased its importance over ML in many applications like bio-informatics, health informatics, identification of images or handwritten languages, and audio recognition. Many researchers get problematic scenario when they are not sure about particular use of machine and deep learning. This work fulfil such conditions/requirements and provide a complete details about ML and deep learning, i.e., with its evolution to forefront use, to use in many applications, to benefiting to the society, and to challenges and potential limitation in the respective learning techniques.

Challenges of Applying Deep Learning in Real-World Applications

Chapter

Full-text available

Jan 2020

Due to development in technology, millions of devices (internet of things: IoTs) are generating a large amount of data (which is called as big data). This data is required for analysis processes or analytics tools or techniques. In the past several decades, a lot of research has been using data mining, machine learning, and deep learning techniques. Here, machine learning is a subset of artificial intelligence and deep learning is a subset of machine leaning. Deep learning is more efficient than machine learning technique (in terms of providing result accurate) because in this, it uses perceptron and neuron or back propagation method (i.e., in these techniques, solve a problem by learning by itself [with being programmed by a human being]). In several applications like healthcare, retails, etc. (or any real-world problems), deep learning is used. But, using deep learning techniques in such applications creates several problems and raises several critical issues and challenges, which are need to be overcome to determine accurate results.

Artificial Intelligence and Machine Learning Algorithms

Chapter

Full-text available

Jan 2020

With the recent development in technologies and integration of millions of internet of things devices, a lot of data is being generated every day (known as Big Data). This is required to improve the growth of several organizations or in applications like e-healthcare, etc. Also, we are entering into an era of smart world, where robotics is going to take place in most of the applications (to solve the world's problems). Implementing robotics in applications like medical, automobile, etc. is an aim/goal of computer vision. Computer vision (CV) is fulfilled by several components like artificial intelligence (AI), machine learning (ML), and deep learning (DL). Here, machine learning and deep learning techniques/algorithms are used to analyze Big Data. Today's various organizations like Google, Facebook, etc. are using ML techniques to search particular data or recommend any post. Hence, the requirement of a computer vision is fulfilled through these three terms: AI, ML, and DL.

Comparison and development of machine learning tools in the prediction of chronic kidney disease progression

Article

Full-text available

Apr 2019
J TRANSL MED

Background Urinary protein quantification is critical for assessing the severity of chronic kidney disease (CKD). However, the current procedure for determining the severity of CKD is completed through evaluating 24-h urinary protein, which is inconvenient during follow-up. Objective To quickly predict the severity of CKD using more easily available demographic and blood biochemical features during follow-up, we developed and compared several predictive models using statistical, machine learning and neural network approaches. Methods The clinical and blood biochemical results from 551 patients with proteinuria were collected. Thirteen blood-derived tests and 5 demographic features were used as non-urinary clinical variables to predict the 24-h urinary protein outcome response. Nine predictive models were established and compared, including logistic regression, Elastic Net, lasso regression, ridge regression, support vector machine, random forest, XGBoost, neural network and k-nearest neighbor. The AU-ROC, sensitivity (recall), specificity, accuracy, log-loss and precision of each of the models were evaluated. The effect sizes of each variable were analysed and ranked. Results The linear models including Elastic Net, lasso regression, ridge regression and logistic regression showed the highest overall predictive power, with an average AUC and a precision above 0.87 and 0.8, respectively. Logistic regression ranked first, reaching an AUC of 0.873, with a sensitivity and specificity of 0.83 and 0.82, respectively. The model with the highest sensitivity was Elastic Net (0.85), while XGBoost showed the highest specificity (0.83). In the effect size analyses, we identified that ALB, Scr, TG, LDL and EGFR had important impacts on the predictability of the models, while other predictors such as CRP, HDL and SNA were less important. Conclusions Blood-derived tests could be applied as non-urinary predictors during outpatient follow-up. Features in routine blood tests, including ALB, Scr, TG, LDL and EGFR levels, showed predictive ability for CKD severity. The developed online tool can facilitate the prediction of proteinuria progress during follow-up in clinical practice. Electronic supplementary material The online version of this article (10.1186/s12967-019-1860-0) contains supplementary material, which is available to authorized users.

Intelligent health risk prediction systems using machine learning: A review

Article

Full-text available

Jun 2018

Humans are considered to be the most intelligent species on the mother earth and are inherently more health conscious. Since Centuries mankind has discovered various proven healthcare systems. To automate the process and predict diseases more accurately machine learn-ing methods are gaining popularity in research community. Machine Learning methods facilitate development of the intelligence into a machine, so that it can perform better in the future using the learned experience. Machine learning methods application on electronic health record dataset could provide valuable information and predication of health risks. The aim of this research review paper are four-fold: i) serve as a guideline for researchers who are new to machine learning area and want to contribute to it, ii) provide state-of-the-art survey of machine learning, iii) application of machine learning techniques in the health prediction, and iv) provides further research directions required into health prediction system using machine learning.

Survey of Machine Learning Algorithms for Disease Diagnostic

Article

Full-text available

Jan 2017

In medical imaging, Computer Aided Diagnosis (CAD) is a rapidly growing dynamic area of research. In recent years, significant attempts are made for the enhancement of computer aided diagnosis applications because errors in medical diagnostic systems can result in seriously misleading medical treatments. Machine learning is important in Computer Aided Diagnosis. After using an easy equation, objects such as organs may not be indicated accurately. So, pattern recognition fundamentally involves learning from examples. In the field of bio-medical, pattern recognition and machine learning promise the improved accuracy of perception and diagnosis of disease. They also promote the objectivity of decision-making process. For the analysis of high-dimensional and multimodal bio-medical data, machine learning offers a worthy approach for making classy and automatic algorithms. This survey paper provides the comparative analysis of different machine learning algorithms for diagnosis of different diseases such as heart disease, diabetes disease, liver disease, dengue disease and hepatitis disease. It brings attention towards the suite of machine learning algorithms and tools that are used for the analysis of diseases and decision-making process accordingly.

Multi-level Predictive with Training Framework (MP with TF) for ranking machine learning algorithms

Conference Paper

Oct 2020

An Extensive Research on Cyber Threats using Learning Algorithm

Conference Paper

Feb 2020

Early Prediction of Chronic Kidney Disease Using Machine Learning Supported by Predictive Analytics

Conference Paper

Jul 2018

Chronic Kidney Disease is a serious lifelong condition that induced by either kidney pathology or reduced kidney functions. Early prediction and proper treatments can possibly stop, or slow the progression of this chronic disease to end-stage, where dialysis or kidney transplantation is the only way to save patient’s life. In this study, we examine the ability of several machine-learning methods for early prediction of Chronic Kidney Disease. This matter has been studied widely; however, we are supporting our methodology by the use of predictive analytics, in which we examine the relationship in between data parameters as well as with the target class attribute. Predictive analytics enables us to introduce the optimal subset of parameters to feed machine learning to build a set of predictive models. This study starts with 24 parameters in addition to the class attribute and ends up by 30% of them as ideal subset to predict Chronic Kidney Disease. A total of 4 machine learning based classifiers have been evaluated within a supervised learning setting, achieving highest performance outcomes of AUC 0.995, sensitivity 0.9897, and specificity 1. The experimental procedure concludes that advances in machine learning, with assist of predictive analytics, represent a promising setting by which to recognize intelligent solutions, which in turn prove the ability of predication in the kidney disease domain and beyond.

Performance Evaluation on Machine Learning Classification Techniques for Disease Classification and Forecasting through Data Analytics for Chronic Kidney Disease (CKD)

Conference Paper

Oct 2017

A Novel Approach to Predict Chronic Kidney Disease using Machine Learning Algorithms

Figures

Recommended publications

Machine Learning Algorithms for Predicting the Risk of Chronic Kidney Disease in Type 1 Diabetes Pat...

Optimized Ensemble Machine Learning Model for Chronic Kidney Disease Prediction

A Novel Onboard Bog Re-Liquefaction Process Design Using Subcooled Lng Based on Machine Learning Pre...

A novel combination machine learning model for regional GDP prediction: evidence from China