Conference PaperPDF Available

Performance Evaluation of Random Forests and Artificial Neural Networks for the Classification of Liver Disorder

February 2018

February 2018

DOI:10.1109/IC4ME2.2018.8465658

Conference: International Conference on Computer, Communication, Chemical, Materials and Electronic Engineering
At: Rajshahi, Bangladesh

Authors:

Md. Rezwanul Haque

Khulna University of Engineering and Technology

Md. Milon Islam

University of Waterloo

Hasib Iqbal

Khulna University of Engineering and Technology

Show all 5 authorsHide

Liver is the major organ inside the human body which is very supportive for digesting food, eliminating poisons, and stocking energy. The rate of Liver disorder patients is rapidly rising all over the world. But it is very hard to identify the disorder from its ambiguous symptoms which increases the mortality rate due to this disease. The paper represents an expert scheme for the classification of liver disorder using Random Forests (RFs) and Artificial Neural Networks (ANNs). The methods train the input features using 10-fold cross validation fashion. The dataset named as BUPA liver dataset is retrieved from UCI machine learning repository for our research study. The performance of the proposed scheme is assessed in view of accuracy, positive predictive value, negative predictive value, sensitivity, specificity and F1 score. The scheme delivers a better result for training but comparatively low for testing. The scheme obtained the accuracy of 80% and 85.29% by RFs and ANNs respectively along with the F1 score of 75.86% and 82.76% in testing phase.

Working procedure of Artificial Neural Networks [12].

…

Figures - uploaded by Md. Milon Islam

Content may be subject to copyright.

Content uploaded by Md. Milon Islam

Content may be subject to copyright.

Performance Evaluation of Random Forests and

Artificial Neural Networks for the Classification of

Liver Disorder

Md. Rezwanul Haque

, Md. Milon Islam

, Hasib Iqbal

, Md. Sumon Reza

, and Md. Kamrul Hasan

1,2

Department of Computer Science and Engineering

Khulna University of Engineering & Technology, Khulna-9203, Bangladesh

Daffodil International University, Dhaka-1207, Bangladesh

r.haque.249.rh@gmail.com

, milonislam@cse.kuet.ac.bd

, pranto00250@gmail.com

, sumon.info2015@gmail.com

, and

mhgolap11@gmail.com

Abstract— Liver is the major organ inside the human body which

is very supportive for digesting food, eliminating poisons, and

stocking energy. The rate of Liver disorder patients is rapidly

rising all over the world. But it is very hard to identify the disorder

from its ambiguous symptoms which increases the mortality rate

due to this disease. The paper represents an expert scheme for the

classification of liver disorder using Random Forests (RFs) and

Artificial Neural Networks (ANNs). The methods train the input

features using 10-fold cross validation fashion. The dataset named

as BUPA liver dataset is retrieved from UCI machine learning

repository for our research study. The performance of the

proposed scheme is assessed in view of accuracy, positive

predictive value, negative predictive value, sensitivity, specificity

and F1 score. The scheme delivers a better result for training but

comparatively low for testing. The scheme obtained the accuracy

of 80% and 85.29% by RFs and ANNs respectively along with the

F1 score of 75.86% and 82.76% in testing phase.

Keywords- Liver Disorder; Random Forests; Artificial Neural

Networks; Performance Measure Indices.

NTRODUCTION

Conferring to National statistics [1] in the UK, liver disorder

has been classified as the fifth most common reason of

mortality. It is also acknowledged as the second reason of death

amongst all gastral diseases in the US [2]. A recent statistics

from the International Liver Congress [3] shows that the

number of sufferers from a chronic liver condition in the

European zone is about 29000 thousand and 30000 thousand

have a liver disorder in America.

The liver is the biggest strong organ in the human body and

it is used to make and secrete bile as this is considered as gland.

The location of the liver is the upper right portion of the

abdomen which is surrounded the rib cage. The liver performs

many complex tasks inside the body and due to the damage of

the liver the tasks cannot be completed which creates illness.

Liver disorder is the trouble of liver task that origins illness [4].

Alcohol consumption plays the major role for liver disorder.

Parasites and viruses may affect the liver resulting

inflammation which decreases liver task. An irregular gene

inherited from the parents can cause liver disorder.

Generally it is very difficult to detect liver disorder because

of its ambiguous symptoms and confused with various health

difficulties [5]. Generally, the doctors used various types of

blood test to determine the proper functionality of liver. The

indicators used to detect the liver disorder are in LTs

considering gamma-glutamyl transpeptidase (GGT), alamine

aminotransferase alamine aminotransferase (ALT), alkaline

phosphotase (ALP) and aspartate aminotransferase (AST)

which are found in liver cells. The presence of above described

enzymes in blood causes liver disorder generally. The ALT and

AST are utilized to find the liver disorder in case of viral

hepatitis for the accurate diagnosis. The anomalies of ALP and

GGT causes liver disorder [6].

The above described techniques are unable to deliver an

exact and consistent result. The techniques are involved with

doctors or physicians or other medical staffs. So, a scheme

which can operate without any medical equipment’s and

medical staffs can lead to a suitable solution. We introduce an

expert scheme to classify the input attributes depending on the

selector of liver disorder. We have used two supervised

machine learning techniques termed as RFs and ANNs which

are related with learning computations that explore data utilized

for regression analysis and classification to identify the liver

disorder.

The rest of the paper is ordered as follows: Section II

illustrated the related works that are on-going in this field. The

working principle of the RFs and ANNs is described in Section

III. The proposed methodology with necessary steps is

explained in Section IV. The implementation and results

analysis are depicted in Section V. Section VI concludes the

paper.

II. R

ELATED WORKS

There are several recent techniques have been developed

with the evolution of technology for the classification of liver

disorder. The recent works in this field is described shortly as

follows.

Ramana et al. [7] proposed a technique for the detection of

liver disorder using selected classification algorithms. The

approach used are Support Vector Machines, C4.5, Naïve

Bayes classifier and Back propagation Neural Network for

classification and the performance of the technique is measured

in view of accuracy, precision, sensitivity etc. They simulated

their work in WEKA. The accuracy obtained by the technique

is 56.62% in NBC, 68.69% in C4.5, 71.59% in Back

Propagation, 62.89% for K-NN and 58.26% in SVM. They

showed the result in the paper by tuning the features.

In [8], the authors demonstrated a scheme for the

identification of liver disorder which using KNN with

Euclidean Distance. The results of the KNN based method are

compared with the other classifier. The scheme obtained

accuracy of 92.53% in testing and 100% in training.

Bahramirad et al. [9] drawn a comparative study for the

detection of liver disorder. The Rapid Mining and Weka are

used for simulation purpose. They used eleven classification

algorithms in their study. The highest accuracy obtained by

their technique is 73.91% using Neural Net and Gaussian

Processes. They also measured precision and recall for all the

algorithms. The precision and recall in the scheme is 79.01%

and 79.52% using Gaussian Processes and Neural Net

respectively.

The authors in [10] illustrated a survey and compare the

various data mining algorithms for the prediction of liver

disorder using Rapid Miner and IBM SPSS Modeler. C5.0,

C4.5, Decision tree and Neural Network are used as a

classification algorithms. The C4.5 and C5.0 algorithms

obtained accuracy of 72.37% and 87.91% by using IBM SPSS

Modeler and Rapid Miner tools respectively.

Dixon et al. [11] proposed a scheme termed as artificial

immune systems for liver disorder diagnosis which worked

depending on the blood test outcomes. The scheme employed

ANN and other classification algorithms. The detection rate

obtained in the scheme is 81.18% using MVD.

III. T

HEORETICAL

XPLANATION

The learning procedure in Machine Learning strategies can be

partitioned into two principle classifications such as supervised

and unsupervised learning. Supervised learning is used to

predict a certain outcome from a given input with input-output

pairs. The machine learning model is formed from these input-

output pairs, which comprise the training set. The goal is to

make accurate predictions to new, never-before seen data.

A. Random Forests

The Random Forests classification is one of the supervised

classification techniques. The RFs classification is an ensemble

method that can be thought of as a form of nearest neighbor

predictor. Ensemble learning is the process by which multiple

models, such as classifiers or experts, are strategically

generated and combined to solve a particular computational

intelligence problem. Random Forests are essentially a

collection of decision trees. The Random Forests model is built

from the numbers of trees [12], [13].

The basic steps of Random Forests technique are as follows.

Step 1: Pick a random K data point from the training set

Step 2: Build the decision tree associated to these K data points

Step 3: Choose the number of N-tree for trees that are formed

and repeat steps 1 and 2

Step 4: For a new data point, form the N-tree that predict the

category to which the data points belongs, and assign the new

data point to the category that wins the majority vote.

Fig. 1. Working procedure of Artificial Neural Networks [12].

B. Artificial Neural Networks

Artificial neural networks are based on a rather simple model

of a neuron. Most neurons have three parts: a dendrite which

collects inputs from neurons (or from an external stimulus); a

soma which performs an important nonlinear processing step;

finally an axon, a cable-like write along which the output signal

is transmitted to other neurons further down the processing

chain.

In Neural network, output value can be continuous, binary, and

categorical. Here, we apply multilayer perceptron (known as

feed-forward) for classification and regression that can serve as

starting point for more involved deep learning method. In

construction of this network, the number of neurons in entering

layer is equal to the number of existing characteristics for

decision making about each sample data; here there are six input

layer. The working procedure of Artificial Neural Networks

[12] is illustrated in Fig. 1. In contrast, the nodes of network are

attributes of samples. Other parts of this network is hidden

layer. The number of hidden layers is considered as one layer

because one layer can mostly solve the question [12].

IV. THE

PROPOSED

ETHODOLOGY

A. Data Collection and Preparation

The liver disorder dataset named as BUPA is collected from

University of California in Irvine (UCI) machine learning

repository [14]. This dataset contains 345 records of liver

patient where the cases are labeled as either class label 1 or 2

and 145 (42.03%) of the cases are of class label 1 and 200

(57.97%) are of class label 2. The dataset has 7 features that are

as follows excluding class label.

• MCV (x

)

• Alkphos (x

)

• SGPT (x

)

• SGOT (x

)

• Gammagt (x

)

•

Drinks / Day (x

)

The BUPA liver dataset is retrieved from the UCI machine

learning repository dataset [14]. If the diagnosis is positive it

comes under class label 1 category and if the diagnosis is

negative it comes under class label 2 category. Pearson

Fig. 2.

Pearson corelation for class label 1.

Fig. 3.

Pearson corrlatrion for class label 2.

correlation is a degree of the linear correlation between two

attributes. The correlation among six attributes of class label 1

and class label 2 is calculated that represents the high

correlation between positive and negative class as illustrated in

Fig.2 and Fig. 3. The correlation among the attributes is

positive.

B. Training and Testing Phase

Training phase acquires the data set from features and testing

phase used to measure how well the model performs at making

predictions on that testing set. In k-fold cross-validation, the

given data set is split into k equal size chunks. A single chunk

is used for testing and k-1 chunks is used for training. The

process is gone through k times. In this scenario all the dataset

are used for training as well as testing. It is possible to avoid the

overfitting scenario in k-fold cross validation.

C. The Performance Measure Indices

The performance of machine learning techniques is measured

in terms of some performance measure indices. A confusion

matrix for actual and predicted class is formed comprising of

TP, FP, TN, and FN to evaluate the parameter. The significance

of the terms is given below.

TP = True Positive ( Correctly Identified )

TN = True Negative ( Incorrectly Identified )

FP = False Positive ( Correctly Rejected )

FN = False Negative ( Incorrectly Rejected )

The performance of the proposed system is measured by the

following formulas:

V. I

MPLEMENTATION AND

ESULTS

NALYSIS

We have developed a model using Random Forests and

Artificial Neural Networks which is implemented in a high

configuration computer. The computer configuration was Intel

Core i5 with 8GB RAM. We have used Scikit-learn which is an

open source software developed in Python for machine learning

library. An Integrated development environment named as

Spyder is used to run the program.

We set some tuning parameters for both modalities. The tuning

parameter for both modalities is shown in Table I.

We have used the 10-fold cross-validation technique i.e. the

data set was split into 10 chunks. The 10 fold technique is

utilized to approve the methodical model. In this scenario, 9

folds are utilized for training and the rest one for testing.

TABLE I. TUNING PARAMETER

RFs ANNs

1 Number of estimators = 10 Hidden layer size=2

2 Criterion = 'entropy' Input dimension = 6

3 Random state = 1234 Batch size =12

4 Cross-validation =10 Cross-validation=10

0.5

x1 x2 x3 x4 x5 x6

x1 1 0.08 0.21 0.27 0.3 0.43

x2 0.08 1 0.02 0.16 0.17 0.06

x3 0.21 0.02 1 0.68 0.6 0.42

x4 0.27 0.16 0.68 1 0.56 0.43

x5 0.3 0.17 0.6 0.56 1 0.49

x6 0.43 0.06 0.42 0.43 0.49 1

Pearson corelation for class label 1

0.5

x1 x2 x3 x4 x5 x6

x1 1 0.01 0.12 0.18 0.21 0.24

x2 0.01 1 0.11 0.17 0.14 0.14

x3 0.12 0.11 1 0.78 0.48 0.07

x4 0.18 0.17 0.78 1 0.5 0.22

x5 0.21 0.14 0.48 0.5 1 0.26

x6 0.24 0.14 0.07 0.22 0.26 1

Pearson corelatrion for class label 2

Accuracy ( Acc )

()

TP TN

TP TN FP FN

+++

(1)

Sensitivity ( Sen )

()

TP FN+

(2)

Specificity ( Spec )

()

TN FP+

(3)

Positive Predictive Value ( PPV )

()

TP FP+

(4)

Negative Predictive Value ( NPV )

()

TN FN+

(5)

F1 Score

(2 )

TP FP FN++

(6)

(a)

(b)

Fig. 4. Confusion matrix. (a) Training phase (b) Testing phase.

We have formed a confusion matrix from the model. We have

utilized 90% instances of total data for training both Random

Forests (RFs) classification and Artificial Neural Networks

(ANNs) individually. The remaining 10% instances used for

testing both in RFs and ANNs individually. The graphical

representation of the confusion matrix for each modality is

illustrated in Fig. 4.

The performance measure indices are calculated both for

training and testing using the above-described equations. The

calculated values are depicted in Table II and the graphical view

of the performance measure indices is illustrated in Fig. 5. The

results represented in Table II and Fig. 5 shows that ANNs has

the best performance over RFs in terms of specificity,

sensitivity, and the accuracy are obtained 89.47%, 80%, and

85.29% respectively in testing phase.

A comparison study is drawn in Table III for testing phase. In

[9], the authors measured the performance eleven classification

algorithms such as Gaussian Processes, Linear Logistic

Regression, Multilayer Perceptron, Neural Network, Support

Vector Machine etc. for the BUPA liver dataset. The accuracy,

sensitivity and PPV is obtained by Neural Network which are

73.91%, 79.52% and 77.65% respectively in testing phase. Lin

et al. [15] proposed a hybrid approach and the accuracy

obtained by the system is 78.18%. Polat et al. [16] introduced a

technique for the diagnosis of liver disorder using Artificial

Immune Recognition System (AIRS) and obtained the accuracy

of 83.38%. The accuracy achieved by our model is 80% and

85.29% in Random Forests and Artificial Neural Networks

individually. Developed model also provide greater percentage

in sensitivity and positive predictive value. We have used 10

fold technique to build our model.

TABLE II. PERFORMANCE MEASURE INDICES

Parameters Training Phase Testing Phase

ANNs RFs ANNs RFs

Accuracy (%) 84.24 98.57 85.29 80

Positive Predictive

Value (%)

74.81 100 85.71 84.61

Negative Predictive

Value (%)

91.11 96.15 85 77.27

Sensitivity (%) 85.96 97.78 80 68.75

Specificity (%) 83.25 100 89.47 89.47

F1 Score (%) 80 98.88 82.76 75.86

(a)

(b)

Fig. 5. Performance measure indices for the classification of liver disorder. (a)

Training phase (b) Testing phase.

TABLE III. COMPARISON WITH EXISTING METHODS IN TESTING

PHASE

Methods Acc (%) PPV

(%)

Sen (%)

Gaussian Processes 73.91 79.01 77.11

Linear Logistic Regression 69.57 74.70 74.70

Multilayer Perceptron 68.84 76.32 69.88

Neural Net 73.91 77.65 79.52

Support Vector Machine 69.23 75 75

CBR-PSO 78.18 - -

AIRS 83.38 - -

Developed model using RFs 80 89.47 68.75

Developed model using ANNs 85.29 89.47 80

100

150

200

TP TN FP FN

RFs 132 75 0 3

ANNs 98 164 33 16

Confusion matrix in training phase

TP TN FP FN

RFs 11 17 2 5

ANNs 12 17 2 3

Confusion matrix in testing phase

100

Acc PPV NPV Sen Spec F1

Score

Performance measure indices

RFs ANNs

100

Acc PPV NPV Sen Spec F1

Score

Performance measure indices

RFs ANNs

VI. C

ONCLUSION

Liver disorder classification is very significant in the area of

Medicare and Biomedical. In this paper we focused on building

a model which aims at classifying the most severe disease

known as liver disorder. Liver disorder is a remarkably risky

disease that causes a lot of death for numerous people all over

the world. So, early detection of this disorder can save a lot of

valuable life. We developed a model which is based on Random

Forests and Artificial Neural Networks. Both of the techniques

has been implemented by the Python to be the most effective in

classifying the diagnostic data set into the two classes in view

of the seriousness of the disease. We end up with an accuracy

of 80% and 85.29% in RFs and ANNs respectively in testing

phase. The developed model will be very helpful for the

medical staffs as well as general people. The model obtained by

supervised machine learning techniques will be very supportive

in the field of medical disorders and proper diagnosing.

EFERENCES

[1] UK National Statistics, http://www.statistics.gov.uk, accessed on: Sep.

25, 2017.

[2] Everhart JE, Ruhl CE “ Burden of digestive diseases in the United States

Part III: Liver, biliary tract, and pancreas,” Gastroenterology, 136,

pp.1134 –1144, 2009.

[3] American Liver Foundation. The Liver Lowdown – Liver Disease: the big

picture,http://www.liverfoundation.org/education/liverlowdown/ll1013/b

igpicture, accessed on: Sep. 25, 2017.

[4] Liver Disease, http://www.medicinenet.com/liver_disease/article.htm,

accessed on: Sep. 25, 2017.

[5] Gonzalez, F., Dasgupta, D., Nino, L. F., “A Randomized Real-

Valued Negative Selection Algorithm”, Second International

Conference on Artificial Immune Systems. United Kingdom. September,

2003.

[6] Diagnosing Liver Disease, http://www.liver.ca/liverdisease/diagnosing-

liver-disease, accessed on: Sep. 25, 2017.

[7] B. Ramana, M. Babu, N. Venkateswarlu, “A Critical Study of Selected

Classification Algorithms for Liver Disease Diagnosis”, International

Journal of Database Management Systems (IJDMS), Vol.3, No.2, pp.101

- 114, May 2011.

[8] Aman Singh; Babita Pandey, “An euclidean distance based KNN

computational method for assessing degree of liver damage”, 2016

International Conference on Inventive Computation Technologies

(ICICT), Vol. 1, pp. 1 – 4, 2016.

[9] S. Bahramirad, A. Mustapha and M. Eshraghi, "Classification of liver

disease diagnosis: A comparative study," 2013 Second International

Conference on Informatics & Applications (ICIA), Lodz, pp. 42-46, 2013.

[10] M. ABDAR, “ A Survey and Compare the Performance of IBM SPSS

Modeler and Rapid Miner Software for Predicting Liver Disease by Using

Various Data Mining Algorithms,” J. Sci.(CSJ), Vol. 36, 2015.

[11] S. Dixon and X. H. Yu, "Liver disorder detection based on artificial

immune systems," 2015 11th International Conference on Natural

Computation (ICNC), Zhangjiajie, pp. 743-748, 2015.

[12] G. James , D. Witten , T.Hastie , R.Tibshirani : An Introduction to

Statistical Learning, 2013.

[13] S. Guido, Andreas C. Müller : Introduction to machine learning with

python. O'Reilly Media, Inc. , 2016

[14] BUPA Liver Disorders Dataset. UCI Repository of Machine Learning

Databases, http://archive.ics.uci.edu/ml/datasets/Liver+Disorders,

accessed on: Sep. 25, 2017.

[15] Jyun Jie Lin and P. C. Chang, "A particle swarm optimization based

classifier for liver disorders classification," International Conference on

Computational Problem-Solving, Lijiang, pp. 63-65, 2010.

[16] K. Polat, S. Sahan, H. Kodaz and S. Gunes, "A new classification method

to diagnosis liver disorders: supervised artificial immune system (AIRS),"

Proceedings of the IEEE 13th Signal Processing and Communications

Applications Conference, pp. 169-174, 2005.

Predictive Prognostic Factors in Non-Calcific Supraspinatus Tendinopathy Treated with Focused Extracorporeal Shock Wave Therapy: An Artificial Neural Network Approach

Article

Full-text available

May 2024

Citation: Santilli, G.; Vetrano, M.; Mangone, M.; Agostini, F.; Bernetti, A.; Coraci, D.; Paoloni, M.; de Sire, A.; Paolucci, T.; Latini, E.; et al. Predictive Abstract: The supraspinatus tendon is one of the most involved tendons in the development of shoulder pain. Extracorporeal shockwave therapy (ESWT) has been recognized as a valid and safe treatment. Sometimes the symptoms cannot be relieved, or a relapse develops, affecting the patient's quality of life. Therefore, a prediction protocol could be a powerful tool aiding our clinical decisions. An artificial neural network was run, in particular a multilayer perceptron model incorporating input information such as the VAS and Constant-Murley score, administered at T0 and at T1 after six months. It showed a model sensitivity of 80.7%, and the area under the ROC curve was 0.701, which demonstrates good discrimination. The aim of our study was to identify predictive factors for minimal clinically successful therapy (MCST), defined as a reduction of ≥40% in VAS score at T1 following ESWT for chronic non-calcific supraspinatus tendinopathy (SNCCT). From the male gender, we expect greater and more frequent clinical success. The more severe the patient's initial condition, the greater the possibility that clinical success will decrease. The Constant and Murley score, Roles and Maudsley score, and VAS are not just evaluation tools to verify an improvement; they are also prognostic factors to be taken into consideration in the assessment of achieving clinical success. Due to the lower clinical improvement observed in older patients and those with worse clinical and functional scales, it would be preferable to also provide these patients with the possibility of combined treatments. The ANN predictive model is reasonable and accurate in studying the influence of prognostic factors and achieving clinical success in patients with chronic non-calcific tendinopathy of the supraspinatus treated with ESWT.

An Enhanced Hybrid Model for Liver Disease Detection Utilizing Deep Learning and Machine Learning

Article

Apr 2024

Enhancing Breast Cancer Detection with Ensemble Machine Learning Models

Article

Full-text available

Apr 2024

Dawood Salim Mohammed

Breast cancer remains a significant public health issue worldwide, underlining the need for accurate and efficient diagnostic methods. In this paper, we propose a new technique to enhance breast cancer diagnosis through the integration of multiple machine-learning models. Our strategy employs a combination of the Naive Bayes classifier, Stochastic Gradient Descent (SGD), Bagging, and the ZeroR classifier, alongside Bayes Network learning. The cornerstone of our approach is Bayes Network learning, a probabilistic graphical model designed to map out the intricate interconnections among various diagnostic factors. This is significant in the way that it can help to uncover complex relationships in the data for the sake of leading to more accurate predictions. Added to the above, we use the Naïve Bayes classifier, a classifier showing good validity in classification tasks and based on probabilistic reasoning, for the screening of breast cancer. Further, a refined model's parameter is included using the SGD and leads to enhancement of the generalization and overall performance of the model. In addition, as part of controlling overfitting, one can also use Bagging. It is an ensemble method in the sense that it considers several models. ZeroR classifier is a very basic classifier and is just used to compare its performance with our composite approach. We are comparing complex ensemble results to its simplicity. We will validate the ability of our proposed methodology to compare the performance of our integrated models against ZeroR.

Optimization of Deep Neural Networks with Particle Swarm Optimization Algorithm for Liver Disease Classification

Article

Feb 2023

Liver disease has affected more than one million new patients in the world. which is where the liver organ has an important role function for the body's metabolism in channeling several vital functions. Liver disease has symptoms including jaundice, abdominal pain, fatigue, nausea, vomiting, back pain, abdominal swelling, weight loss, enlarged spleen and gallbladder and has abnormalities that are very difficult to detect because the liver works as usual even though some liver functions have been damaged. Diagnosis of liver disease through Deep Neural Network classification, optimizing the weight value of neural networks with the Particle Swarm Optimization algorithm. The results of optimizing the PSO weight value get the best accuracy of 92.97% of the Hepatitis dataset, 79.21%, Hepatitis 91.89%, and Hepatocellular 92.97% which is greater than just using a Deep Neural Network.

Breast Cancer Prediction using Genetic Algorithm via Mixed Radial Basis Function Neural Network

Conference Paper

May 2024

DETECTION OF LIVER DISEASE USING MACHINE LEARNING MODELS

Article

Full-text available

May 2024

Improved Kepler Optimization Algorithm for enhanced feature selection in liver disease classification

Article

May 2024
KNOWL-BASED SYST

Liver diseases represent a significant healthcare challenge, impacting millions globally and posing complexities in diagnosis. To address this global health concern, this paper introduces a groundbreaking enhancement to the Kepler Optimization Algorithm, termed I-KOA, designed specifically for feature selection in high-dimensional datasets. By harnessing the synergies of Opposition-Based Learning and a Local Escaping Operator grounded in the k-nearest Neighbor (kNN) classifier, I-KOA asserts itself as a potent tool for local exploitation, balanced exploration, and evasion of local optima. To our knowledge, this is the first work to exploit KOA as a feature selection method. Pioneering the utilization of KOA as a feature selection method, the paper rigorously tests I-KOA in two extensive experiments, tackling the complex CEC’22 benchmark suite functions and the intricate landscape of five liver disease datasets. Results underscore I-KOA’s unparalleled performance, validated through the Friedman test, where it surpasses seven rival optimization algorithms. Achieving an outstanding overall classification accuracy of 93.46%, Feature selection size of 0.1042, sensitivity of 97.46%, precision of 94.37%, and F1-score of 90.35% across the liver disease datasets, I-KOA’s randomized algorithm ensures robust feature selection, striking a compelling balance between subset size and classification efficacy. Acknowledging computational demands and generalization nuances, I-KOA is a formidable tool ready to revolutionize medical diagnosis and decision support systems. The open source codes of the proposed I-KOA are available at https://www.mathworks.com/matlabcentral/fileexchange/161376- improved-kepler-optimization-algorithm.

Determining the minimum data size for the development of artificial neural network-based prediction models for rice pests in Korea

Article

May 2024
COMPUT ELECTRON AGR

Technical Review of Breast Cancer Screening and Detection using Artificial Intelligence and Radiomics

Conference Paper

Feb 2024

Visualizing Transformers for Breast Histopathology By IJISRT

Article

Mar 2024

Detecting breast cancer early is crucial for improving patient survival rates. Using machine learning models to predict breast cancer holds promise for enhancing early detection methods. However, evaluating the effectiveness of these models remains challenging. Therefore, achieving high accuracy in cancer prediction is essential for improving treatment strategies and patient outcomes. By applying various machine learning algorithms to the Breast Cancer Wisconsin Diagnostic dataset, researchers aim to identify the most efficient approach for breast cancer diagnosis. They evaluate the performance of classifiers such as Random Forest, Naïve Bayes, Decision Tree (C4.5), KNN, SVM, and Logistic Regression, considering metrics like confusion matrix, accuracy, and precision. The assessment reveals that Random Forest outperforms other classifiers, achieving the highest accuracy rate of 97%. This study is conducted using the Anaconda environment, Python programming language, and Sci-Kit Learn library, ensuring replicability and accessibility of the findings. In summary, this study demonstrates the potential of machine learning algorithms for breast cancer prediction and highlights Random Forest as the most effective approach. Its findings contribute valuable insights to the field of breast cancer diagnosis and treatment.

A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis

Article

Full-text available

Apr 2011

Patients with Liver disease have been continuously increasing because of excessive consumption of alcohol, inhale of harmful gases, intake of contaminated food, pickles and drugs. Automatic classification tools may reduce burden on doctors. This paper evaluates the selected classification algorithms for the classification of some liver patient datasets. The classification algorithms considered here are Naïve Bayes classifier, C4.5, Back propagation Neural Network algorithm, and Support Vector Machines. These algorithms are evaluated based on four criteria: Accuracy, Precision, Sensitivity and Specificity.

A Randomized Real-Valued Negative Selection Algorithm

Conference Paper

Full-text available

Sep 2003
Lect Notes Comput Sci

This paper presents a real-valued negative selection algorithm with good mathematical foundation that solves some of the drawbacks of our previ- ous approach (11). Specifically, it can produce a good estimate of the optimal number of detectors needed to cover the non-self space, and the maximization of the non-self coverage is done through an optimization algorithm with proven convergence properties. The proposed method is a randomized algorithm based on Monte Carlo methods. Experiments are performed to validate the assumptions made while designing the algorithm and to evaluate its performance.3

CEC01-03 (B): An Introduction to Machine Learning in Python

Article

Sep 2023
TOXICOL LETT

D.-A. Clevert

A Survey and Compare the Performance of IBM SPSS Modeler and Rapid Miner Software for Predicting Liver Disease by Using Various Data Mining Algorithms

Article

Jan 2015

Moloud Abdar

Today, with the development of industry and mechanized life style, prevalence of diseases is rising steadily, as well. In the meantime, the number of patients with liver diseases (such as fatty liver, cirrhosis and liver cancer, etc.) is rising. Since prevention is better than treatment, early diagnosis can be helpful for the treatment process so it is essential to develop some methods for detecting high-risk individuals who have the chance of getting liver diseases and also to adopt appropriate solutions for early diagnosis and initiation of treatment in early stages of the disease. In this study, we tried to use common data mining techniques that are used nowadays for diagnosis and treatment of different diseases, for the diagnosis and treatment of liver disease. For this purpose, we used Rapid Miner and IBM SPSS Modeler data mining tools together. Accuracy of different data mining algorithms such as C5.0 and C4.5, Decision tree and Neural Network were examined by the two above tools for predicting the prevalence of these diseases or early diagnosis of them using these algorithms. According to the results, the C4.5 and C5.0 algorithms by using IBM SPSS Modeler and Rapid Miner tools had 72.37% and 87.91% of accuracy respectively. Further, Neural Network algorithm by using Rapid Miner had the ability of showing more details.

An euclidean distance based KNN computational method for assessing degree of liver damage

Conference Paper

Aug 2016

Liver disorder detection based on artificial immune systems

Conference Paper

Aug 2015

Classification of liver disease diagnosis: A comparative study

Conference Paper

Sep 2013

Medical Data Mining (MDM) is one of the most critical aspects of automated disease diagnosis and disease prediction. MDM involves developing data mining algorithms and techniques to analyze medical data. In recent years, liver disorders have excessively increased and liver diseases are becoming one of the most fatal diseases in several countries. In this study, two real liver patient datasets were investigated for building classification models in order to predict liver diagnosis. Eleven data mining classification algorithms were applied to the datasets and the performance of all classifiers are compared against each other in terms of accuracy, precision, and recall. Several investigations have also been carried out to improve performance of the classification models. Finally, the results shown promising methodology in diagnosing liver disease during the earlier stages.

A particle swarm optimization based classifier for liver disorders classification

Article

Jan 2010

A hybrid model is developed by integrating a case-based reasoning approach and a particle swarm optimization model for medical data classification. The data sets from UCI Machine Learning Repository; Liver Disorders Data Set is employed for benchmark test. Initially a case-based reasoning method is applied to preprocess the data set thus a weight vector for each feature is derived. A particle swarm optimization model is then applied to construct a decision-making system based on the selected features and diseases identified. The PSO algorithm starts by partitioning the data set into a relatively large number of clusters to reduce the effects of initial conditions and then reducing the number of clusters into two. The average for liver disorders of CBRPSO is 78.18%. The proposed case-based particle swarm optimization model is able to produce more accurate and comprehensible results for medical experts in medical diagnosis..

Burden of Digestive Diseases in the United States Part III: Liver, Biliary Tract, and Pancreas

Article