ArticlePDF Available

Associating Measles Vaccine Uptake Classification and its Underlying Factors Using an Ensemble of Machine Learning Models

Authors:

Abstract and Figures

Measles is one of the significant public health issues responsible for the high mortality rate around the globe, especially for developing countries. Using nationally representative demographic and health survey data, measles vaccine utilization has been classified, and its underlying factors are identified through an ensemble Machine Learning (ML) approach. Firstly, missing values are imputed employing various approaches, and then several feature selection techniques have been applied to identify the crucial attributes for predicting measles vaccination. A grid search hyperparameter optimization technique has been applied for tuning the critical hyperparameters of different ML models, such as Naive Bayes, random forest, decision tree, XGboost, and lightgbm. The categorization performance of the individual optimized ML model as all as their ensembles have been reported utilizing our proposed BDHS dataset. Individually, the optimized lightgbm provides the highest precision and AUC of 79.90% and 77.80%, respectively. This result improved when the optimized lightgbm is ensembled with XGboost, providing the precision and AUC of 84.60% and 80.0%, respectively. Our result reveals that the statistical median imputation technique with the XGboost-based attribute selection method and the lightgbm classifier provides the best individual result. The performance has been improved when the proposed weighted ensemble of the XGboost and lightgbm approach has been adapted with the same preprocessing and recommended for measles vaccine utilization. The significance of our proposed approach is that it utilizes minimum attributes collected from the child and their family members and yielded 80.0%accuracy, making it easily explainable by caregivers and healthcare personnel. Finally, our predictive model provides an early detection procedure to help national policymakers enforce new policies with specific rules and regulations. The data and source codes that support the findings of this study are available at https://github.com/kamruleee51/measles_vaccine_uptake.
Content may be subject to copyright.
Received August 2, 2021, accepted August 24, 2021, date of publication August 27, 2021, date of current version September 3, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3108551
Associating Measles Vaccine Uptake Classification
and Its Underlying Factors Using an Ensemble of
Machine Learning Models
MD. KAMRUL HASAN 1, MD. TASNIM JAWAD 1, AISHWARIYA DUTTA 2, MD. ABDUL AWAL 3,
MD. AKHTARUL ISLAM 4, MEHEDI MASUD 5, (Senior Member, IEEE),
AND JEHAD F. AL-AMRI 6
1Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
2Department of Biomedical Engineering (BME), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
3Electronics and Communication Engineering (ECE) Discipline, Khulna University (KU), Khulna 9208, Bangladesh
4Statistics Discipline, Khulna University (KU), Khulna 9208, Bangladesh
5Department of Computer Science, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia
6Department of Information Technology, College of Computer and Information Technology, Taif University, Taif 21994, Saudi Arabia
Corresponding author: Md. Abdul Awal (m.awal@ece.ku.ac.bd)
This work was supported by Taif University Researchers Supporting Project, Taif University, Taif, Saudi Arabia, under
Grant TURSP-2020/211.
This work involved human subjects or animals in its research. Approval of all ethical and experimental procedures and protocols was
granted by the ICF Institutional Review Board (ICF-IRB).
ABSTRACT Measles is one of the significant public health issues responsible for the high mortality
rate around the globe, especially for developing countries. Using nationally representative demographic
and health survey data, measles vaccine utilization has been classified, and its underlying factors are
identified through an ensemble Machine Learning (ML) approach. Firstly, missing values are imputed
employing various approaches, and then several feature selection techniques have been applied to identify the
crucial attributes for predicting measles vaccination. A grid search hyperparameter optimization technique
has been applied for tuning the critical hyperparameters of different ML models, such as Naive Bayes,
random forest, decision tree, XGboost, and lightgbm. The individual optimized ML model’s categorization
performance as all their ensembles have been reported utilizing our proposed BDHS dataset. Individually,
the optimized lightgbm provides the highest precision and AUC of 79.90% and 77.80 %, respectively. This
result improved when the optimized lightgbm is ensembled with XGboost, providing the precision and AUC
of 84.60 % and 80.0%, respectively. Our result reveals that the statistical median imputation technique with
the XGboost-based attribute selection method and the lightgbm classifier provides the best individual result.
The performance improved when the proposed weighted ensemble of the XGboost and lightgbm approach
was adapted with the same preprocessing and recommended for measles vaccine utilization. The significance
of our proposed approach is that it utilizes minimum attributes collected from the child and their family
members and yielded 80.0 % accuracy, making it easily explainable by caregivers and healthcare personnel.
Finally, our predictive model provides an early detection procedure to help national policymakers enforce
new policies with specific rules and regulations. The data and source codes that support the findings of this
study are available at https://github.com/kamruleee51/measles_vaccine_uptake.
INDEX TERMS Attribute selection, measles vaccine uptake classification, measles BDHS data, missing
value imputation, weighted ensemble ML model.
I. INTRODUCTION
Measles is a highly contagious viral disease, which is very
common in developing countries and is associated with a
The associate editor coordinating the review of this manuscript and
approving it for publication was Emre Koyuncu .
significant level of mortality and morbidity [1], [2]. This viral
disease is vaccine-preventable, yet measles is a leading cause
of death among children among vaccine-preventable dis-
eases, and the fatality rate of measles is up to 10.0 % [3]–[5].
This vaccine-preventable disease is a crucial public health
issue in sub-Saharan Africa and South-East Asia, involving
VOLUME 9, 2021
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 119613
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
Bangladesh [6], [7]. Every year, more than 0.1 million deaths
occur due to measles, and in the first three months of 2019,
the cases of measles have increased by 300.0 % than 2018 [1],
[8], [9]. To reduce measles and increase community-level
immunity, 95.0% measles vaccination coverage is crucial
with two doses, which will decrease related causes of
mortality and lead to elimination [10]–[13]. To realize the
elimination of measles, we can consider the statement made
by the World Health Organization (WHO) (2017) as ‘‘The
interruption of measles transmission in a defined geograph-
ical area that has lasted at least 12 months and is verified
after it has been sustained for at least 36 months [14]’’.
In Bangladesh, measles vaccination coverage was 88.0 %
among children below the age of one year. The Fourth
Health, Population and Nutrition Sector Program (HPNSP)
sets a goal of 90.0 % coverage by 2022 [15]. The crucial
thing to increasing the vaccination rate is to recognize the
influencing factors associated with the utilization of measles
vaccination [16], [17]. Existing literature revealed several
influencing factors related to measles vaccination uptake [6],
[17]–[19].
This study focused on recognizing the contributing factors
to the non-utilization of measles vaccination among children
in Bangladesh. We have employed Machine Learning (ML)
techniques in four consecutive Bangladesh Demographic and
Health Survey (BDHS) data surveys from 2007 to 201718.
Utilizing the ML procedure may accelerate the recognition
of appropriate features related to the non-utilization of
the measles vaccine compared to other methods frequently
applied to variable selection challenges, as well as improve
the prediction accuracy of the concluding classification
model [20]. For evidence, the authors in [21] utilized
the Synthetic Minority Over-Sampling Technique (SMOTE)
techniques to investigate the problem of imbalance in class
and found 93.90 % as a true-positive rate. In contrast,
the false-positive and false-negative rates were 5.80 % and
5.10 %, respectively. To evaluate the influencing factors
that place individuals at a higher risk of measles [22],
utilized ML techniques and found that contact with measles
patients, age, rhinorrhea, vaccination, male sex, cough,
conjunctivitis, ethnicity, and fever were the crucial factors
that were associated with measles disease. The authors
in [23] adopted the LASSO (Least Absolute Shrinkage
and Selection Operator) logistic regression model on the
electronic health record to identify message vaccine-resistant
families and obtained 72.0 % precision. They attributed 25
features based on the history of the child and their family
members. The authors employed the ML approach based
on the area level feature to predict vaccine hesitancy for
a broad range of vaccine-preventable diseases, including
measles [24]. The authors found that the random forest
provided the best performance than the gradient boosting
machine, LASSO, and neural network. The authors in [25]
explored and identified associated features to predict measles
non-vaccination from the Philippine National Demographic
and Health Survey data. They employed an Elastic Net ML
model using 32 relevant attributes comprised of geographic
location, socioeconomic condition, and features related to
children and family information. As a result, they obtained
an accuracy, sensitivity, and specificity of 79.02 %, 97.73 %,
and 23.41 %, respectively. A review article was published
in [26] to explore the usefulness of data mining and ML
approaches to explore the clinical significance of measles and
its prediction. A multiple linear regression model was applied
in [27], and they found that the associated factors for measles
uptake were parenting and knowledge, nutritional status, and
behavior. The authors of [28] applied a logistic regression
model to find out the association between socioeconomic
characteristics with measles uptake and revealed that measles
vaccine utilization rates are highly socially determined.
An illustration of the positive relationship between child
daycare centers, maternal and paternal education, and
measles vaccine uptake was accomplished in [29] in Ger-
many. Finally, a systematic review analysis was conducted
in [30] utilizing the primary studies and discovered that for
measles, mumps, and rubella vaccine uptake, community
health, peer judgment, confidence in experts and vaccines,
responsibility toward children, and measles severity are
strongly associated. Unfortunately, research on measles and
its vaccine using ML approaches was minimal, and to our
best knowledge, in Bangladesh, with our proposed BDHS
data, this article is the first attempt. However, the significant
contributions and key topics covered by this article are as
follows:
Proposing nationally representative demographic and
health survey measles data from Bangladesh, called the
BDHS dataset.
Developing a framework for linking measles vaccine
uptake classification and its underlying factors.
Incorporating an integral preprocessing, which includes
missing value imputation and attribute selection strate-
gies.
Optimizing the hyperparameters of different ML-based
models and proposing a weighted ensemble ML model
for the aimed task of this article.
Conducting complete ablation studies for the prepro-
cessing and classifier determination for recommend-
ing the best possible framework for measles vaccine
utilization.
The article’s remaining sections are arranged as fol-
lows: Section II describes the proposed BDHS dataset
and framework. Section III illustrates the achieved results
from different extensive experiments with the possible
explainability. In the end, Section IV terminates the article
with future working directions.
II. MATERIALS AND METHODS
This section elaborately manifests the materials and method-
ologies of the article. Section II-A illustrates the proposed
datasets, which were collected from Bangladesh. Section II-B
explains the proposed framework, incorporating missing
value imputation (see in Section II-B1), attribute selection
119614 VOLUME 9, 2021
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
(see in Section II-B2), and different ML classifiers with
the proposed ensemble classifier (see in Section II-B3).
In Section II-B3, we also describe the hyperparameter opti-
mization for different ML models. In the end, we define the
evaluation indices of different comprehensive experiments in
Section II-B4.
A. PROPOSED DATASETS
1) CLINICAL INTERPRETATION OF MEASLES
The disease is initiated by an RNA respiratory virus of the
Morbillivirus genus and Paramyxoviridae family [31]–[33].
According to the WHO, the clinical meaning of measles
is that any individual with cough, coryza or conjunctivitis,
and fever generalized maculopapular rash [31], [34], [35].
Sometimes, unusual tiny white spots on the buccal mucosa
called koplik spots can be observed for measles disease
[31], [34]. Fever may be as high as 40C, cough,
conjunctivitis, and rash are the symptoms of measles, similar
to the symptoms of other respiratory seasonal infections [28],
[36]. These symptoms similarity may be why rapid increases
in measles cases and rapid spread occur through close contact
with one another, and routine interaction in public places
[36], [37]. The measles virus affects individuals through
respiratory droplets produced by sneezing or coughing
or through straight contact. These tiny droplets or tiny-
particle aerosols can drift in the air for prolonged durations,
and the typical contagious duration is four days after the
rash occurs [31]–[33]. Therefore, vaccine utilization to
prevent measles is crucial to growing hard immunity in the
community.
2) DATA SOURCES AND VARIABLES
This study utilized four consecutive nationally representative
Demographic and Health Survey data of Bangladesh begin-
ning from 2007, 2011, 2014, and 2017-18 [15], [38]–[40].
These datasets were collected under the National Institute
of Population Research and Training (NIPORT) authority
of the Ministry of Health and Family Welfare (MOHFW).
A Bangladeshi research organization, Mitra and Associates,
implemented the survey. In this survey, a two-stage stratified
clustering sampling technique was utilized. The total area was
divided into several enumeration areas (EA) and selected in
the first stage, and for the second stage, several households
were selected. For instance, in the 2017-18 survey, a list
of 675 Enumeration Areas (EA) was established in the
first stage, with 250 in urban and 425 in rural areas.
In the second stage, 30 households were taken on average
by each EA. The BDHS 2017-18 was conducted using five
types of questionnaires. In this study, we used data from
the woman’s questionnaire. This questionnaire was based on
the model questionnaires developed for the worldwide DHS-
7 Program, adjusted to the circumstances and requirements in
Bangladesh, and considering the content of the instruments
employed in earlier DHS surveys in Bangladesh [15].
Our focused question was related to children’s immu-
nizations. During this survey, women were asked questions
regarding their socioeconomic characteristics (for instance,
age, education, religion, and media exposure), reproductive
history, knowledge of uses and sources of family planning
methods, antenatal, delivery, postnatal, and newborn care,
husbands’ background, etc. [15]. Note that we have utilized
publicly identified accessible datasets, which were secondary
data for this study. This data was collected considering
all ethical issues that can be found on the DHS websites
(https://dhsprogram.com/) and is now published at Harvard
Dataverse [41]. This study excluded the ethical review
endorsement separately.
Dependent Variable: We consider measles as the depen-
dent variable with two categories. Children who took the
measles vaccine were categorized as ‘‘Yes’’, and those who
did not take the measles vaccine were categorized as ‘‘No’’.
In the following Fig. 1, we represent the prevalence rate of
measles uptake in different divisions in Bangladesh. Children
from the Barisal division recorded the lowest prevalence
(59.67 %) of measles uptake, whereas children from the
Rajshahi division showed the highest prevalence (67.91 %).
Also, in the Dhaka and Khulna divisions, the rates were
65.21 % and 63.91 %, respectively.
Independent Variable: Table. 1illustrates the different
independent variables, which are classified as categorical and
continuous attributes.
B. PROPOSED METHODOLOGIES
Fig. 2displays our proposed framework for the Measles
Disease Classification (MDC), which incorporates two
crucial preprocessing, such as missing value imputation
and attribute selection. We apply different imputation and
attribute selection techniques to perform complete ablation
studies for the proposed BDHS datasets. The BDHS datasets
after preprocessing have been partitioned into Kfolds,
where the K1 folds are utilized for training and fine-
tuning the hyperparameters in the inner loop, employing the
grid search algorithm [42]. In the outer circle (Ktimes),
the best hyperparameters and unseen test data were utilized to
evaluate the classifier in the proposed framework. Since the
proposed BDHS datasets contain imbalanced class samples,
the stratified cross-fold validation [43] has been adopted to
preserve the fundamental class specimen ratio. After training
all the ML models, an evaluation has been accomplished,
utilizing the unseen test data. Then the obtained prediction
probabilities (Pi,iN, where Nis the number of candidate
classifiers for ensembling) are employed to build an ensemble
classifier for the MDC. The following sections describe the
integrated parts of the proposed framework in Fig. 2.
1) MISSING VALUE IMPUTATION
The real-world datasets often include missing values,
encoded as NaNs, blanks, undefined, null, or any other
placeholders, for various reasons [44]. There are many
methods for replacing missing values with substituted values,
such as case deletion (Raw), missing data imputation,
model-based prediction, etc [45]. The latter method, like
VOLUME 9, 2021 119615
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
FIGURE 1. The prevalence rate of measles uptake in different divisions in Bangladesh in the proposed BDHS dataset, bestowing the higher
to lower prevalence rate using heatmap intensities.
model-based prediction, suffers from various complications,
such as it fails for the complex & blended pattern and
necessitates a long time to converge [46]. Therefore,
this article integrates statistical imputation methods in
the proposed framework in Fig. 2, such as Median and
Mode, as they are simple, easy, and faster [46]. The steps
applied for the Filling Missing Value (FMV) are presented
in Algorithm 1.
Algorithm 1 The Steps for Achieving the Applied FMV
Technique
Input: The n-dimensional uncurated data, Xin Rnand
outcome, Y[0,1].
Output: The n-dimensional curated data, Xout Rn
1Estimate class median or mode as MCi,i[0,1]
2Missing value imputation as
Xout (x)=(MC=i,i[0,1],if x=missed
x,otherwise ,
3where xXin is the observation of Xin and lies in
n-dimensional attribute space
2) ATTRIBUTE SELECTION
The ML models’ accuracy increases with the addition of
the attribute’s dimension. However, it brings the curse of
dimensionality by decreasing the results by increasing the
dimension. With the extension of size without increasing
sample numbers in the feature vector, the dimensionality
of the attribute-space converted sparser, which pushed the
ML models to be overfitted by dropping generalizing
capacity [43]. Additionally, constructing models from
datasets with many attributes is more computationally
demanding [47]. Therefore, it is essential to incorporate
attribute reduction techniques in a classification framework,
which is likely to build a generic ML model. The supervised
Attribute Selection (AS) method usually has better perfor-
mance among supervised, semi-supervised, and unsupervised
AS techniques [43], [48]. This article applies four most
commonly employed supervised AS methods to reduce the
attribute redundancy, namely Fisher Score (FS) [49], RF [50],
LGB [51], and XGB [52] for conducting the ablation studies
for our BDHS datasets, which are briefly detailed in the
following paragraphs.
a: FS ATTRIBUTE SELECTOR
The core intention of the FS is to attain a subset of attributes
so that the lengths between data points in separate classes are
as high as possible. In contrast, the distances between data
points in the same category are as small as possible [49]. The
applied actions for the FS scheme in the AS are conferred in
Algorithm 2.
b: RF ATTRIBUTE SELECTOR
RF, a tree-based strategy, is employed for the AS in our
framework in Fig. 2, as it directly ranks the attributes by
how well it improves the purity of the node, decreasing
the impurity over all trees. Nodes with the most significant
reduction in impurity happen at the start of the trees, while
notes with a minor drop in impurity occur at the end of
the trees. Thus, by pruning the trees below a particular
node, a subset of the essential attributes can be picked.
119616 VOLUME 9, 2021
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
TABLE 1. Description of the independent attributes (categorical and continuous) utilized in this research. A χ2-test is used for categorical attributes to
describe the significant relationship with the dependent variable measles uptake, whereas the Mean ±std is used to describe continuous variables. The
respondent is the mother of the child who is considering vaccine utilization.
FIGURE 2. The complete workflow of the study, where the training dataset is further divided to perform grid search optimization for finding
the best hyperparameters of the ML models.
The applied steps for the RF-based AS are displayed
in Algorithm 3.
c: LGB AND XGB ATTRIBUTE SELECTORS
The feature importance obtained from the LGB and XGB
are likely to be more accurate as they are way more reliable
than linear models [52]. Both models practice regularized
learning and cache-aware block structure tree learning for
ensembling learning. The gain from them represents the gain
score for each tree split, and the average growth calculates
the final feature importance score. Finally, the selections of
the top-mranked features are obtained from their gain (see in
Algorithm 7and 8).
3) CLASSIFIERS AND HYPERPARAMETER OPTIMIZATION
Different ML classifiers, such as Gaussian Naive Bayes
(GNB), Bernoulli Naive Bayes (BNB), Decision Tree (DT),
Random Forest (RF), XGboos (XGB), and Lightgbm (LGB),
are trained and evaluated for the measles classification
VOLUME 9, 2021 119617
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
Algorithm 2 The Implementing Steps for the Involved FS
Technique
Input: The d-dimensional data, Xin Rn×dand
outcome, Y[0,1].
Output: The reduced m-dimensional data, Xout Rn×m,
where m<d
1Estimate Fisher score (F) consider jth feature xjR1×n
of Xin as
F(xj)=Pc
k=1nk×(µj
kµj)2
(σj)2,
where (σj)2=Pc
k=1nk×(σj
k)2and C[0,1] is the
class number. The kth-class mean and standard deviation
are µj
kand σj
k. Considering that µjand σjdenote the
mean and standard deviation of the whole data set
corresponding to the jth feature.
2Select the top-mranked features with large scores (F)
and store in Xout
Algorithm 3 The Steps for Implementing the Applied RF
Technique
Input: The d-dimensional data, Xin Rn×dand
outcome, Y[0,1].
Output: The reduced m-dimensional data, Xout Rn×m,
where m<d
1Compute the Out of Bag (OOB) error of a tree.
2Randomly assign each observation with ˆ
Pkto the child
nodes if the parent node kis split in X, where ˆ
Pkis the
relative frequency of observations that initially went in
the same direction of the tree.
3Recompute the OOB error of the tree (following step 2).
4Compute the difference between the original and
recomputed OOB-errors.
5Repeat steps 1–4 for each tree and apply the average
deviation over all trees as the overall importance score
(F).
6Select the top-mranked features with large scores (F)
and store them in Xout .
in the proposed framework. The following paragraphs
elaborately explain the algorithmic actions of these ML
classifiers.
a: GNB & BNB CLASSIFIER
The Bayesian methods are supervised learning algorithms
based on applying Bayes’ theorem with the assumption of
conditional independence between all couple of attributes
providing the value of the class variable. We employ two
variants of this classifier, such as GNB and BNB. The former
variant utilizes Gaussian function as a likelihood of the
attributes, whereas the second variant applies multivariate
Bernoulli distributions. The actions for implementing those
two Bayesian classifiers are illustrated in Algorithm 4.
Algorithm 4 The Steps of Implementing GNB & BNB
Classifiers
Input: The d-dimensional data XRn×dwith n
samples, and target YRn×1
Output: The posterior probability P[0,1] of unseen
test set x, necessitating
PC
i=1Pi=1,iC=2, Cis the class number
1Compute the prior as P(Y=Cj)=nj
n,jC, and njis
the sample in jth class
2Estimate the output posterior probability as
P(Cj|X)=P(X|Cj)×P(Y=Cj)
P(X), where P(X|Ci) is the
likelihood of the predictor for a given class (jC)
b: RF CLASSIFIER
RF models apply the bagging method to individual trees in
the ensemble, which repeatedly chooses a random sample
with replacement from the training set and fits trees to these
samples. The number of trees in the ensemble is a free
parameter that is readily automatically learned using out-
of-bag errors. The algorithmic steps for developing the RF
classifier are defined in Algorithm 5.
Algorithm 5 The Steps of Implementing RF Classifier
Input: The d-dimensional data XRn×dwith n
samples, and target YRn×1
Output: The posterior probability P[0,1] of unseen
test set x, necessitating
PC
i=1Pi=1,iC=2, Cis the class number
1for b=1N(n_Bagging)do
2Draw a bootstrap sample, (Xb,Yb) from given
(XRn×d,YRn×1)
3Grow a random-forest tree Tbusing Xband Ybby
repeating recursively using the following steps until
the minimum node size is nmin.
1) Randomly select mvariables from the given n
variables
2) Pick the best variable or split-point among the m
variables
3) Split the node into two daughter nodes
Output the ensemble of trees will be {Tb}N
1
4The posterior probability ˆ
PN
RF (x)=Voting{ˆ
Pk(x)}N
1,
where ˆ
Pk(x) is the class prediction of the kth RF.
c: DT CLASSIFIER
DT builds classification models in a tree structure, breaking
down a data set into smaller and smaller subsets. The final
result is a tree with decision nodes and leaf nodes, where
a decision node has two or more branches, and a leaf node
represents a classification or decision. The topmost decision
node in a tree corresponds to the best predictor, called the root
node. Algorithm 6explains the steps of a DT model.
119618 VOLUME 9, 2021
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
Algorithm 6 The Steps of Implementing DT Classifier
Input: The d-dimensional data XRn×dwith n
samples, and target YRn×1
Output: The posterior probability P[0,1] of unseen
test set x, necessitating
PC
i=1Pi=1,iC=2, Cis the class number
1Split θ=(j,tm) into Qleft (θ) and Qright (θ) subsets,
where θconsisting of a feature, jand threshold, tm
2Compute the impurity at kth node using an impurity
function (H),
G(Q, θ)=nleft
NmH(Qleft (θ)) +nright
NmH(Qright (θ)), where
H=X
C
PmC ×(1 PmC ) or
H= X
C
PmC ×log(pmC ) and
PmC =1
NmX
xiRm
I(yi=C)
3Minimise the impurity by selecting the parameters,
θ=argminθG(Q, θ)
4Repeat the above processes for subsets Qleft (θ) and
Qright (θ) until depth reach to Nm<minsamples or
Nm=1
d: XGB CLASSIFIER
XGB falls under the category of boosting techniques in
ensemble learning, consisting of multiple models to predict
accuracy better. In this boosting technique, the errors made
by previous models are adjusted by succeeding models by
adding some weights to the models. The actions for the XGB
classifier implementation are disclosed in Algorithm 7.
e: LGB CLASSIFIER
LGB is also a gradient boosting framework built on decision
tree algorithms., which applies a technique called Gradient-
Based One-Side Sampling (GOSS) and Exclusive Feature
Bundling (EFB) that benefits from both leaf-wise and level-
wise strategy. Those techniques in LGB accelerate the
training process [54], [55]. Algorithm 8describes the steps
of completing the LGB classifier.
f: ENSEMBLE CLASSIFIER
The six different ML models, as described earlier, are
employed for the ensemble models as they can boost the
performance of the ML-based classifiers [43], [56] and shown
outperforming in many applications such as pneumonia,
diabetic retinopathy classifications [57], [58]. In ensembling
approaches, the aggregation of the outputs from different
models can improve the measles vaccine uptake prediction
precision. The output from each model PjRC,j
{1,2,...,m=6}(mis the number of classifiers) assigns
C=2 confidence values yiR(i=1,2) to the unseen
test data, where yi[0,1] and
C
X
i=1
yi=1. The weighted
Algorithm 7 The Steps of Implementing XGB Classifier
Input: The d-dimensional data XRn×dwith n
samples, and target YRn×1
Output: The posterior probability P[0,1] of unseen
test set x, necessitating
PC
i=1Pi=1,iC=2, Cis the class number
1Initialize the model with constant value:
Fo(x)=argminγ
N
X
i=1
L(Y, γ ) [53], where L(Y,F(x)) is
the differentiable loss function and Nis the number of
sample
2for m=1M(n_Iterations)do
3Compute pseudo-residuals, rim = −[δL(Y,F(Xi))
δF(Xi)],
where i=1,2,...,N
4Fit a base tree, hmusing training set (Xi,rim) for
i=1,2,...,N
5Compute multiplier γmby
γm=argminγ
n
X
i=1
L(Yi,Fm1(Xi)+γhm(Xi))
6Update the model by Fm(x)=Fm1(x)+γmhm(x)
7Fm(x) is the desired posterior probability, P[0,1]
aggregation of various ML models was conducted employing
the equation as in (1).
Pen
i=
m=6
X
j=1
(Wj×Pij)
C=2
X
i=1
m=6
X
j=1
(Wj×Pij)
,(1)
where the weight, Wjis the jth classifier’s AUC. We choose
AUC as a weight for the proposed ensemble classifier
since we necessitate a class unbiased metric as a weight
to introduce a weighted soft voting ensembling. However,
the output of the ensemble model, YRChas the confidence
values Pen
i[0,1]. The final class label of the unseen data
of our BDHS datasets, XRnfrom ensemble model will be
Ciif Pen
i=max(Y(X)).
g: HYPERPARAMETER OPTIMIZATION
The performance of ML algorithms depends critically
on identifying a good set of hyperparameters, as those
algorithms are susceptible to many hyperparameters [43],
[59], [60]. However, the grid search [42] is the most basic
method, where the user specifies a finite set of values
for each hyperparameter, and the grid search evaluates the
Cartesian product of these sets [60]. Let us consider that
be the space of problem parameters P=(p1,p2,...,pm)
over which we maximize the p-value. A simple way to
set up a grid search consists in defining a vector of lower
bounds L=(l1,l2,...,lm) and a vector of upper bounds
VOLUME 9, 2021 119619
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
Algorithm 8 The Steps of Implementing LGB Classifier
Input: The d-dimensional data XRn×dwith n
samples, and target YRn×1
Output: The posterior probability P[0,1] of unseen
test set x, necessitating
PC
i=1Pi=1,iC=2, Cis the class number
1Combine mutually exclusive features of XRn×dby
the exclusive feature bundling technique and set
θ0(x) =argminC
n
X
i
L(Yi,C)
2for m=1M(no_Iteration)do
3Calculate gradient absolute values as
ri= | L(yi (xi))
∂θ (xi)|θ(x)=θm1(x),in
4Resample data set using GOSS process as
top_n=a×len(X), rand_n=b×len(X),
sorted =GetSortedIndices(abs(ri)),
A=sorted[1 :top_n],
B=RandomPick(sorted [top_n:len(X)],rand _n),
and ˆ
X=A+B, where aand bare the big and slight
gradient data sampling ratios, respectively.
5Estimate information gain as
Vj(d)=1
nPxiAlri+1a
bPxiBlri2
nj
l(d)+
PxiArri+1a
bPxiBrri2
nj
r(d)
6Build a new decision tree as θm(ˆx) on set ˆ
X
7Update θm(χ)=θm1(χ)+θm(χ)
8Finally, obtained θm(x) is the desired posterior
probability, P[0,1]
U=(u1,u2,...,um) for each component of P. It involves
taking nequally spaced points in each interval of the form
[Li,Ui],im, including Liand Ui. This creates a
total of n×mpossible grid points to check. Finally, once
each pair of points is calculated, the maximum of these
values is chosen. Table 3bestows different hyperparameters
of six separate ML models, which are optimized in this
article.
4) EVALUATION INDICES
Different extensive experiments of this article are evaluated
utilizing various metrics, such as Sensitivity (Sn), Precision
(Pr), Accuracy (Acc), and the ROC curve with AUC
value [61], [62]. The former three metrics estimate the true-
positive rates, positive predictive values, and total correctly
classified samples among all the pieces. A ROC curve
confirms the performance of a classification model at all
classification thresholds, whereas the AUC expresses the
degree or measure of separability by the classifiers. Since all
the experiments are conducted using a k-fold cross-validation
technique, the final evaluation metrics are estimated using the
equation in (2) [63], [64].
Metric =1
K×
K
X
n=1
Pn±
v
u
u
u
u
t
K
X
n=1
(Pn¯
P)2
K1,(2)
where Kis fold numbers and PnR,nK, is the
performance metric for each fold.
III. RESULTS AND DISCUSSION
This section exhibits various extensive experiments of
this article with the corresponding results in several sub-
sections. The best missing value imputation and attribute
selection methods are analyzed through comprehensive
ablation studies in Sections III-A and III-B, respectively.
The hyperparameters of different ML models are optimized
in Section III-C. In the end, Section III-D describes the
obtained results from other ML models and the proposed
weighted ensemble classifiers with complete ablation studies.
The effectiveness of the proposed classifier has also been
validated employing a statistical ANOVA test in this section.
A. FILLING MISSING VALUES
To alleviate the missing value obstacle (see in Section II-B1),
we have applied three strategies, such as Raw (removing
those samples), Median (using median value), and Mode
(using most frequent value), as presented in Table 2. We have
applied four different BDHS datasets (see in Section II-A)
and six separate ML classifiers to produce the ablation
studies on various methods of FMV to choose the best
performing FMV technique for the measle categorization.
The experimental results in Table 2reveal that the Median
and Mode techniques outperform most of the cases with
a significant margin than the Raw method, while the Raw
method beats them in the remaining cases with a low
margin. The observation in all the BDHS datasets (as
explained in Section II-A) reveals that the percentage of
missing values is significantly less than the total samples
(13.7 %). Moreover, only one feature (Antenatal visits (A19))
contains the missing values out of nineteen features. Since
the number of missing values and the attribute containing
missing values are significantly smaller, the obtained AUCs
from all the classifiers for all the proposed datasets are
almost similar for all the MVF strategies, with a little bit
better in the Median and Mode methods in most cases (see
in Table 2).
Again, the visual inspection in Fig. 3exposes that the
populations of the A19 feature for all the BDHS datasets
follow the normal distribution, conferring similar values of
mode, median, and mean. Such as median and mode values
are responsible for getting similar AUCs for the Median
and Mode methods of FMV policies for all the datasets and
classifiers. Since the Median method outperforms the other
two FMV methods (see in Table 2), this method is applied in
the rest of the experiments of this article.
119620 VOLUME 9, 2021
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
TABLE 2. Extensive experimental results in terms of AUC for the missing value imputation, employing three imputation methods, four different BDHS
datasets, and six different classifiers, where the best imputation method for each dataset and classifier is underlined with a blue color.
FIGURE 3. Normal distribution of an A19 attribute of all the BDHS datasets containing the missing values, where (a) for BDHS-2007, (b) for
BDHS-2011, (c) for BDHS-2014, and (d) for BDHS-2020.
B. ATTRIBUTE SELECTION
AS methods have been integrated into the recommended
framework for finding the smallest subset of features,
yielding increased performance. However, it is imprac-
tical to guess the proper AS method without ablation
studies, as those methods’ performance often varies with
the applications. This article explores four distinct AS
methods without attribute transformation (thus conserving
the interpretation) and six different classifiers for the
measle uptake classification task to conduct a complete
ablation study. Fig. 4displays the AS results from different
experiments.
The AS results from the FS-based method confirm that
the LGB classifier achieves the highest possible AUC of
approximately 0.75 utilizing top 13 14 attributes (see
in Fig. 4(a)). The other classifiers also demonstrate their
corresponding highest AUC at that number of features.
Again, the RF method also shows the highest performance
utilizing top 9 11 attributes, with a maximum AUC of
0.74 for the same LGB classifier (see in Fig. 4(b)). Another
AS method, named LGB-based AS, explicates its maximum
AUC of 0.74 at top 7 8 attributes for the LGB classifiers
(see in Fig. 4(c)). Although the FS outperforms the RF-
and LGB-based approach by a margin of 1.0 %, the former
technique demands more attributes, approximately double
than the LGB-based scheme. The remaining last method,
called XGB-based AS, confers the best AUC of roughly 0.76
for the same LGB classifier with top 3 5 attributes (see
in Fig. 4(d)).
All the results in Fig. 4demonstrate that the XBG-
based AS method outperforms the RF- and LGB-based
techniques by the margins of 2.0 % and FS-based system
by a 1.0 % boundary. The FS-based AS method reveals the
discriminative power of each feature independently from
others, without indicating anything on the combination of
mutual information, leading to poor MDC results. Like the
FS-based method, the RF-based approach also points to low
MDC results, as it outputs higher importance to the attributes
without considering their correlation. It is noteworthy that
the classifiers expose their corresponding highest AUC at the
top 3 5 attributes when the XBG-based AS approach is
employed. It is remarkably clear from all the figures in Fig. 4
that almost all the classifiers depict the same patterns with
varying attribute numbers, where the classifiers yield the best
results for the same attribute numbers. The AS experiments
quantitatively approve the MDC attribute ranking by the
XGB-based AS process, providing an order of A13, A14,
A1, A17, A19, A7, A12, A11, A16, A18, A8, A6, A4, A15,
A3, A2, A9, A5, and A10 (high to low importance), where
first 3 5 attributes yield best AUCs for the MDC. The
obtained attributes’ ranking points to the logical results as it
provides a better ranking of the features, which are related
to respondents’ ever-born children’s numbers, age of first
birth, current age, birth order, and antenatal visit during the
VOLUME 9, 2021 119621
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
FIGURE 4. AUC versus the number of features of the proposed BDHS dataset, employing four distinct attribute selection algorithms and six individual
classifiers. The attribute numbers are varied from top 2 19 to explore their characteristics in the proposed BDHS dataset.
TABLE 3. The tuned hyperparameters of six ML models with the highest possible AUC for the MDC.
pregnancy etc. Since the XGB-based AS scheme publicizes
the best results for the measle classification with fewer
attribute numbers, it has been involved in the rest of the
experiments in this article.
C. HYPERPARAMETER OPTIMIZATION
The best-obtained FMV and AS methods from those two
previous experiments are used for the hyperparameter
optimization of six different ML models to attain the
maximum possible AUCs. Table 3exposes the list of
ML models’ hyperparameters with their optimized values,
employing a grid search strategy in the proposed framework.
The optimized hyperparameter values are picked from
the set of predefined values in a grid by a searching
algorithm by maximizing AUC for the MDC, as described in
Section II-B3.
119622 VOLUME 9, 2021
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
TABLE 4. The measle classification results employing six separate ML models and proposed weighted ensemble models, incorporating missing value
imputation, attribute selection, and hyperparameter optimization. The best metrics obtained from a single ML model are presented in bold fonts, and
those metrics from the proposed ensembling models are underlined with a blue color.
D. CLASSIFIERS
The measles classification results employing different ML
models, the best performing FMV and AS methods, utilizing
the proposed BDHS datasets, are presented in Table 4.
a: INDIVIDUAL ML CLASSIFIERS
Again, the measles classification from the tree-based clas-
sifiers, such as RF and DT, shows that the RF model
outperforms three cases out of four cases with significant
margins than the DT model. Although the DT model is
less biased towards the positive class, the performance of
the RF model is far better in terms of Acc and AUC.
Technically, the RF model reduces the variance component
of error rather than the bias component as in the DT model.
Hence, the DT model has better deals with bias, while the
RF model has better accuracy. Such concepts have been
reflected in the measles classification of this article as the
DT model wins in terms of positive predictive value (Pr),
and the RF model outperforms in terms of Acc. Furthermore,
contrasting the boosting-based classifiers’ (XGB and LGB)
results, it is perceived that the LGB has more Sn, Acc, and
AUC, while the XGB has better Pr. Although the XGB model
has a slightly better positive predictive value (Pr), the LGB
model has better remaining three metrics (see in Table 4).
Although both the XGB and LGB models are based on
the boosting mechanism, the XGB model cannot supervise
categorical attributes by itself, unlike LGB or CatBoost
[65], [66]. Therefore, the LGB is the winner model for
the given BDHS dataset, which mainly holds categorical
attributes. However, confronting all the single ML models,
the applied LBG has better deals with the measles catego-
rization in the proposed BDHS dataset when the proposed
preprocessing and hyperparameter optimization are practiced
(see first six rows in Table 4). Such a result has proven
the superiority of the LGB model to classifying the measles
disease concerning accuracy and AUC.
b: ENSEMBLING ML CLASSIFIERS
To further enhance the measle categorization results, we per-
formed an ablation study to build an ensembling classifier,
FIGURE 5. 2D visualization of the proposed BDHS dataset to demonstrate
the inter-class homogeneities using a principal component analysis,
where the x-axis and y-axis respectively denote the first and second
principal components.
as it has been proven earlier that such a classifier provides
better results (see details in Section II-B3). Table 4displays
the results for all the proposed weighted ensembling models.
Firstly, we aggregate the Bayesian, tree-based, and boosting
ML models to build three ensembling models, where the
AUC of the individual model acts as a weight of that
model for the aggregation. The results of those three models
show that the proposed LGB+XGB wins three cases, such
as Pr, Acc, and AUC, out of four cases with a high
degree of margin (see in 7 9th rows of Table 4).
Although the results obtained from the GNB+BNB model
shows 100.0 % Sn, it is very unfortunate that this model
predicts all the samples as positive (as the positive predictive
value (Pr) is the same as the positive class prior probability
(Ppos) (Pr =Ppos =0.749)). Such results reveal that
the classification by the ensemble of Bayesian models of
a dataset with lots of inter-class homogeneities (see the
class similarity in the BDHS dataset in Fig. 5) is not a
suitable choice as it is experimentally approved in this
article.
VOLUME 9, 2021 119623
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
FIGURE 6. The ROC curves of two different ensemble models, such as (a) LGB+XGB and (b) GNB+BNB+XGB+LGB, for the measles
classification utilizing the proposed approach.
Secondly, the weighted aggregation of two different
type model mechanisms, such as Bayesian with tree-based,
Bayesian with boosting-based, and tree and boosting-based,
points out that the proposed GNB+BNB+XGB+LGB
increases the overall accuracy with the reduced Sn, Pr,
and AUC (see in 10 12th rows of Table 4). The
other two models, such as GNB+BNB+DT+RF and
DT+RF+XGB+LGB, do not produce any success of those
types of ensembling. However, the ROC curves in Fig. 6
yield the explainability of revealing the superiority of the
LGB+XGB and GNB+BNB+XGB+LGB models.
Although those two ROC curves confer almost similar
AUC values, they mainly differ in their accuracy point (see
red cross points in both the ROC curves). The left ROC curve
for the LGB+XGB model shows around 88.0 % true-positive
rates with 47.0 % false-positive rates at its accuracy point (see
blue dashed line in left figure). Similarly, the right ROC curve
for the GNB+BNB+XGB+LGB model produces around
98.0 % true-positive rates with 70.0 % false-positive rates
at its accuracy point (see blue dashed line in left figure).
Such results confer that to increase 10.0 % true-positive rates,
we must accept 22.0 % false-positive rates, which is not
a better alternative in the medical diagnostic application.
Therefore, the LGB+XGB model deals better with both the
true- and false-positive rates, providing the highest possible
AUC of 80.0%. Thirdly, the weighted ensembling of the
Bayesian-, tree-, and boosting-based models cannot further
improve the classification results; instead, it reduces the
performance. Again, we explore the two AS techniques, such
as LGB- and XGB-based AS, on all the proposed ensembling
models, whose results are visualized in Fig. 7.
The AS results in Fig. 7again exhibits a similar pattern as
they conferred in Section III-B. The varying attribute results
on all the proposed models (see in Fig. 7) acknowledge that
the XGB-based AS method again outperforms the XGB-
based AS process, providing the maximum AUC of 0.80. All
the models exhibit a similar pattern with varying attributes,
demonstrating better results for the XGB+LGB classifier
with top-5 attributes. The obtained attributes’ ranking using
the XGB-based AS method and the proposed XGB+LGB
classifier notches similar logical results, as in Section III-B,
giving a better ranking to the respondent has ever born
children numbers, age of first birth, current age, birth order,
and antenatal visit during the pregnancy.
Furthermore, the experimental results from different clas-
sification models, utilizing the proposed best preprocessing,
have been authorized employing a statistical test called
ANOVA and 10-fold cross-validation. Fig. 8dispenses the
Box and Whisker plot of the AUC values of this validation
test. For ANOVA testing, α=0.05 is applied as a
threshold to reject the Null hypothesis (all classifiers’
means are equal) if p-value 0.05, which outcomes signif-
icant results. The ANOVA test demonstrates a p-value of
7.93 ×1038 (0.05), which reveals that an alternative
hypothesis is accepted, strongly pointing that none of the
means are equal (also displayed in Fig. 8). Again, a post
hoc T-test (Bonferroni correction) is incorporated with the
ANOVA test for deciding the better classification model in
the recommended classification system, which confirms the
superiority of the offered weighted ensemble XGB+LGB
classifier.
c: YEAR-WISE CROSS-FOLD VALIDATION
All the previous results are carried utilizing a one-year
BDHS dataset employing 5-fold cross-validation, where we
have proposed four-year BDHS datasets (n=4) (see
in Section II-A). We evaluate the proposed framework,
incorporating missing value imputation, AS method, and
proposed weighted ensembling model, utilizing all the BDHS
datasets, where data acts as one fold each year. In this
experiment, ith (in)-year dataset is utilized as a test
set, and the remaining three datasets are used as a training
set and iterate n=4 times to test all the data in a year-
wise fashion. In this way, we have validated our proposed
119624 VOLUME 9, 2021
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
FIGURE 7. AUC versus the number of features of the proposed BDHS dataset, employing two distinct AS algorithms and the proposed weighted
ensembling classifiers. The attribute numbers are varied from top 2 19 to explore their characteristics in the proposed BDHS dataset.
FIGURE 8. Box and Whisker plot of the AUC values obtained from 10-fold
cross-validation on different ML-based classifiers, where Model-1 to
Model-13, respectively, denote GNB, BNB, RF, DT, XGB, LGB, GNB +BNB,
RF +DT, XGB +LGB, GNB +BNB +RF +DT, RF +DT +LGB +XGB, GNB +
BNB +LGB +XGB, and GNB +BNB +RF +DT +XGB +LGB classifiers.
prediction and showed the generalization capability of our
proposed approach. The ROC curve in Fig. 9represents the
results of this experiment. The obtained ROC curve clarifies
that the proposed framework achieves an average AUC of
0.781 with a standard deviation of 0.005. Although the
average AUC The following paragraphs elaborately explain
the algorithmic actions of these ML classifiers. Using all the
BDHS datasets is less than the individual dataset utilization,
the standard deviation (inter-fold variation) is much higher.
Such a result reveals that the utilization of more samples
increases the model’s genericity with significantly fewer
inter-fold variations.
d: FRAMEWORK SUPERIORITY COMPARED TO OTHER
STUDIES
It is unreasonable to compare the recommended framework
with the published frameworks, as we utilized our newly
proposed BDHS datasets (see dataset details in Section II-A).
FIGURE 9. The ROC curve best performing ensemble model, named
LGB+XGB, for the measles classification utilizing the proposed approach
and all the BDHS datasets.
However, it is the first attempt to suggest an AI-based
framework for the endeavored task using nationally represen-
tative demographic and health survey data from Bangladesh.
Additionally, the contributions in this article focused on
identifying the contributing factor of the non-utilization
of measles vaccination among children in Bangladesh.
However, the authors in [25] utilized Philippine National
Demographic and Health Survey data, using 32 relevant
attributes comprised of geographic location, socioeconomic
condition, and features related to children and family
information, which obtained an accuracy of 79.02 %. Another
article in [23] received 72.0 % precision, using 25 attributes
based on the history of the child and their family members.
In contrast, our framework achieved an accuracy of 78.70 %
and precision of 84.60 %, using only 3 5 attributes, such
as respondents’ ever-born children numbers, first birth’s age,
current age, birth order number, and antenatal visit during the
pregnancy. Such above discussions reveal the preponderance
VOLUME 9, 2021 119625
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
of the recommended AI-based system as it provides better
results with the least number of attributes.
IV. CONCLUSION
This article schemes and optimizes a novel ML-based
framework for measles vaccine uptake classification and
correlates its underlying factors. The whole research has been
succeeded based on the newly proposed BDHS datasets. The
recommended framework reveals that a weightedensemble of
ML models successfully enhances the classification results,
as it weighted aggregates the output probabilities of the
ensemble candidates’ model. Furthermore, the integration
of missing value imputation and attribute selection as a
preprocessing also heightens the aimed outcome. Adopting
those preprocessing methods is critical, necessitating a
complete ablation study to determine the essentially suitable
methods. Moreover, compared to other studies, our research
provides a more accurate model using only 3 5 attributes,
namely respondents’ ever-born children numbers, first birth’s
age, current age, birth order number, and antenatal visit
during the pregnancy, which are easily explainable. We hope
that this study will help national policymakers to give more
importance to these attributes and to ensure ‘hard-immunity’
in the community.
CONFLICT OF INTEREST
The authors have not any conflicts to disclose this research.
AUTHOR CONTRIBUTIONS
Md. Kamrul Hasan and Md. Abdul Awal conceived of the
presented idea and planned the experiments. Md. Abdul
Awal and Md. Akhtarul Islam conceptualized the original
idea. Md. Kamrul Hasan and Md. Abdul Awal designed
the model and the computational framework, analyzed the
data, and Md. Kamrul Hasan carried out the implementation.
Md. Kamrul Hasan, Md. Tasnim Jawad, and Aishwariya
Dutta carried out the experiments. Md. Kamrul Hasan,
Md. Tasnim Jawad, Aishwariya Dutta, Md. Abdul Awal, and
Md. Akhtarul Islam wrote the manuscript with support from
Mehedi Masud and Jehad F. Al-Amri, and Mehedi.Masud and
Jehad F. Al-Amri edited the manuscript. All authors provided
critical feedback and helped shape the research, analysis,
and manuscript. Md. Kamrul Hasan, Md. Abdul Awal, and
Md. Akhtarul Islam supervised the project.
MATERIAL AVAILABILITY
This data was collected considering all ethical issues that
can be found on the DHS websites (https://dhsprogram.com/)
and now published at Harvard Dataverse [41]. This study
excluded the ethical review endorsement separately. The data
and source codes that support the findings of this study are
available at https://github.com/kamruleee51/measles_vaccine
_uptake.
REFERENCES
[1] W. J. Moss, ‘‘Measles,’Lancet, vol. 390, no. 10111, pp. 2490–2502,
2017. [Online]. Available: https://www.sciencedirect.com/science/
article/pii/S0140673617314630
[2] H. Q. McLean, A. P. Fiebelkorn, J. L. Temte, and G. S. Wallace,
‘‘Prevention of measles, rubella, congenital rubella syndrome, and
mumps, 2013: Summary recommendations of the Advisory Committee
on Immunization Practices (ACIP),’’ Morbidity Mortality Weekly Rep.,
Recommendations Rep., vol. 62, no. 4, pp. 1–34, 2013.
[3] R. Fernandez, A. Rammohan, and N. Awofeso, ‘‘Correlates of first dose
of measles vaccination delivery and uptake in Indonesia,’’ Asian Pacific J.
Tropical Med., vol. 4, no. 2, pp. 140–145, Feb. 2011.
[4] S. Izadi, S.-M. Zahraie, and M. Sartipi, ‘‘An investigation into a measles
outbreak in southeast Iran,’Jpn. J. Infectious Diseases, vol. 65, no. 1,
pp. 45–51, 2012.
[5] A. Mahamud, A. Burton, M. Hassan, J. A. Ahmed, J. B. Wagacha,
P. Spiegel, C. Haskew, R. B. Eidex, S. Shetty, S. Cookson,
C. Navarro-Colorado, and J. L. Goodson, ‘‘Risk factors for measles
mortality among hospitalized Somali refugees displaced by famine,
Kenya, 2011,’’ Clin. Infectious Diseases, vol. 57, no. 8, pp. e160–e166,
Oct. 2013.
[6] N. Sheikh, M. Sultana, N. Ali, R. Akram, R. Mahumud, M. Asaduzzaman,
and A. Sarker, ‘‘Coverage, timelines, and determinants of incomplete
immunization in Bangladesh,’Tropical Med. Infectious Disease, vol. 3,
no. 3, p. 72, Jun. 2018.
[7] R. E. Black, S. Cousens, H. L. Johnson, J. E. Lawn, I. Rudan, D. G. Bassani,
P. Jha, H. Campbell, C. F. Walker, R. Cibulskis, T. Eisele, L. Liu, and
C. Mathers, ‘‘Global, regional, and national causes of child mortality in
2008: A systematic analysis,’Lancet, vol. 375, no. 9730, pp. 1969–1987,
Jun. 2010.
[8] New Measles Surveillance Data for 2019, World Health Organization,
Geneva, Switzerland, 2019, vol. 24.
[9] A. C. Kantner, S. H. van Wees, E. M. G. Olsson, and S. Ziaei, ‘‘Factors
associated with measles vaccination status in children under the age of
three years in a post-Soviet context: A cross-sectional study using the DHS
VII in Armenia,’BMC Public Health, vol. 21, no. 1, pp. 1–10, Dec. 2021.
[10] P. Plans-Rubió, ‘‘Why does measles persist in Europe?’’ Eur. J. Clin.
Microbiol. Infectious Diseases, vol. 36, no. 10, pp. 1899–1906, Oct. 2017.
[11] Y. Hu, Y. Chen, Y. Wang, and H. Liang, ‘‘Evaluation of potentially
achievable vaccination coverage of the second dose of measles containing
vaccine with simultaneous administration and risk factors for missed
opportunities among children in Zhejiang province, East China,’Hum.
Vaccines Immunotherapeutics, vol. 14, no. 4, pp. 875–880, Apr. 2018.
[12] P. Plans-Rubió, ‘‘Low percentages of measles vaccination coverage with
two doses of vaccine and low herd immunity levels explain measles
incidence and persistence of measles in the European union in 2017–
2018,’Eur. J. Clin. Microbiol. Infectious Diseases, vol. 38, no. 9,
pp. 1719–1729, Sep. 2019.
[13] J. P. Higgins, K. Soares-Weiser, J. A. López-López, A. Kakourou,
K. Chaplin, H. Christensen, N. K. Martin, J. A. Sterne, and A. L. Reingold,
‘‘Association of BCG, DTP, and measles containing vaccines with
childhood mortality: Systematic review,’Brit. Med. J., vol. 355, Oct. 2016,
Art. no. i5170.
[14] O. M. de la Santé, ‘‘Measles vaccines: Who position paper—April 2017-
note de synthèse de l’OMS sur les vaccins contre la rougeole-avril 20177,’’
Weekly Epidemiolog. Record= Relevé épidémiologique hebdomadaire,
vol. 92, no. 17, pp. 205–227, 2017.
[15] Bangladesh Demographic and Health Survey 2017–18: Key Indicators,
National Institute of Population Research and Training (NIPORT), Dhaka,
Bangladesh, 2019.
[16] M. D. C. Tauil, A. P. S. Sato, and E. A. Waldman, ‘‘Factors associated with
incomplete or delayed vaccination across countries: A systematic review,’’
Vaccine, vol. 34, no. 24, pp. 2635–2643, May 2016.
[17] S. Bhattacherjee, P. Dasgupta, A. Mukherjee, and S. Dasgupta, ‘‘Vaccine
hesitancy for childhood vaccinations in slum areas of Siliguri, India,’’
Indian J. Public Health, vol. 62, no. 4, p. 253, 2018.
[18] R. Rossi, ‘‘Do maternal living arrangements influence the vaccination
status of children age 12–23 months? A data analysis of demographic
health surveys 2010–11 from Zimbabwe,’PLoS ONE, vol. 10, no. 7,
Jul. 2015, Art. no. e0132357.
[19] S. Walsh, D. R. Thomas, B. W. Mason, and M. R. Evans, ‘‘The impact of
the media on the decision of parents in south Wales to accept measles-
mumps-rubella (MMR) immunization,’Epidemiol. Infection, vol. 143,
no. 3, pp. 550–560, Feb. 2015.
119626 VOLUME 9, 2021
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
[20] S. Engebretsen and J. Bohlin, ‘‘Statistical predictions with glmnet,’Clin.
Epigenetics, vol. 11, no. 1, pp. 1–3, Dec. 2019.
[21] W. M. T. W. Ahmad, N. Ghani, and S. M. Drus, ‘‘Handling imbalanced
class problem of measles infection risk prediction model,’Int. J. Eng. Adv.
Technol., vol. 9, no. 1, pp. 3431–3435, 2019.
[22] J. Nazari, P.-S. Fathi, N. Sharahi, M. Taheri, P. Amini, and
A. Almasi-Hashiani, ‘‘Evaluating measles incidence rates using machine
learning and time series methods in the center of Iran; 1997–2020,’
Tech. Rep., 2020.
[23] A. Bell, A. Rich, M. Teng, T. Oreskovic, N. B. Bras, L. Mestrinho,
S. Golubovic, I. Pristas, and L. Zejnilovic, ‘‘Proactive advising: A machine
learning driven approach to vaccine hesitancy,’’ in Proc. IEEE Int. Conf.
Healthcare Informat. (ICHI), Jun. 2019, pp. 1–6.
[24] V. Carrieri, R. Lagravinese, and G. Resce, ‘‘Predicting vaccine hesitancy
from area-level indicators: A machine learning approach,’’ MedRxiv,
Mar. 2021.
[25] O. O. Bucaro, ‘‘Exploring relevant features associated with
measles nonvaccination using a machine learning approach,’’
Tech. Rep., 2020. [Online]. Available: https://www.diva-portal.org/smash/
get/diva2:1461628/FULLTEXT01.pdf
[26] A. S. Rao, D. A. D’Mello, R. Anand, and S. Nayak, ‘‘Clinical significance
of measles and its prediction using data mining techniques: A systematic
review,’’ in Advances in Artificial Intelligence and Data Engineering.
Singapore: Springer, 2021, pp. 737–759.
[27] A. Susilowati, Y. Wijayanti, and I. M. Sudana, ‘‘The influencing risk
factors of measles in Bantul regency,’Public Health Perspective J., vol. 4,
no. 2, pp. 129–140, 2019.
[28] V. D. Kien, H. Van Minh, K. B. Giang, V. Q. Mai, N. T. Tuan,
and M. B. Quam, ‘‘Trends in childhood measles vaccination highlight
socioeconomic inequalities in Vietnam,’’ Int. J. Public Health, vol. 62,
no. S1, pp. 41–49, Feb. 2017.
[29] C. Hagemann, A. Streng, A. Kraemer, and J. G. Liese, ‘‘Heterogeneity
in coverage for measles and varicella vaccination in toddlers—Analysis
of factors influencing parental acceptance,’BMC Public Health, vol. 17,
no. 1, pp. 1–10, Dec. 2017.
[30] A. B. Wilder-Smith and K. Qureshi, ‘‘Resurgence of measles in Europe:
A systematic review on parental attitudes and beliefs of measles vaccine,’’
J. Epidemiol. Global Health, vol. 10, no. 1, p. 46, 2019.
[31] D. E. Griffin, ‘‘Measles vaccine,’’ Viral Immunol., vol.31, no. 2, pp. 86–95,
2018.
[32] R. D. de Vries, A. W. Mesman, T. B. Geijtenbeek, W. P. Duprex, and
R. L. de Swart, ‘‘The pathogenesis of measles,’ Current Opinion Virol.,
vol. 2, no. 3, pp. 248–255, 2012.
[33] R. Buchanan and D. J. Bonthius, ‘‘Measles virus and associated central
nervous system sequelae,’Seminars Pediatric Neurol., vol. 19, no. 3,
pp. 107–114, Sep. 2012.
[34] J. C. Bester, ‘‘Measles and measles vaccination: A review,’’ JAMA
Pediatrics, vol. 170, no. 12, pp. 1209–1215, 2016.
[35] W. J. Moss and D. E. Griffin, ‘‘Global measles elimination,’’ Nature Rev.
Microbiol., vol. 4, no. 12, pp. 900–908, Dec. 2006.
[36] L. K. Tannous, G. Barlow, and N. H. Metcalfe, ‘‘A short clinical
review of vaccination against measles,’’ JRSM open, vol. 5, no. 4, 2014,
Art. no. 2054270414523408.
[37] R. T. Perry and N. A. Halsey, ‘‘The clinical significance of measles: A
review,’J. Infectious Diseases, vol. 189, no. 1, pp. S4–S16, May 2004.
[38] Bangladesh Demographic and Health Survey, Mitra and Associates
(Firm), M. I. I. for Resource Development Demographic and Health
Survey, National Institute of Population Research and Training (NIPORT),
Dhaka, Bangladesh, 2011.
[39] Bangladesh Demographic and Health Survey 2014: Key Indicators,
National Institute of Population Research and Training (NIPORT), Mitra,
Associates, and II, Dhaka, Bangladesh, 2015.
[40] Bangladesh Demographic Health Survey, 2007, National Institute of
Population Research and Training (NIPORT), Mitra, Associates, (Firm),
and Macro International, Dhaka, Bangladesh, 2009
[41] M. K. Hasan, J. M. Tasnim, A. Dutta, A. M. Abdul, M. A. Islam,
M. Mehedi, and F. Al-Amr Jehad, ‘‘Measles,’’ Harvard Dataverse, V1,
Tech. Rep. UNF:6:CG4S8sYltZv8Btm5uCF/aA==[fileUNF], 2021, doi:
10.7910/DVN/S76AZS.
[42] D. Krstajic, L. J. Buturovic, D. E. Leahy, and S. Thomas, ‘‘Cross-validation
pitfalls when selecting and assessing regression and classification models,’
J. Cheminform., vol. 6, no. 1, pp. 1–15, Dec. 2014.
[43] M. K. Hasan, M. A. Alam, D. Das, E. Hossain, and M. Hasan, ‘‘Diabetes
prediction using ensembling of different machine learning classifiers,’
IEEE Access, vol. 8, pp. 76516–76531, 2020.
[44] A. Purwar and S. K. Singh, ‘‘Hybrid prediction model with missing
value imputation for medical data,’Expert Syst. Appl., vol. 42, no. 13,
pp. 5621–5631, Aug. 2015.
[45] P. J. García-Laencina, J.-L. Sancho-Gómez, and A. R. Figueiras-Vidal,
‘‘Pattern classification with missing data: A review,’Neural Comput.
Appl., vol. 19, no. 2, pp. 263–282, 2010.
[46] T. Aljuaid and S. Sasi, ‘‘Proper imputation techniques for missing values in
data sets,’’ in Proc. Int. Conf. Data Sci. Eng. (ICDSE), Aug. 2016, pp. 1–5.
[47] F. Korn, B.-U. Pagel, and C. Faloutsos, ‘‘‘On the ‘dimensionality curse’
and the ‘self-similarity blessing,’’’ IEEE Trans. Knowl. Data Eng., vol. 13,
no. 1, pp. 96–111, Jan./Feb. 2001.
[48] A. Jovic, K. Brkic, and N. Bogunovic, ‘‘A review of feature selection
methods with applications,’’ in Proc. 38th Int. Conv. Inf. Commun.
Technol., Electron. Microelectron. (MIPRO), May 2015, pp. 1200–1205.
[49] Q. Gu, Z. Li, and J. Han, ‘‘Generalized Fisher score for feature
selection,’’ 2012, arXiv:1202.3725. [Online]. Available: http://arxiv.org/
abs/1202.3725
[50] B. H. Menze, B. M. Kelm, R. Masuch, U. Himmelreich, P. Bachert,
W. Petrich, and F. A. Hamprecht, ‘‘A comparison of random forest and
its Gini importance with standard chemometric methods for the feature
selection and classification of spectral data,’BMC Bioinf., vol. 10, no. 1,
pp. 1–16, 2009.
[51] Y. Ye, C. Liu, N. Zemiti, and C. Yang, ‘‘Optimal feature selection for EMG-
based finger force estimation using LightGBM model,’’ in Proc. 28th
IEEE Int. Conf. Robot Hum. Interact. Commun. (RO-MAN), Oct. 2019,
pp. 1–7.
[52] C. Chen, Q. Zhang, B. Yu, Z. Yu, P. J. Lawrence, Q. Ma, and
Y. Zhang, ‘‘Improving protein-protein interactions prediction accuracy
using XGBoost feature selection and stacked ensemble classifier,’’
Comput. Biol. Med., vol. 123, Aug. 2020, Art. no. 103899.
[53] T. Chen and C. Guestrin, ‘‘XGBoost: A scalable tree boosting system,’’
in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
Aug. 2016, pp. 785–794.
[54] M. Ustuner and F. Balik Sanli, ‘‘Polarimetric target decompositions and
light gradient boosting machine for crop classification: A comparative
evaluation,’’ ISPRS Int. J. Geo-Inf., vol. 8, no. 2, p. 97, Feb. 2019.
[55] A. A. Taha and S. J. Malebary, ‘‘An intelligent approach to credit card
fraud detection using an optimized light gradient boosting machine,’IEEE
Access, vol. 8, pp. 25579–25587, 2020.
[56] S.-L. Hsieh, S.-H. Hsieh, P.-H. Cheng, C.-H. Chen, K.-P. Hsu, I.-S. Lee,
Z. Wang, and F. Lai, ‘‘Design ensemble machine learning model for breast
cancer diagnosis,’J. Med. Syst., vol. 36, no. 5, pp. 2841–2847, Oct. 2012.
[57] N. Sikder, M. Masud, A. K. Bairagi, A. S. M. Arif, A.-A. Nahid,
andH. A. Alhumyani, ‘‘Severity classification of diabetic retinopathy
using an ensemble learning algorithm through analyzing retinal images,’
Symmetry, vol. 13, no. 4, p. 670, Apr. 2021.
[58] M. Masud, A. K. Bairagi, A.-A. Nahid, N. Sikder, S. Rubaiee, A. Ahmed,
and D. Anand, ‘‘A pneumonia diagnosis scheme based on hybrid
features extracted from chest radiographs using an ensemble learning
algorithm,’J. Healthcare Eng., vol. 2021, pp. 1–11, Feb. 2021, doi:
10.1155/2021/8862089.
[59] M. A. Awal, M. Masud, M. S. Hossain, A. A.-M. Bulbul,
S. M. H. Mahmud, and A. K. Bairagi, ‘‘A novel Bayesian optimization-
based machine learning framework for COVID-19 detection from inpatient
facility data,’IEEE Access, vol. 9, pp. 10263–10281, 2021.
[60] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar,
‘‘Hyperband: A novel bandit-based approach to hyperparameter optimiza-
tion,’J. Mach. Learn. Res., vol. 18, no. 1, pp. 6765–6816, 2017.
[61] R. Dai, W. Zhang, W. Tang, E. Wynendaele, Q. Zhu, Y. Bin,
B. De Spiegeleer, and J. Xia, ‘‘BBPpred: Sequence-based prediction of
blood-brain barrier peptides with feature representation learning and
logistic regression,’J. Chem. Inf. Model., vol. 61, no. 1, pp. 525–534,
Jan. 2021.
[62] N. Cheng, M. Li, L. Zhao, B. Zhang, Y. Yang, C.-H. Zheng, and J. Xia,
‘‘Comparison and integration of computational methods for deleterious
synonymous mutation prediction,’Briefings Bioinf., vol. 21, no. 3,
pp. 970–981, May 2020, doi: 10.1093/bib/bbz047.
[63] M. K. Hasan, T. A. Aleef, and S. Roy, ‘‘Automatic mass classification
in breast using transfer learning of deep convolutional neural network
and support vector machine,’’ in Proc. IEEE Region Symp. (TENSYMP),
Jun. 2020, pp. 110–113.
VOLUME 9, 2021 119627
M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors
[64] M. A. Awal, M. S. Hossain, K. Debjit, N. Ahmed, R. D. Nath,
G. M. M. Habib, M. S. Khan, M. A. Islam, and M. A. P. Mahmud,
‘‘An early detection of asthma using BOMLA detector,’IEEE Access,
vol. 9, pp. 58403–58420, 2021.
[65] A. V. Dorogush, V. Ershov, and A. Gulin, ‘‘CatBoost: Gradient boosting
with categorical features support,’’ 2018, arXiv:1810.11363. [Online].
Available: http://arxiv.org/abs/1810.11363
[66] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu,
‘‘LightGBM: A highly efficient gradient boosting decision tree,’’ in Proc.
Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 3146–3154.
MD. KAMRUL HASAN received the B.Sc.
and M.Sc. degrees in electrical and electronic
engineering (EEE) from Khulna University of
Engineering & Technology (KUET), in 2014 and
2017, respectively, and the M.Sc. degree in
medical imaging and application (MAIA) from the
University of Burgundy, France, the University of
Cassino and Southern Lazio, Italy, and the Uni-
versity of Girona, Spain, as an Erasmus Scholar,
in 2019. He is currently working as an Assistant
Professor with the EEE Department, KUET. His research interests include
medical image and data analysis, machine learning, deep convolutional
neural network, medical image reconstruction, augmented reality, and
surgical robotics in minimally invasive surgery. He is currently a supervisor
of several undergraduate students on the classification, segmentation, and
registration of medical images with different modalities. His previous works
were published in various journals such as Medical Image Analysis (MIA;
Elsevier), Computer in Biology and Medicine (CBM; Elsevier), Artificial
Intelligence in Medicine (AIIM; Elsevier), Biomedical Signal Processing
and Control (BSCP; Elsevier), and IEEE ACCESS.
MD. TASNIM JAWAD was born in Rangpur,
Bangladesh, in 2000. He is currently pursuing
the B.Sc. degree in electrical and electronic engi-
neering with Khulna University of Engineering
& Technology. He is also taking supplementary
courses from online educational providers, such as
Coursera and Udemy in machine learning and deep
learning. His current research interests include
image classification, audio classification, medical
image processing, convolutional neural networks,
recurrent neural networks, and generative adversarial networks.
AISHWARIYA DUTTA received the B.Sc. degree
in biomedical engineering (BME) from Khulna
University of Engineering & Technology (KUET),
where she is currently pursuing the master’s
degree with the Department of Biomedical Engi-
neering (BME). She has published one con-
ference paper in the 4th International Joint
Conference on Advances in Computational Intel-
ligence (IJCACI), in 2020, and also coauthored
one international journal article. Her research
interests include machine learning and its applications, deep learning,
biomedical imaging, biomedical signal processing, and nanotechnology in
bioengineering.
MD. ABDUL AWAL received the B.Sc. degree in
electronics and communication engineering (ECE)
from the ECE Discipline, Khulna University,
in 2009, the M.Sc. degree in biomedical engi-
neering from Khulna University of Engineering
& Technology, in 2011, and the Ph.D. degree
in biomedical engineering from The University
of Queensland, Australia, in 2018. He is cur-
rently working as an Associate Professor with
the ECE Discipline, Khulna University, Khulna,
Bangladesh. He is also investigating some projects as the Principal
Investigator and a Co-Investigator and supervising several undergraduate
and post-graduate students. His research interests include signal processing,
especially biomedical signal processing, big data analysis, image processing,
time-frequency analysis, machine learning algorithms, deep learning,
optimization, and computational intelligence biomedical engineering. He has
more than 40 papers published in internationally accredited journals and
conferences.
MD. AKHTARUL ISLAM received the B.Sc. and
M.S. degree in statistics biostatistics & informat-
ics from Dhaka University, Dhaka, Bangladesh,
in 2012 and 2013, respectively. He is currently
working as an Assistant Professor with the
Statistics Discipline, Khulna University, Khulna,
Bangladesh. He has authored or coauthored
around 12 publications in different peer-reviewed
journals. His research interests include bio-
statistics, epidemiology, public health, infectious
disease, meta-analysis, statistical computing, and multivariate analysis.
MEHEDI MASUD (Senior Member, IEEE)
received the Ph.D. degree in computer science
from the University of Ottawa, Canada. He is
currently a Full Professor with the Department
of Computer Science, Taif University, Taif, Saudi
Arabia. He has authored or coauthored around
50 publications, including refereed IEEE, ACM,
Springer, and Elsevier journals, conference papers,
books, and book chapters. His research interests
include cloud computing, distributed algorithms,
data security, data interoperability, formal methods, and cloud and
multimedia for healthcare. He has served as a Technical Program Committee
Member of different international conferences. He is a recipient of a number
of awards, including the Research in Excellence Award from Taif University.
He is on the Associate Editorial Board of IEEE ACCESS and International
Journal of Knowledge Society Research (IJKSR). He is an Editorial Board
Member of Journal of Software. He also served as the Guest Editor of
ComSIS journal and Journal of Universal Computer Science (JUCS). He is
a member of ACM.
JEHAD F. AL-AMRI received the degree from
the Centre for Computing and Social Responsi-
bility, De Montfort University. He is currently
an Associate Professor with the Department of
Information Technology, Faculty of Computers
and Information Technology, Taif University,
Saudi Arabia. His research interests include cloud
computing security, multimedia security, image
encryption, steganography, and medical image
processing.
119628 VOLUME 9, 2021
... The ensemble of the ML model is a prevalent technique for increasing performance by combining a group of classifiers [31,64,65]. Integrating the outputs from different classifier models in ensemble procedures can boost diabetes prediction accuracy. ...
... Integrating the outputs from different classifier models in ensemble procedures can boost diabetes prediction accuracy. The six different ML models, as previously explained (GNB, BNB, RF, DT, XGB, LGB), are utilized for the ensemble frameworks as they can enhance the effectiveness of ML-based classifiers [31,66] and outperform in numerous medical fields, for instance, pneumonia, diabetic retinopathy, and measles vaccination uptake classifications [64,67,68]. We caluculate each models' output, Y j , (j = 1, 2, 3, . . ...
... However, their performances for the DDC dataset and aimed tasks are more promising than the other four tree-based and Bayesian classifiers. The boosting classifiers applied in this article are extreme gradient boosting and one of the well-known gradient boosting procedures (ensemble), which improved interpretation and swiftness in tree-based ML algorithms [31,64]. Additionally, they minimize a regularized (L1 and L2) objective function that integrates a convex loss function and a correction term for model complexity, producing a more generic classification in any given assignment, including the aspired task in this article. ...
Article
Full-text available
Diabetes is one of the most rapidly spreading diseases in the world, resulting in an array of significant complications, including cardiovascular disease, kidney failure, diabetic retinopathy, and neuropathy, among others, which contribute to an increase in morbidity and mortality rate. If diabetes is diagnosed at an early stage, its severity and underlying risk factors can be significantly reduced. However, there is a shortage of labeled data and the occurrence of outliers or data missingness in clinical datasets that are reliable and effective for diabetes prediction, making it a challenging endeavor. Therefore, we introduce a newly labeled diabetes dataset from a South Asian nation (Bangladesh). In addition, we suggest an automated classification pipeline that includes a weighted ensemble of machine learning (ML) classifiers: Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), XGBoost (XGB), and LightGBM (LGB). Grid search hyperparameter optimization is employed to tune the critical hyperparameters of these ML models. Furthermore, missing value imputation, feature selection, and K-fold cross-validation are included in the framework design. A statistical analysis of variance (ANOVA) test reveals that the performance of diabetes prediction significantly improves when the proposed weighted ensemble (DT + RF + XGB + LGB) is executed with the introduced preprocessing, with the highest accuracy of 0.735 and an area under the ROC curve (AUC) of 0.832. In conjunction with the suggested ensemble model, our statistical imputation and RF-based feature selection techniques produced the best results for early diabetes prediction. Moreover, the presented new dataset will contribute to developing and implementing robust ML models for diabetes prediction utilizing population-level data. Keywords: artificial intelligence; diabetes prediction; ensemble ML classifier; filling missing value; outlier rejection; South Asian diabetes dataset
... After feature extraction, a feature selection step is crucial [592], and has been employed for the SLC task to determine the most relevant features and reduce the dimensionality of the feature space [1,356,376,410,443,467,469,502,593]. Moreover, such features may influence the performance of the classification process, i.e., render it slower. ...
... The identification of a good set of hyperparameters is essential for the robust performance of the CNN model [592]. However, hyperparameters cannot be directly learned from regular training processes and must be tuned separately. ...
Article
Full-text available
The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include: relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on developing an automated and robust CAD system for skin lesion analysis.
... Therefore, a data-driven approach that involves statistical analysis and machine learning has emerged as a tool that can model spatiotemporal patterns of infectious diseases. The machine learning approach has been used to assess factors that place people at a higher risk of measles [15,16], and researchers have worked on influenza forecasting for a long time using statistical and machine learning methods, such as the autoregressive integrated moving average model and random forest algorithm [17]. Statistical and machine learning models have mainly attempted to simulate the effects of driving factors (i.e., predictive variables) on the spread dynamics of infectious diseases [18][19][20]. ...
Article
Full-text available
Data-driven approaches predict infectious disease dynamics by considering various factors that influence severity and transmission rates. However, these factors may not fully capture the dynamic nature of disease transmission, limiting prediction accuracy and consistency. Our proposed data-driven approach integrates spatiotemporal human mobility patterns from detailed point-of-interest clustering and population flow data. These patterns inform the creation of mobility-informed risk indices, which serve as auxiliary factors in data-driven models for detecting outbreaks and predicting prevalence trends. We evaluated our approach using real-world COVID-19 outbreaks in Beijing and Guangzhou, China. Incorporating the risk indices, our models successfully identified 87% (95% Confidence Interval: 83–90%) of affected subdistricts in Beijing and Guangzhou. These findings highlight the effectiveness of our approach in identifying high-risk areas for targeted disease containment. Our approach was also tested with COVID-19 prevalence data in the United States, which showed that including the risk indices reduced the mean absolute error and improved the R-squared value for predicting weekly case increases at the county level. It demonstrates applicability for spatiotemporal forecasting of widespread diseases, contributing to routine transmission surveillance. By leveraging comprehensive mobility data, we provide valuable insights to optimize control strategies for emerging infectious diseases and facilitate proactive measures against long-standing diseases.
... Next, Figure 8: The first figure represents the identity block, and the figure below is the conv block where conv denotes the convolution layer, batch denotes the batch normalization layer, and X is the input of the first layer the MobileNet model [27] was implemented as an example of a light weight DNN. This model uses depthwise separable CNN [28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43]. ...
Article
Full-text available
The accuracy of fingerprint recognition model is extremely important due to its usage in forensic and security fields. Any fingerprint recognition system has particular network architecture whereas many other networks achieve higher accuracy. To solve this problem in a unified model, this paper proposes a model that can automatically specify itself. So, it is called an automatic deep neural network (ADNN). Our algorithm can specify the appropriate architecture ofthe neural network used and some significant parameters of this network. These parameters are the number offilters, epochs, and iterations. It guarantees the highest accuracy by updating itself until achieving 99% accuracy then it stops and outputs the result. Moreover, this paper proposes an end-to-end methodology for recognizing a person’s identity from the input fingerprint image based on a residual convolutional neural network. It is a complete system and is fully automated whether in the features extraction stage or the classification stage. Our goal is to automate this fingerprint recognition system because the more automatic the system is, the more time and effort it saves. Our model also allows users to react by inputting the initial values of these parameters. Then, the model updates itself until it finds the optimal values for the parameters and achieves the best accuracy. Another advantage of our algorithm is that it can recognize people from their thumb and other fingers and its ability to recognize distorted samples. Our algorithm achieved 99.75% accuracy on the public fingerprint dataset (SOCOFing). This is the best accuracy compared with other models.
Article
Full-text available
Estimating rainfall accurately is crucial for both the community and various institutions involved in managing water resources and preventing disasters. The XGBoost model has demonstrated its effectiveness in predicting rainfall, but it still requires fine-tuning of hyperparameters to enhance its performance. This study seeks to determine the optimal learning rate for rainfall prediction while keeping the max_depth and n_estimator parameters fixed. The hyperparameter optimization process was carried out using a two-step approach: an initial coarse search using RandomizedSearchCV followed by a more detailed fine-tuning using GridSearchCV. The model's foundation relied on historical rainfall data gathered over three months from the Automated Weather Observed System (AWOS) at the Pontianak Meteorological Station, recorded on an hourly basis. To assess the model's performance, several metrics were employed, including accuracy, precision, recall, F1 score, and ROC-AUC. The model demonstrated promising results, with accuracy, precision, recall, and F1 score all reaching 95%, indicating its ability to effectively predict rainfall. However, the ROC-AUC score was somewhat lower at 62%. After conducting the hyperparameter search, the optimal learning rate determined for the model, utilizing the 2040 dataset, was found to be 0.204.
Article
The field of artificial intelligence to which machine learning belongs. We use machine learning methods like K-nearest neighbor(KNN), and Linear regression algorithm to detect and diagnose illnesses in this work. The dataset is trained using supervised learning, Reinforcement learning methods in order to construct a logical mathematical model. In the context of learning models, the datasets are employed for purposes such as data analysis and illness diagnosis. The purpose of the Disease Prediction using Machine Learning (ML) system is to make predictions about diseases based on the symptoms reported by patients or other users. The user inputs their symptoms, and the machine returns the likelihood that they have a certain ailment. In machine learning, disease prognosis relies on disease prediction.
Chapter
Full-text available
Parkinson’s disease (PD) is a common dynamic neurodegenerative disorder due to the lack of the brain’s chemical dopamine, impairing motor and nonmotor symptoms. The PD patients undergo vocal cord dysfunctions, producing speech impairment, an early and essential PD indicator. The researchers are contributing to building generic data-driven decision-making systems due to the non-availability of the medical test(s) for the early PD diagnosis. This article has provided an automatic decision-making framework for PD detection by proposing a weighted ensemble of machine learning (ML) boosting classifiers: random forest (RF), AdaBoost (AdB), and XGBoost (XGB). The introduced framework has incorporated outlier rejection (OR) and attribute selection (AS) as the recommended preprocessing. The experimental results reveal that the one-class support vector machine-based OR followed by information gain-based AS performs the best preprocessing in the aimed task. Additionally, one of the proposed ensemble models has outputted an average area under the ROC curve (AUC) of 0.972, outperforming the individual RF, AdB, and XGB classifiers with the margins of \(0.5\,\%\), \(3.7\,\%\), and \(1.4\,\%\), respectively, while the advised preprocessing is incorporated. Since the suggested system provides better PD diagnosis results, it can be a practical decision-making tool for clinicians in PD diagnosis.KeywordsParkinson diseaseOutlier rejectionAttribute selectionMachine learning modelsEnsemble classifiers
Article
Full-text available
Background: Measles is a feverish condition labeled among the most infectious viral illnesses in the globe. Despite the presence of a secure, accessible, affordable and efficient vaccine, measles continues to be a worldwide concern. Methods: This epidemiologic study used machine learning and time series methods to assess factors that placed people at a higher risk of measles. The study contained the measles incidence in Markazi Province, the center of Iran, from Apr 1997 to Feb 2020. In addition to machine learning, zero-inflated negative binomial regression for time series was utilized to assess development of measles over time. Results: The incidence of measles was 14.5% over the recent 24 years and a constant trend of almost zero cases were observed from 2002 to 2020. The order of independent variable importance were recent years, age, vaccination, rhinorrhea, male sex, contact with measles patients, cough, conjunctivitis, ethnic, and fever. Only 7 new cases were forecasted for the next two years. Bagging and random forest were the most accurate classification methods. Conclusion: Even if the numbers of new cases were almost zero during recent years, age and contact were responsible for non-occurrence of measles. October and May are prone to have new cases for 2021 and 2022.
Article
Full-text available
Vaccine hesitancy (VH) might represent a serious threat to the next COVID‐19 mass immunization campaign. We use machine learning algorithms to predict communities at a high risk of VH relying on area‐level indicators easily available to policymakers. We illustrate our approach on data from child immunization campaigns for seven nonmandatory vaccines carried out in 6062 Italian municipalities in 2016. A battery of machine learning models is compared in terms of area under the receiver operating characteristics curve. We find that the Random Forest algorithm best predicts areas with a high risk of VH improving the unpredictable baseline level by 24% in terms of accuracy. Among the area‐level indicators, the proportion of waste recycling and the employment rate are found to be the most powerful predictors of high VH. This can support policymakers to target area‐level provaccine awareness campaigns.
Article
Full-text available
Asthma is a chronic and airway-induced disease, causing the incidence of bronchus inflammation, breathlessness, wheezing, is drastically becoming life-threatening. Even in the worst cases, it may destroy the quality to lead. Therefore, early detection of asthma is urgently needed, and machine learning can help identify asthma accurately. In this paper, a novel machine learning framework, namely BOMLA (Bayesian Optimisation-based Machine Learning framework for Asthma) detector has been proposed to detect asthma. Ten classifiers have been utilized in the BOMLA detector, where Support Vector Classifier (SVC), Random Forest (RF), Gradient Boosting Classifier (GBC), eXtreme Gradient Boosting (XGB), and Artificial Neural Network (ANN) are state-of-the-art classifiers. In contrast, Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QLDA), Naive Bayes (NB), Decision Tree (DT), and K-Nearest Neighbor (KNN) are conventional popular classifiers. ADASYN algorithm has also been employed in the BOMLA detector to eradicate the issues created due to the imbalanced dataset. It has even been attempted to delineate how the ADASYN algorithm affects the classification performance. The highest accuracy (ACC) and Matthews’s correlation coefficient (MCC) for an Asthma dataset provide 94.35% and 88.97%, respectively, using BOMLA detector when SVC is adapted, and it has been increased to 96.52% and 93.04%, respectively, when ensemble technique is adapted. The one-way analysis of variance (ANOVA) has also been performed in the 10-fold cross-validation to measure the statistical significance. A decision support system has been built as a potential application of the proposed system to visualize the probable outcome of the patient. Finally, it is expected that the BOMLA detector will help patients in their early diagnosis of asthma.
Article
Full-text available
Diabetic Retinopathy (DR) refers to the damages endured by the retina as an effect of diabetes. DR has become a severe health concern worldwide, as the number of diabetes patients is soaring uncountably. Periodic eye examination allows doctors to detect DR in patients at an early stage to initiate proper treatments. Advancements in artificial intelligence and camera technology have allowed us to automate the diagnosis of DR, which can benefit millions of patients indeed. This paper inscribes a novel method for DR diagnosis based on the gray-level intensity and texture features extracted from fundus images using a decision tree-based ensemble learning technique. This study primarily works with the Asia Pacific Tele-Ophthalmology Society 2019 Blindness Detection (APTOS 2019 BD) dataset. We undertook several steps to curate its contents to make them more suitable for machine learning applications. Our approach incorporates several image processing techniques, two feature extraction techniques, and one feature selection technique, which results in a classification accuracy of 94.20% (margin of error: 0.32%) and an F-measure of 93.51% (margin of error: 0.5%). Several other parameters regarding the proposed method’s performance have been presented to manifest its robustness and reliability. Details on each employed technique have been included to make the provided results reproducible. This method can be a valuable tool for mass retinal screening to detect DR, thus drastically reducing the rate of vision loss attributed to it.
Article
Full-text available
Background The resurgence of measles globally and the increasing number of unvaccinated clusters call for studies exploring factors that influence measles vaccination uptake. Armenia is a middle-income post-Soviet country with an officially high vaccination coverage. However, concerns about vaccine safety are common. The purpose of this study was to measure the prevalence of measles vaccination coverage in children under three years of age and to identify factors that are associated with measles vaccination in Armenia by using nationally representative data. Methods Cross-sectional analysis using self-report data from the most recent Armenian Demographic Health Survey (ADHS VII 2015/16) was conducted. Among 588 eligible women with a last-born child aged 12–35 months, 63 women were excluded due to unknown status of measles vaccination, resulting in 525 women included in the final analyses. We used logistic regression models in order to identify factors associated with vaccination status in the final sample. Complex sample analyses were used to account for the study design. Results In the studied population 79.6% of the children were vaccinated against measles. After adjusting for potential confounders, regression models showed that the increasing age of the child (AOR 1.07, 95% CI: 1.03–1.12), secondary education of the mothers (AOR 3.38, 95% CI: 1.17–9.76) and attendance at postnatal check-up within two months after birth (AOR 2.71, 95% CI: 1.17–6.30) were significantly associated with the vaccination status of the child. Conclusions The measles vaccination coverage among the children was lower than the recommended percentage. The study confirmed the importance of maternal education and attending postnatal care visits. However, the study also showed that there might be potential risks for future measles outbreaks because of delayed vaccinations and a large group of children with an unknown vaccination status.
Article
Full-text available
Pneumonia is a fatal disease responsible for almost one in five child deaths worldwide. Many developing countries have high mortality rates due to pneumonia because of the unavailability of proper and timely diagnostic measures. Using machine learning-based diagnosis methods can help to detect the disease early and in less time and cost. In this study, we proposed a novel method to determine the presence of pneumonia and identify its type (bacterial or viral) through analyzing chest radiographs. We performed a three-class classification based on features containing diverse information of the samples. After using an augmentation technique to balance the dataset’s sample sizes, we extracted the chest X-ray images’ statistical features, as well as global features by employing a deep learning architecture. We then combined both sets of features and performed the final classification using the RandomForest classifier. A feature selection method was also incorporated to identify the features with the highest relevance. We tested the proposed method on a widely used (but relabeled) chest radiograph dataset to evaluate its performance. The proposed model can classify the dataset’s samples with an 86.30% classification accuracy and 86.03% F-score, which assert the model’s efficacy and reliability. However, results show that the classifier struggles while distinguishing between viral and bacterial pneumonia samples. Implementing this method will provide a fast and automatic way to detect pneumonia in a patient and identify its type.
Article
Full-text available
The whole world faces a pandemic situation due to the deadly virus, namely COVID-19. It takes considerable time to get the virus well-matured to be traced, and during this time, it may be transmitted among other people. To get rid of this unexpected situation, quick identification of COVID-19 patients is required. We have designed and optimized a machine learning-based framework using inpatient’s facility data that will give a user-friendly, cost-effective, and time-efficient solution to this pandemic. The proposed framework uses Bayesian optimization to optimize the hyperparameters of the classifier and ADAptive SYNthetic (ADASYN) algorithm to balance the COVID and non-COVID classes of the dataset. Although the proposed technique has been applied to nine state-of-the-art classifiers to show the efficacy, it can be used to many classifiers and classification problems. It is evident from this study that eXtreme Gradient Boosting (XGB) provides the highest Kappa index of 97.00%. Compared to without ADASYN, our proposed approach yields an improvement in the kappa index of 96.94%. Besides, Bayesian optimization has been compared to grid search, random search to show efficiency. Furthermore, the most dominating features have been identified using SHapely Adaptive exPlanations (SHAP) analysis. A comparison has also been made among other related works. The proposed method is capable enough of tracing COVID patients spending less time than that of the conventional techniques. Finally, two potential applications, namely, clinically operable decision tree and decision support system, have been demonstrated to support clinical staff and build a recommender system.
Article
Blood-brain barrier peptides (BBPs) have a large range of biomedical applications since they can cross the blood-brain barrier based on different mechanisms. As experimental methods for the identification of BBPs are laborious and expensive, computational approaches are necessary to be developed for predicting BBPs. In this work, we describe a computational method, BBPpred (blood-brain barrier peptides prediction), that can efficiently identify BBPs using logistic regression. We investigate a wide variety of features from amino acid sequence information, and then a feature learning method is adopted to represent the informative features. To improve the prediction performance, seven informative features are selected for classification by eliminating redundant and irrelevant features. In addition, we specifically create two benchmark data sets (training and independent test), which contain a total of 119 BBPs from public databases and the literature. On the training data set, BBPpred shows promising performances with an AUC score of 0.8764 and an AUPR score of 0.8757 using the 10-fold cross-validation. We also test our new method on the independent test data set and obtain a favorable performance. We envision that BBPpred will be a useful tool for identifying, annotating, and characterizing BBPs. BBPpred is freely available at http://BBPpred.xialab.info.
Article
Protein-protein interactions (PPIs) are involved with most cellular activities at the proteomic level, making the study of PPIs necessary to comprehending any biological process. Machine learning approaches have been explored, leading to more accurate and generalized PPI predictions. In this paper, we propose a predictive framework called StackPPI. First, we use pseudo amino acid composition, Moreau-Broto, Moran and Geary autocorrelation descriptor, amino acid composition position-specific scoring matrix, Bi-gram position-specific scoring matrix and composition, transition and distribution to encode biologically relevant features. Secondly, we employ XGBoost to reduce feature noise and perform dimensionality reduction through gradient boosting and average gain. Finally, the optimized features that result are analyzed by StackPPI, a PPIs predictor we have developed from a stacked ensemble classifier consisting of random forest, extremely randomized trees and logistic regression algorithms. Five-fold cross-validation shows StackPPI can successfully predict PPIs with an ACC of 89.27%, MCC of 0.7859, AUC of 0.9561 on Helicobacter pylori, and with an ACC of 94.64%, MCC of 0.8934, AUC of 0.9810 on Saccharomyces cerevisiae. We find StackPPI improves protein interaction prediction accuracy on independent test sets compared to the state-of-the-art models. Finally, we highlight StackPPIs's ability to infer biologically significant PPI networks. StackPPI's accurate prediction of functional pathways make it the logical choice for studying the underlying mechanism of PPIs, especially as it applies to drug design. The datasets and source code used to create StackPPI are available here: https://github.com/QUST-AIBBDRC/StackPPI/.