ArticlePDF Available

Associating Measles Vaccine Uptake Classification and its Underlying Factors Using an Ensemble of Machine Learning Models

August 2021
IEEE Access PP(99):1-1

August 2021
PP(99):1-1

DOI:10.1109/ACCESS.2021.3108551

License
CC BY-NC-ND 4.0

Authors:

Md. Kamrul Hasan

Khulna University of Engineering and Technology

Md Tasnim Jawad

Khulna University of Engineering and Technology

Aishwariya Dutta

Military Institution of Science & Technology (MIST)

Md.abdul Awal

Khulna University

Show all 7 authorsHide

Measles is one of the significant public health issues responsible for the high mortality rate around the globe, especially for developing countries. Using nationally representative demographic and health survey data, measles vaccine utilization has been classified, and its underlying factors are identified through an ensemble Machine Learning (ML) approach. Firstly, missing values are imputed employing various approaches, and then several feature selection techniques have been applied to identify the crucial attributes for predicting measles vaccination. A grid search hyperparameter optimization technique has been applied for tuning the critical hyperparameters of different ML models, such as Naive Bayes, random forest, decision tree, XGboost, and lightgbm. The categorization performance of the individual optimized ML model as all as their ensembles have been reported utilizing our proposed BDHS dataset. Individually, the optimized lightgbm provides the highest precision and AUC of 79.90% and 77.80%, respectively. This result improved when the optimized lightgbm is ensembled with XGboost, providing the precision and AUC of 84.60% and 80.0%, respectively. Our result reveals that the statistical median imputation technique with the XGboost-based attribute selection method and the lightgbm classifier provides the best individual result. The performance has been improved when the proposed weighted ensemble of the XGboost and lightgbm approach has been adapted with the same preprocessing and recommended for measles vaccine utilization. The significance of our proposed approach is that it utilizes minimum attributes collected from the child and their family members and yielded 80.0%accuracy, making it easily explainable by caregivers and healthcare personnel. Finally, our predictive model provides an early detection procedure to help national policymakers enforce new policies with specific rules and regulations. The data and source codes that support the findings of this study are available at https://github.com/kamruleee51/measles_vaccine_uptake.

The complete workflow of the study, where the training dataset is further divided to perform grid search optimization for finding the best hyperparameters of the ML models.

…

2D visualization of the proposed BDHS dataset to demonstrate the inter-class homogeneities using a principal component analysis, where the x-axis and y-axis respectively denote the first and second principal components.

…

Description of the independent attributes (categorical and continuous) utilized in this research. A χ 2 -test is used for categorical attributes to describe the significant relationship with the dependent variable measles uptake, whereas the Mean ± std is used to describe continuous variables. Respondent denotes the mother of the child who is considering vaccine utilization.

…

Figures - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Content may be subject to copyright.

Content uploaded by Md. Akhtarul Islam

Content may be subject to copyright.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

Received August 2, 2021, accepted August 24, 2021, date of publication August 27, 2021, date of current version September 3, 2021.

Digital Object Identifier 10.1109/ACCESS.2021.3108551

Associating Measles Vaccine Uptake Classification

and Its Underlying Factors Using an Ensemble of

Machine Learning Models

MD. KAMRUL HASAN 1, MD. TASNIM JAWAD 1, AISHWARIYA DUTTA 2, MD. ABDUL AWAL 3,

MD. AKHTARUL ISLAM 4, MEHEDI MASUD 5, (Senior Member, IEEE),

AND JEHAD F. AL-AMRI 6

1Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh

2Department of Biomedical Engineering (BME), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh

3Electronics and Communication Engineering (ECE) Discipline, Khulna University (KU), Khulna 9208, Bangladesh

4Statistics Discipline, Khulna University (KU), Khulna 9208, Bangladesh

5Department of Computer Science, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia

6Department of Information Technology, College of Computer and Information Technology, Taif University, Taif 21994, Saudi Arabia

Corresponding author: Md. Abdul Awal (m.awal@ece.ku.ac.bd)

This work was supported by Taif University Researchers Supporting Project, Taif University, Taif, Saudi Arabia, under

Grant TURSP-2020/211.

This work involved human subjects or animals in its research. Approval of all ethical and experimental procedures and protocols was

granted by the ICF Institutional Review Board (ICF-IRB).

ABSTRACT Measles is one of the signiﬁcant public health issues responsible for the high mortality

rate around the globe, especially for developing countries. Using nationally representative demographic

and health survey data, measles vaccine utilization has been classiﬁed, and its underlying factors are

identiﬁed through an ensemble Machine Learning (ML) approach. Firstly, missing values are imputed

employing various approaches, and then several feature selection techniques have been applied to identify the

crucial attributes for predicting measles vaccination. A grid search hyperparameter optimization technique

has been applied for tuning the critical hyperparameters of different ML models, such as Naive Bayes,

random forest, decision tree, XGboost, and lightgbm. The individual optimized ML model’s categorization

performance as all their ensembles have been reported utilizing our proposed BDHS dataset. Individually,

the optimized lightgbm provides the highest precision and AUC of 79.90% and 77.80 %, respectively. This

result improved when the optimized lightgbm is ensembled with XGboost, providing the precision and AUC

of 84.60 % and 80.0%, respectively. Our result reveals that the statistical median imputation technique with

the XGboost-based attribute selection method and the lightgbm classiﬁer provides the best individual result.

The performance improved when the proposed weighted ensemble of the XGboost and lightgbm approach

was adapted with the same preprocessing and recommended for measles vaccine utilization. The signiﬁcance

of our proposed approach is that it utilizes minimum attributes collected from the child and their family

members and yielded 80.0 % accuracy, making it easily explainable by caregivers and healthcare personnel.

Finally, our predictive model provides an early detection procedure to help national policymakers enforce

new policies with speciﬁc rules and regulations. The data and source codes that support the ﬁndings of this

study are available at https://github.com/kamruleee51/measles_vaccine_uptake.

INDEX TERMS Attribute selection, measles vaccine uptake classiﬁcation, measles BDHS data, missing

value imputation, weighted ensemble ML model.

I. INTRODUCTION

Measles is a highly contagious viral disease, which is very

common in developing countries and is associated with a

The associate editor coordinating the review of this manuscript and

approving it for publication was Emre Koyuncu .

signiﬁcant level of mortality and morbidity [1], [2]. This viral

disease is vaccine-preventable, yet measles is a leading cause

of death among children among vaccine-preventable dis-

eases, and the fatality rate of measles is up to 10.0 % [3]–[5].

This vaccine-preventable disease is a crucial public health

issue in sub-Saharan Africa and South-East Asia, involving

VOLUME 9, 2021

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 119613

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

Bangladesh [6], [7]. Every year, more than 0.1 million deaths

occur due to measles, and in the ﬁrst three months of 2019,

the cases of measles have increased by 300.0 % than 2018 [1],

[8], [9]. To reduce measles and increase community-level

immunity, 95.0% measles vaccination coverage is crucial

with two doses, which will decrease related causes of

mortality and lead to elimination [10]–[13]. To realize the

elimination of measles, we can consider the statement made

by the World Health Organization (WHO) (2017) as ‘‘The

interruption of measles transmission in a deﬁned geograph-

ical area that has lasted at least 12 months and is veriﬁed

after it has been sustained for at least 36 months [14]’’.

In Bangladesh, measles vaccination coverage was 88.0 %

among children below the age of one year. The Fourth

Health, Population and Nutrition Sector Program (HPNSP)

sets a goal of 90.0 % coverage by 2022 [15]. The crucial

thing to increasing the vaccination rate is to recognize the

inﬂuencing factors associated with the utilization of measles

vaccination [16], [17]. Existing literature revealed several

inﬂuencing factors related to measles vaccination uptake [6],

[17]–[19].

This study focused on recognizing the contributing factors

to the non-utilization of measles vaccination among children

in Bangladesh. We have employed Machine Learning (ML)

techniques in four consecutive Bangladesh Demographic and

Health Survey (BDHS) data surveys from 2007 to 2017−18.

Utilizing the ML procedure may accelerate the recognition

of appropriate features related to the non-utilization of

the measles vaccine compared to other methods frequently

applied to variable selection challenges, as well as improve

the prediction accuracy of the concluding classiﬁcation

model [20]. For evidence, the authors in [21] utilized

the Synthetic Minority Over-Sampling Technique (SMOTE)

techniques to investigate the problem of imbalance in class

and found 93.90 % as a true-positive rate. In contrast,

the false-positive and false-negative rates were 5.80 % and

5.10 %, respectively. To evaluate the inﬂuencing factors

that place individuals at a higher risk of measles [22],

utilized ML techniques and found that contact with measles

patients, age, rhinorrhea, vaccination, male sex, cough,

conjunctivitis, ethnicity, and fever were the crucial factors

that were associated with measles disease. The authors

in [23] adopted the LASSO (Least Absolute Shrinkage

and Selection Operator) logistic regression model on the

electronic health record to identify message vaccine-resistant

families and obtained 72.0 % precision. They attributed 25

features based on the history of the child and their family

members. The authors employed the ML approach based

on the area level feature to predict vaccine hesitancy for

a broad range of vaccine-preventable diseases, including

measles [24]. The authors found that the random forest

provided the best performance than the gradient boosting

machine, LASSO, and neural network. The authors in [25]

explored and identiﬁed associated features to predict measles

non-vaccination from the Philippine National Demographic

and Health Survey data. They employed an Elastic Net ML

model using 32 relevant attributes comprised of geographic

location, socioeconomic condition, and features related to

children and family information. As a result, they obtained

an accuracy, sensitivity, and speciﬁcity of 79.02 %, 97.73 %,

and 23.41 %, respectively. A review article was published

in [26] to explore the usefulness of data mining and ML

approaches to explore the clinical signiﬁcance of measles and

its prediction. A multiple linear regression model was applied

in [27], and they found that the associated factors for measles

uptake were parenting and knowledge, nutritional status, and

behavior. The authors of [28] applied a logistic regression

model to ﬁnd out the association between socioeconomic

characteristics with measles uptake and revealed that measles

vaccine utilization rates are highly socially determined.

An illustration of the positive relationship between child

daycare centers, maternal and paternal education, and

measles vaccine uptake was accomplished in [29] in Ger-

many. Finally, a systematic review analysis was conducted

in [30] utilizing the primary studies and discovered that for

measles, mumps, and rubella vaccine uptake, community

health, peer judgment, conﬁdence in experts and vaccines,

responsibility toward children, and measles severity are

strongly associated. Unfortunately, research on measles and

its vaccine using ML approaches was minimal, and to our

best knowledge, in Bangladesh, with our proposed BDHS

data, this article is the ﬁrst attempt. However, the signiﬁcant

contributions and key topics covered by this article are as

follows:

•Proposing nationally representative demographic and

health survey measles data from Bangladesh, called the

BDHS dataset.

•Developing a framework for linking measles vaccine

uptake classiﬁcation and its underlying factors.

•Incorporating an integral preprocessing, which includes

missing value imputation and attribute selection strate-

gies.

•Optimizing the hyperparameters of different ML-based

models and proposing a weighted ensemble ML model

for the aimed task of this article.

•Conducting complete ablation studies for the prepro-

cessing and classiﬁer determination for recommend-

ing the best possible framework for measles vaccine

utilization.

The article’s remaining sections are arranged as fol-

lows: Section II describes the proposed BDHS dataset

and framework. Section III illustrates the achieved results

from different extensive experiments with the possible

explainability. In the end, Section IV terminates the article

with future working directions.

II. MATERIALS AND METHODS

This section elaborately manifests the materials and method-

ologies of the article. Section II-A illustrates the proposed

datasets, which were collected from Bangladesh. Section II-B

explains the proposed framework, incorporating missing

value imputation (see in Section II-B1), attribute selection

119614 VOLUME 9, 2021

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

(see in Section II-B2), and different ML classiﬁers with

the proposed ensemble classiﬁer (see in Section II-B3).

In Section II-B3, we also describe the hyperparameter opti-

mization for different ML models. In the end, we deﬁne the

evaluation indices of different comprehensive experiments in

Section II-B4.

A. PROPOSED DATASETS

1) CLINICAL INTERPRETATION OF MEASLES

The disease is initiated by an RNA respiratory virus of the

Morbillivirus genus and Paramyxoviridae family [31]–[33].

According to the WHO, the clinical meaning of measles

is that any individual with cough, coryza or conjunctivitis,

and fever generalized maculopapular rash [31], [34], [35].

Sometimes, unusual tiny white spots on the buccal mucosa

called koplik spots can be observed for measles disease

[31], [34]. Fever may be as high as 40◦C, cough,

conjunctivitis, and rash are the symptoms of measles, similar

to the symptoms of other respiratory seasonal infections [28],

[36]. These symptoms similarity may be why rapid increases

in measles cases and rapid spread occur through close contact

with one another, and routine interaction in public places

[36], [37]. The measles virus affects individuals through

respiratory droplets produced by sneezing or coughing

or through straight contact. These tiny droplets or tiny-

particle aerosols can drift in the air for prolonged durations,

and the typical contagious duration is four days after the

rash occurs [31]–[33]. Therefore, vaccine utilization to

prevent measles is crucial to growing hard immunity in the

community.

2) DATA SOURCES AND VARIABLES

This study utilized four consecutive nationally representative

Demographic and Health Survey data of Bangladesh begin-

ning from 2007, 2011, 2014, and 2017-18 [15], [38]–[40].

These datasets were collected under the National Institute

of Population Research and Training (NIPORT) authority

of the Ministry of Health and Family Welfare (MOHFW).

A Bangladeshi research organization, Mitra and Associates,

implemented the survey. In this survey, a two-stage stratiﬁed

clustering sampling technique was utilized. The total area was

divided into several enumeration areas (EA) and selected in

the ﬁrst stage, and for the second stage, several households

were selected. For instance, in the 2017-18 survey, a list

of 675 Enumeration Areas (EA) was established in the

ﬁrst stage, with 250 in urban and 425 in rural areas.

In the second stage, 30 households were taken on average

by each EA. The BDHS 2017-18 was conducted using ﬁve

types of questionnaires. In this study, we used data from

the woman’s questionnaire. This questionnaire was based on

the model questionnaires developed for the worldwide DHS-

7 Program, adjusted to the circumstances and requirements in

Bangladesh, and considering the content of the instruments

employed in earlier DHS surveys in Bangladesh [15].

Our focused question was related to children’s immu-

nizations. During this survey, women were asked questions

regarding their socioeconomic characteristics (for instance,

age, education, religion, and media exposure), reproductive

history, knowledge of uses and sources of family planning

methods, antenatal, delivery, postnatal, and newborn care,

husbands’ background, etc. [15]. Note that we have utilized

publicly identiﬁed accessible datasets, which were secondary

data for this study. This data was collected considering

all ethical issues that can be found on the DHS websites

(https://dhsprogram.com/) and is now published at Harvard

Dataverse [41]. This study excluded the ethical review

endorsement separately.

Dependent Variable: We consider measles as the depen-

dent variable with two categories. Children who took the

measles vaccine were categorized as ‘‘Yes’’, and those who

did not take the measles vaccine were categorized as ‘‘No’’.

In the following Fig. 1, we represent the prevalence rate of

measles uptake in different divisions in Bangladesh. Children

from the Barisal division recorded the lowest prevalence

(59.67 %) of measles uptake, whereas children from the

Rajshahi division showed the highest prevalence (67.91 %).

Also, in the Dhaka and Khulna divisions, the rates were

65.21 % and 63.91 %, respectively.

Independent Variable: Table. 1illustrates the different

independent variables, which are classiﬁed as categorical and

continuous attributes.

B. PROPOSED METHODOLOGIES

Fig. 2displays our proposed framework for the Measles

Disease Classiﬁcation (MDC), which incorporates two

crucial preprocessing, such as missing value imputation

and attribute selection. We apply different imputation and

attribute selection techniques to perform complete ablation

studies for the proposed BDHS datasets. The BDHS datasets

after preprocessing have been partitioned into Kfolds,

where the K−1 folds are utilized for training and ﬁne-

tuning the hyperparameters in the inner loop, employing the

grid search algorithm [42]. In the outer circle (Ktimes),

the best hyperparameters and unseen test data were utilized to

evaluate the classiﬁer in the proposed framework. Since the

proposed BDHS datasets contain imbalanced class samples,

the stratiﬁed cross-fold validation [43] has been adopted to

preserve the fundamental class specimen ratio. After training

all the ML models, an evaluation has been accomplished,

utilizing the unseen test data. Then the obtained prediction

probabilities (Pi,∀i∈N, where Nis the number of candidate

classiﬁers for ensembling) are employed to build an ensemble

classiﬁer for the MDC. The following sections describe the

integrated parts of the proposed framework in Fig. 2.

1) MISSING VALUE IMPUTATION

The real-world datasets often include missing values,

encoded as NaNs, blanks, undeﬁned, null, or any other

placeholders, for various reasons [44]. There are many

methods for replacing missing values with substituted values,

such as case deletion (Raw), missing data imputation,

model-based prediction, etc [45]. The latter method, like

VOLUME 9, 2021 119615

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

FIGURE 1. The prevalence rate of measles uptake in different divisions in Bangladesh in the proposed BDHS dataset, bestowing the higher

to lower prevalence rate using heatmap intensities.

model-based prediction, suffers from various complications,

such as it fails for the complex & blended pattern and

necessitates a long time to converge [46]. Therefore,

this article integrates statistical imputation methods in

the proposed framework in Fig. 2, such as Median and

Mode, as they are simple, easy, and faster [46]. The steps

applied for the Filling Missing Value (FMV) are presented

in Algorithm 1.

Algorithm 1 The Steps for Achieving the Applied FMV

Technique

Input: The n-dimensional uncurated data, Xin ∈Rnand

outcome, Y∈[0,1].

Output: The n-dimensional curated data, Xout ∈Rn

1Estimate class median or mode as MCi,∀i∈[0,1]

2Missing value imputation as

Xout (x)=(MC=i,∀i∈[0,1],if x=missed

x,otherwise ,

3where x∈Xin is the observation of Xin and lies in

n-dimensional attribute space

2) ATTRIBUTE SELECTION

The ML models’ accuracy increases with the addition of

the attribute’s dimension. However, it brings the curse of

dimensionality by decreasing the results by increasing the

dimension. With the extension of size without increasing

sample numbers in the feature vector, the dimensionality

of the attribute-space converted sparser, which pushed the

ML models to be overﬁtted by dropping generalizing

capacity [43]. Additionally, constructing models from

datasets with many attributes is more computationally

demanding [47]. Therefore, it is essential to incorporate

attribute reduction techniques in a classiﬁcation framework,

which is likely to build a generic ML model. The supervised

Attribute Selection (AS) method usually has better perfor-

mance among supervised, semi-supervised, and unsupervised

AS techniques [43], [48]. This article applies four most

commonly employed supervised AS methods to reduce the

attribute redundancy, namely Fisher Score (FS) [49], RF [50],

LGB [51], and XGB [52] for conducting the ablation studies

for our BDHS datasets, which are brieﬂy detailed in the

following paragraphs.

a: FS ATTRIBUTE SELECTOR

The core intention of the FS is to attain a subset of attributes

so that the lengths between data points in separate classes are

as high as possible. In contrast, the distances between data

points in the same category are as small as possible [49]. The

applied actions for the FS scheme in the AS are conferred in

Algorithm 2.

b: RF ATTRIBUTE SELECTOR

RF, a tree-based strategy, is employed for the AS in our

framework in Fig. 2, as it directly ranks the attributes by

how well it improves the purity of the node, decreasing

the impurity over all trees. Nodes with the most signiﬁcant

reduction in impurity happen at the start of the trees, while

notes with a minor drop in impurity occur at the end of

the trees. Thus, by pruning the trees below a particular

node, a subset of the essential attributes can be picked.

119616 VOLUME 9, 2021

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

TABLE 1. Description of the independent attributes (categorical and continuous) utilized in this research. A χ2-test is used for categorical attributes to

describe the significant relationship with the dependent variable measles uptake, whereas the Mean ±std is used to describe continuous variables. The

respondent is the mother of the child who is considering vaccine utilization.

FIGURE 2. The complete workflow of the study, where the training dataset is further divided to perform grid search optimization for finding

the best hyperparameters of the ML models.

The applied steps for the RF-based AS are displayed

in Algorithm 3.

c: LGB AND XGB ATTRIBUTE SELECTORS

The feature importance obtained from the LGB and XGB

are likely to be more accurate as they are way more reliable

than linear models [52]. Both models practice regularized

learning and cache-aware block structure tree learning for

ensembling learning. The gain from them represents the gain

score for each tree split, and the average growth calculates

the ﬁnal feature importance score. Finally, the selections of

the top-mranked features are obtained from their gain (see in

Algorithm 7and 8).

3) CLASSIFIERS AND HYPERPARAMETER OPTIMIZATION

Different ML classiﬁers, such as Gaussian Naive Bayes

(GNB), Bernoulli Naive Bayes (BNB), Decision Tree (DT),

Random Forest (RF), XGboos (XGB), and Lightgbm (LGB),

are trained and evaluated for the measles classiﬁcation

VOLUME 9, 2021 119617

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

Algorithm 2 The Implementing Steps for the Involved FS

Technique

Input: The d-dimensional data, Xin ∈Rn×dand

outcome, Y∈[0,1].

Output: The reduced m-dimensional data, Xout ∈Rn×m,

where m<d

1Estimate Fisher score (F) consider jth feature xj∈R1×n

of Xin as

F(xj)=Pc

k=1nk×(µj

k−µj)2

(σj)2,

where (σj)2=Pc

k=1nk×(σj

k)2and C∈[0,1] is the

class number. The kth-class mean and standard deviation

are µj

kand σj

k. Considering that µjand σjdenote the

mean and standard deviation of the whole data set

corresponding to the jth feature.

2Select the top-mranked features with large scores (F)

and store in Xout

Algorithm 3 The Steps for Implementing the Applied RF

Technique

Input: The d-dimensional data, Xin ∈Rn×dand

outcome, Y∈[0,1].

Output: The reduced m-dimensional data, Xout ∈Rn×m,

where m<d

1Compute the Out of Bag (OOB) error of a tree.

2Randomly assign each observation with ˆ

Pkto the child

nodes if the parent node kis split in X, where ˆ

Pkis the

relative frequency of observations that initially went in

the same direction of the tree.

3Recompute the OOB error of the tree (following step 2).

4Compute the difference between the original and

recomputed OOB-errors.

5Repeat steps 1–4 for each tree and apply the average

deviation over all trees as the overall importance score

(F).

6Select the top-mranked features with large scores (F)

and store them in Xout .

in the proposed framework. The following paragraphs

elaborately explain the algorithmic actions of these ML

classiﬁers.

a: GNB & BNB CLASSIFIER

The Bayesian methods are supervised learning algorithms

based on applying Bayes’ theorem with the assumption of

conditional independence between all couple of attributes

providing the value of the class variable. We employ two

variants of this classiﬁer, such as GNB and BNB. The former

variant utilizes Gaussian function as a likelihood of the

attributes, whereas the second variant applies multivariate

Bernoulli distributions. The actions for implementing those

two Bayesian classiﬁers are illustrated in Algorithm 4.

Algorithm 4 The Steps of Implementing GNB & BNB

Classiﬁers

Input: The d-dimensional data X∈Rn×dwith n

samples, and target Y∈Rn×1

Output: The posterior probability P∈[0,1] of unseen

test set x, necessitating

i=1Pi=1,∀i∈C=2, Cis the class number

1Compute the prior as P(Y=Cj)=nj

n,∀j∈C, and njis

the sample in jth class

2Estimate the output posterior probability as

P(Cj|X)=P(X|Cj)×P(Y=Cj)

P(X), where P(X|Ci) is the

likelihood of the predictor for a given class (∀j∈C)

b: RF CLASSIFIER

RF models apply the bagging method to individual trees in

the ensemble, which repeatedly chooses a random sample

with replacement from the training set and ﬁts trees to these

samples. The number of trees in the ensemble is a free

parameter that is readily automatically learned using out-

of-bag errors. The algorithmic steps for developing the RF

classiﬁer are deﬁned in Algorithm 5.

Algorithm 5 The Steps of Implementing RF Classiﬁer

Input: The d-dimensional data X∈Rn×dwith n

samples, and target Y∈Rn×1

Output: The posterior probability P∈[0,1] of unseen

test set x, necessitating

i=1Pi=1,∀i∈C=2, Cis the class number

1for b=1∼N(n_Bagging)do

2Draw a bootstrap sample, (Xb,Yb) from given

(X∈Rn×d,Y∈Rn×1)

3Grow a random-forest tree Tbusing Xband Ybby

repeating recursively using the following steps until

the minimum node size is nmin.

1) Randomly select mvariables from the given n

variables

2) Pick the best variable or split-point among the m

variables

3) Split the node into two daughter nodes

Output the ensemble of trees will be {Tb}N

4The posterior probability ˆ

RF (x)=Voting{ˆ

Pk(x)}N

where ˆ

Pk(x) is the class prediction of the kth RF.

c: DT CLASSIFIER

DT builds classiﬁcation models in a tree structure, breaking

down a data set into smaller and smaller subsets. The ﬁnal

result is a tree with decision nodes and leaf nodes, where

a decision node has two or more branches, and a leaf node

represents a classiﬁcation or decision. The topmost decision

node in a tree corresponds to the best predictor, called the root

node. Algorithm 6explains the steps of a DT model.

119618 VOLUME 9, 2021

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

Algorithm 6 The Steps of Implementing DT Classiﬁer

Input: The d-dimensional data X∈Rn×dwith n

samples, and target Y∈Rn×1

Output: The posterior probability P∈[0,1] of unseen

test set x, necessitating

i=1Pi=1,∀i∈C=2, Cis the class number

1Split θ=(j,tm) into Qleft (θ) and Qright (θ) subsets,

where θconsisting of a feature, jand threshold, tm

2Compute the impurity at kth node using an impurity

function (H),

G(Q, θ)=nleft

NmH(Qleft (θ)) +nright

NmH(Qright (θ)), where

H=X

PmC ×(1 −PmC ) or

H= − X

PmC ×log(pmC ) and

PmC =1

NmX

xi∈Rm

I(yi=C)

3Minimise the impurity by selecting the parameters,

θ∗=argminθG(Q, θ)

4Repeat the above processes for subsets Qleft (θ∗) and

Qright (θ∗) until depth reach to Nm<minsamples or

Nm=1

d: XGB CLASSIFIER

XGB falls under the category of boosting techniques in

ensemble learning, consisting of multiple models to predict

accuracy better. In this boosting technique, the errors made

by previous models are adjusted by succeeding models by

adding some weights to the models. The actions for the XGB

classiﬁer implementation are disclosed in Algorithm 7.

e: LGB CLASSIFIER

LGB is also a gradient boosting framework built on decision

tree algorithms., which applies a technique called Gradient-

Based One-Side Sampling (GOSS) and Exclusive Feature

Bundling (EFB) that beneﬁts from both leaf-wise and level-

wise strategy. Those techniques in LGB accelerate the

training process [54], [55]. Algorithm 8describes the steps

of completing the LGB classiﬁer.

f: ENSEMBLE CLASSIFIER

The six different ML models, as described earlier, are

employed for the ensemble models as they can boost the

performance of the ML-based classiﬁers [43], [56] and shown

outperforming in many applications such as pneumonia,

diabetic retinopathy classiﬁcations [57], [58]. In ensembling

approaches, the aggregation of the outputs from different

models can improve the measles vaccine uptake prediction

precision. The output from each model Pj∈RC,∀j∈

{1,2,...,m=6}(mis the number of classiﬁers) assigns

C=2 conﬁdence values yi∈R(i=1,2) to the unseen

test data, where yi∈[0,1] and

i=1

yi=1. The weighted

Algorithm 7 The Steps of Implementing XGB Classiﬁer

Input: The d-dimensional data X∈Rn×dwith n

samples, and target Y∈Rn×1

Output: The posterior probability P∈[0,1] of unseen

test set x, necessitating

i=1Pi=1,∀i∈C=2, Cis the class number

1Initialize the model with constant value:

Fo(x)=argminγ

i=1

L(Y, γ ) [53], where L(Y,F(x)) is

the differentiable loss function and Nis the number of

sample

2for m=1∼M(n_Iterations)do

3Compute pseudo-residuals, rim = −[δL(Y,F(Xi))

δF(Xi)],

where i=1,2,...,N

4Fit a base tree, hmusing training set (Xi,rim) for

i=1,2,...,N

5Compute multiplier γmby

γm=argminγ

i=1

L(Yi,Fm−1(Xi)+γhm(Xi))

6Update the model by Fm(x)=Fm−1(x)+γmhm(x)

7Fm(x) is the desired posterior probability, P∈[0,1]

aggregation of various ML models was conducted employing

the equation as in (1).

Pen

m=6

j=1

(Wj×Pij)

C=2

i=1

m=6

j=1

(Wj×Pij)

,(1)

where the weight, Wjis the jth classiﬁer’s AUC. We choose

AUC as a weight for the proposed ensemble classiﬁer

since we necessitate a class unbiased metric as a weight

to introduce a weighted soft voting ensembling. However,

the output of the ensemble model, Y∈RChas the conﬁdence

values Pen

i∈[0,1]. The ﬁnal class label of the unseen data

of our BDHS datasets, X∈Rnfrom ensemble model will be

Ciif Pen

i=max(Y(X)).

g: HYPERPARAMETER OPTIMIZATION

The performance of ML algorithms depends critically

on identifying a good set of hyperparameters, as those

algorithms are susceptible to many hyperparameters [43],

[59], [60]. However, the grid search [42] is the most basic

method, where the user speciﬁes a ﬁnite set of values

for each hyperparameter, and the grid search evaluates the

Cartesian product of these sets [60]. Let us consider that 

be the space of problem parameters P=(p1,p2,...,pm)

over which we maximize the p-value. A simple way to

set up a grid search consists in deﬁning a vector of lower

bounds L=(l1,l2,...,lm) and a vector of upper bounds

VOLUME 9, 2021 119619

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

Algorithm 8 The Steps of Implementing LGB Classiﬁer

Input: The d-dimensional data X∈Rn×dwith n

samples, and target Y∈Rn×1

Output: The posterior probability P∈[0,1] of unseen

test set x, necessitating

i=1Pi=1,∀i∈C=2, Cis the class number

1Combine mutually exclusive features of X∈Rn×dby

the exclusive feature bundling technique and set

θ0(x) =argminC

L(Yi,C)

2for m=1∼M(no_Iteration)do

3Calculate gradient absolute values as

ri= | ∂L(yi,θ (xi))

∂θ (xi)|θ(x)=θm−1(x),∀i∈n

4Resample data set using GOSS process as

top_n=a×len(X), rand_n=b×len(X),

sorted =GetSortedIndices(abs(ri)),

A=sorted[1 :top_n],

B=RandomPick(sorted [top_n:len(X)],rand _n),

and ˆ

X=A+B, where aand bare the big and slight

gradient data sampling ratios, respectively.

5Estimate information gain as

Vj(d)=1

nPxi∈Alri+1−a

bPxi∈Blri2

l(d)+

Pxi∈Arri+1−a

bPxi∈Brri2

r(d)

6Build a new decision tree as θm(ˆx) on set ˆ

7Update θm(χ)=θm−1(χ)+θm(χ)

8Finally, obtained θm(x) is the desired posterior

probability, P∈[0,1]

U=(u1,u2,...,um) for each component of P. It involves

taking nequally spaced points in each interval of the form

[Li,Ui],∀i∈m, including Liand Ui. This creates a

total of n×mpossible grid points to check. Finally, once

each pair of points is calculated, the maximum of these

values is chosen. Table 3bestows different hyperparameters

of six separate ML models, which are optimized in this

article.

4) EVALUATION INDICES

Different extensive experiments of this article are evaluated

utilizing various metrics, such as Sensitivity (Sn), Precision

(Pr), Accuracy (Acc), and the ROC curve with AUC

value [61], [62]. The former three metrics estimate the true-

positive rates, positive predictive values, and total correctly

classiﬁed samples among all the pieces. A ROC curve

conﬁrms the performance of a classiﬁcation model at all

classiﬁcation thresholds, whereas the AUC expresses the

degree or measure of separability by the classiﬁers. Since all

the experiments are conducted using a k-fold cross-validation

technique, the ﬁnal evaluation metrics are estimated using the

equation in (2) [63], [64].

Metric =1

K×

n=1

Pn±

n=1

(Pn−¯

P)2

K−1,(2)

where Kis fold numbers and Pn∈R,∀n∈K, is the

performance metric for each fold.

III. RESULTS AND DISCUSSION

This section exhibits various extensive experiments of

this article with the corresponding results in several sub-

sections. The best missing value imputation and attribute

selection methods are analyzed through comprehensive

ablation studies in Sections III-A and III-B, respectively.

The hyperparameters of different ML models are optimized

in Section III-C. In the end, Section III-D describes the

obtained results from other ML models and the proposed

weighted ensemble classiﬁers with complete ablation studies.

The effectiveness of the proposed classiﬁer has also been

validated employing a statistical ANOVA test in this section.

A. FILLING MISSING VALUES

To alleviate the missing value obstacle (see in Section II-B1),

we have applied three strategies, such as Raw (removing

those samples), Median (using median value), and Mode

(using most frequent value), as presented in Table 2. We have

applied four different BDHS datasets (see in Section II-A)

and six separate ML classiﬁers to produce the ablation

studies on various methods of FMV to choose the best

performing FMV technique for the measle categorization.

The experimental results in Table 2reveal that the Median

and Mode techniques outperform most of the cases with

a signiﬁcant margin than the Raw method, while the Raw

method beats them in the remaining cases with a low

margin. The observation in all the BDHS datasets (as

explained in Section II-A) reveals that the percentage of

missing values is signiﬁcantly less than the total samples

(13.7 %). Moreover, only one feature (Antenatal visits (A19))

contains the missing values out of nineteen features. Since

the number of missing values and the attribute containing

missing values are signiﬁcantly smaller, the obtained AUCs

from all the classiﬁers for all the proposed datasets are

almost similar for all the MVF strategies, with a little bit

better in the Median and Mode methods in most cases (see

in Table 2).

Again, the visual inspection in Fig. 3exposes that the

populations of the A19 feature for all the BDHS datasets

follow the normal distribution, conferring similar values of

mode, median, and mean. Such as median and mode values

are responsible for getting similar AUCs for the Median

and Mode methods of FMV policies for all the datasets and

classiﬁers. Since the Median method outperforms the other

two FMV methods (see in Table 2), this method is applied in

the rest of the experiments of this article.

119620 VOLUME 9, 2021

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

TABLE 2. Extensive experimental results in terms of AUC for the missing value imputation, employing three imputation methods, four different BDHS

datasets, and six different classifiers, where the best imputation method for each dataset and classifier is underlined with a blue color.

FIGURE 3. Normal distribution of an A19 attribute of all the BDHS datasets containing the missing values, where (a) for BDHS-2007, (b) for

BDHS-2011, (c) for BDHS-2014, and (d) for BDHS-2020.

B. ATTRIBUTE SELECTION

AS methods have been integrated into the recommended

framework for ﬁnding the smallest subset of features,

yielding increased performance. However, it is imprac-

tical to guess the proper AS method without ablation

studies, as those methods’ performance often varies with

the applications. This article explores four distinct AS

methods without attribute transformation (thus conserving

the interpretation) and six different classiﬁers for the

measle uptake classiﬁcation task to conduct a complete

ablation study. Fig. 4displays the AS results from different

experiments.

The AS results from the FS-based method conﬁrm that

the LGB classiﬁer achieves the highest possible AUC of

approximately 0.75 utilizing top 13 ∼14 attributes (see

in Fig. 4(a)). The other classiﬁers also demonstrate their

corresponding highest AUC at that number of features.

Again, the RF method also shows the highest performance

utilizing top 9 ∼11 attributes, with a maximum AUC of

0.74 for the same LGB classiﬁer (see in Fig. 4(b)). Another

AS method, named LGB-based AS, explicates its maximum

AUC of 0.74 at top 7 ∼8 attributes for the LGB classiﬁers

(see in Fig. 4(c)). Although the FS outperforms the RF-

and LGB-based approach by a margin of 1.0 %, the former

technique demands more attributes, approximately double

than the LGB-based scheme. The remaining last method,

called XGB-based AS, confers the best AUC of roughly 0.76

for the same LGB classiﬁer with top 3 ∼5 attributes (see

in Fig. 4(d)).

All the results in Fig. 4demonstrate that the XBG-

based AS method outperforms the RF- and LGB-based

techniques by the margins of 2.0 % and FS-based system

by a 1.0 % boundary. The FS-based AS method reveals the

discriminative power of each feature independently from

others, without indicating anything on the combination of

mutual information, leading to poor MDC results. Like the

FS-based method, the RF-based approach also points to low

MDC results, as it outputs higher importance to the attributes

without considering their correlation. It is noteworthy that

the classiﬁers expose their corresponding highest AUC at the

top 3 ∼5 attributes when the XBG-based AS approach is

employed. It is remarkably clear from all the ﬁgures in Fig. 4

that almost all the classiﬁers depict the same patterns with

varying attribute numbers, where the classiﬁers yield the best

results for the same attribute numbers. The AS experiments

quantitatively approve the MDC attribute ranking by the

XGB-based AS process, providing an order of A13, A14,

A1, A17, A19, A7, A12, A11, A16, A18, A8, A6, A4, A15,

A3, A2, A9, A5, and A10 (high to low importance), where

ﬁrst 3 ∼5 attributes yield best AUCs for the MDC. The

obtained attributes’ ranking points to the logical results as it

provides a better ranking of the features, which are related

to respondents’ ever-born children’s numbers, age of ﬁrst

birth, current age, birth order, and antenatal visit during the

VOLUME 9, 2021 119621

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

FIGURE 4. AUC versus the number of features of the proposed BDHS dataset, employing four distinct attribute selection algorithms and six individual

classifiers. The attribute numbers are varied from top 2 ∼19 to explore their characteristics in the proposed BDHS dataset.

TABLE 3. The tuned hyperparameters of six ML models with the highest possible AUC for the MDC.

pregnancy etc. Since the XGB-based AS scheme publicizes

the best results for the measle classiﬁcation with fewer

attribute numbers, it has been involved in the rest of the

experiments in this article.

C. HYPERPARAMETER OPTIMIZATION

The best-obtained FMV and AS methods from those two

previous experiments are used for the hyperparameter

optimization of six different ML models to attain the

maximum possible AUCs. Table 3exposes the list of

ML models’ hyperparameters with their optimized values,

employing a grid search strategy in the proposed framework.

The optimized hyperparameter values are picked from

the set of predeﬁned values in a grid by a searching

algorithm by maximizing AUC for the MDC, as described in

Section II-B3.

119622 VOLUME 9, 2021

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

TABLE 4. The measle classification results employing six separate ML models and proposed weighted ensemble models, incorporating missing value

imputation, attribute selection, and hyperparameter optimization. The best metrics obtained from a single ML model are presented in bold fonts, and

those metrics from the proposed ensembling models are underlined with a blue color.

D. CLASSIFIERS

The measles classiﬁcation results employing different ML

models, the best performing FMV and AS methods, utilizing

the proposed BDHS datasets, are presented in Table 4.

a: INDIVIDUAL ML CLASSIFIERS

Again, the measles classiﬁcation from the tree-based clas-

siﬁers, such as RF and DT, shows that the RF model

outperforms three cases out of four cases with signiﬁcant

margins than the DT model. Although the DT model is

less biased towards the positive class, the performance of

the RF model is far better in terms of Acc and AUC.

Technically, the RF model reduces the variance component

of error rather than the bias component as in the DT model.

Hence, the DT model has better deals with bias, while the

RF model has better accuracy. Such concepts have been

reﬂected in the measles classiﬁcation of this article as the

DT model wins in terms of positive predictive value (Pr),

and the RF model outperforms in terms of Acc. Furthermore,

contrasting the boosting-based classiﬁers’ (XGB and LGB)

results, it is perceived that the LGB has more Sn, Acc, and

AUC, while the XGB has better Pr. Although the XGB model

has a slightly better positive predictive value (Pr), the LGB

model has better remaining three metrics (see in Table 4).

Although both the XGB and LGB models are based on

the boosting mechanism, the XGB model cannot supervise

categorical attributes by itself, unlike LGB or CatBoost

[65], [66]. Therefore, the LGB is the winner model for

the given BDHS dataset, which mainly holds categorical

attributes. However, confronting all the single ML models,

the applied LBG has better deals with the measles catego-

rization in the proposed BDHS dataset when the proposed

preprocessing and hyperparameter optimization are practiced

(see ﬁrst six rows in Table 4). Such a result has proven

the superiority of the LGB model to classifying the measles

disease concerning accuracy and AUC.

b: ENSEMBLING ML CLASSIFIERS

To further enhance the measle categorization results, we per-

formed an ablation study to build an ensembling classiﬁer,

FIGURE 5. 2D visualization of the proposed BDHS dataset to demonstrate

the inter-class homogeneities using a principal component analysis,

where the x-axis and y-axis respectively denote the first and second

principal components.

as it has been proven earlier that such a classiﬁer provides

better results (see details in Section II-B3). Table 4displays

the results for all the proposed weighted ensembling models.

Firstly, we aggregate the Bayesian, tree-based, and boosting

ML models to build three ensembling models, where the

AUC of the individual model acts as a weight of that

model for the aggregation. The results of those three models

show that the proposed LGB+XGB wins three cases, such

as Pr, Acc, and AUC, out of four cases with a high

degree of margin (see in 7 ∼9th rows of Table 4).

Although the results obtained from the GNB+BNB model

shows 100.0 % Sn, it is very unfortunate that this model

predicts all the samples as positive (as the positive predictive

value (Pr) is the same as the positive class prior probability

(Ppos) (Pr =Ppos =0.749)). Such results reveal that

the classiﬁcation by the ensemble of Bayesian models of

a dataset with lots of inter-class homogeneities (see the

class similarity in the BDHS dataset in Fig. 5) is not a

suitable choice as it is experimentally approved in this

article.

VOLUME 9, 2021 119623

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

FIGURE 6. The ROC curves of two different ensemble models, such as (a) LGB+XGB and (b) GNB+BNB+XGB+LGB, for the measles

classification utilizing the proposed approach.

Secondly, the weighted aggregation of two different

type model mechanisms, such as Bayesian with tree-based,

Bayesian with boosting-based, and tree and boosting-based,

points out that the proposed GNB+BNB+XGB+LGB

increases the overall accuracy with the reduced Sn, Pr,

and AUC (see in 10 ∼12th rows of Table 4). The

other two models, such as GNB+BNB+DT+RF and

DT+RF+XGB+LGB, do not produce any success of those

types of ensembling. However, the ROC curves in Fig. 6

yield the explainability of revealing the superiority of the

LGB+XGB and GNB+BNB+XGB+LGB models.

Although those two ROC curves confer almost similar

AUC values, they mainly differ in their accuracy point (see

red cross points in both the ROC curves). The left ROC curve

for the LGB+XGB model shows around 88.0 % true-positive

rates with 47.0 % false-positive rates at its accuracy point (see

blue dashed line in left ﬁgure). Similarly, the right ROC curve

for the GNB+BNB+XGB+LGB model produces around

98.0 % true-positive rates with 70.0 % false-positive rates

at its accuracy point (see blue dashed line in left ﬁgure).

Such results confer that to increase 10.0 % true-positive rates,

we must accept 22.0 % false-positive rates, which is not

a better alternative in the medical diagnostic application.

Therefore, the LGB+XGB model deals better with both the

true- and false-positive rates, providing the highest possible

AUC of 80.0%. Thirdly, the weighted ensembling of the

Bayesian-, tree-, and boosting-based models cannot further

improve the classiﬁcation results; instead, it reduces the

performance. Again, we explore the two AS techniques, such

as LGB- and XGB-based AS, on all the proposed ensembling

models, whose results are visualized in Fig. 7.

The AS results in Fig. 7again exhibits a similar pattern as

they conferred in Section III-B. The varying attribute results

on all the proposed models (see in Fig. 7) acknowledge that

the XGB-based AS method again outperforms the XGB-

based AS process, providing the maximum AUC of 0.80. All

the models exhibit a similar pattern with varying attributes,

demonstrating better results for the XGB+LGB classiﬁer

with top-5 attributes. The obtained attributes’ ranking using

the XGB-based AS method and the proposed XGB+LGB

classiﬁer notches similar logical results, as in Section III-B,

giving a better ranking to the respondent has ever born

children numbers, age of ﬁrst birth, current age, birth order,

and antenatal visit during the pregnancy.

Furthermore, the experimental results from different clas-

siﬁcation models, utilizing the proposed best preprocessing,

have been authorized employing a statistical test called

ANOVA and 10-fold cross-validation. Fig. 8dispenses the

Box and Whisker plot of the AUC values of this validation

test. For ANOVA testing, α=0.05 is applied as a

threshold to reject the Null hypothesis (all classiﬁers’

means are equal) if p-value ≤0.05, which outcomes signif-

icant results. The ANOVA test demonstrates a p-value of

7.93 ×10−38 (≤0.05), which reveals that an alternative

hypothesis is accepted, strongly pointing that none of the

means are equal (also displayed in Fig. 8). Again, a post

hoc T-test (Bonferroni correction) is incorporated with the

ANOVA test for deciding the better classiﬁcation model in

the recommended classiﬁcation system, which conﬁrms the

superiority of the offered weighted ensemble XGB+LGB

classiﬁer.

c: YEAR-WISE CROSS-FOLD VALIDATION

All the previous results are carried utilizing a one-year

BDHS dataset employing 5-fold cross-validation, where we

have proposed four-year BDHS datasets (n=4) (see

in Section II-A). We evaluate the proposed framework,

incorporating missing value imputation, AS method, and

proposed weighted ensembling model, utilizing all the BDHS

datasets, where data acts as one fold each year. In this

experiment, ith (∀i∈n)-year dataset is utilized as a test

set, and the remaining three datasets are used as a training

set and iterate n=4 times to test all the data in a year-

wise fashion. In this way, we have validated our proposed

119624 VOLUME 9, 2021

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

FIGURE 7. AUC versus the number of features of the proposed BDHS dataset, employing two distinct AS algorithms and the proposed weighted

ensembling classifiers. The attribute numbers are varied from top 2 ∼19 to explore their characteristics in the proposed BDHS dataset.

FIGURE 8. Box and Whisker plot of the AUC values obtained from 10-fold

cross-validation on different ML-based classifiers, where Model-1 to

Model-13, respectively, denote GNB, BNB, RF, DT, XGB, LGB, GNB +BNB,

RF +DT, XGB +LGB, GNB +BNB +RF +DT, RF +DT +LGB +XGB, GNB +

BNB +LGB +XGB, and GNB +BNB +RF +DT +XGB +LGB classifiers.

prediction and showed the generalization capability of our

proposed approach. The ROC curve in Fig. 9represents the

results of this experiment. The obtained ROC curve clariﬁes

that the proposed framework achieves an average AUC of

0.781 with a standard deviation of 0.005. Although the

average AUC The following paragraphs elaborately explain

the algorithmic actions of these ML classiﬁers. Using all the

BDHS datasets is less than the individual dataset utilization,

the standard deviation (inter-fold variation) is much higher.

Such a result reveals that the utilization of more samples

increases the model’s genericity with signiﬁcantly fewer

inter-fold variations.

d: FRAMEWORK SUPERIORITY COMPARED TO OTHER

STUDIES

It is unreasonable to compare the recommended framework

with the published frameworks, as we utilized our newly

proposed BDHS datasets (see dataset details in Section II-A).

FIGURE 9. The ROC curve best performing ensemble model, named

LGB+XGB, for the measles classification utilizing the proposed approach

and all the BDHS datasets.

However, it is the ﬁrst attempt to suggest an AI-based

framework for the endeavored task using nationally represen-

tative demographic and health survey data from Bangladesh.

Additionally, the contributions in this article focused on

identifying the contributing factor of the non-utilization

of measles vaccination among children in Bangladesh.

However, the authors in [25] utilized Philippine National

Demographic and Health Survey data, using 32 relevant

attributes comprised of geographic location, socioeconomic

condition, and features related to children and family

information, which obtained an accuracy of 79.02 %. Another

article in [23] received 72.0 % precision, using 25 attributes

based on the history of the child and their family members.

In contrast, our framework achieved an accuracy of 78.70 %

and precision of 84.60 %, using only 3 ∼5 attributes, such

as respondents’ ever-born children numbers, ﬁrst birth’s age,

current age, birth order number, and antenatal visit during the

pregnancy. Such above discussions reveal the preponderance

VOLUME 9, 2021 119625

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

of the recommended AI-based system as it provides better

results with the least number of attributes.

IV. CONCLUSION

This article schemes and optimizes a novel ML-based

framework for measles vaccine uptake classiﬁcation and

correlates its underlying factors. The whole research has been

succeeded based on the newly proposed BDHS datasets. The

recommended framework reveals that a weightedensemble of

ML models successfully enhances the classiﬁcation results,

as it weighted aggregates the output probabilities of the

ensemble candidates’ model. Furthermore, the integration

of missing value imputation and attribute selection as a

preprocessing also heightens the aimed outcome. Adopting

those preprocessing methods is critical, necessitating a

complete ablation study to determine the essentially suitable

methods. Moreover, compared to other studies, our research

provides a more accurate model using only 3 ∼5 attributes,

namely respondents’ ever-born children numbers, ﬁrst birth’s

age, current age, birth order number, and antenatal visit

during the pregnancy, which are easily explainable. We hope

that this study will help national policymakers to give more

importance to these attributes and to ensure ‘‘hard-immunity’’

in the community.

CONFLICT OF INTEREST

The authors have not any conﬂicts to disclose this research.

AUTHOR CONTRIBUTIONS

Md. Kamrul Hasan and Md. Abdul Awal conceived of the

presented idea and planned the experiments. Md. Abdul

Awal and Md. Akhtarul Islam conceptualized the original

idea. Md. Kamrul Hasan and Md. Abdul Awal designed

the model and the computational framework, analyzed the

data, and Md. Kamrul Hasan carried out the implementation.

Md. Kamrul Hasan, Md. Tasnim Jawad, and Aishwariya

Dutta carried out the experiments. Md. Kamrul Hasan,

Md. Tasnim Jawad, Aishwariya Dutta, Md. Abdul Awal, and

Md. Akhtarul Islam wrote the manuscript with support from

Mehedi Masud and Jehad F. Al-Amri, and Mehedi.Masud and

Jehad F. Al-Amri edited the manuscript. All authors provided

critical feedback and helped shape the research, analysis,

and manuscript. Md. Kamrul Hasan, Md. Abdul Awal, and

Md. Akhtarul Islam supervised the project.

MATERIAL AVAILABILITY

This data was collected considering all ethical issues that

can be found on the DHS websites (https://dhsprogram.com/)

and now published at Harvard Dataverse [41]. This study

excluded the ethical review endorsement separately. The data

and source codes that support the ﬁndings of this study are

available at https://github.com/kamruleee51/measles_vaccine

_uptake.

REFERENCES

[1] W. J. Moss, ‘‘Measles,’’ Lancet, vol. 390, no. 10111, pp. 2490–2502,

2017. [Online]. Available: https://www.sciencedirect.com/science/

article/pii/S0140673617314630

[2] H. Q. McLean, A. P. Fiebelkorn, J. L. Temte, and G. S. Wallace,

‘‘Prevention of measles, rubella, congenital rubella syndrome, and

mumps, 2013: Summary recommendations of the Advisory Committee

on Immunization Practices (ACIP),’’ Morbidity Mortality Weekly Rep.,

Recommendations Rep., vol. 62, no. 4, pp. 1–34, 2013.

[3] R. Fernandez, A. Rammohan, and N. Awofeso, ‘‘Correlates of ﬁrst dose

of measles vaccination delivery and uptake in Indonesia,’’ Asian Paciﬁc J.

Tropical Med., vol. 4, no. 2, pp. 140–145, Feb. 2011.

[4] S. Izadi, S.-M. Zahraie, and M. Sartipi, ‘‘An investigation into a measles

outbreak in southeast Iran,’’ Jpn. J. Infectious Diseases, vol. 65, no. 1,

pp. 45–51, 2012.

[5] A. Mahamud, A. Burton, M. Hassan, J. A. Ahmed, J. B. Wagacha,

P. Spiegel, C. Haskew, R. B. Eidex, S. Shetty, S. Cookson,

C. Navarro-Colorado, and J. L. Goodson, ‘‘Risk factors for measles

mortality among hospitalized Somali refugees displaced by famine,

Kenya, 2011,’’ Clin. Infectious Diseases, vol. 57, no. 8, pp. e160–e166,

Oct. 2013.

[6] N. Sheikh, M. Sultana, N. Ali, R. Akram, R. Mahumud, M. Asaduzzaman,

and A. Sarker, ‘‘Coverage, timelines, and determinants of incomplete

immunization in Bangladesh,’’ Tropical Med. Infectious Disease, vol. 3,

no. 3, p. 72, Jun. 2018.

[7] R. E. Black, S. Cousens, H. L. Johnson, J. E. Lawn, I. Rudan, D. G. Bassani,

P. Jha, H. Campbell, C. F. Walker, R. Cibulskis, T. Eisele, L. Liu, and

C. Mathers, ‘‘Global, regional, and national causes of child mortality in

2008: A systematic analysis,’’ Lancet, vol. 375, no. 9730, pp. 1969–1987,

Jun. 2010.

[8] New Measles Surveillance Data for 2019, World Health Organization,

Geneva, Switzerland, 2019, vol. 24.

[9] A. C. Kantner, S. H. van Wees, E. M. G. Olsson, and S. Ziaei, ‘‘Factors

associated with measles vaccination status in children under the age of

three years in a post-Soviet context: A cross-sectional study using the DHS

VII in Armenia,’’ BMC Public Health, vol. 21, no. 1, pp. 1–10, Dec. 2021.

[10] P. Plans-Rubió, ‘‘Why does measles persist in Europe?’’ Eur. J. Clin.

Microbiol. Infectious Diseases, vol. 36, no. 10, pp. 1899–1906, Oct. 2017.

[11] Y. Hu, Y. Chen, Y. Wang, and H. Liang, ‘‘Evaluation of potentially

achievable vaccination coverage of the second dose of measles containing

vaccine with simultaneous administration and risk factors for missed

opportunities among children in Zhejiang province, East China,’’ Hum.

Vaccines Immunotherapeutics, vol. 14, no. 4, pp. 875–880, Apr. 2018.

[12] P. Plans-Rubió, ‘‘Low percentages of measles vaccination coverage with

two doses of vaccine and low herd immunity levels explain measles

incidence and persistence of measles in the European union in 2017–

2018,’’ Eur. J. Clin. Microbiol. Infectious Diseases, vol. 38, no. 9,

pp. 1719–1729, Sep. 2019.

[13] J. P. Higgins, K. Soares-Weiser, J. A. López-López, A. Kakourou,

K. Chaplin, H. Christensen, N. K. Martin, J. A. Sterne, and A. L. Reingold,

‘‘Association of BCG, DTP, and measles containing vaccines with

childhood mortality: Systematic review,’’ Brit. Med. J., vol. 355, Oct. 2016,

Art. no. i5170.

[14] O. M. de la Santé, ‘‘Measles vaccines: Who position paper—April 2017-

note de synthèse de l’OMS sur les vaccins contre la rougeole-avril 20177,’’

Weekly Epidemiolog. Record= Relevé épidémiologique hebdomadaire,

vol. 92, no. 17, pp. 205–227, 2017.

[15] Bangladesh Demographic and Health Survey 2017–18: Key Indicators,

National Institute of Population Research and Training (NIPORT), Dhaka,

Bangladesh, 2019.

[16] M. D. C. Tauil, A. P. S. Sato, and E. A. Waldman, ‘‘Factors associated with

incomplete or delayed vaccination across countries: A systematic review,’’

Vaccine, vol. 34, no. 24, pp. 2635–2643, May 2016.

[17] S. Bhattacherjee, P. Dasgupta, A. Mukherjee, and S. Dasgupta, ‘‘Vaccine

hesitancy for childhood vaccinations in slum areas of Siliguri, India,’’

Indian J. Public Health, vol. 62, no. 4, p. 253, 2018.

[18] R. Rossi, ‘‘Do maternal living arrangements inﬂuence the vaccination

status of children age 12–23 months? A data analysis of demographic

health surveys 2010–11 from Zimbabwe,’’ PLoS ONE, vol. 10, no. 7,

Jul. 2015, Art. no. e0132357.

[19] S. Walsh, D. R. Thomas, B. W. Mason, and M. R. Evans, ‘‘The impact of

the media on the decision of parents in south Wales to accept measles-

mumps-rubella (MMR) immunization,’’ Epidemiol. Infection, vol. 143,

no. 3, pp. 550–560, Feb. 2015.

119626 VOLUME 9, 2021

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

[20] S. Engebretsen and J. Bohlin, ‘‘Statistical predictions with glmnet,’’ Clin.

Epigenetics, vol. 11, no. 1, pp. 1–3, Dec. 2019.

[21] W. M. T. W. Ahmad, N. Ghani, and S. M. Drus, ‘‘Handling imbalanced

class problem of measles infection risk prediction model,’’ Int. J. Eng. Adv.

Technol., vol. 9, no. 1, pp. 3431–3435, 2019.

[22] J. Nazari, P.-S. Fathi, N. Sharahi, M. Taheri, P. Amini, and

A. Almasi-Hashiani, ‘‘Evaluating measles incidence rates using machine

learning and time series methods in the center of Iran; 1997–2020,’’

Tech. Rep., 2020.

[23] A. Bell, A. Rich, M. Teng, T. Oreskovic, N. B. Bras, L. Mestrinho,

S. Golubovic, I. Pristas, and L. Zejnilovic, ‘‘Proactive advising: A machine

learning driven approach to vaccine hesitancy,’’ in Proc. IEEE Int. Conf.

Healthcare Informat. (ICHI), Jun. 2019, pp. 1–6.

[24] V. Carrieri, R. Lagravinese, and G. Resce, ‘‘Predicting vaccine hesitancy

from area-level indicators: A machine learning approach,’’ MedRxiv,

Mar. 2021.

[25] O. O. Bucaro, ‘‘Exploring relevant features associated with

measles nonvaccination using a machine learning approach,’’

Tech. Rep., 2020. [Online]. Available: https://www.diva-portal.org/smash/

get/diva2:1461628/FULLTEXT01.pdf

[26] A. S. Rao, D. A. D’Mello, R. Anand, and S. Nayak, ‘‘Clinical signiﬁcance

of measles and its prediction using data mining techniques: A systematic

review,’’ in Advances in Artiﬁcial Intelligence and Data Engineering.

Singapore: Springer, 2021, pp. 737–759.

[27] A. Susilowati, Y. Wijayanti, and I. M. Sudana, ‘‘The inﬂuencing risk

factors of measles in Bantul regency,’’ Public Health Perspective J., vol. 4,

no. 2, pp. 129–140, 2019.

[28] V. D. Kien, H. Van Minh, K. B. Giang, V. Q. Mai, N. T. Tuan,

and M. B. Quam, ‘‘Trends in childhood measles vaccination highlight

socioeconomic inequalities in Vietnam,’’ Int. J. Public Health, vol. 62,

no. S1, pp. 41–49, Feb. 2017.

[29] C. Hagemann, A. Streng, A. Kraemer, and J. G. Liese, ‘‘Heterogeneity

in coverage for measles and varicella vaccination in toddlers—Analysis

of factors inﬂuencing parental acceptance,’’ BMC Public Health, vol. 17,

no. 1, pp. 1–10, Dec. 2017.

[30] A. B. Wilder-Smith and K. Qureshi, ‘‘Resurgence of measles in Europe:

A systematic review on parental attitudes and beliefs of measles vaccine,’’

J. Epidemiol. Global Health, vol. 10, no. 1, p. 46, 2019.

[31] D. E. Grifﬁn, ‘‘Measles vaccine,’’ Viral Immunol., vol.31, no. 2, pp. 86–95,

2018.

[32] R. D. de Vries, A. W. Mesman, T. B. Geijtenbeek, W. P. Duprex, and

R. L. de Swart, ‘‘The pathogenesis of measles,’’ Current Opinion Virol.,

vol. 2, no. 3, pp. 248–255, 2012.

[33] R. Buchanan and D. J. Bonthius, ‘‘Measles virus and associated central

nervous system sequelae,’’ Seminars Pediatric Neurol., vol. 19, no. 3,

pp. 107–114, Sep. 2012.

[34] J. C. Bester, ‘‘Measles and measles vaccination: A review,’’ JAMA

Pediatrics, vol. 170, no. 12, pp. 1209–1215, 2016.

[35] W. J. Moss and D. E. Grifﬁn, ‘‘Global measles elimination,’’ Nature Rev.

Microbiol., vol. 4, no. 12, pp. 900–908, Dec. 2006.

[36] L. K. Tannous, G. Barlow, and N. H. Metcalfe, ‘‘A short clinical

review of vaccination against measles,’’ JRSM open, vol. 5, no. 4, 2014,

Art. no. 2054270414523408.

[37] R. T. Perry and N. A. Halsey, ‘‘The clinical signiﬁcance of measles: A

review,’’ J. Infectious Diseases, vol. 189, no. 1, pp. S4–S16, May 2004.

[38] Bangladesh Demographic and Health Survey, Mitra and Associates

(Firm), M. I. I. for Resource Development Demographic and Health

Survey, National Institute of Population Research and Training (NIPORT),

Dhaka, Bangladesh, 2011.

[39] Bangladesh Demographic and Health Survey 2014: Key Indicators,

National Institute of Population Research and Training (NIPORT), Mitra,

Associates, and II, Dhaka, Bangladesh, 2015.

[40] Bangladesh Demographic Health Survey, 2007, National Institute of

Population Research and Training (NIPORT), Mitra, Associates, (Firm),

and Macro International, Dhaka, Bangladesh, 2009

[41] M. K. Hasan, J. M. Tasnim, A. Dutta, A. M. Abdul, M. A. Islam,

M. Mehedi, and F. Al-Amr Jehad, ‘‘Measles,’’ Harvard Dataverse, V1,

Tech. Rep. UNF:6:CG4S8sYltZv8Btm5uCF/aA==[ﬁleUNF], 2021, doi:

10.7910/DVN/S76AZS.

[42] D. Krstajic, L. J. Buturovic, D. E. Leahy, and S. Thomas, ‘‘Cross-validation

pitfalls when selecting and assessing regression and classiﬁcation models,’’

J. Cheminform., vol. 6, no. 1, pp. 1–15, Dec. 2014.

[43] M. K. Hasan, M. A. Alam, D. Das, E. Hossain, and M. Hasan, ‘‘Diabetes

prediction using ensembling of different machine learning classiﬁers,’’

IEEE Access, vol. 8, pp. 76516–76531, 2020.

[44] A. Purwar and S. K. Singh, ‘‘Hybrid prediction model with missing

value imputation for medical data,’’ Expert Syst. Appl., vol. 42, no. 13,

pp. 5621–5631, Aug. 2015.

[45] P. J. García-Laencina, J.-L. Sancho-Gómez, and A. R. Figueiras-Vidal,

‘‘Pattern classiﬁcation with missing data: A review,’’ Neural Comput.

Appl., vol. 19, no. 2, pp. 263–282, 2010.

[46] T. Aljuaid and S. Sasi, ‘‘Proper imputation techniques for missing values in

data sets,’’ in Proc. Int. Conf. Data Sci. Eng. (ICDSE), Aug. 2016, pp. 1–5.

[47] F. Korn, B.-U. Pagel, and C. Faloutsos, ‘‘‘On the ‘dimensionality curse’

and the ‘self-similarity blessing,’’’ IEEE Trans. Knowl. Data Eng., vol. 13,

no. 1, pp. 96–111, Jan./Feb. 2001.

[48] A. Jovic, K. Brkic, and N. Bogunovic, ‘‘A review of feature selection

methods with applications,’’ in Proc. 38th Int. Conv. Inf. Commun.

Technol., Electron. Microelectron. (MIPRO), May 2015, pp. 1200–1205.

[49] Q. Gu, Z. Li, and J. Han, ‘‘Generalized Fisher score for feature

selection,’’ 2012, arXiv:1202.3725. [Online]. Available: http://arxiv.org/

abs/1202.3725

[50] B. H. Menze, B. M. Kelm, R. Masuch, U. Himmelreich, P. Bachert,

W. Petrich, and F. A. Hamprecht, ‘‘A comparison of random forest and

its Gini importance with standard chemometric methods for the feature

selection and classiﬁcation of spectral data,’’ BMC Bioinf., vol. 10, no. 1,

pp. 1–16, 2009.

[51] Y. Ye, C. Liu, N. Zemiti, and C. Yang, ‘‘Optimal feature selection for EMG-

based ﬁnger force estimation using LightGBM model,’’ in Proc. 28th

IEEE Int. Conf. Robot Hum. Interact. Commun. (RO-MAN), Oct. 2019,

pp. 1–7.

[52] C. Chen, Q. Zhang, B. Yu, Z. Yu, P. J. Lawrence, Q. Ma, and

Y. Zhang, ‘‘Improving protein-protein interactions prediction accuracy

using XGBoost feature selection and stacked ensemble classiﬁer,’’

Comput. Biol. Med., vol. 123, Aug. 2020, Art. no. 103899.

[53] T. Chen and C. Guestrin, ‘‘XGBoost: A scalable tree boosting system,’’

in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,

Aug. 2016, pp. 785–794.

[54] M. Ustuner and F. Balik Sanli, ‘‘Polarimetric target decompositions and

light gradient boosting machine for crop classiﬁcation: A comparative

evaluation,’’ ISPRS Int. J. Geo-Inf., vol. 8, no. 2, p. 97, Feb. 2019.

[55] A. A. Taha and S. J. Malebary, ‘‘An intelligent approach to credit card

fraud detection using an optimized light gradient boosting machine,’’ IEEE

Access, vol. 8, pp. 25579–25587, 2020.

[56] S.-L. Hsieh, S.-H. Hsieh, P.-H. Cheng, C.-H. Chen, K.-P. Hsu, I.-S. Lee,

Z. Wang, and F. Lai, ‘‘Design ensemble machine learning model for breast

cancer diagnosis,’’ J. Med. Syst., vol. 36, no. 5, pp. 2841–2847, Oct. 2012.

[57] N. Sikder, M. Masud, A. K. Bairagi, A. S. M. Arif, A.-A. Nahid,

andH. A. Alhumyani, ‘‘Severity classiﬁcation of diabetic retinopathy

using an ensemble learning algorithm through analyzing retinal images,’’

Symmetry, vol. 13, no. 4, p. 670, Apr. 2021.

[58] M. Masud, A. K. Bairagi, A.-A. Nahid, N. Sikder, S. Rubaiee, A. Ahmed,

and D. Anand, ‘‘A pneumonia diagnosis scheme based on hybrid

features extracted from chest radiographs using an ensemble learning

algorithm,’’ J. Healthcare Eng., vol. 2021, pp. 1–11, Feb. 2021, doi:

10.1155/2021/8862089.

[59] M. A. Awal, M. Masud, M. S. Hossain, A. A.-M. Bulbul,

S. M. H. Mahmud, and A. K. Bairagi, ‘‘A novel Bayesian optimization-

based machine learning framework for COVID-19 detection from inpatient

facility data,’’ IEEE Access, vol. 9, pp. 10263–10281, 2021.

[60] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar,

‘‘Hyperband: A novel bandit-based approach to hyperparameter optimiza-

tion,’’ J. Mach. Learn. Res., vol. 18, no. 1, pp. 6765–6816, 2017.

[61] R. Dai, W. Zhang, W. Tang, E. Wynendaele, Q. Zhu, Y. Bin,

B. De Spiegeleer, and J. Xia, ‘‘BBPpred: Sequence-based prediction of

blood-brain barrier peptides with feature representation learning and

logistic regression,’’ J. Chem. Inf. Model., vol. 61, no. 1, pp. 525–534,

Jan. 2021.

[62] N. Cheng, M. Li, L. Zhao, B. Zhang, Y. Yang, C.-H. Zheng, and J. Xia,

‘‘Comparison and integration of computational methods for deleterious

synonymous mutation prediction,’’ Brieﬁngs Bioinf., vol. 21, no. 3,

pp. 970–981, May 2020, doi: 10.1093/bib/bbz047.

[63] M. K. Hasan, T. A. Aleef, and S. Roy, ‘‘Automatic mass classiﬁcation

in breast using transfer learning of deep convolutional neural network

and support vector machine,’’ in Proc. IEEE Region Symp. (TENSYMP),

Jun. 2020, pp. 110–113.

VOLUME 9, 2021 119627

M. K. Hasan et al.: Associating Measles Vaccine Uptake Classification and Its Underlying Factors

[64] M. A. Awal, M. S. Hossain, K. Debjit, N. Ahmed, R. D. Nath,

G. M. M. Habib, M. S. Khan, M. A. Islam, and M. A. P. Mahmud,

‘‘An early detection of asthma using BOMLA detector,’’ IEEE Access,

vol. 9, pp. 58403–58420, 2021.

[65] A. V. Dorogush, V. Ershov, and A. Gulin, ‘‘CatBoost: Gradient boosting

with categorical features support,’’ 2018, arXiv:1810.11363. [Online].

Available: http://arxiv.org/abs/1810.11363

[66] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu,

‘‘LightGBM: A highly efﬁcient gradient boosting decision tree,’’ in Proc.

Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 3146–3154.

MD. KAMRUL HASAN received the B.Sc.

and M.Sc. degrees in electrical and electronic

engineering (EEE) from Khulna University of

Engineering & Technology (KUET), in 2014 and

2017, respectively, and the M.Sc. degree in

medical imaging and application (MAIA) from the

University of Burgundy, France, the University of

Cassino and Southern Lazio, Italy, and the Uni-

versity of Girona, Spain, as an Erasmus Scholar,

in 2019. He is currently working as an Assistant

Professor with the EEE Department, KUET. His research interests include

medical image and data analysis, machine learning, deep convolutional

neural network, medical image reconstruction, augmented reality, and

surgical robotics in minimally invasive surgery. He is currently a supervisor

of several undergraduate students on the classiﬁcation, segmentation, and

registration of medical images with different modalities. His previous works

were published in various journals such as Medical Image Analysis (MIA;

Elsevier), Computer in Biology and Medicine (CBM; Elsevier), Artiﬁcial

Intelligence in Medicine (AIIM; Elsevier), Biomedical Signal Processing

and Control (BSCP; Elsevier), and IEEE ACCESS.

MD. TASNIM JAWAD was born in Rangpur,

Bangladesh, in 2000. He is currently pursuing

the B.Sc. degree in electrical and electronic engi-

neering with Khulna University of Engineering

& Technology. He is also taking supplementary

courses from online educational providers, such as

Coursera and Udemy in machine learning and deep

learning. His current research interests include

image classiﬁcation, audio classiﬁcation, medical

image processing, convolutional neural networks,

recurrent neural networks, and generative adversarial networks.

AISHWARIYA DUTTA received the B.Sc. degree

in biomedical engineering (BME) from Khulna

University of Engineering & Technology (KUET),

where she is currently pursuing the master’s

degree with the Department of Biomedical Engi-

neering (BME). She has published one con-

ference paper in the 4th International Joint

Conference on Advances in Computational Intel-

ligence (IJCACI), in 2020, and also coauthored

one international journal article. Her research

interests include machine learning and its applications, deep learning,

biomedical imaging, biomedical signal processing, and nanotechnology in

bioengineering.

MD. ABDUL AWAL received the B.Sc. degree in

electronics and communication engineering (ECE)

from the ECE Discipline, Khulna University,

in 2009, the M.Sc. degree in biomedical engi-

neering from Khulna University of Engineering

& Technology, in 2011, and the Ph.D. degree

in biomedical engineering from The University

of Queensland, Australia, in 2018. He is cur-

rently working as an Associate Professor with

the ECE Discipline, Khulna University, Khulna,

Bangladesh. He is also investigating some projects as the Principal

Investigator and a Co-Investigator and supervising several undergraduate

and post-graduate students. His research interests include signal processing,

especially biomedical signal processing, big data analysis, image processing,

time-frequency analysis, machine learning algorithms, deep learning,

optimization, and computational intelligence biomedical engineering. He has

more than 40 papers published in internationally accredited journals and

conferences.

MD. AKHTARUL ISLAM received the B.Sc. and

M.S. degree in statistics biostatistics & informat-

ics from Dhaka University, Dhaka, Bangladesh,

in 2012 and 2013, respectively. He is currently

working as an Assistant Professor with the

Statistics Discipline, Khulna University, Khulna,

Bangladesh. He has authored or coauthored

around 12 publications in different peer-reviewed

journals. His research interests include bio-

statistics, epidemiology, public health, infectious

disease, meta-analysis, statistical computing, and multivariate analysis.

MEHEDI MASUD (Senior Member, IEEE)

received the Ph.D. degree in computer science

from the University of Ottawa, Canada. He is

currently a Full Professor with the Department

of Computer Science, Taif University, Taif, Saudi

Arabia. He has authored or coauthored around

50 publications, including refereed IEEE, ACM,

Springer, and Elsevier journals, conference papers,

books, and book chapters. His research interests

include cloud computing, distributed algorithms,

data security, data interoperability, formal methods, and cloud and

multimedia for healthcare. He has served as a Technical Program Committee

Member of different international conferences. He is a recipient of a number

of awards, including the Research in Excellence Award from Taif University.

He is on the Associate Editorial Board of IEEE ACCESS and International

Journal of Knowledge Society Research (IJKSR). He is an Editorial Board

Member of Journal of Software. He also served as the Guest Editor of

ComSIS journal and Journal of Universal Computer Science (JUCS). He is

a member of ACM.

JEHAD F. AL-AMRI received the degree from

the Centre for Computing and Social Responsi-

bility, De Montfort University. He is currently

an Associate Professor with the Department of

Information Technology, Faculty of Computers

and Information Technology, Taif University,

Saudi Arabia. His research interests include cloud

computing security, multimedia security, image

encryption, steganography, and medical image

processing.

119628 VOLUME 9, 2021

Early Prediction of Diabetes Using an Ensemble of Machine Learning Models

Article

Full-text available

Sep 2022
Int J Environ Res Publ Health

Diabetes is one of the most rapidly spreading diseases in the world, resulting in an array of significant complications, including cardiovascular disease, kidney failure, diabetic retinopathy, and neuropathy, among others, which contribute to an increase in morbidity and mortality rate. If diabetes is diagnosed at an early stage, its severity and underlying risk factors can be significantly reduced. However, there is a shortage of labeled data and the occurrence of outliers or data missingness in clinical datasets that are reliable and effective for diabetes prediction, making it a challenging endeavor. Therefore, we introduce a newly labeled diabetes dataset from a South Asian nation (Bangladesh). In addition, we suggest an automated classification pipeline that includes a weighted ensemble of machine learning (ML) classifiers: Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), XGBoost (XGB), and LightGBM (LGB). Grid search hyperparameter optimization is employed to tune the critical hyperparameters of these ML models. Furthermore, missing value imputation, feature selection, and K-fold cross-validation are included in the framework design. A statistical analysis of variance (ANOVA) test reveals that the performance of diabetes prediction significantly improves when the proposed weighted ensemble (DT + RF + XGB + LGB) is executed with the introduced preprocessing, with the highest accuracy of 0.735 and an area under the ROC curve (AUC) of 0.832. In conjunction with the suggested ensemble model, our statistical imputation and RF-based feature selection techniques produced the best results for early diabetes prediction. Moreover, the presented new dataset will contribute to developing and implementing robust ML models for diabetes prediction utilizing population-level data. Keywords: artificial intelligence; diabetes prediction; ensemble ML classifier; filling missing value; outlier rejection; South Asian diabetes dataset

A survey, review, and future trends of skin lesion segmentation and classification

Article

Full-text available

Feb 2023
COMPUT BIOL MED

The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include: relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on developing an automated and robust CAD system for skin lesion analysis.

Data-Driven Models Informed by Spatiotemporal Mobility Patterns for Understanding Infectious Disease Dynamics

Article

Full-text available

Jul 2023
ISPRS

Data-driven approaches predict infectious disease dynamics by considering various factors that influence severity and transmission rates. However, these factors may not fully capture the dynamic nature of disease transmission, limiting prediction accuracy and consistency. Our proposed data-driven approach integrates spatiotemporal human mobility patterns from detailed point-of-interest clustering and population flow data. These patterns inform the creation of mobility-informed risk indices, which serve as auxiliary factors in data-driven models for detecting outbreaks and predicting prevalence trends. We evaluated our approach using real-world COVID-19 outbreaks in Beijing and Guangzhou, China. Incorporating the risk indices, our models successfully identified 87% (95% Confidence Interval: 83–90%) of affected subdistricts in Beijing and Guangzhou. These findings highlight the effectiveness of our approach in identifying high-risk areas for targeted disease containment. Our approach was also tested with COVID-19 prevalence data in the United States, which showed that including the risk indices reduced the mean absolute error and improved the R-squared value for predicting weekly case increases at the county level. It demonstrates applicability for spatiotemporal forecasting of widespread diseases, contributing to routine transmission surveillance. By leveraging comprehensive mobility data, we provide valuable insights to optimize control strategies for emerging infectious diseases and facilitate proactive measures against long-standing diseases.

An Automatic Deep Neural Network Model for Fingerprint Classification

Article

Full-text available

Jan 2023

The accuracy of fingerprint recognition model is extremely important due to its usage in forensic and security fields. Any fingerprint recognition system has particular network architecture whereas many other networks achieve higher accuracy. To solve this problem in a unified model, this paper proposes a model that can automatically specify itself. So, it is called an automatic deep neural network (ADNN). Our algorithm can specify the appropriate architecture ofthe neural network used and some significant parameters of this network. These parameters are the number offilters, epochs, and iterations. It guarantees the highest accuracy by updating itself until achieving 99% accuracy then it stops and outputs the result. Moreover, this paper proposes an end-to-end methodology for recognizing a person’s identity from the input fingerprint image based on a residual convolutional neural network. It is a complete system and is fully automated whether in the features extraction stage or the classification stage. Our goal is to automate this fingerprint recognition system because the more automatic the system is, the more time and effort it saves. Our model also allows users to react by inputting the initial values of these parameters. Then, the model updates itself until it finds the optimal values for the parameters and achieves the best accuracy. Another advantage of our algorithm is that it can recognize people from their thumb and other fingers and its ability to recognize distorted samples. Our algorithm achieved 99.75% accuracy on the public fingerprint dataset (SOCOFing). This is the best accuracy compared with other models.

Gestational Diabetes Prediction in Pregnancy: A Machine Learning and Data Preprocessing Approach

Conference Paper

Dec 2023

Optimized Ensembled Model to Predict Diabetes Using Machine Learning

Chapter

Feb 2024

Hyperparameters Optimization in XGBoost Model for Rainfall Estimation: A Case Study in Pontianak City

Article

Full-text available

Sep 2023

Estimating rainfall accurately is crucial for both the community and various institutions involved in managing water resources and preventing disasters. The XGBoost model has demonstrated its effectiveness in predicting rainfall, but it still requires fine-tuning of hyperparameters to enhance its performance. This study seeks to determine the optimal learning rate for rainfall prediction while keeping the max_depth and n_estimator parameters fixed. The hyperparameter optimization process was carried out using a two-step approach: an initial coarse search using RandomizedSearchCV followed by a more detailed fine-tuning using GridSearchCV. The model's foundation relied on historical rainfall data gathered over three months from the Automated Weather Observed System (AWOS) at the Pontianak Meteorological Station, recorded on an hourly basis. To assess the model's performance, several metrics were employed, including accuracy, precision, recall, F1 score, and ROC-AUC. The model demonstrated promising results, with accuracy, precision, recall, and F1 score all reaching 95%, indicating its ability to effectively predict rainfall. However, the ROC-AUC score was somewhat lower at 62%. After conducting the hyperparameter search, the optimal learning rate determined for the model, utilizing the 2040 dataset, was found to be 0.204.

A SURVEY ON IDENTIFICATION AND DIAGNOSIS OF DISEASES USING MACHINE LEARNING

Article

Jan 2023

The field of artificial intelligence to which machine learning belongs. We use machine learning methods like K-nearest neighbor(KNN), and Linear regression algorithm to detect and diagnose illnesses in this work. The dataset is trained using supervised learning, Reinforcement learning methods in order to construct a logical mathematical model. In the context of learning models, the datasets are employed for purposes such as data analysis and illness diagnosis. The purpose of the Disease Prediction using Machine Learning (ML) system is to make predictions about diseases based on the symptoms reported by patients or other users. The user inputs their symptoms, and the machine returns the likelihood that they have a certain ailment. In machine learning, disease prognosis relies on disease prediction.

Ensemble of Boosting Algorithms for Parkinson Disease Diagnosis

Chapter

Full-text available

Jan 2023

Parkinson’s disease (PD) is a common dynamic neurodegenerative disorder due to the lack of the brain’s chemical dopamine, impairing motor and nonmotor symptoms. The PD patients undergo vocal cord dysfunctions, producing speech impairment, an early and essential PD indicator. The researchers are contributing to building generic data-driven decision-making systems due to the non-availability of the medical test(s) for the early PD diagnosis. This article has provided an automatic decision-making framework for PD detection by proposing a weighted ensemble of machine learning (ML) boosting classifiers: random forest (RF), AdaBoost (AdB), and XGBoost (XGB). The introduced framework has incorporated outlier rejection (OR) and attribute selection (AS) as the recommended preprocessing. The experimental results reveal that the one-class support vector machine-based OR followed by information gain-based AS performs the best preprocessing in the aimed task. Additionally, one of the proposed ensemble models has outputted an average area under the ROC curve (AUC) of 0.972, outperforming the individual RF, AdB, and XGB classifiers with the margins of \(0.5\,\%\), \(3.7\,\%\), and \(1.4\,\%\), respectively, while the advised preprocessing is incorporated. Since the suggested system provides better PD diagnosis results, it can be a practical decision-making tool for clinicians in PD diagnosis.KeywordsParkinson diseaseOutlier rejectionAttribute selectionMachine learning modelsEnsemble classifiers

Optimization of the ADMET Properties for the Anti-Breast Cancer Medicine Based on Agent Model

Article

Jan 2022

Evaluating Measles Incidence Rates Using Machine Learning and Time Series Methods in the Center of Iran, 1997-2020

Article

Full-text available

Apr 2022

Background: Measles is a feverish condition labeled among the most infectious viral illnesses in the globe. Despite the presence of a secure, accessible, affordable and efficient vaccine, measles continues to be a worldwide concern. Methods: This epidemiologic study used machine learning and time series methods to assess factors that placed people at a higher risk of measles. The study contained the measles incidence in Markazi Province, the center of Iran, from Apr 1997 to Feb 2020. In addition to machine learning, zero-inflated negative binomial regression for time series was utilized to assess development of measles over time. Results: The incidence of measles was 14.5% over the recent 24 years and a constant trend of almost zero cases were observed from 2002 to 2020. The order of independent variable importance were recent years, age, vaccination, rhinorrhea, male sex, contact with measles patients, cough, conjunctivitis, ethnic, and fever. Only 7 new cases were forecasted for the next two years. Bagging and random forest were the most accurate classification methods. Conclusion: Even if the numbers of new cases were almost zero during recent years, age and contact were responsible for non-occurrence of measles. October and May are prone to have new cases for 2021 and 2022.

Predicting vaccine hesitancy from area‐level indicators: A machine learning approach

Article

Full-text available

Sep 2021
HEALTH ECON

Vaccine hesitancy (VH) might represent a serious threat to the next COVID‐19 mass immunization campaign. We use machine learning algorithms to predict communities at a high risk of VH relying on area‐level indicators easily available to policymakers. We illustrate our approach on data from child immunization campaigns for seven nonmandatory vaccines carried out in 6062 Italian municipalities in 2016. A battery of machine learning models is compared in terms of area under the receiver operating characteristics curve. We find that the Random Forest algorithm best predicts areas with a high risk of VH improving the unpredictable baseline level by 24% in terms of accuracy. Among the area‐level indicators, the proportion of waste recycling and the employment rate are found to be the most powerful predictors of high VH. This can support policymakers to target area‐level provaccine awareness campaigns.

An Early Detection of Asthma using BOMLA Detector

Article

Full-text available

Apr 2021

Asthma is a chronic and airway-induced disease, causing the incidence of bronchus inflammation, breathlessness, wheezing, is drastically becoming life-threatening. Even in the worst cases, it may destroy the quality to lead. Therefore, early detection of asthma is urgently needed, and machine learning can help identify asthma accurately. In this paper, a novel machine learning framework, namely BOMLA (Bayesian Optimisation-based Machine Learning framework for Asthma) detector has been proposed to detect asthma. Ten classifiers have been utilized in the BOMLA detector, where Support Vector Classifier (SVC), Random Forest (RF), Gradient Boosting Classifier (GBC), eXtreme Gradient Boosting (XGB), and Artificial Neural Network (ANN) are state-of-the-art classifiers. In contrast, Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QLDA), Naive Bayes (NB), Decision Tree (DT), and K-Nearest Neighbor (KNN) are conventional popular classifiers. ADASYN algorithm has also been employed in the BOMLA detector to eradicate the issues created due to the imbalanced dataset. It has even been attempted to delineate how the ADASYN algorithm affects the classification performance. The highest accuracy (ACC) and Matthews’s correlation coefficient (MCC) for an Asthma dataset provide 94.35% and 88.97%, respectively, using BOMLA detector when SVC is adapted, and it has been increased to 96.52% and 93.04%, respectively, when ensemble technique is adapted. The one-way analysis of variance (ANOVA) has also been performed in the 10-fold cross-validation to measure the statistical significance. A decision support system has been built as a potential application of the proposed system to visualize the probable outcome of the patient. Finally, it is expected that the BOMLA detector will help patients in their early diagnosis of asthma.

Severity Classification of Diabetic Retinopathy Using an Ensemble Learning Algorithm through Analyzing Retinal Images

Article

Full-text available

Apr 2021

Diabetic Retinopathy (DR) refers to the damages endured by the retina as an effect of diabetes. DR has become a severe health concern worldwide, as the number of diabetes patients is soaring uncountably. Periodic eye examination allows doctors to detect DR in patients at an early stage to initiate proper treatments. Advancements in artificial intelligence and camera technology have allowed us to automate the diagnosis of DR, which can benefit millions of patients indeed. This paper inscribes a novel method for DR diagnosis based on the gray-level intensity and texture features extracted from fundus images using a decision tree-based ensemble learning technique. This study primarily works with the Asia Pacific Tele-Ophthalmology Society 2019 Blindness Detection (APTOS 2019 BD) dataset. We undertook several steps to curate its contents to make them more suitable for machine learning applications. Our approach incorporates several image processing techniques, two feature extraction techniques, and one feature selection technique, which results in a classification accuracy of 94.20% (margin of error: 0.32%) and an F-measure of 93.51% (margin of error: 0.5%). Several other parameters regarding the proposed method’s performance have been presented to manifest its robustness and reliability. Details on each employed technique have been included to make the provided results reproducible. This method can be a valuable tool for mass retinal screening to detect DR, thus drastically reducing the rate of vision loss attributed to it.

Factors associated with measles vaccination status in children under the age of three years in a post-soviet context: a cross-sectional study using the DHS VII in Armenia

Article

Full-text available

Mar 2021
BMC PUBLIC HEALTH

Background The resurgence of measles globally and the increasing number of unvaccinated clusters call for studies exploring factors that influence measles vaccination uptake. Armenia is a middle-income post-Soviet country with an officially high vaccination coverage. However, concerns about vaccine safety are common. The purpose of this study was to measure the prevalence of measles vaccination coverage in children under three years of age and to identify factors that are associated with measles vaccination in Armenia by using nationally representative data. Methods Cross-sectional analysis using self-report data from the most recent Armenian Demographic Health Survey (ADHS VII 2015/16) was conducted. Among 588 eligible women with a last-born child aged 12–35 months, 63 women were excluded due to unknown status of measles vaccination, resulting in 525 women included in the final analyses. We used logistic regression models in order to identify factors associated with vaccination status in the final sample. Complex sample analyses were used to account for the study design. Results In the studied population 79.6% of the children were vaccinated against measles. After adjusting for potential confounders, regression models showed that the increasing age of the child (AOR 1.07, 95% CI: 1.03–1.12), secondary education of the mothers (AOR 3.38, 95% CI: 1.17–9.76) and attendance at postnatal check-up within two months after birth (AOR 2.71, 95% CI: 1.17–6.30) were significantly associated with the vaccination status of the child. Conclusions The measles vaccination coverage among the children was lower than the recommended percentage. The study confirmed the importance of maternal education and attending postnatal care visits. However, the study also showed that there might be potential risks for future measles outbreaks because of delayed vaccinations and a large group of children with an unknown vaccination status.

A Pneumonia Diagnosis Scheme Based on Hybrid Features Extracted from Chest Radiographs Using an Ensemble Learning Algorithm

Article

Full-text available

Feb 2021

Pneumonia is a fatal disease responsible for almost one in five child deaths worldwide. Many developing countries have high mortality rates due to pneumonia because of the unavailability of proper and timely diagnostic measures. Using machine learning-based diagnosis methods can help to detect the disease early and in less time and cost. In this study, we proposed a novel method to determine the presence of pneumonia and identify its type (bacterial or viral) through analyzing chest radiographs. We performed a three-class classification based on features containing diverse information of the samples. After using an augmentation technique to balance the dataset’s sample sizes, we extracted the chest X-ray images’ statistical features, as well as global features by employing a deep learning architecture. We then combined both sets of features and performed the final classification using the RandomForest classifier. A feature selection method was also incorporated to identify the features with the highest relevance. We tested the proposed method on a widely used (but relabeled) chest radiograph dataset to evaluate its performance. The proposed model can classify the dataset’s samples with an 86.30% classification accuracy and 86.03% F-score, which assert the model’s efficacy and reliability. However, results show that the classifier struggles while distinguishing between viral and bacterial pneumonia samples. Implementing this method will provide a fast and automatic way to detect pneumonia in a patient and identify its type.

A Novel Bayesian Optimization-Based Machine Learning Framework for COVID-19 Detection from Inpatient Facility Data

Article

Full-text available

Jan 2021

The whole world faces a pandemic situation due to the deadly virus, namely COVID-19. It takes considerable time to get the virus well-matured to be traced, and during this time, it may be transmitted among other people. To get rid of this unexpected situation, quick identification of COVID-19 patients is required. We have designed and optimized a machine learning-based framework using inpatient’s facility data that will give a user-friendly, cost-effective, and time-efficient solution to this pandemic. The proposed framework uses Bayesian optimization to optimize the hyperparameters of the classifier and ADAptive SYNthetic (ADASYN) algorithm to balance the COVID and non-COVID classes of the dataset. Although the proposed technique has been applied to nine state-of-the-art classifiers to show the efficacy, it can be used to many classifiers and classification problems. It is evident from this study that eXtreme Gradient Boosting (XGB) provides the highest Kappa index of 97.00%. Compared to without ADASYN, our proposed approach yields an improvement in the kappa index of 96.94%. Besides, Bayesian optimization has been compared to grid search, random search to show efficiency. Furthermore, the most dominating features have been identified using SHapely Adaptive exPlanations (SHAP) analysis. A comparison has also been made among other related works. The proposed method is capable enough of tracing COVID patients spending less time than that of the conventional techniques. Finally, two potential applications, namely, clinically operable decision tree and decision support system, have been demonstrated to support clinical staff and build a recommender system.

Clinical Significance of Measles and Its Prediction Using Data Mining Techniques: A Systematic Review

Chapter

Full-text available

Jan 2021

BBPpred: Sequence-Based Prediction of Blood-Brain Barrier Peptides with Feature Representation Learning and Logistic Regression

Article

Jan 2021

Blood-brain barrier peptides (BBPs) have a large range of biomedical applications since they can cross the blood-brain barrier based on different mechanisms. As experimental methods for the identification of BBPs are laborious and expensive, computational approaches are necessary to be developed for predicting BBPs. In this work, we describe a computational method, BBPpred (blood-brain barrier peptides prediction), that can efficiently identify BBPs using logistic regression. We investigate a wide variety of features from amino acid sequence information, and then a feature learning method is adopted to represent the informative features. To improve the prediction performance, seven informative features are selected for classification by eliminating redundant and irrelevant features. In addition, we specifically create two benchmark data sets (training and independent test), which contain a total of 119 BBPs from public databases and the literature. On the training data set, BBPpred shows promising performances with an AUC score of 0.8764 and an AUPR score of 0.8757 using the 10-fold cross-validation. We also test our new method on the independent test data set and obtain a favorable performance. We envision that BBPpred will be a useful tool for identifying, annotating, and characterizing BBPs. BBPpred is freely available at http://BBPpred.xialab.info.

Improving protein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier

Article

Jul 2020
COMPUT BIOL MED

Protein-protein interactions (PPIs) are involved with most cellular activities at the proteomic level, making the study of PPIs necessary to comprehending any biological process. Machine learning approaches have been explored, leading to more accurate and generalized PPI predictions. In this paper, we propose a predictive framework called StackPPI. First, we use pseudo amino acid composition, Moreau-Broto, Moran and Geary autocorrelation descriptor, amino acid composition position-specific scoring matrix, Bi-gram position-specific scoring matrix and composition, transition and distribution to encode biologically relevant features. Secondly, we employ XGBoost to reduce feature noise and perform dimensionality reduction through gradient boosting and average gain. Finally, the optimized features that result are analyzed by StackPPI, a PPIs predictor we have developed from a stacked ensemble classifier consisting of random forest, extremely randomized trees and logistic regression algorithms. Five-fold cross-validation shows StackPPI can successfully predict PPIs with an ACC of 89.27%, MCC of 0.7859, AUC of 0.9561 on Helicobacter pylori, and with an ACC of 94.64%, MCC of 0.8934, AUC of 0.9810 on Saccharomyces cerevisiae. We find StackPPI improves protein interaction prediction accuracy on independent test sets compared to the state-of-the-art models. Finally, we highlight StackPPIs's ability to infer biologically significant PPI networks. StackPPI's accurate prediction of functional pathways make it the logical choice for studying the underlying mechanism of PPIs, especially as it applies to drug design. The datasets and source code used to create StackPPI are available here: https://github.com/QUST-AIBBDRC/StackPPI/.

Associating Measles Vaccine Uptake Classification and its Underlying Factors Using an Ensemble of Machine Learning Models

Abstract and Figures

Recommended publications

Early Prediction of Diabetes Using an Ensemble of Machine Learning Models

Factors associated with measles vaccination status in children under the age of three years in a pos...

Frequency of Meningitis in Cases with a History of Getting the MMR Vaccine within the Past 45 Days

The measles emergency is over, but the crisis continues – a call to action for the Pacific Islands