ArticlePDF Available

Classification and Categorization of COVID-19 Outbreak in Pakistan

Tech Science Press
Computers, Materials & Continua
Authors:

Abstract and Figures

Coronavirus is a potentially fatal disease that normally occurs in mammals and birds. Generally, in humans, the virus spreads through aerial droplets of any type of fluid secreted from the body of an infected person. Coronavirus is a family of viruses that is more lethal than other unpremed-itated viruses. In December 2019, a new variant, i.e., a novel coronavirus (COVID-19) developed in Wuhan province, China. Since January 23, 2020, the number of infected individuals has increased rapidly, affecting the health and economies of many countries, including Pakistan. The objective of this research is to provide a system to classify and categorize the COVID-19 outbreak in Pakistan based on the data collected every day from different regions of Pakistan. This research also compares the performance of machine learning classifiers (i.e., Decision Tree (DT), Naive Bayes (NB), Support Vector Machine, and Logistic Regression) on the COVID-19 dataset collected in Pakistan. According to the experimental results, DT and NB classifiers out-performed the other classifiers. In addition, the classified data is categorized by implementing a Bayesian Regularization Artificial Neural Network (BRANN) classifier. The results demonstrate that the BRANN classifier outperforms state-of-the-art classifiers.
Content may be subject to copyright.
ech
T
PressScience
Computers, Materials & Continua
DOI:10.32604/cmc.2021.015655
Article
Classication and Categorization of COVID-19 Outbreak in Pakistan
Amber Ayoub1, Kainaat Mahboob1, Abdul Rehman Javed2, Muhammad Rizwan1,
Thippa Reddy Gadekallu2, Mustufa Haider Abidi3,*and Mohammed Alkahtani4,5
1Department of Computer Science, Kinnaird College for Women, Lahore, 54000, Pakistan
2Department of Cyber Security, Air University, Islamabad, Pakistan
3School of Information Technology and Engineering, Vellore Institute of Technology, Tamil Nadu, India
4Raytheon Chair for Systems Engineering, Advanced Manufacturing Institute, King Saud University,
Riyadh, 11421, Saudi Arabia
5Industrial Engineering Department, College of Engineering, King Saud University, Riyadh, 11421, Saudi Arabia
*Corresponding Author: Mustufa Haider Abidi. Email: mabidi@ksu.edu.sa
Received: 01 December 2020; Accepted: 05 February 2021
Abstract: Coronavirus is a potentially fatal disease that normally occurs in
mammals and birds. Generally, in humans, the virus spreads through aerial
droplets of any type of uid secreted from the body of an infected person.
Coronavirus is a family of viruses that is more lethal than other unpremed-
itated viruses. In December 2019, a new variant, i.e., a novel coronavirus
(COVID-19) developed in Wuhan province, China. Since January 23, 2020,
the number of infected individuals has increased rapidly, affecting the health
and economies of many countries, including Pakistan. The objective of this
research is to provide a system to classify and categorize the COVID-19
outbreak in Pakistan based on the data collected every day from different
regions of Pakistan. This research also compares the performance of machine
learning classiers (i.e., Decision Tree (DT), Naive Bayes (NB), Support Vec-
tor Machine, and Logistic Regression) on the COVID-19 dataset collected in
Pakistan. According to the experimental results, DT and NB classiers out-
performed the other classiers. In addition, theclassied data is categorized by
implementing a Bayesian Regularization Articial Neural Network (BRANN)
classier. The results demonstrate that the BRANN classier outperforms
state-of-the-art classiers.
Keywords: COVID-19; pandemic; neural network; BRANN; machine
learning
1 Introduction
The COVID-19 outbreak that appeared in Wuhan, China at the end of December 2019 was
initially considered a pneumonia based on etiology. The virus soon spread worldwide at a rapid
rate [1]. On January 30, 2020, the World Health Organization (WHO) declared the COVID-19
outbreak a Public Health Emergency of International Concern [2,3]. This virus has affected
people in more than 209 nations around the world. The overheads of the coronavirus outbreak
This work is licensed under a Creative Commons Attribution 4.0 International License,
which permits unrestricted use, distribution, and reproduction in any medium, provided
the original work is properly cited.
1254 CMC, 2021, vol.69, no.1
are continually increasing. When this virus rst started to spread, there were approximately 600
conrmed cases in China. Globally, the number of people who have died because of this virus has
been increasing daily [4]. The WHO determined that the most common symptoms of this virus
are tiredness, fever, and dry cough [5]. Most people with these symptoms can recover without
extraordinary treatment or prescriptions. However, some patients have more severe symptoms,
such as a runny nose, sore throat, nasal congestion, and general or severe pain. Typically, 80%
of people who became infected have severe symptoms [6]. In the United Kingdom, the National
Health Service (NHS) has reported cases with more severe side effects, including high fever and
persistent cough. The NHS recommends that anybody with these sorts of symptoms should self-
quarantine for 7 to 14 days [7]. The infection spreads between individuals in close contact who
are exposed to respiratory aerosol droplets that are emitted, primarily when an infected person
coughs or sneezes, or shouts, sings, or talks.
For the most part, the droplets do not travel signicant distances. Typically, they fall to
the ground or onto immediate surfaces. Transmission may also occur through little droplets that
can remain suspended in the air for longer periods of time [8]. People may become infected by
touching a contaminated surface and then touching their face [9]. Outbreaks and rapid spread are
highly expected, even before symptoms are noticeable, and from individuals who do not possess
any symptoms of being infected by the virus, but they carry it [10]. Fig. 1 represents the worldwide
spread of this coronavirus. It is believed that the virus did not spread in Pakistan the way it
spread in other countries, like China, the USA, and Italy. Pakistan, with permeable borders, is
sandwiched between two focal points of this coronavirus (China and Iran).
Figure 1: Worldwide spread of COVID-19
Recently, Pakistan has reinforced their precautions against COVID-19 by various strategies,
such as detailing the use of national crisis readiness, compulsory thermal screenings at all entry
points, observation of regional spread, contact tracing, and information assortment through
various sources. Testing has been reinforced by bringing in Polymerase Chain Reaction units
for SARS-COV-2 diagnostics [11]. Assets have been deployed to setup quarantine centers in
preparation of expected cases. Locations for these stations include a few urban areas, emergency
clinics, and reconnaissance units that have been actuated to track the contacts of afrmed cases,
as suggested by the WHO [10]. The COVID-19 infection has spread to more than 213 nations,
and as of April 17, 2020, there were 1,995,983 conrmed cases and 131,037 deaths [12].
Pakistan revealed its initial two positive cases on February 26, 2020. These cases were con-
nected to travel to Iran [13]. The number of positive cases across the nation rose to 7,025 on
April 17th , 2020: 3,276 positive cases and 135 deaths in Punjab, 2,008 cases in Sindh, 993 cases
in Khyber Pakhtunkhwa, 303 cases in Balochistan, 237 cases in Gilgit Baltistan, 154 cases in
Islamabad Capital Territory (ICT), and 46 cases in Azad Jammu Kashmir [14].
CMC, 2021, vol.69, no.1 1255
The number of positive cases is rising rapidly every day. In fact, in most countries, the number
of cases is probably much higher than recorded, due to limited testing [14,15]. Fig. 2 shows the
number of total coronavirus cases in Pakistan. The exponential increase in cases has driven the
Government to force total and severe lockdowns in numerous urban areas [16].
Figure 2: COVID-19 cases in Pakistan
Fig. 3 shows the total number of COVID-19 cases, the total number of deaths, and the total
number of recovered cases in different regions of Pakistan.
Figure 3: Total recovered cases, deaths, and conrmed cases in Pakistan
1.1 Problem Statement
Deaths due to COVID-19 are increasing day by day in Pakistan. The nature of the COVID-19
outbreak differs in various countries. For example, in China, Iran, and France, COVID-19 out-
break is characterized by extremely high numbers and severe cases. The outbreak severity can
be detected through an increase in the number of deaths. Thus, in this research, the nature of
1256 CMC, 2021, vol.69, no.1
the outbreak is detected with the help of the COVID-19 dataset for the past few months in
Pakistan collected by the Government. If the nature of the COVID-19 outbreak can be detected
from the past months’ death rate, then with the help of standard operating procedure and
precautionary measures, the death rate can be reduced in the coming months in Pakistan. For
outbreak detection, the COVID-19 dataset is rst classied with machine learning (ML) classiers.
Then the classied dataset is categorized into severe and normal COVID-19 outbreaks, using the
Bayesian regularized articial neural network (BRANN) classier.
1.2 Motivation and Contribution
The COVID-19 death rate is high and is increasing day by day globally [17]. This research
is intended to classify and categorize the nature of the outbreak in Pakistan using machine
learning classiers. In this study, a dataset of COVID-19 patients from different regions (primarily
populated regions) of Pakistan is preprocessed and then classied to understand the nature of
the virus and its outbreak in Pakistan. Machine learning classiers: Decision Tree (DT), Naive
Bayes (NB), Support Vector Machine (SVM), and Logistic Regression (LR) are implemented,
and results are compared based on performance measures (i.e., accuracy, precision, and recall).
The comparison of machine learning classiers indicates that the DT and NB classiers return
100% accuracy. Classied data is input to the BRANN to categorize the COVID-19 outbreak in
Pakistan to determine if the nature of the outbreak will be normal or severe.
The remainder of this paper is organized as follows. Section 2 discusses the related work.
Section 3 provides the proposed methodology to classify and categorize COVID patients. Section 4
provides the experimental analysis and results. Conclusions and suggestions for future work are
presented in Section 5.
2 Literature Review
COVID-19 virus was initially discovered in December 2019 in the population of Wuhan,
China. Later, it spread to other regions of China and other parts of the world [18]. Various
papers and studies have applied different techniques on COVID-19 datasets. In this section, several
studies that investigate the application of machine learning algorithms on different diseases are
discussed.
SVM and Mutual Information techniques have been applied to classify genes [19]. In that
study, the authors claimed that the SVM classier achieved the best mean accuracy rate. In
addition, the fuzzy KNN approach has been used on a Parkinson’s dataset to help generate a
diagnostic system that will make better clinical diagnostic decisions [20]. Here, researchers utilized
different machine learning techniques to propose a novel method. They computed signicant
features by implementing machine learning techniques to improve the accuracy rate of predicting
cardiovascular disease. Their prediction model gives 88.7% accuracy [21]. In 2015, a combination
of SVM and fuzzy logic was applied for the risk classication of diabetes. Fuzzy reasoning was
used to predict the risk factors of (Type-II) diabetes, and an SVM was used to generate fuzzy
rules from the Pima diabetes dataset [22].
Other researchers used the NB classier to improve the accuracy of predicting heart dis-
ease [23]. Different machine learning techniques, such as Articial Neural Network (ANN),
random forest (RF), and K-means clustering techniques were implemented to predict diabetes.
The ANN technique provided the best accuracy rate (75.7%) in the prediction of diabetes [24].
Some researchers also implemented machine learning techniques to predict hypertension outcomes
based on medical data. In that study, the researchers evaluated four classiers, i.e., SVM, DT, RF,
CMC, 2021, vol.69, no.1 1257
and XGBoost, to meet the desired accuracy level of the prediction system. XGBoost produced
the best results among the four classiers and provided a system accuracy of 94.36%. [25,26].
Other researchers used histopathological data patients who had a lung lobectomy to treat
adenocarcinoma. For both “accidental” models, adjacent to malignancies, the lungs show edema
and fundamental proteinaceous exudates as huge protein globules [27]. The researchers docu-
mented vascular joins with blazing gatherings of brinoid content, multinucleated goliath cells,
and pneumocyte hyperplasia. In addition, some researchers used the ANFIS model to estimate
landslide susceptibility and to develop a model to predict landslides. The ANFIS model was used
to train and validate the dataset [28]. Different ML classiers have been used to develop predictive
models [29,30]. In 2017, researchers proposed an SVM and fuzzy logic-based system automati-
cally block pornographic content on the web. SVMs have also been used in statistical learning
approaches to classify hypothesis test data and compute the error rate using the Gaussian-density
function [31,32].
3 Proposed Methodology
Machine learning classiers, DT, NB, LR, and SMV, are used to classify and categorize the
COVID-19 outbreak in different regions of Pakistan. The proposed system is shown in Fig. 4.
Figure 4: Proposed system for COVID-19 data classication and prediction
3.1 Dataset
The “Corona-Virus Pakistan Dataset 2020” was downloaded from Kaggle [33]. The dataset
contains 13 features that represent the lab tests of suspected, conrmed, and fatal COVID-19
cases per day in the most populated regions of Pakistan (Tab. 1 ). The dataset features are listed in
Tab. 2. The dataset has 315 rows and 13 columns, i.e., 11089 data items. The dataset was checked
for null and missing values of categorical features; none were found. The data distribution of
categorical features, such as Date and Province, are shown in Fig. 5.
1258 CMC, 2021, vol.69, no.1
Table 1: Selected regions of Pakistan in dataset
Sr. No. Regions
1. AJK
2. Balochistan
3. GB
4. ICT
5. KP
6. Punjab
7. Sindh
Table 2: Features of COVID-19 dataset
Sr. No. Features
1. Date
2. Province old
3. Suspected cases last date
4. Suspected cases last 24 h
5. Suspected cases cumulative
6. Lab tests last 24 h
7. Lab tests cumulative
8. Conrmed cases last date
9. Conrmed cases last 24 h
10. Conrmed cases cumulative
11. Deaths last date
12. Deaths last 24 h
13. Deaths cumulative
Figure 5: Data distribution of categorical features
CMC, 2021, vol.69, no.1 1259
3.2 Dataset Preprocessing
Preprocessing is necessary to avoid misclassied results and errors [34,35]. Data preprocessing
involved data preparation, data exploration, data distribution, and replacing categorical features.
Preprocessing resulted in a clean dataset suitable for classication. This preprocessed dataset is fed
to the machine learning classiers to produce classied results [36].
4 Experimental Analysis and Results
For the classication of the dataset, Google Colab was used for python coding, and dataset
categorization was implemented through MATLAB. The dataset was split into training (70%) and
testing (30%) sets. The metrics used in this work are as follows.
Accuracy =TP +TN
TP +FP +TN +FN (1)
Pecision =TP
TP +FP (2)
Recall =TP
TP +FN (3)
4.1 Decision Tree Classier
The COVID-19 dataset was classied using the DT ID3 classier. The results are shown in
Tab. 3. As can be seen, this classier achieved 100% accuracy, precision, and recall. The confusion
matrix for the DT classier is plotted in Fig. 6a.
Table 3: Results achieved for decision tree classier
Sr. No. Measures Result
1. Accuracy 1.0 => 100%
2. Precision 1.0 => 100%
3. Recall 1.0 => 100%
4.2 Naive Bayes Classier
The NB Classier is implemented on the COVID-19 dataset because it is a continuous dataset.
The NB classier also achieved 100% accuracy (Tab. 4). The confusion matrix for this classier is
shown in Fig. 6b.
4.3 Logistic Regression Classier
The LR classier has been used successfully to predict various diseases [37,38]. The testing
data is predicted for the rst 25 entries. The histogram of the predictions is shown in Fig. 7.
Figs. 8a and 8b depict the confusion matrices for LR and SVM classiers respectively. The
Receiver Operating Characteristics (ROC) plot for the COVID19 dataset, based on true positive
rate and false positive rate, is shown in Fig. 9a. The LR ROC curve covers 91% of the area. The
results obtained for LR are listed in Tab. 5.
1260 CMC, 2021, vol.69, no.1
(a) (b)
Figure 6: Confusion matrices for both classiers (a) Decision tree classier (b) Naive Bayesian
classier
Table 4: Results achieved for Naive Bayesian classier
Sr. No. Measures Result
1. Accuracy 1.0 => 100%
2. Precision 1.0 => 100%
3. Recall 1.0 => 100%
Figure 7: Histogram of predicted probabilities
4.4 Support Vector Machine Classier
The linear SVM classier achieved precision of 98%. The ROC curve for multiclass SVM is
depicted in Fig. 9b. It shows that the ROC curve for class-1 covers 100% of the area, while class-2
covers 88% of the area. Tab. 6 lists the SVM results using formulas (1–3).
CMC, 2021, vol.69, no.1 1261
(a) (b)
Figure 8: Confusion matrices for (a) LR and (b) SVM classiers
(a) (b)
Figure 9: ROC Curve for (a) LR and (b) SVM classiers
Table 5: Results achieved for logistic regression classier
Sr. No. Measures Result (%)
1. Accuracy 91
2. Precision 86
3. Recall 94
DT and NB classiers yielded 100% accuracy for this dataset. Tab. 7 shows the results for
the DT, NB, LR, and SVM classiers. The classied dataset is input to an ANN (Section 4.5) for
data categorization.
1262 CMC, 2021, vol.69, no.1
Table 6: Results achieved for SVM classier
Sr. No. Measures Result (%)
1. Accuracy 97
2. Precision 98
3. Recall 96
Table 7: Comparison of classication results
Sr. No Classier Accuracy (%)
1. Decision tree 100
2. Naive Bayesian 100
3. Logistic regression 91
4. Support vector machine 97
4.5 Articial Neural Network
In the Articial Neural Network training classier, Bayesian regularization is used to cate-
gorize the search space into two classes: normal outbreak and severe outbreak. This classier is
used to categorize the nature of the COVID-19 outbreak in Pakistan based on data collected from
various regions. Fig. 10 shows the COVID-19 dataset simulation architecture.
Figure 10: COVID-19 dataset simulation architecture
CMC, 2021, vol.69, no.1 1263
Algorithm 1: Algorithm for Classication
1.Provide the Input Parameters
2.Data Preprocessing
3.Checking of Conditional Probability
4.While (error-rate <threshold-value)
5.Training and Testing of Model
6.If error-rate > threshold-value
7.Back Propagation
8.Weight setting
9.End If
10.End While
11.Neural Network’s Bayesian regularization
12.Classication Results
13.Classication of the Outbreak Nature
The output is labeled 0 or and 1, where 0 represents a normal outbreak and 1 represents
a severe outbreak. The output is labeled based on input parameter values. Tab. 8 shows the
classied, important ranking features of the dataset as inputs selected for the neural network. The
Error Histogram and Regression values are given in Tab. 9.
Table 8: Selected inputs of COVID-19 dataset
Sr. No. Inputs
1. Province old
2. Suspected cases cumulative
3. Lab tests cumulative
4. Conrmed cases cumulative
5. Deaths cumulative
In Tab. 9, from the 852 dataset entries, 596 instances are selected for training, 128 are selected
for validation, and 128 are selected for testing. Furthermore, 50 hidden neurons with one epoch
are used for the neural network. The confusion matrix results demonstrated that the actual
class predicts the predicted class with 99.88% accuracy. This indicates that the BRANN classier
predicts the results accurately for this dataset. Tab. 9 shows that the BRANN classier correctly
categorized 128 data items for the validation and testing process.
Table 9: Bayesian regularization results
Bayesian regularization Samples MSE Regression
Training 596 4.55463e–5 9.99908e–1
Validation 128 0.00000e–0 0.00000e–0
Testing 128 8.41366e–0 2.75570e–1
1264 CMC, 2021, vol.69, no.1
Figure 11: Error histogram of Bayesian regularization ANN algorithm
Figure 12: Bayesian regularization regression plot
CMC, 2021, vol.69, no.1 1265
From Fig. 11, it is evident BRANN has 0 errors. This indicates that the neural network ts
the data perfectly. Fig. 12 shows how accurately a neural network determines the function for
regression to analyze the dataset. The actual network details are shown in comparison with the
target output. How accurately a model ts the data is represented through this colored line shown
in the Fig. 12. This line should closely intersect the real output from the left to the right corner of
the regression plot. The above gure shows that the COVID-19 dataset closely ts in the BRANN
model.
Fig. 13 shows the training state of the BRANN (gradient, mu, parameters, the sum of
squared parameters, and validation checks). They all achieve 1000 epochs, which indicates the
good performance of the dataset. Fig. 14 represents the mean square error of the BRANN.
The blue and red training lines represents the testing mean square, and the dotted line represents
the 1000 epochs. The gure listed below shows the best training performance of the BRANN.
Figure 13: Training state of BRANN
Figure 14: Mean square error of neural network BRANN
1266 CMC, 2021, vol.69, no.1
Fig. 15 is the confusion matrix of the BRANN classier. The BRANN classier gives 99.88%
accuracy for training, testing, and validation of the classier on the COVID dataset for Pakistan.
The outcome of the dataset is divided into two classes 0 and 1, where 0 denotes that the outbreak
is normal, and 1 represents that the outbreak is severe. Five potential features are selected as input
according to their importance that is classied through ML classiers.
Figure 15: BRANN confusion matrix
The COVID-19 dataset for Pakistan is classied through machine learning techniques, and
their accuracy results are compared. The results show that the NB classier gives 100% accuracy
for this dataset. Therefore, the BRANN best ts the dataset and categorizes the dataset into a
normal class and severe class for the COVID-19 outbreak in Pakistan.
5 Conclusion
The proposed system categorizes the COVID-19 outbreak in Pakistan based on a dataset
collected in different regions of Pakistan. Machine learning classiers play a vital role in the classi-
cation, categorization, and prediction of dangerous diseases such as COVID-19. With the help of
various machine learning techniques, the loss from COVID19 can be minimized in the upcoming
months in Pakistan. First, we classied the COVOD-19 dataset using different machine learning
classiers. Then, the BRANN classier was used to categorize the nature of outbreak as normal
or severe. The experiments show that the BRANN provides a best t regression plot with minimal
error rate. In future, the proposed model can be further tested on a larger dataset [39,40] to test
its scalability.
CMC, 2021, vol.69, no.1 1267
Funding Statement: The authors are grateful to the Raytheon Chair for Systems Engineering for
funding.
Conicts of Interest: The authors declare that they have no conicts of interest to report regarding
the present study.
References
[1] W. H. Organization, “Health topics. Coronavirus,Coronavirus: Symptoms, World Health Organization,
2020. [Online]. Available: https://www. who. int/healthtopics/coronavirus# tab= tab_3.
[2] K. Karim, S. Guha and R. Beni, “Globalism after covid-19 pandemic: A turning point in the
separation of social and economic aspects,Voice of the Publisher, vol. 6, no. 2, pp. 7–17, 2020.
[3] N. N. Thilakarathne, M. K. Kagita, T. R. Gadekallu and P. K. R. Maddikunta, “The adoption
of ict powered healthcare technologies towards managing global pandemics,” arXiv e-prints, arXiv:
2009.05716, 2020.
[4] B. G. Ali, T. Announce and G. Amr, “The day after tomorrow: Cardiac surgery post-covid-19,
Authorea Preprints, 2020. https//doi.org/10.22541/au.159284828.87817861.
[5] J. M. Read, J. R. Bridgen, D. A. Cummings, A. Ho and C. P. Jewell, “Novel coronavirus 2019-
ncov: Early estimation of epidemiological parameters and epidemic predictions,MedRxiv, 2020.
https://doi.org/10.1101/2020.01.23.20018549.
[6] A. Wnuk, T. Oleksy and D. Maison, “The acceptance of covid-19 tracking technologies: The role of
perceived threat, lack of control, and ideological beliefs,” PLoS One, vol. 15, no. 9, pp. e0238973, 2020.
[7] K. B. Mitchell and S. R. Weinstein, “Concerns regarding the article entitled âsafe handling of con-
tainers of expressed human milk in all settings during the sars-cov-2 (covid-19),Journal of Human
Lactation, vol. 36, no. 3, pp. 542, 2020.
[8] L. Bourouiba, “Turbulent gas clouds and respiratory pathogen emissions: Potential implications for
reducing transmission of covid-19,” JAMA , vol. 323, no. 18, pp. 1837–1838, 2020.
[9] M. Begum, M. S. Farid, S. Barua and M. J. Alam, “Covid-19 and Bangladesh: Socio-economic analysis
towards the future correspondence,Asian Journal of Agricultural Extension, Economics & Sociology,
pp. 143–155, 2020. https//doi.org/10.20944/preprints202004.0458.v1.
[10] N.Noreen,S.Dil,S.Niazi,I.Naveed,N.Khanet al., “Covid 19 pandemic & Pakistan; limitations
and gaps,Global Biosecurity, vol. 1, no. 4, pp. 1–11, 2020.
[11] U. Ramzan, “Coronavirus diagnostic kits arrived in Pakistan_ace news, ACE News, 2020.
https://acenews.pk/coronavirus-diagnostic-kits-arrived-in-pakistan/.
[12] Pakistan Government, “Covid-19 situation,” 2020. [Online]. Available: http://covid.gov.pk/ [Last
accessed 16 September 2020].
[13] N. Noreen, S. Dil, S. U. K. Niazi, I. Naveed, N. U. Khan et al., “COVID 19 Pandemic & Pakistan;
limitations and gaps,Global Biosecurity, vol. 2, no. 1, 1–11, 2020.
[14] S. Montanari, “Japan has a remarkably low number of coronavirus cases that experts worry may lead
to a false sense of security, pp. 1–6, 2020. https://www.businessinsider.com/why-japan-cases-of-corona
virus-are-so-low-2020-3.
[15] J. M. Goraya, “Testing people for covid,” 2020. [Online]. Available: https://www.geo.tv/latest/279454-is-
pakistan-testing-enough-people-for-covid-19 [Last accessed 16 September 2020].
[16] M. MK, G. Srivastava, S. R. K. Somayaji, T. R. Gadekallu, K. Reddy et al., “An incentive-based
approach for COVID-19 using blockchain technology,” arXiv preprint arXiv: 2011.01468, 2020.
[17] S. Bhattacharya, P. K.Reddy, Q. Pham, T. R. Gadekallu, C. Chowdhary et al., “Deep learning and
medical image processing for coronavirus (COVID-19) pandemic: A survey,” Sustainable Cities and
Society, vol. 65, p. 102589, 2021. https://doi.org/10.1016/j.scs.2020.102589.
[18] F. Times, “Coronavirus tracked: The latest gures as the pandemic spreads,Financial Times, 2020.
https://www.ft.com/content/a2901ce8-5eb7-4633-b89ccbdf5b386938.
1268 CMC, 2021, vol.69, no.1
[19] N. Jafarpisheh and M. Teshnehlab, “Cancers classication based on deep neural networks and
emotional learning approach,” IET Systems Biology, vol. 12, no. 6, pp. 258–263, 2018.
[20] Z. Cai, J. Gu, C. Wen, D. Zhao, C. Huang et al., An intelligent parkinsonâs disease diag-
nostic system based on a chaotic bacterial foraging optimization enhanced fuzzy knn approach,
Computational and Mathematical Methods in Medicine, vol. 2018, Article ID 2396952, 2018.
https://doi.org/10.1155/2018/2396952.
[21] C. Iwendi, A. K. Bashir, P. Atharv, R. Sujatha, J. M. Chatterjee et al., “COVID-19 patient health
prediction using boosted random forest algorithm,” Frontiers in Public Health, vol. 8, pp. 357, 2020.
[22] T. T. Ramanathan and D. Sharma, “An SVM-fuzzy expert system design for diabetes risk clas-
sication,International Journal of Computer Science and Information Technologies,vol. 6,no. 3,
pp. 2221–2226, 2015.
[23] C. B. C. Latha and S. C. Jeeva, “Improving the accuracy of prediction of heart disease risk based on
ensemble classication techniques,Informatics in Medicine Unlocked, vol. 16, no. 6, pp. 100203, 2019.
[24] S. Vyas, R. Ranjan, N. Singh and A. Mathur, “Review of predictive analysis techniques for analysis
of diabetes risk,” in 2019 Amity Int. Conf. on Articial Intelligence, Dubai, United Arab Emirates, IEEE,
pp. 626–631, 2019.
[25] W. Chang, Y. Liu, Y. Xiao, X. Yuan, X. Xu et al., A machine-learning-based prediction method for
hypertension outcomes based on medical data,Diagnostics, vol. 9, no. 4, pp. 178, 2019.
[26] R. C. Lacson, B. Baker, H. Suresh, K. Andriole, P. Szolovits et al., “Use of machine-learning algo-
rithms to determine features of systolic blood pressure variability that predict poor outcomes in
hypertensive patients,Clinical Kidney Journal, vol. 12, no. 2, pp. 206–212, 2019.
[27] S. Tian, W. Hu, L. Niu, H. Liu, H. Xu et al., “Pulmonary pathology of early phase 2019 novel
coronavirus (covid-19) pneumonia in two patients with lung cancer,” Journal of Thoracic Oncology,
vol. 15, no. 5, pp. 700–704, 2020.
[28] A. Jaafari, M. Panahi, B. T. Pham, H. Shahabi, D. T. Bui et al., “Meta optimization of an adaptive
neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms
for spatial prediction of landslide susceptibility,Catena, vol. 175, no. 3, pp. 430–445, 2019.
[29] E. Kirkos, C. Spathis and Y. Manolopoulos, “Support vector machines, decision trees and neural
networks for auditor selection,Journal of Computational Methods in Sciences and Engineering,vol.8,
no. 3, pp. 213–224, 2008.
[30] D. R. Amancio, C. H. Comin, D. Casanova, G. Travieso, O. Martinez Bruno et al., A systematic
comparison of supervised classiers,” PLoS One, vol. 9, no. 4, pp. 1–13, 2014.
[31] J. M. Górriz, J. Ramírez, J. Suckling, I. A. Illán, A. Ortiz et al., “Case-based statistical learn-
ing: A non-parametric implementation with a conditional-error rate SVM,” IEEE Access,vol.5,
pp. 11468–11478, 2017.
[32] A. Al-Nasheri, G. Muhammad, M. Alsulaiman, Z. Ali, K. H. Malki et al., “Voice pathology detection
and classication using auto-correlation and entropy features in different frequency regions,IEEE
Access, vol. 6, pp. 6961–6974, 2017.
[33] Mesum Raza Hemani, “Coronavirus Pakistan dataset 2020,” [Online]. Available: https://www.kaggle.com/
mesumraza/coronavirus-pakistan-dataset-2020?select=COVID_ FINAL_DATA.xlsx [Last accessed 16
September 2020].
[34] T. Reddy, M. P. K. Reddy, K. Lakshmana, R. Kaluri, D. S. Rajput et al., Analysis of dimensionality
reduction techniques on big data,” IEEE Access, vol. 8, pp. 54776–54788, 2020.
[35] T. Reddy, S. Bhattacharya, P. K. R. Maddikunta, S. Hakak, W. Z. Khan et al., Antlion re-sampling
based deep neural network model for classication of the imbalanced multimodal dataset,” Multimedia
Tools and Applications, pp. 1–25, 2020. https://doi.org/10.1007/s11042-020-09988-y.
[36] C. Iwendi, Celestine, S. A. Moqurrab, A. Anjum, S. Khan et al., “N-Sanitization: A semantic privacy-
preserving framework for unstructured medical datasets,Computer Communications, vol. 161, pp. 160–
171, 2020.
CMC, 2021, vol.69, no.1 1269
[37] B. Tripathy, M. Parimala and G. T. Reddy, “Innovative classication, a regression model for predicting
various diseases,” in Data Analytics in Biomedical Engineering and Healthcare, Academic Press, pp. 179–
203, 2020. https://doi.org/10.1016/B978-0-12-819314-3.00012-4.
[38] T. R. Gadekallu, N. Khare, S. Bhattacharya, S. Singh, P. K. R. Maddikunta et al., “Deep neural
networks to predict diabetic retinopathy,Journal of Ambient Intelligence and Humanized Computing,
2020. https://doi.org/10.1007/s12652-020-01963-7.
[39] N. Deepa, Q. V. Pham, D. C. Nguyen, S. Bhattacharya, T. R. Gadekallu et al., A survey on blockchain
for big data: Approaches, opportunities, and future directions,” arXiv preprint arXiv: 2009.00858, 2020.
[40] M. Tang, M. Alazab and Y. Luo, “Big data for cybersecurity: Vulnerability disclosure trends and
dependencies,” IEEE Transactions on Big Data, vol. 5, no. 3, pp. 317–329, 2017.
... Machine learning and deep learning are used for detecting brain tumors, cervical cancer, breast cancer, COVID-19, identifying physical activity, detecting wind chill, and assessing the cognitive health of dementia patients [7][8][9][10][11]. It is more productive than conventional detection approaches due to advancements in the healthcare industry [12][13][14][15]. ...
... Eqs. (7) to (9) put zeros in blank columns to perform a smooth addition process without any minimal error. ...
Article
Full-text available
Cervical cancer is an intrusive cancer that imitates various women around the world. Cervical cancer ranks in the fourth position because of the leading death cause in its premature stages. The cervix which is the lower end of the vagina that connects the uterus and vagina forms a cancerous tumor very slowly. This pre-mature cancerous tumor in the cervix is deadly if it cannot be detected in the early stages. So, in this delineated study, the proposed approach uses federated machine learning with numerous machine learning solvers for the prediction of cervical cancer to train the weights with varying neurons empowered fuzzed techniques to align the neurons, Internet of Medical Things (IoMT) to fetch data and blockchain technology for data privacy and models protection from hazardous attacks. The proposed approach achieves the highest cervical cancer prediction accuracy of 99.26% and a 0.74% misprediction rate. So, the proposed approach shows the best prediction results of cervical cancer in its early stages with the help of patient clinical records, and all medical professionals will get beneficial diagnosing approaches from this study and detect cervical cancer in its early stages which reduce the overall death ratio of women due to cervical cancer.
... Deep Learning (DL) and Machine Learning (ML) are effective algorithm classifiers in predicting brain tumours, breast cancer, thermal sensation, dementia evaluation, COVID-19, renal disorders, heart problems, and cervical cancer (Abbas et al., 2021;Ayoub et al., 2021;Khamparia et al., 2021). Because of technological improvements in the health care system, several medical disorders can now be predicted at an earlier stage based on identifying critical factors than traditional diagnostic approaches (Chen H et al., 2021;Javed et al., 2021;Javed et al., 2020;Sarwar et al., 2019). ...
Article
Full-text available
Objective: Human papillomavirus and other predicting factors are responsible causing cervical cancer, and early prediction and diagnosis is the solution for preventing this condition. The objective is to find out and analyze the predictors of cervical cancer and to study the issues of unbalanced datasets using various Machine Learning (ML) algorithm-based models. Methods: A multi-stage sampling strategy was used to recruit 501 samples for the study. The educational intervention was the video-assisted counseling which is consisted of two educational methods: a documentary film and face-to- face interaction with women followed by reminders. Following the collection of baseline data from these subjects, they were encouraged to undergo Pap smear screening. Women having abnormal Pap tests were sent for biopsy. Machine learning classification methods such as Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), Multi-layer Perceptron (MLP) and Naive Bayes(NB) were used to evaluate the unbalanced input and target datasets. Result: Merely 398 women out of 501 showed an interest to participate in the study, but only 298 stated a willingness for cervical screening. Atypical malignant cells were discovered on the cervix of 26 women who had abnormal pap tests. These women had guided for further tests, such as a cervical biopsy, and seven women had been diagnosed with cervical cancer. LR in models 1, 2, and 4 showed 88% to 94% sensitivity with 84% to 89% accuracy, respectively for cervical cancer prediction, whereas DT in models 3, 5, and 6 algorithms exhibited 83% to 84% sensitivity with 84% to 88% accuracy, respectively. The NB and LR algorithms produced the highest area under the ROC curve for testing dataset, but all models performed similarly for training data. Conclusion: In current study , Logistic Regression and Decision Tree algorithms were identified as the best-performed ML algorithm classifiers to detect the significant predictors.
... Researchers, practitioners, and decision-makers are very much interested in developing a variety of models to comprehend the trajectory of the pandemic and to devise efficient control tactics [6]. In the literature, a variety of models have been employed, including mathematical models [7][8][9][10][11][12][13], statistical models [14][15][16][17][18], networkbased models [19][20][21], artificial intelligence (AI) models [8,[22][23][24], and simulation models [25][26][27]. The output of these models has exhibited extreme importance for decisionmakers in controlling the pandemic's spread and its adverse effects [28]. ...
Article
Full-text available
In 2020, coronavirus (COVID-19) was declared a global pandemic and it remains prevalent today. A necessity to model the transmission of the virus has emerged as a result of COVID-19's exceedingly contagious characteristics and its rapid propagation throughout the world. Assessing the incidence of infection could enable policymakers to identify measures to halt the pandemic and gauge the required capacity of healthcare centers. Therefore, modeling the susceptibility, exposure, infection, and recovery in relation to the COVID-19 pandemic is crucial for the adoption of interventions by regulatory authorities. Fundamental factors, such as the infection rate, mortality rate, and recovery rate, must be considered in order to accurately represent the behavior of the pandemic using mathematical models. The difficulty in creating a mathematical model is in identifying the real model variables. Parameters might vary significantly across models, which can result in variations in the simulation results because projections primarily rely on a particular dataset. The purpose of this work was to establish a susceptible-exposed-infected-recovered (SEIR) model describing the propagation of the COVID-19 outbreak throughout the Kingdom of Saudi Arabia (KSA). The goal of this study was to derive the essential COVID-19 epidemiological factors from actual data. System dynamics modeling and design of experiment approaches were used to determine the most appropriate combination of epidemiological parameters and the influence of COVID-19. This study investigates how epidemiological variables such as seasonal amplitude, social awareness impact, and waning time can be adapted to correctly estimate COVID-19 scenarios such as the number of infected persons on a daily basis in KSA. This model can also be utilized to ascertain how stress (or hospital capacity) affects the percentage of hospitalizations and the number of deaths. Additionally, the results of this study can be used to establish policies or strategies for monitoring or restricting COVID-19 in Saudi Arabia.
... The COVID-19 epidemic became the main topic of news and research and gained a lot of attention from national and international media and researchers. A previous study on pandemic communication found that the content covered by the news media has a strong influence on how people seek information, evaluate it and make concerned decisions [3,4]. Indeed, in crises such as public health threats, news coverage is widely believed to have a significant impact on people's perceptions and behavior [5]. ...
Article
Full-text available
The COVID-19 pandemic has shattered the whole world, and due to this, millions of people have posted their sentiments toward the pandemic on different social media platforms. This resulted in a huge information flow on social media and attracted many research studies aimed at extracting useful information to understand the sentiments. This paper analyses data imported from the Twitter API for the healthcare sector, emphasizing sub-domains, such as vaccines, post-COVID-19 health issues and healthcare service providers. The main objective of this research is to analyze machine learning models for classifying the sentiments of people and analyzing the direction of polarity by considering the views of the majority of people. The inferences drawn from this analysis may be useful for concerned authorities as they work to make appropriate policy decisions and strategic decisions. Various machine learning models were developed to extract the actual emotions, and results show that the support vector machine model outperforms with an average accuracy of 82.67% compared with the logistic regression, random forest, multinomial naïve Bayes and long short-term memory models, which present 78%, 77%, 68.67% and 75% accuracy, respectively.
Chapter
In the current decade, the economy and health have been significantly impacted globally by the pandemic disease named Coronavirus Disease 2019 (COVID-19). People need to stay indoors at this time, which causes them to grow more dependent on social media and use these online channels to communicate their feelings and sympathies. Twitter is one of the familiar social media and micro-blogging platforms in which people post tweets, retweet tweets, and communicate regularly, offering an immense amount of data. Popular social media have evolved into an abundant information source for sentiment analysis (SA) on COVID-19-related issues. Hence, SA is used to predict the public opinion polarity that underlies various factors from Twitter during lockdown phases. Natural language processing (NLP) has been utilised in this study to manage the SA and employ specific tools to codify human language and its means of transmitting information to beneficial findings. This proposed method for Twitter SA is concentrated on all aspects by considering the emoji provided and leveraging the Flair Pytorch (FP) technology. Since extracting emojis and text is implanted with sentiment awareness, it surpasses cutting-edge algorithms. In this research, the ‘en-sentiment' module is introduced in the FP method for tokenisation and text classification that assists in diverging the sentence with respect to words, namely positive or negative as sentiment status for the tweets. Thus, it is evaluated by the confidence score of the FP method and compared with the existing textblob method.
Article
The emergence of the novel COVID-19 virus has had a profound impact on global healthcare systems and economies, underscoring the imperative need for the development of precise and expeditious diagnostic tools. Machine learning techniques have emerged as a promising avenue for augmenting the capabilities of medical professionals in disease diagnosis and classification. In this research, the EFS-XGBoost classifier model, a robust approach for the classification of patients afflicted with COVID-19 is proposed. The key innovation in the proposed model lies in the Ensemble-based Feature Selection (EFS) strategy, which enables the judicious selection of relevant features from the expansive COVID-19 dataset. Subsequently, the power of the eXtreme Gradient Boosting (XGBoost) classifier to make precise distinctions among COVID-19-infected patients is harnessed.The EFS methodology amalgamates five distinctive feature selection techniques, encompassing correlation-based, chi-squared, information gain, symmetric uncertainty-based, and gain ratio approaches. To evaluate the effectiveness of the model, comprehensive experiments were conducted using a COVID-19 dataset procured from Kaggle, and the implementation was executed using Python programming. The performance of the proposed EFS-XGBoost model was gauged by employing well-established metrics that measure classification accuracy, including accuracy, precision, recall, and the F1-Score. Furthermore, an in-depth comparative analysis was conducted by considering the performance of the XGBoost classifier under various scenarios: employing all features within the dataset without any feature selection technique, and utilizing each feature selection technique in isolation. The meticulous evaluation reveals that the proposed EFS-XGBoost model excels in performance, achieving an astounding accuracy rate of 99.8%, surpassing the efficacy of other prevailing feature selection techniques. This research not only advances the field of COVID-19 patient classification but also underscores the potency of ensemble-based feature selection in conjunction with the XGBoost classifier as a formidable tool in the realm of medical diagnosis and classification.
Book
Full-text available
This book is fit for anyone wishing to learn more about tourism management and the application of artificial intelligence techniques. This book covers ten chapters on the application of AI in tourism management and also the impact and cost-effectiveness of AI in the tourism industry and hospitality management. An idea for the development as well as customer satisfaction in this industry can be increased by using AI and ML techniques. Any unintended errors, mistakes, omissions, and improvements in this book are most welcome.
Article
Food spoilage is a pervasive issue that contributes to food waste and poses significant economic and environmental challenges worldwide. To combat this problem, we propose the development of a Convolutional Neural Network (CNN) model capable of predicting and preventing food spoilage. This paper outlines the methodology, data collection, model architecture, and evaluation of our CNN-based solution, which aims to assist consumers, retailers, and food producers in minimizing food waste. Researchers are working on innovative techniques to preserve the quality of food in an effort to extend its shelf life since grains are prone to spoiling as a result of precipitation, humidity, temperature, and a number of other factors. In order to maintain current standards of food quality, effective surveillance systems for food deterioration are needed. To monitor food quality and control home storage systems, we have created a prototype. To start, we used a Convolutional Neural Network (CNN) model to identify the different types of fruits and vegetables. The suggested system then uses sensors and actuators to check the amount of food spoiling by monitoring the gas emission level, humidity level, and temperature of fruits and vegetables. Additionally, this would regulate the environment and, to the greatest extent feasible, prevent food spoiling. Additionally, based on the freshness and condition of the food, a message alerting the client to the food decomposition level is delivered to their registered cell numbers. The model used turned out to have a 96.3% accuracy rate.
Article
Full-text available
The global coronavirus pandemic (COVID-19) started in 2020 and is still ongoing today. Among the numerous insights the community has learned from the COVID-19 pandemic is the value of robust healthcare inventory management. The main cause of many casualties around the world is the lack of medical resources for those who need them. To inhibit the spread of COVID-19, it is therefore imperative to simulate the demand for desirable medical goods at the proper time. The estimation of the incidence of infections using the right epidemiological criteria has a significant impact on the number of medical supplies required. Modeling susceptibility, exposure, infection, hospitalization, isolation, and recovery in relation to the COVID-19 pandemic is indeed crucial for the management of healthcare inventories. The goal of this research is to examine the various inventory policies such as reorder point, periodic order, and just-in-time in order to minimize the inventory management cost for medical commodities. To accomplish this, a SEIHIsRS model has been employed to comprehend the dynamics of COVID-19 and determine the hospitalized percentage of infected people. Based on this information, various situations are developed, considering the lockdown, social awareness, etc., and an appropriate inventory policy is recommended to reduce inventory management costs. It is observed that the just-in-time inventory policy is found to be the most cost-effective when there is no lockdown or only a partial lockdown. When there is a complete lockdown, the periodic order policy is the best inventory policy. The periodic order and reorder policies are cost-effective strategies to apply when social awareness is high. It has also been noticed that periodic order and reorder policies are the best inventory strategies for uncertain vaccination efficacy. This effort will assist in developing the best healthcare inventory management strategies to ensure that the right healthcare requirements are available at a minimal cost.
Article
Full-text available
Since it was first identified, the epidemic scale of the recently emerged novel coronavirus (2019-nCoV) in Wuhan, China, has increased rapidly, with cases arising across China and other countries and regions. Using a transmission model, we estimate a basic reproductive number of 3.11 (95% CI, 2.39–4.13), indicating that 58–76% of transmissions must be prevented to stop increasing. We also estimate a case ascertainment rate in Wuhan of 5.0% (95% CI, 3.6–7.4). The true size of the epidemic may be significantly greater than the published case counts suggest, with our model estimating 21 022 (prediction interval, 11 090–33 490) total infections in Wuhan between 1 and 22 January. We discuss our findings in the light of more recent information. This article is part of the theme issue ‘Modelling that shaped the early COVID-19 pandemic response in the UK’.
Article
Full-text available
The COVID-19 has caused gigantic negative effects on populace wellbeing, society, education, and the economy in Bangladesh. The aim is to deliver a comprehensive overview of the observed and the possible impacts that could appear in the coming days. The study is based on secondary information. During the early period, due to a lack of accurate facts about the case affected and death tension up-and-down among the nations. The total number of confirmed cases is increasing following geometric patterns in Bangladesh. Dairy farmers, vegetable producers, pharmaceuticals, poultry farmers are in deep crisis due to lower prices. Also, the pandemic has seriously affected educational systems, banking, FDI, ready-made garments, remittances, etc. Finally, it is not possible to mitigate the effects of pandemic individually but the integrated effort from the state authority as well as concern people of all sectors need to come forward.
Chapter
Full-text available
Data mining plays a major role in the healthcare industry. It can be used to enhance healthcare processes systematically and determine the best healthcare techniques at the lowest cost. Data mining performs analysis of large datasets to discover hidden patterns that can help to forecast or predict future events. Applications of data mining are vast, including in the retail industry, telecom services, automotive industry, and life sciences. However, data mining in health care is the need of the hour. There are various data mining techniques, such as association rule generation, classification, clustering, and outlier analysis. Some of the soft computing techniques such as neural networks, genetic algorithms, rough set techniques, and support vector machines are also used with data mining techniques for optimizing results, to search for dominating attributes in a given dataset and to handle high dimensional data. Based on the nature of the dataset, data mining can be categorized as sequence extraction, web mining, text mining, or spatial data mining. Other applications are in the fields of scientific engineering and healthcare data, like finding relationships between genomic data, predicting patterns in sensor data, intensive computing in simulated data, diagnosing disease in infected people, and predicting disease outbreak. They are in use for extracting meaningful information from voluminous multimedia data in fields like banking, customer retention, targeted marketing, and crime detection. The first part of this chapter discusses the various classification models and methods used for classifying diseases. The second part explains the different regression models designed for specific types of diseases like diabetes, cardiac disease, and epidemiological disease.
Article
Full-text available
Stroke is enlisted as one of the leading causes of death and serious disability affecting millions of human lives across the world with high possibilities of becoming an epidemic in the next few decades. Timely detection and prompt decision making pertinent to this disease, plays a major role which can reduce chances of brain death, paralysis and other resultant outcomes. Machine learning algorithms have been a popular choice for the diagnosis, analysis and predication of this disease but there exists issues related to data quality as they are collected cross-institutional resources. The present study focuses on improving the quality of stroke data implementing a rigorous pre-processing technique. The present study uses a multimodal stroke dataset available in the publicly available Kaggle repository. The missing values in this dataset are replaced with attribute means and LabelEncoder technique is applied to achieve homogeneity. However the dataset considered was observed to be imbalanced which reflect that the results may not represent the actual accuracy and would be biased. In order to overcome this imbalance, resampling technique was used. In case of oversampling, some data points in the minority class are replicated to increase the cardinality value and rebalance the dataset. transformed and oversampled data is further normalized using Standardscalar technique. Antlion optimization (ALO) algorithm is implemented on the deep neural network (DNN) model to select optimal hyperparameters in minimal time consumption. The proposed model consumed only 38.13% of the training time which was also a positive aspect. The experimental results proved the superiority of proposed model.
Article
Full-text available
New technological solutions play an important role in preventing the spread of Covid-19. Many countries have implemented tracking applications or other surveillance systems, which may raise concerns about privacy and civil rights violations but may be also perceived by citizens as a way to reduce threat and uncertainty. Our research examined whether feelings evoked by the pandemic (perceived threat and lack of control) as well as more stable ideological views predict the acceptance of such technologies. In two studies conducted in Poland, we found that perceived personal threat and lack of personal control were significantly positively related to the acceptance of surveillance technologies, but their predictive value was smaller than that of individual differences in authoritarianism and endorsement of liberty. Moreover, we found that the relationship between the acceptance of surveillance technologies and both perceived threat and lack of control was particularly strong among people high in authoritarianism. Our research shows that the negative feelings evoked by the unprecedented global crisis may inspire positive attitudes towards helpful but controversial surveillance technologies but that they do so to a lesser extent than ideological beliefs.
Article
Full-text available
Integration of artificial intelligence (AI) techniques in wireless infrastructure, real-time collection, and processing of end-user devices is now in high demand. It is now superlative to use AI to detect and predict pandemics of a colossal nature. The Coronavirus disease 2019 (COVID-19) pandemic, which originated in Wuhan China, has had disastrous effects on the global community and has overburdened advanced healthcare systems throughout the world. Globally; over 4,063,525 confirmed cases and 282,244 deaths have been recorded as of 11th May 2020, according to the European Centre for Disease Prevention and Control agency. However, the current rapid and exponential rise in the number of patients has necessitated efficient and quick prediction of the possible outcome of an infected patient for appropriate treatment using AI techniques. This paper proposes a fine-tuned Random Forest model boosted by the AdaBoost algorithm. The model uses the COVID-19 patient's geographical, travel, health, and demographic data to predict the severity of the case and the possible outcome, recovery, or death. The model has an accuracy of 94% and a F1 Score of 0.86 on the dataset used. The data analysis reveals a positive correlation between patients' gender and deaths, and also indicates that the majority of patients are aged between 20 and 70 years.
Article
Since December 2019, the coronavirus disease (COVID-19) outbreak has caused many death cases and affected all sectors of human life. With gradual progression of time, COVID-19 was declared by the world health organization (WHO) as an outbreak, which has imposed a heavy burden on almost all countries, especially ones with weaker health systems and ones with slow responses. In the field of healthcare, deep learning has been implemented in many applications, e.g., diabetic retinopathy detection, lung nodule classification, fetal localization, and thyroid diagnosis. Numerous sources of medical images (e.g., X-ray, CT, and MRI) make deep learning a great technique to combat the COVID-19 outbreak. Motivated by this fact, a large number of research works have been proposed and developed for the initial months of 2020. In this paper, we first focus on summarizing the state-of-the-art research works related to deep learning applications for COVID-19 medical image processing. Then, we provide an overview of deep learning and its applications to healthcare found in the last decade. Next, three use cases in China, Korea, and Canada are also presented to show deep learning applications for COVID-19 medical image processing. Finally, we discuss several challenges and issues related to deep learning implementations for COVID-19 medical image processing, which are expected to drive further studies in controlling the outbreak and controlling the crisis, which results in smart healthy cities.
Article
The introduction and rapid growth of the Internet of Medical Things (IoMT), a subset of the Internet of Things (IoT) in the medical and healthcare systems, has brought numerous changes and challenges to current medical and healthcare systems. Healthcare organizations share data about patients with research organizations for various medical discoveries. Releasing such information is a tedious task since it puts the privacy of patients at risk with the understanding that textual health documents about an individual contains specific sensitive terms that need to be sanitized before such document can be released. Recent approaches improved the utility of protected output by substituting sensitive terms with appropriate “generalizations” that are retrieved from several medical and general-purpose knowledge bases (KBs). However, these approaches perform unnecessary sanitization by anonymizing the negated assertions, e.g., AIDS-negative. This paper proposes a semantic privacy framework that effectively sanitizes the sensitive and semantically related terms in healthcare documents. The proposed model effectively identifies the negated assertions (e.g., AIDS-negative) before the sanitization process in IoMT which further improves the utility of sanitized documents. Moreover, besides considering the sensitive medical findings, we also incorporated state-of-the-art metrics, i.e., Protected Health Information (PHI), as defined in the privacy rules such as Health Insurance Portability and Accountability Act (HIPAA), Informatics for Integrating Biology & the Bedside (i2b2), and Materialize Interactive Medical Image Control System (MIMICS). The proposed approach is evaluated on real clinical data provided by i2b2. On average the detection (for both PHI’s and medical findings) accuracy is improved with Precision, Recall and F-measure score at 21%, 51%, and 54% respectively. The overall improved data utility of our proposed model is 8% as compared to C-sanitized and 25% when comparing it with a simple reduction approach. Experimental results show that our approach effectively manages the privacy and utility trade-off as compared to its counterparts.