PreprintPDF Available

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities

January 2021

January 2021

Authors:

Mohammadnoor Ahmad Mohammad Injadat

Zarqa University

Abdallah Moubayed

Lebanese American University

Ali Bou Nassif

University of Sharjah

Abdallah Shami

The University of Western Ontario

Preprints and early-stage research may not have been peer reviewed yet.

The emergence and continued reliance on the Internet and related technologies has resulted in the generation of large amounts of data that can be made available for analyses. However, humans do not possess the cognitive capabilities to understand such large amounts of data. Machine learning (ML) provides a mechanism for humans to process large amounts of data, gain insights about the behavior of the data, and make more informed decision based on the resulting analysis. ML has applications in various fields. This review focuses on some of the fields and applications such as education, healthcare, network security, banking and finance, and social media. Within these fields, there are multiple unique challenges that exist. However, ML can provide solutions to these challenges, as well as create further research opportunities. Accordingly, this work surveys some of the challenges facing the aforementioned fields and presents some of the previous literature works that tackled them. Moreover, it suggests several research opportunities that benefit from the use of ML to address these challenges.

Global ML Forecast Columbus (2020)

…

Potential Deployment of ML in Learning Management System (LMS)

…

Potential Deployment of ML in Diabetes Research

…

Potential ML-based Credit Risk and Bankruptcy Assessment Framework

…

Potential ML-based Social Media Analytics Framework

…

Figures - uploaded by Mohammadnoor Ahmad Mohammad Injadat

Content may be subject to copyright.

Content uploaded by Mohammadnoor Ahmad Mohammad Injadat

Content may be subject to copyright.

Artiﬁcial Intelligence Review manuscript No.

(will be inserted by the editor)

Machine Learning Towards Intelligent Systems: Applications,

Challenges, and Opportunities

MohammadNoor Injadat ·Abdallah Moubayed ·

Ali Bou Nassif ·Abdallah Shami

Received: date /Accepted: date

Abstract The emergence and continued reliance on the Internet and related technologies

has resulted in the generation of large amounts of data that can be made available for anal-

yses. However, humans do not possess the cognitive capabilities to understand such large

amounts of data. Machine learning (ML) provides a mechanism for humans to process large

amounts of data, gain insights about the behavior of the data, and make more informed de-

cision based on the resulting analysis. ML has applications in various ﬁelds. This review

focuses on some of the ﬁelds and applications such as education, healthcare, network secu-

rity, banking and ﬁnance, and social media. Within these ﬁelds, there are multiple unique

challenges that exist. However, ML can provide solutions to these challenges, as well as

create further research opportunities. Accordingly, this work surveys some of the challenges

facing the aforementioned ﬁelds and presents some of the previous literature works that

tackled them. Moreover, it suggests several research opportunities that beneﬁt from the use

of ML to address these challenges.

Keywords Machine learning ·Data Analytics ·Application ﬁelds ·Research Opportunities

1 Introduction

The rapid growth of the Internet and related technologies has provided individuals, organi-

zations, and society with the opportunity to collect large amounts of data (Van Der Aalst,

MohammadNoor Injadat, Abdallah Moubayed, Abdallah Shami

Electrical & Computer Engineering Dept.

University of Western Ontario

London, ON, Canada

E-mail: minjadat@uwo.ca, amoubaye@uwo.ca, abdallah.shami@uwo.ca

Ali Bou Nassif

Computer Engineering Dept.

University of Sharjah, Sharjah, UAE

and

Electrical & Computer Engineering Dept.

University of Western Ontario

London, ON, Canada

E-mail: anassif@sharjah.ac.ae

arXiv:2101.03655v1 [cs.LG] 11 Jan 2021

2 MohammadNoor Injadat et al.

2016). However, these large amounts of data often lead to information overload. Informa-

tion overload occurs when the amount of input (e.g. data) that a human is trying to process

exceeds their cognitive capacities (Halford et al., 2005). Information overload can lead to hu-

mans ignoring, overlooking, or misinterpreting crucial information (Caban and Gotz, 2015).

Humans do not have the cognitive capacity to process large amounts of data. Therefore,

the discipline of data science has emerged. Data science combines the classic disciplines

of statistics, data mining, databases, and distributed systems in order to extract information

from large sets of data (Van Der Aalst, 2016). One approach of data analysis that data sci-

entists can implement is machine learning (ML) (Moubayed, 2018). ML allows computers

to learn without being explicitly programmed. Once the computer learns patterns from a

training set of data, it can apply what it has learned to ﬁnd these patterns in similar data

(Kearns et al., 1994). Furthermore, ML allows computer systems to adapt and learn from

their experience (Wilson and Keil, 2001; Mitchell, 1997).

ML algorithms have a lot of applications. Examples are house pricing prediction, spam

ﬁltering, education, structuring of data in healthcare systems, drug response prediction, dia-

betes research, network security, banking and ﬁnance, and social media. This work aims to

provide a brief literature review of the challenges facing diﬀerent ﬁelds such as education,

healthcare, network security, banking and ﬁnance, and social media. Moreover, it presents

several research opportunities on the role and potential of using ML to address these chal-

lenges. Hence, the contributions of this paper are summarized as follows:

–Describing brieﬂy the diﬀerent challenges facing a variety of modern ﬁelds including

education, healthcare, network security, banking and ﬁnance, and social media.

–Presenting some of the previous literature works that addressed these challenges and

their shortcomings.

–Discussing the role and potential of ML in addressing these challenges and presents

potential frameworks for its deployment.

The remainder of this paper is organized as follows: Section 2 presents some of the recent

trends concerning the development and deployment of ML algorithms. Section 3 discusses

the education ﬁeld. Section 4 focuses on the healthcare system and ﬁeld. Section 5 presents

the challenges in network security and the potential role of ML in addressing these chal-

lenges. Section 6 sheds light on the banking and ﬁnance sector. Section 7 focuses on the

area of social media. Finally, Section 8 concludes the paper.

2 Recent Trends in Machine Learning

ML has become an extremely popular topic within development organizations that are

looking to adopt a data-driven approach to improve their business by gaining useful infor-

mation from the data they collect. With ML models, organizations can continually predict

changes in their business and make decisions accordingly. ML uses algorithms that itera-

tively learn from data to improve, describe data, and predict outcomes. Once an ML model

has been trained, it can predict new data that is given as input. The output given by the model

on the new data will depend on the data used to train the model.

The emerging growth of ML adoption in various ﬁelds is emphasized by the amount

of ﬁnancial resources being allocated to deploy ML models. As illustrated in Figure 1,

the global ML market is expected to reach close to $42.5 billion CAD by the year 2024

(Columbus, 2020). Furthermore, as per McKinsey & Company’s “Notes from the AI Fron-

tier, Tackling Europe’s Gap in Digital and AI” discussion paper, the ML market could boost

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 3

Fig. 1: Global ML Forecast Columbus (2020)

economic activity growth throughout the EU by as much as 20% by the year 2030 (Bughin

et al., 2019). Moreover, the World Economic Forum predicted that a net of 58 million jobs

will be created in the coming years due to ML technologies (Forum, 2018). This highlights

the importance and positive potential impact that ML will have on the global economic mar-

ket.

Within the area of ML, one promising paradigm to adopt is federated learning (FL).

FL is a ML paradigm in which a high-quality centralized model is trained using data that is

distributed over a large number of locations. The term was ﬁrst coined by Google in 2016 in

which they proposed a mechanism in which data at each location is used to independently

compute an update of the current ML model. This update is then communicated back to

a central service that aggregates these updates to compute a new global model that is dis-

tributed back to the diﬀerent locations (Konecn`

y et al., 2016). Accordingly, this paradigm

adopts the “bringing the code to the data” philosophy rather than “bringing the data to the

code”. As such, the FL paradigm addresses concerns regarding the data privacy, ownership,

and locality (Bonawitz et al., 2019) (Yang et al., 2019b). Given the distributed nature of wa-

ter leak monitoring systems with sensors collecting data at various geographical locations,

FL promises to be a viable solution for extracting meaningful information from the collected

data while still maintaining its privacy and locality.

The continued projected market growth of ML technologies and the privacy-preserving

characteristic of FL (due to its distributed learning nature) has resulted in increased demand

for FL with new technologies and frameworks currently being developed. For example,

Google has recently released a TensorFlow-based FL framework named TensorFlow Fed-

erated (TFF). TFF enables developers to deploy an AI system and train it across data from

multiple sources, all while keeping each of those sources separate and local (Ingerman and

Ostrowski, 2019). Other FL-based frameworks include Federated AI Technology Enabler

(Webank’s, AI, 2019), PySyft (Ryﬀel et al., 2018), Leaf (Caldas et al., 2019), PaddleFL

(Dong et al., 2019), and Clara Training Framework (Wen et al., 2019). In addition to the

4 MohammadNoor Injadat et al.

popular ML algorithms previous proposed in the literature such as neural networks and sup-

port vector machines, this illustrates the recent and continued development and deployment

eﬀorts of ML algorithms and paradigms.

3 Education

The ﬁrst area considered is that of the education sector. There are three main ways that

education can be delivered: onsite, online, and blended learning. Onsite education, or tradi-

tional education, refers to educational content delivery within a traditional classroom setting

(Moubayed et al., 2018). This setting requires that the educator and the students are in the

same room at the same time. This allows the educator to deliver his/her lecture to the at-

tending class. As such, traditional classrooms provide face-to-face interaction between the

educator and the students (Black, 2002).

On the other hand, online education, one category of e-learning systems, refers to edu-

cation that is provided over the Internet. E-learning provides students with the opportunity

to access educational curriculum outside of a traditional classroom at any time from any

geographical location. However, there is no face-to-face interaction with the educator as all

the content is delivered remotely (Moubayed et al., 2018).

Last but not least, blended or hybrid learning is a combination of onsite and online edu-

cation. For the education delivery system to be considered blended, up to 30% of the course

requirements must be conducted face-to-face in a traditional classroom setting, while the re-

maining percentage of the course requirements can be completed online. Blended learning

oﬀers students the opportunity to have face-to-face interactions with the educator and other

students while also providing them with the opportunity to assess course materials at any

time from any location (Moubayed et al., 2018).

However, the educational sector faces a variety of challenges, some of which are ped-

agogical and others being technical. This section identiﬁes some of the challenges in the

education sector. Moreover, it presents some of the previous literature works that tried to ad-

dress each of these challenges. Furthermore, it discusses the role of ML in addressing them.

challenges. More speciﬁcally, this section will discuss how ML can be used to grade es-

says, predict and prevent students from dropping-out, improve intelligent tutoring systems,

recommend online courses, and provide personalized learning.

3.1 Essay Grading

3.1.1 Challenge Description:

Essays provide a tool for assessing students’ critical thinking, analysis, and communica-

tion skills. However, it is time consuming for educators to grade essays. Furthermore, when

humans grade essays there is a great level of subjectivity which can lead to two diﬀerent

graders scoring an essay very diﬀerently (Mahana et al., 2012). Within this context, using

ML algorithms to grade essays can reduce the workload of educators and provide more

objectivity during the grading process. A common approach to creating an essay grading

algorithm is to ﬁrst collect a large pool of essays which have characteristics that are com-

putationally measurable (e.g., sentence length, word frequency distributions, grammar, and

spelling), and have been scored by humans (Ramalingam et al., 2018). This allows the algo-

rithm to ﬁrst learn the characteristics which are important for grading an essay. Then, when

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 5

the algorithm is used to score essays, the algorithm’s scores can be compared to those of the

human graders in order to determine if the algorithm has properly learned the grading char-

acteristics. It is worth mentioning that this challenge is not only from the technical aspect,

but also from the social aspect in terms of accepting the output of the automated ML-based

models. However, this work only focuses on the technical aspect of this challenge rather

than the social aspect of it.

3.1.2 Previous Works:

Mahana et al. (2012) built an automated essay scoring system using essays from kaggle.com.

The authors selected roughly 13,000 essays from a pool of essays that were submitted to a

competition by the William and Flora Hewlett Foundation. The essays were written by stu-

dents from Grade 7 to Grade 10 and were approximately 150 to 550 words long. The selected

essays were divided into eight sets, with each of the sets having unique grading characteris-

tics. Eight diﬀerent sets were selected to ensure that the automated grader was trained across

diﬀerent types of essays. Furthermore, each essay has one or more human scores. In the lat-

ter case, the essay also had a ﬁnal resolved score which considered all human scores. After

that, the authors selected the eight sets of training essays, they extracted several features

from them (e.g., total word count per essay, sentence count, number of long words, part of

speech counts, etc.) . These features were selected because they are characteristics that a

human grader would commonly look for when grading an essay. The authors then used a

linear regression model to allow their algorithm to learn parameters for grading based on the

selected features. After the algorithm learned the parameters for scoring the eight diﬀerent

essay types, the algorithm was used to score a distinct set of test essays. These scores were

compared against human graded scores to arrive at an error metric (Quadratic Weighted

Kappa). The average kappa score of the authors’ algorithm across the eight essay types was

0.73. Essay set 8 had the lowest kappa at 0.68 and essay set 1 had the highest kappa at 0.80.

Despite the fact that the proposed framework achieved good performance as seen with the

high kappa values, this work has a limited contribution as it only considered one ML algo-

rithm.

Ramalingam et al. (2018) also used ML techniques to develop an automated essay as-

sessment system . The authors selected essays from a pool of essays that were submitted to

a competition by The Hewlett Foundation to kaggle.com. All of the essays had been graded

by humans. Similarly, the authors further segregated these essays into eight unique sets. This

was followed by using Bayesian Linear Regression as their algorithm. The Bayesian essay

scoring system used features like speciﬁc words, speciﬁc phrases, order in which certain

noun-verb pair appears, and the order of the concepts explained to score the essays. In the

end, the authors tested their algorithm on eighty essays divided into two groups of forty,

which had also been scored by humans. The authors’ algorithm was over 80% accurate at

scoring the essays. Yet, this work also only considered one ML technique which limits its

contribution.

Ullmann (2019) also discussed using ML to assess essays. More speciﬁcally, the author

investigated the potential of ML algorithms to automate the analysis of reﬂective essays.

To that end, the author explored eight diﬀerent categories that are often used as metrics to

evaluate the quality of a reﬂective passage, namely reﬂection, experience, feeling, belief,

diﬃculty, perspective, learning, and intention. To that end, the author collected data from 76

students containing a total of 5080 sentences. Then, the authors created a training dataset

using a random sample consisting of 80% of the sentences and tested the performance of

four diﬀerent ML classiﬁcation algorithms including support vector machines (SVM), neu-

6 MohammadNoor Injadat et al.

ral networks, random forests (RF), and Naive Bayes on the remaining 20% of the sentences.

Experiments showed that the accuracy of the diﬀerent ML models ranged between 80%-

96% for the diﬀerent reﬂective essay metrics.

Mathias and Bhattacharyya (2020) explored the use of deep learning models to auto-

mate the essay grading process. To that end, the authors used the ASAP AEG dataset that

described diﬀered essay sets with multiple essay traits such as Content, Organization, Word

Choice, Sentence Fluency, and Conventions (Mathias and Bhattacharyya, 2018). More-

over, the authors proposed the use of a feature engineering system, a string kernel, and

an attention-based neural network as part of their automated essay grading framework. Ad-

ditionally, the Cohen’s Kappa metric was used to evaluate the performance of the proposed

framework as it considers the random correct classiﬁcation of data samples. Results showed

that the attention-based neural network algorithm outperformed other works from the lit-

erature across multiple prompts (with each prompt having a diﬀerent set of essay traits)

with a Kappa value between 0.586 and 0.820. However, the work only considered deep neu-

ral networks without considering other potential classiﬁcation algorithms that may be more

computationally eﬃcient and have similar performance.

3.1.3 Research Opportunities:

As can be seen from this review of the literature, essays are an important tool for assessing

students’ comprehension and expression. However, grading essays is a time consuming task.

Furthermore, essay grading is prone to subjectivity, which can lead to the same essay being

scored diﬀerently by two graders. Hence, essay grading presents the challenges of time con-

sumption and human subjectivity.

ML oﬀers a potential solution to address these issues. Firstly, ML algorithms can be

used so that graders no longer need to spend time on grading. Secondly, such algorithms can

be used to provide objective scores of essays. Although the previous subsection presented

research that has shown how ML can be used to address the challenges of essay grading,

there are still opportunities for further research.

One potential opportunity is exploring and evaluating diﬀerent ML models (e.g. logistic

model trees or deep neural networks). This is mainly due to the fact that most of the previous

work only used one algorithm for essay grading. Therefore, it is important to explore and

compare the performance of other ML models to obtain a more robust essay grading frame-

work, especially given the eﬀectiveness of other models such as deep neural networks in

natural language processing problems. Another potential opportunity is studying the impact

of more advanced Natural Language Processing features (e.g. N-grams, k-nearest neighbors

(k-NN) in bag of words), selecting features that are grammar and usage speciﬁc, and explor-

ing other polynomial basis functions like neural networks (NN) as part of the essay grading

framework.

Such frameworks can be applied to any assessment task that contains an essay compo-

nent. This includes exams and tests that contain essay sections. The application of essay

grading algorithms to exam and test essays could increase the consistency of scoring while

reducing the grader bias. Furthermore, there is the possibility to use essay grading algo-

rithms as components of interactive knowledge and writing tutorial systems.

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 7

3.2 Dropout Prevention

3.2.1 Challenge Description:

Student dropout is another challenge that is prevailing in the education sector. The term

dropout refers to the case when a student leaves/quits a course before completing it. Recent

studies showed that students were more likely to dropout of online courses than traditional

classes (Coussement et al., 2020). High dropout rates can eﬀect the future of colleges and

universities. This is because diﬀerent stakeholders within the education ﬁeld including poli-

cymakers, funding bodies, and educators consider dropout rates to be an objective outcome-

based measure of the educational institutions’ quality (Sneyers and De Witte, 2017).

While the higher dropout rate of students in online classes is a known issue, there are

many possible reasons for students to dropout. In turn, this makes predicting dropout chal-

lenging. From the student side, possible reasons for online course dropout include higher

than expected workload, inability to manage academic responsibilities in a self-driven learn-

ing environment, unfamiliarity with the online educational delivery system, less student–teacher

interaction, family and social obligations, and motivation level (Bawa, 2016). However, stu-

dents may also dropout of online courses due to the course being poorly designed and de-

livered, which can occur when the professor who created and taught the online course is

unfamiliar with technology and/or is provided with no training by their institution on how

to teach in an online environment (Bawa, 2016).

As can be seen, there are many possible reasons students may dropout, which can make

predicting which students will dropout a complicated task. Even though this is a compli-

cated task, it is important to identify students at risk of dropping out so that professors can

address the needs of these students and take the appropriate actions to reduce their probabil-

ity of dropping out (Sneyers and De Witte, 2017). One way to make the task of identifying

at risk students easier is to use ML algorithms. Again, this challenge does not only entail the

technical aspect, but it also has a pedagogical aspect in terms of the context used for data

collection purposes. However, as mentioned earlier, this work only focuses on the technical

aspect with which ML can play a role without tackling the pedagogical context of the data

collection.

3.2.2 Previous Works:

Coussement et al. (2020) used ML techniques in order to predict dropout in e-learning

courses. More speciﬁcally, the authors proposed the use of logit leaf model (LLM), a deci-

sion tree (DT)-based classiﬁcation model, to accurately predict student dropout in subscription-

based e-learning environments. LLM was chosen due to its capability to balance between

comprehensibility and predictive performance. To that end, the authors compared the per-

formance of the proposed LLM model to that of eight other ML classiﬁcation algorithms

on a real-life dataset containing more than 10,000 students of a global subscription-based

e-learning provider. Results showed that the proposed LLM model was one of the top per-

forming student dropout prediction models with a high area under the curve (AUC) value

above 0.8, highlighting its eﬀectiveness in achieving its target task.

Similarly, Chung and Lee (2019) proposed the use of an ML classiﬁcation model to pre-

dict student dropout in high schools. In particular, they proposed the use of RF algorithm

for this task due to its high prediction accuracy in multiple scenarios and applications. To

that end, the authors used the National Education Information System (NEIS) data collected

in Korea in 2014 and evaluated the performance of the RF model using multiple metrics

8 MohammadNoor Injadat et al.

such as accuracy, sensitivity, speciﬁcity, and AUC. Experiments showed that the proposed

RF achieved high accuracy (close to 95%) and high AUC value (close to 0.97), highlighting

the eﬀectiveness of this model. However, one shortcoming is that the work only considered

one classiﬁcation algorithm without comparing it to other potential ML algorithms.

3.2.3 Research Opportunities:

Although ML has been proposed to predict dropout in e-learning courses, there are still

research opportunities within this area. One potential opportunity is studying the perfor-

mance of diﬀerent ML dropout prediction frameworks and models in other course delivery

settings such as blended learning, distance and classical education. This would highlight

the generality of the dropout prediction framework. Another opportunity worth exploring

is investigating the impact of diﬀerent student attributes to create their dropout prediction

method. This is essential as it can result in more accurate models. A third opportunity to

consider is comparing the performance of diﬀerent base and ensemble learning methods to

achieve more accurate and robust prediction models and studying their impact on retention

strategies through correlation and association rules mining.

3.3 Intelligent Tutoring

3.3.1 Challenge Description:

The third challenge facing modern education systems is that of providing and improving

intelligent tutoring systems (Di Pietro and Distefano, 2019). For example, Troussas et al.

(2018) proposed an intelligent tutoring system to teach English and French languages using

ML-based models. However, in such a scenario, there are multiple challenges that arise

include how to detect spelling mistakes, verb tense mistakes, and auxiliary verb mistakes.

As such, developing eﬀective intelligent tutoring systems can be a challenging task given

the signiﬁcant impact of multiple factors that need to be considered.

3.3.2 Previous Works:

Di Pietro and Distefano (2019) combined the concepts of ML and gamiﬁcation with cloud

technologies in a uniﬁed framework to improve intelligent tutoring systems. More speciﬁ-

cally, the authors proposed the use of diﬀerent ML models such as optical character recog-

nition, sentiment analysis, and speech recognition to create a virtual study buddy. The goal

of this system is to help students develop better study strategies by interacting with a digital

study partner. However, one limitation of this work was the fact that the authors did not test

their proposed framework to explore its eﬀectiveness.

On the other hand, Barron-Estrada et al. (2017) proposed the use of sentiment analysis

to improve the performance of an aﬀective intelligent tutoring system by better gauging the

opinions of students about the course contents. To that end, the authors used a collection

of texts containing more than 68,000 Twitter messages written in Spanish and transformed

them into numerical feature vectors along with their associated sentiment. Then, the authors

used Naive Bayes classiﬁer (chosen due to its simplicity) to predict the sentiment of future

texts/Twitter messages. Again, the authors used multiple metrics such as accuracy, precision,

recall, and f1-score to evaluate the performance of the proposed module. Experimental re-

sults showed that their proposed module achieved high accuracy (above 80%) coupled with

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 9

high f1-scores. However, one shortcoming of this work is the fact that they did not compare

the performance of their proposed module to other potential classiﬁers.

3.3.3 Research Opportunities:

Despite the fact that ML has been used to improve intelligent tutors, more research opportu-

nities still exist. One such opportunity is considering more features (for example phonemic

and history-based features) and investigating their impact on the performance of the de-

veloped model. Another potential opportunity to consider is comparing the performance of

other classiﬁers such as NN and SVM. This comparison can help determine whether the

classiﬁers previously proposed in the literature are biased. Moreover, such comparisons will

lead to having a more adaptive and robust intelligent tutors.

3.4 Course Recommendation

3.4.1 Challenge Description:

Massively Open Online Courses (MOOCs) such us Coursera, Udacity, EdX, and MOOC.org

are a form of online distance education/e-learning. As such, MOOCs provide online courses

that can be accessed by a student at any time from any geographical location (Moubayed

et al., 2018). MOOCs are open to anyone that is interested in enrolling and are often free

or low-cost. However, the courses do not provide course credit and are not applied towards

a degree. Instead, MOOCs tend to be used by people who want to learn new skills, be it to

advance their career or for fun. MOOCs provide open access to a plethora of courses from

various top-rated universities and institutions. For example, the website mooc.org provides

a course titled “Data Science: R Basics” from Harvard University, and a course titled “In-

troduction to Data Analysis using Excel” from Microsoft (Mooc.org, 2019).

Since MOOCs are open access, hundreds of thousands of students can be enrolled in

each course, with MOOCs platforms oﬀering thousands of diﬀerent courses. This means

that MOOC platforms are privy to mass amounts of data. This data can then be used to im-

prove the MOOC system. For example, having thousands of diﬀerent courses available can

be overwhelming for students. Therefore, if a student is looking to improve a speciﬁc skill,

it would be beneﬁcial for the MOOC system to recommend which courses are needed to

acquire those skills (Symeonidis and Malakoudis, 2016).

3.4.2 Previous Works:

Several previous literature works focused on the problem of course recommendation for

students. One such example is the work by (Aher and Lobo, 2013). The authors used prior

student data and a combination of ML algorithms to recommend courses to students in

an e-learning system. The authors combined Simple K-means (a clustering technique) and

Apriori (an association rule algorithm) to investigate prior students’ data from Moodle.org in

order to determine which courses to recommend to new students. The authors found that the

results of their combination approach matched real world student course selection patterns.

However, one limitation of this work is that it only considered one unsupervised clustering

algorithm.

Mondal et al. (2020) also proposed the use of ML algorithms as part of a course rec-

ommendation system for online learning environments. More speciﬁcally, the authors ﬁrst

10 MohammadNoor Injadat et al.

used K-means algorithm to group students based on their performance in previous courses.

This was followed by applying collaborative ﬁltering to recommend new suitable courses.

The results showed that the proposed model achieved a low root mean squared error and

mean absolute error. Furthermore, the model also achieved high precision and recall values,

indicating that it can return correct results and preserve the majority of true positives.

Similarly, Zhang et al. (2018a) proposed the use of a distributed association rule algo-

rithm as part of their course recommendation system. The authors used a combination of

Hadoop and Spark platforms to implement the proposed framework so that it is suitable for

MOOC environments. The experimental results on three diﬀerent datasets illustrated the ef-

fectiveness of the proposed framework by having a high conﬁdence value (close to 0.5) for

multiple association rules.

3.4.3 Research Opportunities:

ML algorithms can be further applied to the large amounts of data that MOOC platforms

possess in order to determine which courses would be best for a student who is interested

in improving a speciﬁc skill set. One potential research opportunity for students’ course

recommendation is evaluating the courses that other students have taken that are related to

the skill that the student is interested in. Using that information can help build an eﬀective

course recommender. Another opportunity is consider multiple supervised classiﬁcation al-

gorithms. This is particularly important given the substantial impact that the classiﬁcation

process has on the overall performance of the recommender. Therefore, it is worth exploring

the performance of diﬀerent classiﬁcation algorithms to study their impact on the eﬀective-

ness of the recommendation process.

3.5 Personalized Learning

3.5.1 Challenge Description:

Personalized learning is based on the individual students and how they learn. Each individual

learns diﬀerently and has a unique learner proﬁle. This proﬁle is based on the individual’s

learning style (Klaˇ

snja-Mili´

cevi´

c et al., 2011; Dwivedi and Bharadwaj, 2015; Bourkoukou

and El Bachari, 2016), which consists of speciﬁc behaviors and attitudes (Truong, 2016).

Personalizing each learner’s education can lead to better learning. One way to personal-

ize education is by using recommender systems that provide useful suggestions for users

(books, movies, products, etc.) based on their preferences and their similarity to other stu-

dents (Bourkoukou and El Bachari, 2018).

3.5.2 Previous Works:

Bourkoukou and El Bachari (2018) tested the ability of LearnFitII to act as a recommender

system. LearnFitII is an adaptive learning system that automatically adapts to the dynamic

preferences of learners. By mining the server logs of students, LearnFitII was able to rec-

ognize the diﬀerent learning styles and habits of students. Then, using the Felder-Silverman

model of learning styles, LearnFitII proposed personalized learning scenarios. The Felder-

Silverman model of learning styles consists of four learning dimensions (1. Information

Processing, 2. Information Perception, 3. Information Reception, and 4. Information Under-

standing) (Felder et al., 1988). These dimensions can be accessed via the Index Learning

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 11

Style Questionnaire (ILSQ) which consists of 44 questions.

After proposing personalized learning scenarios, LearnFitII analyzed the habits and the

preferences of learners by mining information about the learners’ actions and interactions.

After the mining of this information, the learning scenarios were revisited and updated us-

ing a hybrid recommender system which combined k-NN and association rule mining algo-

rithms. The authors found that when LearnFitII was tested in real environments that learning

quality increased and so did the learners’ satisfaction with the learning process (Bourkoukou

and El Bachari, 2018).

Another way that personalized learning can be beneﬁcial is in helping students select the

learning-pathway that is appropriate for them. Elfaki et al. (2014) investigated how student

learning-pathways can be improved with ML. The term learning-pathway can be under-

stood as the path of academic courses that is appropriate for a student to achieve a degree.

Ideally, one’s learning-pathway is in their ﬁeld of interest. Typically, students spend some

time taking various courses in order to discover which topics they are interested in. How-

ever, this process of taking various courses can lead to a mismatch between a student’s

current and preferred learning pathway. When mismatches occur, the student may experi-

ence academic diﬃculties (e.g. weak performance, high absentee rate). These mismatches

may lead students to lower their level of education or dropout of university altogether. In

order to improve students’ levels of achievement it would be beneﬁcial to help them deter-

mine their desired learning-pathway sooner. In order to achieve this goal sooner, the authors

ﬁrst collected questionnaire data. Then they sent a questionnaire to 900 students from the

Faculty of Computers and Information Technology at Tabuk University in Saudi Arabia with

450 students returning the questionnaire. The questionnaire addressed four topics: basic in-

formation, personal information, academic information, and learning pathway information.

After collecting this data, the authors applied a DT algorithm to the data. Then, induction

rules were deduced from the tree paths in order to provide learning-pathway recommen-

dations. In order to validate their results, the authors divided the questionnaires into two

groups, a developing group (70%) and a test group (30%). Using these two groups of data,

the authors found that their algorithm could accurately provide learning-pathway recom-

mendations (Elfaki et al., 2014).

Moubayed et al. (2018); Moubayed et al. (2019) studied the problem of student en-

gagement level identiﬁcation in an e-learning environment. This was done using K-means

algorithm. In addition to that, the authors extracted a set of rules relating student engage-

ment to academic performance. To that end, the authors used the Apriori association rules

algorithm. Experimental results showed a positive relationship between students’ engage-

ment level and their academic performance.

Injadat et al. (2020c,a) proposed the use of optimized ML ensemble classiﬁcation mod-

els to predict student performance during the course delivery time at two stages. The au-

thors explored the use of various based learners such as SVM, K-NN, NB, RF, and neural

networks to form the ensembles and tested them on two diﬀerent datasets. Results showed

that their proposed ensemble models achieved high accuracy for the target class in both the

binary and multi-class cases despite the small number of instances available.

3.5.3 Research Opportunities:

There are still further research opportunities to use ML to provide personalized learning.

One opportunity is to consider more complex recommendation approaches by including

other factors such as learner motivation and knowledge level as well as additional personality

traits. Another opportunity is to study the performance of diﬀerent classiﬁcation algorithms

12 MohammadNoor Injadat et al.

Table 1: Challenges, Previous Works, and Research Opportunities within Education Sector

Challenge Previous Work Research Opportunity

Essay Grading

Regular linear regression is used for

essay grading (Mahana et al., 2012)

- Explore diﬀerent ML models such

as LR, DT, and DNN

Bayesian linear regression is used

for essay grading (Ramalingam

et al., 2018)

- Study the impact of additional

Language and usage-speciﬁc fea-

tures as well as other polynomial

basis functions.

SVM, RF, neural networks, and

naive bayes are used to evaluate re-

ﬂective essays (Ullmann, 2019)

- Compare the performance of the

diﬀerent models on various tasks to

get a more accurate and robust es-

say grading framework.

Deep neural networks were used to

classify essays based on ﬁve dif-

ferent potential traits (Mathias and

Bhattacharyya, 2020)

Dropout Prevention

Studied the performance LLM to

predict dropout (Coussement et al.,

2020)

- Study the performance of diﬀerent

models (base learners and ensemble

learner models) in diﬀerent course

delivery settings.

Studied the performance of RF clas-

siﬁcation model for dropout predic-

tion(Chung and Lee, 2019)

- Investigate the impact of diﬀer-

ent student attributes on the dropout

prediction frameworks.

Intelligent Tutors Sentiment analysis was proposed

as part of a virtual study buddy

framework (Di Pietro and Diste-

fano, 2019)

- Consider more features such as

phonemic and history-based fea-

tures as well as investigate their im-

pact on the performance of the de-

veloped model.

NB was used to predict the sen-

timent of text messages to be

used to improve an aﬀective in-

telligent tutoring system (Barron-

Estrada et al., 2017)

- Study the performance of other

classiﬁers such as NN and SVM

to determine whether the classiﬁers

previously proposed in the litera-

ture are biased.

Course

Recommendation

Combined k-means and apriori al-

gorithm to recommend courses

(Aher and Lobo, 2013)

- Evaluate the courses that other stu-

dents have taken that are related to

the skill the student is interested in

using multiple metrics.

Used a combination of K-means al-

gorithm and collaborative ﬁltering

for course recommendation (Mon-

dal et al., 2020)

- Consider multiple supervised clas-

siﬁcation algorithms to study their

impact on the eﬀectiveness of rec-

ommendation process.

Used an improved version of

Apriori association rules algorithm

for course recommendation (Zhang

et al., 2018a)

Personalized

Learning

Combined a DT algorithm and in-

duction rules algorithm to pro-

vide learning-pathway recommen-

dations (Elfaki et al., 2014)

- Study the performance of diﬀer-

ent classiﬁcation algorithms to pre-

dict student performance during the

course delivery.

Mined server logs to determine stu-

dent learning style (Bourkoukou

and El Bachari, 2018)

- Consider more complex recom-

mendation approaches by including

other factors.

Used K-means and apriori al-

gorithms to identify student en-

gagement and their relation with

academic performance (Moubayed

et al., 2018; Moubayed et al., 2019)

Used multiple optimized ensemble

classiﬁcation algorithms to predict

student performance during course

delivery (Injadat et al., 2020c,a)

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 13

Fig. 2: Potential Deployment of ML in Learning Management System (LMS)

to predict student performance during the course delivery. This can help identify student

who may need help and provide them with a personalized plan to improve their predicted

performance.

Table 1 summarizes the challenges within the education sector, lists some of the previous

works, and presents the diﬀerent research opportunities. Furthermore, Figure 2 illustrates

the potential deployment framework of the ML modules within the Learning Management

System (LMS).

4 Healthcare

Another area where ML has shown promise is in the ﬁeld of healthcare. Many modern

medical organizations use electronic health records (EHRs) (Caban and Gotz, 2015), EHRs

consist of heterogeneous data elements. This includes information such as the patient demo-

graphic, diagnoses, laboratory test results, previous prescriptions, and clinical notes (Xiao

et al., 2018). Patient data can also include imaging, sensor and text data (Miotto et al., 2018).

Furthermore, this data often comes in various formats, including structured, semi-structured

and weakly structured data (Holzinger et al., 2014). Originally it was thought that having

access to more information about individual patients would lead to more informed medi-

cal decisions. However, often times health professionals are overwhelmed by the amount

of information that is now available to them (Caban and Gotz, 2015). Hence, a challenge

with big data in healthcare is making the data easily interpretable for medical professionals.

ML oﬀers a solution to this problem because it can be used to identify relevant patterns in

complex data. In this section, how ML algorithms can be used in various applications such

as predicting individual patient’s responses to cancer drugs, diabetes research, retinopathy

detection, and cancer detection is discussed. Note that despite the various applications in

which ML can be applied within the ﬁeld of healthcare, this section discusses some of the

most prominent applications.

14 MohammadNoor Injadat et al.

4.1 Drug Response Prediction

4.1.1 Challenge Description:

One way that ML can be applied to medical data is to predict an individual patient’s response

to a drug or drugs (Vidyasagar, 2015). For example, ML can be used to predict the responses

of individual cancer patient to therapeutic drugs (Huang et al., 2018). When working with

cancer patients, it is possible to use precision cancer medicine. Precision cancer medicine

aims to accurately predict the optimal drug therapies for a patient based upon the person-

alized molecular proﬁles of their tumors (Prasad et al., 2016). In order to provide precision

cancer medicine, it is necessary to search for signiﬁcant correlations between patient tumor

proﬁles and the output predictions of optimal drug responses in cancer-relevant datasets.

Once these correlations are found in previously established datasets, they can be used to

predict an individual patient’s response to various series of therapeutic drugs (Vidyasagar,

2015). As mentioned before, ML oﬀers a solution to this problem, because it can be used to

identify relevant patterns in complex data.

4.1.2 Previous Works:

Huang et al. (2018) applied their open-source SVM-based algorithm to the gene-expression

proﬁles of 175 individual cancer patient’s tumors. The algorithm was able to predict the

responses of these 175 individuals to a variety of standard-of-care chemotherapeutic drugs

with >80% accuracy.

Xia et al. (2018) also used ML to predict tumor cell line response to drug pairs. The

authors used a computational deep learning model to predict cell line response to a subset of

drug pairs in the National Cancer Institute-ALMANAC database. When the authors ranked

the drug pairs for each cell line based on the model’s predicted combination eﬀect, they

were able to determine 80% of the top drug pairs.

Chiu et al. (2019) also used deep neural networks (DNN) and the genomic proﬁles of

cancer tumors in order to predict the tumors’ responses to therapeutic drugs. The authors

created DeepDR, a deep neural network model, then trained it to learn the genetic back-

ground of tumors based on data from The Cancer Genome Atlas (TCGA). DeepDR was also

trained on pharmacogenomics data from human cancer cell lines provided by the Genomics

of Drug Sensitivity in Cancer (GDSC) Project. After training on these data sets, DeepDR

was applied to TCGA data again in order to predict the drug response of tumors. The au-

thors’ work provides insights into the ability of a deep neural network model to translate

pharmacogenomics features identiﬁed from in vitro drug screening to predict the response

of tumors.

Mucaki et al. (2019) used ML and genetic data to predict patients’ responses to chemother-

apy. More speciﬁcally, the authors used supervised support vector ML to determine the gene

sets whose expression was related to the speciﬁc tumor cell line GI50 . The authors discov-

ered that speciﬁc genes and functional pathways can be used to distinguish which tumor

cell lines are sensitive to chemotherapy drugs and which tumor cell lines are resistant to

chemotherapy drugs. They tested their algorithm on bladder, ovarian and colorectal cancer

patient data from The Cancer Genome Atlas (TCGA) in order to determine the response

of tumor cell line GI50 to three chemotherapy drugs (cisplatin, carboplatin and oxaliplatin).

Through experimental results, the authors found that for cisplatin, their algorithm was 71.0%

accurate at predicting disease recurrence and 59% accurate at predicting remission. In the

case of carboplatin, their algorithm was 60.2% accurate at predicting disease recurrence and

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 15

61% accurate at predicting remission. Finally, for oxaliplatin, their algorithm was 54.5%

accurate at predicting disease recurrence and 72% accurate at predicting remission. Fur-

thermore, in patients who used cisplatin and had a speciﬁc genetic signature, the algorithm

was able to predict 100% of recurrence in non-smoking bladder cancer patients and 79%

recurrence in smokers.

4.1.3 Research Opportunities:

Many research opportunities still exist in applying ML for drug response prediction. One

such opportunity is extending existing models to to predict the drug responses of cancer

patients who are receiving emerging immuno- and other targeted gene therapies. This will

validate the comprehensiveness and generality of the considered frameworks. Another po-

tential research opportunity is to build more comprehensive models by using more drug

features (such as concentration, SMILES strings, molecular graph convolution and atomic

convolution). This again will help extract more information and potentially uncover more

correlations and inter-dependencies that can make the models more robust and accurate. A

third opportunity is to investigate other methods and techniques including semi-supervised

learning methods to encode molecular features with external gene expression and other types

of data. This particularly would be helpful given that access to labeled data is not always

possible. Therefore, having semi-supervised based ML models can help healthcare profes-

sionals gain insight from labeled data and apply it to the unlabeled data that they have.

Last but not least, researchers should also investigate ways to adapt existing models to other

drugs, cancer types, and diseases. This is essential as it would provide one adaptive sys-

tem that can help healthcare professionals from diﬀerent specializations make use of the

available data.

4.2 Diabetes Research

4.2.1 Challenge Description:

As mentioned before, many modern medical organizations use electronic health records

(EHRs) to store the medical data of patients. The large amounts of data present in EHRs

can be a valuable source for researching diabetes mellitus (DM). Kavakiotis et al. (2017)

discuss what DM is and why it is a medical concern. DM is a group of metabolic disorders

that are mainly caused by abnormal insulin secretion and/or action. Abnormal insulin secre-

tion can result in a patient’s body not producing enough insulin which causes the patient’s

metabolism of carbohydrates, fat and proteins to be impaired, which in turn results in ele-

vated blood glucose levels (hyperglycaemia).

There are two major clinical types of DM, type 1 diabetes (T1D) and type 2 diabetes

(T2D). T1D is linked to the auto-immunological destruction of the Langerhans islets; whereas

T2D is linked to lifestyle, little physical activity, poor dietary habits and heredity. The main

treatment for T1D is insulin administration which can applied to T2D patients. However,

the main treatment for T2D is improved diet, weight loss, exercise and oral medication.

DM aﬀects more than 200 million people worldwide, with 10% of those aﬀected with T1D

and 90% aﬀected with T2D. DM possess a health threat as chronic hyperglycaemia results

in several complications, including diabetic nephropathy, retinopathy, neuropathy, diabetic

coma and cardiovascular disease.

16 MohammadNoor Injadat et al.

4.2.2 Previous Works:

DM has a high mortality and morbidity rate, therefore, detecting and treating DM is of high

interest to the medical community as well as those who may or already do suﬀer from DM

(Kavakiotis et al., 2017). In recent years, researchers have been able to apply ML algorithms

to the data of patients with DM in order to improve the methods of detecting and treating

DM.

Hemoglobin is a substance in red blood cells that carries oxygen to tissues. However,

it can also attach to sugar in the blood and form a substance called glycated hemoglobin

(HbA1c) (Michael Dansinger, 2019). A patient’s HbA1c level can be checked in order to

determine if they have T2D. Alternatively, a patient’s HbA1c level along with their fasting

blood glucose level and oral glucose tolerance test results can be used to determine if they

have T2D (Jelinek et al., 2016). Currently, to diagnosis a patient with T2D their HbA1c

value must be at or above 6.5% . However, studies have shown that the cut-oﬀvalue of 6.5%

leads to inconsistencies in the diagnosis of T2D. Hence, using HbA1c with a 6.5% cut-oﬀ

value as a single marker for T2D may lead to undiagnosed cases of diabetes.

Jelinek et al. (2016) applied ML algorithms to the data of 840 patients from the Diabetes

Health screening (DiabHealth) in order to identify an optimal cut-oﬀvalue for HbA1c and

to identify whether additional biomarkers could be used along with HbA1c to increase the

diagnosis of T2D. Then the authors used T2D as the class feature and generated a conven-

tional DT using an information gain (IG) measure. Using this algorithm, the authors found

that if an oxidative stress marker (8-OhdG) was included in the model along with HbA1c that

the accuracy of detecting T2D at the 6.5% HbA1c level increased from 78.71% to 86.64%.

The authors also found that if interleukin-6 (IL-6) was included in the model along with

HbA1c that the accuracy of detecting T2D increased from 78.71% to 85.63%. However, in

this model, the optimal HbA1c range was between 5.73 and 6.22% .

Herrero et al. (2014) used ML to improve the treatment methods of T1D. Diabetics

with T1D need to use the medication insulin in order to maintain normal blood sugar lev-

els. Diabetics must self-administer multiple daily injections of insulin, both before meals

and basally, in order to mimic the natural insulin secretion of the pancreas. Before diabet-

ics administer these injections, they must prick their ﬁngertip to draw blood that is placed

in an electronic glucose meter that determines the amount of glucose in the patient’s blood

(Schiﬀrin and Belmonte, 1982; A. Flyvbjerg and Goldstein, 2010). However, in recent years,

an alternative form of therapy has become available. This alternative form of therapy is in-

sulin pump therapy. In this therapy, injections are provided by continuous subcutaneous

insulin infusion. The application of insulin by a machine allows diabetics to avoid multiple

uncomfortable ﬁnger pricks and injections. Insulin that is taken at meal times is referred to as

bolus insulin. Typically, bolus insulin doses are calculated by estimating carbohydrate intake

and dividing this number by a ﬁxed carbohydrate to insulin ratio, then adding a correction

dose derived from the individual’s insulin sensitivity factor (Herrero et al., 2014). Although

several algorithms have been developed to calculate bolus insulin dose (Jovanovic and Pe-

terson, 1982; Chanoch et al., 1985; Schiﬀrin et al., 1985; Chiarelli et al., 1990; Albisser,

2003; Owens et al., 2006), these algorithms have only been incorporated in commercially

available insulin pumps and in some glucose meters (Zisser et al., 2008). However, these al-

gorithms have not been adopted widely commercially due to economic risk, security issues

and inertia to change, and the lack of ease of use (Bellazzi, 2008). Based on these challenges,

Herrero et al. (2014) set out to create a more user-friendly bolus insulin calculating system.

The authors used a decision support algorithm that incorporated Run-To-Run (R2R) control

and case-based reasoning (CBR). They tested their algorithm via in-silico scenarios by using

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 17

a simulator that emulated intra-subject insulin sensitivity variations and uncertainty in the

capillarity measurements and carbohydrate intake. Via these simulations, the authors found

that the CBR(R2R) algorithm signiﬁcantly reduced the mean blood glucose level and com-

pletely eliminated hypoglycemia. When the authors compared the CBR(R2R) algorithm to

a standalone (R2R only) version of the algorithm, they found that the CBR(R2R) algorithm

performed better. The goal of the algorithm was to reduce blood glucose levels. Therefore,

the CBR(R2R) algorithm performed better than the standalone R2R algorithm in both pop-

ulations.

Fig. 3: Potential Deployment of ML in Diabetes Research

4.2.3 Research Opportunities:

Despite the fact that ML has been used in diabetes research, more opportunities still exist.

One suggestion is testing existing algorithms in the real-world via clinical trials. This is

particularly important given that simulation environment tend to over-estimate the beneﬁts

of an intervention and may not always provide an accurate representation of the behavior of

the body. Another opportunity worth exploring is investigating the performance of diﬀerent

ML classiﬁcation models. This can help validate whether existing models have any bias.

Therefore, it is important to compare the performance of diﬀerent models to have a more

accurate and sensitive model for insulin calculation. Figure 3 provides a visualization of

how these topics ﬁt into a precision medicine framework.

18 MohammadNoor Injadat et al.

4.3 Retinopathy Detection Through Image Classiﬁcation

4.3.1 Challenge Description:

Another healthcare-related area in which ML is playing a major role is for the detection of

retinopathy. Retinopathy refers to the damage of the retina, the light-sensing inner part of the

human eye (Harvard Medical School, 2017). This can be due to multiple causes and diseases

which can lead to partial or complete vision loss. There are diﬀerent types of retinopathy

including: retinopathy of prematurity, diabetic retinopathy, hypertensive retinopathy, and

central serous retinopathy. Detecting the damage at an early stage is crucial to facilitate

the treatment and slow down the loss vision process. To that end, multiple previous works

proposed the use of ML algorithms and paradigms to accurately detect retinopathy.

4.3.2 Previous Works:

Bhatia et al. (2016) proposed the use of ensemble ML models to perform early detection of

diabetic retinopathy. More speciﬁcally, the authors proposed ensembles based on DT, ad-

aBoost, Naive Bayes, K-NN, RF, and SVM to detect this disease using features extracted

from retinal images such as diameter of optic disk, lesion speciﬁc, and image level features.

Their experiments showed that the proposed models achieved a detection accuracy of up to

94%.

Similarly, M¨

uller et al. (2020) also proposed the use of ensemble models to detect

ABCA4-Related Retinopathy. The authors used high-dimensional microstructural eye im-

age’s dataset and extracted multiple features. The authors then developed diﬀerent ensemble

learning models based on K-NN, RF, SVM, and eXtreme Gradient Boosting (XGBoost).

Their experimental results showed that the proposed model achieved detection accuracies

ranging between 86%-93%.

Reddy et al. (2020) also proposed the use of ensemble-based ML models to detect di-

abetic retinopathy. In their work, the authors considered multiple classiﬁers including RF,

DT, Adaboost, K-NN, and Logistic Regression (LR). These methods were applied to a di-

abetic retinopathy dataset that was normalized using the min-max method. Results showed

that the proposed ensemble model outperformed the base models and achieved a detection

accuracy above 80%.

On the other hand, Gadekallu et al. (2020) proposed the combination of principal com-

ponent analysis (PCA) and DNN for the early detection of diabetic retinopathy. The authors

used a diabetes retinopathy dataset available at the UCI machine learning repository and nor-

malized it using the Z-score technique. After normalization, PCA was applied to extract the

most signiﬁcant features. The reduced dataset was then given as an input to a DNN model

for classiﬁcation. The experimental results showed that the proposed DNN model achieved

training accuracies between 72%-82% and testing accuracies between 68%-79%.

Similarly, Zhang et al. (2019) used DNN for the automated identiﬁcation and grading

system of diabetic retinopathy. The proposed system uses transfer learning and ensemble

learning to detect the presence and severity of DR from fundus images. The authors’ ex-

perimental results showed that their developed model has a high identiﬁcation sensitivity of

97.5% and a speciﬁcity of 97.7%. On the other hand, the grading model achieved a sensitiv-

ity of 98.1% and a speciﬁcity of 98.9%.

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 19

4.3.3 Research Opportunities:

Despite the promising results shown by ML models for retinopathy detection, further op-

portunities still exist. One such opportunity is developing optimized ML models. This is

because many of the previous works in the literature only consider the default version of the

classiﬁers. However, it is important to explore the impact of diﬀerent optimization methods

on the overall performance of the classiﬁers. Another opportunity is to consider diﬀerent

deep learning models such as convolutional neural networks (CNN) and recurrent neural

networks (RNN). More speciﬁcally, CNNs and RNNs can be combined to extend the ca-

pabilities of traditional CNN models from the binary to multi-label image classiﬁcation as

illustrated in (Wang et al., 2016; Shi and Pun, 2018). Such architectures have the potential to

further improve the performance of deep learning models for early detection of retinopathy.

4.4 Cancer Detection Through Image Classiﬁcation

4.4.1 Challenge Description:

Another prominent area in which ML is used within the healthcare ﬁeld is the detection of

diﬀerent types of cancer including breast cancer, prostate cancer, and lung cancer. This is

often done by applying ML methods to images of the diﬀerent organs or tissue suspected to

have cancer (Saba, 2020). Due to their success in image classiﬁcation problems in general,

ML models have been proposed in multiple research works from the literature to detect

cancer based on tissue images.

4.4.2 Previous Works:

Agarap (2018) investigated the performance of six diﬀerent ML algorithms to detect breast

cancer. More speciﬁcally, the author compared between linear regression, multi-layer per-

ceptron (MLP), K-NN, softmax regression, and two variants of SVM algorithm. To evaluate

the performance of these algorithms, the Wisconsin diagnostic breast cancer dataset was

used which is composed of features extracted from digitized images of tests on breast mass.

Experimental results showed that the detection accuracy ranged between 93%-99% with the

MLP algorithm being the most accurate.

Similarly, Shen et al. (2019) also proposed the use of ML algorithms for breast can-

cer detection. However, the authors in this case developed a deep learning model, namely

CNN, that was applied to digitized ﬁlm mammograms from the Digital Database for Screen-

ing Mammography. Experimental results showed that the developed CNN model achieved

accuracies between 63%-99% with the performance dependent on whether the CNN was

pre-trained or not. Moreover, the developed model achieved an area under the curve (AUC)

value reaching 0.88, illustrating its eﬀectiveness and robustness in detecting breast cancer.

On the other hand, Hussain et al. (2018) proposed the use of ML models for prostate can-

cer. To that end, the authors explored diﬀerent ML algorithms such as Bayesian approach,

SVM with multiple kernels, and DT. The authors also investigated diﬀerent feature extrac-

tion strategies based on texture, morphological, scale invariant feature transform (SIFT),

and elliptic Fourier descriptors (EFDs) features. Experimental results showed that the SVM

classiﬁer with RBF kernel achieved the highest accuracy ranging between 98%-99%.

Wu and Zhao (2017) proposed the use of ML algorithms to detect lung cancer based on

20 MohammadNoor Injadat et al.

computed tomography (CT) scan images. To that end, the authors proposed the use of a neu-

ral network-based model, namely the entropy degradation method (EDM), to detect cancer-

ous images. The performance of the proposed model was explored using a high-resolution

CT scan images provided by the National Cancer institute. Experimental results showed that

the proposed model achieved a detection accuracy of 77.8%, illustrating its eﬀectiveness to

detect small-cell lung cancer at an early stage.

Table 2: Challenges, Previous Works, and Research Opportunities within Healthcare Sector

Challenge Previous Work Research Opportunity

Drug

Response

Prediction

Applied (SVM)-based algorithm to pre-

dict the responses of individuals to a

variety of standard-of-care chemothera-

peutic drugs (Huang et al., 2018)

- Extend existing models to to predict

the drug responses of cancer patients

who are receiving emerging immuno-

and other targeted gene therapies.

Developed a deep learning model to pre-

dict cell line response to a subset of drug

pairs in the National Cancer Institute-

ALMANAC database (Xia et al., 2018)

- Build more comprehensive models by

using more drug features.

Used DNN and the genomic proﬁles of

cancer tumors to predict the tumors’ re-

sponses to therapeutic drugs (Chiu et al.,

2019)

- Investigate other methods and tech-

niques including semi-supervised learn-

ing methods on other types of data.

Used SVM to determine the gene sets

whose expression was related to the spe-

ciﬁc tumor cell line GI50 (Mucaki et al.,

2019)

- Investigate ways to adapt existing

models to other drugs, cancer types, and

diseases.

Diabetes

Research

Used conventional DT to identify an op-

timal cut-oﬀvalue for HbA1c (Jelinek

et al., 2016)

- Investigate the performance of diﬀer-

ent ML classiﬁcation models to validate

whether existing models have any bias.

Used a decision support algorithm to

calculate the bolus insulin levels (Her-

rero et al., 2014)

- Test existing algorithms in the real-

world via clinical trials.

Retinopathy

Detection

Through

Image

Classiﬁcation

Used of ensemble ML models based

on multiple algorithms to perform early

detection of diabetic retinopathy(Bhatia

et al., 2016)

- Develop optimized ML models to fur-

ther improve detection accuracy.

Used ensemble models to detect

ABCA4-Related Retinopathy (M¨

uller

et al., 2020)

- Consider diﬀerent deep learning mod-

els and architectures such as CNN and

RNN.

Proposed ensemble-based ML models

to detect diabetic retinopathy(Reddy

et al., 2020)

- Use DNN techniques to build models

that predict if a patient is diabetic or not.

Proposed the combination of PCA and

DNN for the early detection of diabetic

retinopathy (Gadekallu et al., 2020)

Used DNN for the automated identiﬁ-

cation and grading system of diabetic

retinopathy (Zhang et al., 2019)

Cancer

Detection

Through

Image

Classiﬁcation

Investigated the performance of six dif-

ferent ML algorithms to detect breast

cancer (Agarap, 2018)

- Develop optimized ML models to fur-

ther improve detection accuracy.

Proposed a deep learning CNN model

for breast cancer detection (Shen et al.,

2019)

- Apply diﬀerent types of ensemble

models that can combine multiple ML

classiﬁers to improve their eﬀectiveness

and robustness.

Explored diﬀerent ML classiﬁers to de-

tect prostate cancer (Hussain et al.,

2018)

- Consider diﬀerent deep learning mod-

els and architectures such as RNN.

Proposed the EDM (a neural network-

based model) model to detect small-cell

lung cancer (Wu and Zhao, 2017)

- Explore diﬀerent algorithms to im-

prove the feature extraction and selec-

tion process.

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 21

4.4.3 Research Opportunities:

As shown above, ML algorithms have been successfully applied to detect diﬀerent types

of cancer. However, there still exists further opportunities to improve the detection perfor-

mance. One such opportunity is to consider diﬀerent hyper-parameter optimization methods

to improve the performance of the ML models. Another potential opportunity is applying

diﬀerent types of ensemble models that can combine multiple ML classiﬁers to improve their

eﬀectiveness and robustness. A third opportunity is studying other deep learning techniques

and architectures such as the combined CNN-RNN models to investigate their eﬀectiveness

in detecting cancer. An additional opportunity is to improve the feature extraction and se-

lection algorithms from digital images. This is crucial given the fact that this process acts as

the input to the ML model development stage. Therefore, it is important to extract and select

relevant and high-quality features to be fed to the ML models under consideration.

Similar to the previous section on education, Table 2 summarizes some of the challenges

facing the healthcare sector, lists some of the previous works, and presents the diﬀerent re-

search opportunities.

5 Network Security

Turning to a diﬀerent sector, ML can also be beneﬁcial in network security. Cisco Systems,

Inc., an American multinational technology conglomerate who specializes in information

technology, networking, and cybersecurity solutions, deﬁnes network security as any ac-

tivity designed to protect the usability and integrity of a network and data (Cisco, 2019).

According to Cisco Systems, Inc., network security allows authorized users to assess a net-

work while preventing outside threats from entering or spreading on a network (Moubayed

et al., 2018, 2020). Cisco Systems, Inc. lists fourteen types of network security. However,

this section will focus on Intrusion Detection Systems (IDS). IDSs analyze and monitor net-

work traﬃc in order to determine if the network traﬃc patterns show normal activity or if

there are signs of malicious activity (Javaid et al., 2016; Sommer and Paxson, 2010). More

speciﬁcally, this section will discuss how ML can be used to improve network intrusion de-

tection systems (NIDS) in general, how to better detect Botnets, and how to improve NIDS

in vehicles.

5.1 Network Intrusion Detection Systems

5.1.1 Challenge Description:

A Network Intrusion Detection System (NIDS) helps system administrators to detect net-

work security breaches in their organizations (Javaid et al., 2016; Salo et al., 2018). NIDSs

are classiﬁed based on the style of detection that they use. Misuse-detection NIDSs use pre-

cise descriptions of known malicious behavior. Anomaly-detection NIDSs ﬂag deviations

from normal activity. Speciﬁcation-based NIDSs deﬁne allowed types of activity and ﬂag

any other activity as forbidden. Behavioral detection NIDSs analyze patterns of activity and

surrounding context to ﬁnd secondary evidence of attacks. Although there are many types

of NIDS, misuse-detection and anomaly-detection NIDSs are the most common (Sommer

and Paxson, 2010).

Misuse-detection NIDSs can also be referred to as signature (misuse) based NIDS (SNIDS).

22 MohammadNoor Injadat et al.

In SNIDS, attack signatures are pre-installed in the NIDS and pattern matching is then per-

formed between network traﬃc and the installed signatures. When a mismatch is found,

it is considered an intrusion. There are advantages and disadvantages to both SNIDS and

anomaly-detection NIDS (ADNIDS). SNIDSs are eﬀective in the detection of known at-

tacks and show high detection accuracy with less false-alarm rates, but is ineﬀective at de-

tecting unknown or new attacks whose signatures have not been installed on the IDS. On the

other hand, ADNIDSs are the better option for the detection of unknown and new attacks,

but produce high false-positive rates. The current deployment framework and usage pat-

ters of NIDSs makes it hard for these systems to be eﬃcient and ﬂexible when considering

unknown future attacks (Javaid et al., 2016).

5.1.2 Previous Works:

Javaid et al. (2016) propose a solution to the challenges of using NIDS to detect known

and unknown future attacks. The authors’ solution involved using a deep learning approach

known as Self-Taught Learning (STL). The authors veriﬁed their method on the bench-

mark intrusion dataset NSL-KDD. This dataset is an improved version of the former bench-

mark intrusion dataset KDD Cup 99. The authors present various metrics related to their

algorithm, including accuracy, precision, recall, and f-measure values. Experimental results

showed that the authors’ algorithm achieved a classiﬁcation accuracy rate above 98%.

Injadat et al. (2018) proposed using Bayesian optimization to hyper-tune the parame-

ters of diﬀerent supervised ML algorithms for anomaly-based IDSs. More speciﬁcally, they

tune the parameters of SVM, Random Forest (RF), and k-NN algorithms. Then, the authors

evaluated the performance of the regular and optimized version of these classiﬁers in terms

of accuracy, precision, and false alarm rate. Their experimental results showed that the pro-

posed framework achieved a high accuracy rate and precision, and a low-false alarm rate

and recall.

Injadat et al. (2020b) extended the work by proposing a novel multi-stage optimized ML-

based NIDS framework. The goal of this framework is to reduce the computational complex-

ity and maintain the detection performance. The performance of the proposed framework

was measured using two state-of-the-art intrusion detection datasets, the CICIDS 2017 and

the UNSW-NB 2015 datasets. Experimental results showed that the proposed model signif-

icantly reduced the required training sample size and feature set size. More speciﬁcally, the

model reduced the training sample size by 74% and the feature size by up to 50%. Moreover,

hyper-parameter optimization helped improve the model performance with the detection ac-

curacies being over 99% for both datasets. This represents an improvement of 1-2% in terms

of accuracy and 1-2% in false alarm rate when compared to other works from the literature.

Salo et al. (2019) proposed an ensemble feature selection and an anomaly detection

method for network intrusion detection. The proposed framework combined unsupervised

and supervised ML techniques to classify network traﬃc and identify previously unseen at-

tack patterns. To that end, the authors used three diﬀerent feature selection techniques that

identiﬁed 8 common and representative features. Moreover, the authors adopted k-Means

clustering to segregate the training instances and developed the classiﬁcation model ac-

cordingly. Their experimental results showed that the proposed framework was eﬀective in

detecting previously unseen attack patterns in comparison to the traditional classiﬁcation

approaches.

Wang et al. (2017) proposed the use of an SVM with augmented features for their intru-

sion detection framework. More speciﬁcally, the author used the logarithm marginal density

ratios transformation to get better-quality features. Using the NSL-KDD dataset, their ex-

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 23

periments showed that the proposed framework achieved better performance in terms of

accuracy, detection rate, false alarm rate and eﬃciency.

5.1.3 Research Opportunities:

Although the use of ML for network intrusion detection is popular, it still requires further re-

search. One potential opportunity is to study the performance of more complex models such

as bagging ensemble models or deep learning models. This is particularly important for

real-time or near real-time network intrusion detection. Another opportunity is to study the

impact of diﬀerent optimization models and techniques in enhancing the current intrusion

detection frameworks and models. A third research opportunity is to consider time-series

analysis techniques to identify and detect temporal-based anomalies and intrusion attempts.

This is crucial given that many attacks such as denial-of-service (DoS) attacks span a period

of time rather than being instantaneous. Another opportunity is to investigate the perfor-

mance of reinforcement learning and transfer learning techniques in IDSs. This is based on

the fact that such techniques have the potential to make the IDSs more ﬂexible and eﬀective.

Although there have been some works that have considered the use deep learning models

and time series analysis such as the work by Nguyen et al. (2020), this should be extended

to other network security problems and not just for IDSs.

5.2 Botnets Detection

5.2.1 Challenge Description:

The term botnet refers to a network of computers (bots) which have been compromised

by an attacker (aka botmaster) who has installed malicious software on the network via an

attacking technique such as trojan horses, worms and viruses. Botmasters often choose to

attack computer networks that contain many computers due to the large amounts of band-

width and powerful computing capabilities available for such networks. Once the botmaster

has control of a network, they use the network to initiate various malicious activities such

as email spam, distributed denial-of-service (DDOS) attacks, password cracking, and key

logging (Wang et al., 2015; Moubayed et al., 2020a; Injadat et al., 2020).

Zaidi and Tanveer (2017) divided the botnet life-cycle into three phases: 1. formation,

2. Command and control (C&C), 3. botnet application phase. In the formation phase, the

botmaster infects other machines on the Internet, turning them into bots on the botnet. In

the C&C phase, bots receive instructions from the bot master. During the botnet applica-

tion phase, the bots carry out malicious activities based on the instructions of the botmaster.

Although some bots might be detected and removed from the botnet, the botmaster will con-

tinue to probe the botnet in a stealthy manner for information about active bots and will plan

to form a new botnet (Vormayr et al., 2017).

One common type of botnet is the Internet relay chat (IRC) botnet. This botnet uses

IRC to facilitate command and control (C&C) communication between bots and botmas-

ters. IRC botnets can connect to one or more servers, making it easy for the botmaster to

execute commands. However, IRC botnets can be stopped by shutting-down the IRC bot-

net’s C&C server. Once attackers realized this central ﬂaw of IRC botnets, they began to

utilize peer-to-peer (P2P) botnets. In a P2P botnet, there is no centralized server and bots

are connected to each other topologically and act as both C&C server and client. Therefore,

24 MohammadNoor Injadat et al.

even if a P2P botnet loses some of its bots, its communication will not be disrupted. Ac-

cording to Wang et al. (2015), botnets have become one of the most signiﬁcant threats to the

Internet.

5.2.2 Previous Works:

McDermott et al. (2018) proposed a deep learning-based model to detect botnet activity

within IoT devices and networks. More speciﬁcally, the authors developed a Bidirectional

Long Short Term Memory based Recurrent Neural Network (BLSTM-RNN). The detec-

tion model was compared to a default LSTM-RNN. The performance was evaluated using

the accuracy and loss metrics. Experimental results demonstrated that the proposed model

achieved a detection accuracy ranging between 92%-99%, highlighting the eﬀectiveness of

RNN-based models for botnet detection.

Pektas¸ and Acarman (2017) investigated the ability of three ML algorithms (RF, LR, and

SVM) to eﬀectively select features to use in botnet detection during network ﬂow analysis.

More speciﬁcally, the authors investigated three diﬀerent feature selection methods. This in-

cluded Lasso linear regression models, Recursive Feature Elimination (RFE), and tree-based

feature selection, along with three diﬀerent classiﬁers (LR, NB, and RF). The authors found

that when the meta-classiﬁer RF was applied on the features selected by RF that the model

was nearly 99.9% accurate, making it the most accurate model that was tested. This model

almost achieved perfect classiﬁcation accuracy for identifying botnet and normal traﬃc.

Chen et al. (2017) also discussed using ML to detect botnets. According to the authors,

in the past, signature-based and anomaly-based intrusion detection systems (IDS) were used

to detect botnets. However, as the speed of the Internet has increased, these methods are no

longer as eﬀective. The authors proposed a method that uses conversation-based network

traﬃc analysis and supervised ML to identify malicious botnet traﬃc . The authors showed

that their approach outperformed other approaches which are based on network ﬂow analy-

sis. More speciﬁcally, the authors’ model resulted in a 13.2% decrease in the false positive

rate of botnet traﬃc detection. Furthermore, it was shown that the RF algorithm had a high

detection accuracy (93.6%) and a low false positive rate (0.3%).

5.2.3 Research Opportunities:

As mentioned earlier, there are still many research opportunities in the usage of ML for bot-

net detection that are worth exploring. One such opportunity is investigating the use of hy-

brid ML models to see if they can satisfy all the requirements of their proposed online botnet

detection framework. Another potential opportunity is to consider non-numerical features as

part of any botnet detection models since such features may contain valuable information.

A third opportunity again is studying the impact of diﬀerent optimization models and tech-

niques on the performance of current botnet detection models.

5.3 Intrusion Detection in Vehicles

5.3.1 Challenge Description:

In recent years, the conventional mechanical controlling parts in cars have largely been re-

placed by Electronics Control Units (ECUs) (Liu et al., 2017). ECUs are computing devices

that are used for controlling and monitoring the subsystems of a vehicle for energy eﬃciency

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 25

enhancement, and noise and vibration reduction (Kang and Kang, 2016; Moubayed et al.,

2020b). The use of computing devices in vehicles has led to the use of automotive network-

ing services such as Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) services.

V2V automotive networking services require computing devices to perform intra-vehicular

communication, while V2I automotive networking services require computing devices to

perform inter-vehicular communication (Moubayed and Shami, 2020).

One standard communication protocol for in-vehicle network communication is Con-

troller Area Network (CAN). CAN connects sensors and actuators with ECUs (Yang et al.,

2019a). Important information such as diagnostic, informative, and controlling data is deliv-

ered through a CAN bus and it is important that this information is secured in order to keep

the driver safe. However, whenever networks are used, there is a potential for signiﬁcant

security concerns. For in-vehicular networks there are several security ﬂaws. For example,

ECUs can obtain any ECU-to-ECU broadcasting messages in the same bus, but they are

unable to identify a sender (Kang and Kang, 2016).

5.3.2 Previous Works:

Based on their concerns about the security issues of in-vehicular networks, especially the

CAN bus component, Kang and Kang (2016) created an intrusion detection system that

uses a deep neural network (DNN). The authors’ DNN was able to more accurately detect

intrusions than a traditional ANN. According to the authors, this increased accuracy is due

to the deep learning framework, which allows for the initialization of parameters through

the unsupervised pre-training of deep belief networks (DBN). Finally, using experimental

results, the authors showed that their algorithm can provide a real-time response to an attack

with a detection ratio average of 98%.

In a similar manner, Yang et al. (2019a) proposed an IDS for autonomous and connected

vehicles using DT structures. The goal of the IDS is to detect network attacks within the ve-

hicle and external to it. Experimental results showed that the authors’ proposed framework

improved the detection accuracy, detection rate, and F1 score by close to 2-3% and achieved

lower false alarm rate than other traditional methods proposed in the literature. Moreover,

the developed IDS detected various attacks. The proposed model achieved an accuracy of

100% and 99.86% on the CAN intrusion and CICIDS2017 data sets. Additionally, it reduced

the computational time by 73.7% to 325.6s and by 38.6% to 2774.8s, respectively.

Zeng et al. (2019) proposed a deep Learning-based intrusion detection model composed

of a combination of CNN and LSTM to detect malware traﬃc for on board units. The pro-

posed model is fed the raw traﬃc instead of the human-extracted private information fea-

tures. The performance of the proposed model was compared with previous methods on a

public dataset and a simulated real-life VANET dataset. Experimental results showed that

the proposed model outperformed the other methods by achieving a precision value between

95%-99% and an F1-score between 0.92-0.99.

5.3.3 Research Opportunities:

There is still ample research opportunities to integrate ML as part of IDS systems for ve-

hicular networks. For example, it is worth exploring the impact of diﬀerent optimization

techniques and meta-heuristics such as particle swarm optimization and Baysian optimiza-

tion to tune the hyper-parameters of existing IDS models (Yang and Shami, 2020). This

should be done in order to improve the overall performance of such models. Another poten-

tial research opportunity is developing more complex and hybrid ML systems that can detect

26 MohammadNoor Injadat et al.

Fig. 4: Potential Deployment of ML in Network Security

Table 3: Challenges, Previous Works, and Research Opportunities within Network Security Field

Challenge Previous Work Research Opportunity

Network

Intrusion

Detection

Systems

Used a deep learning approach to detect

network intrusions (Javaid et al., 2016)

- Study the performance of more com-

plex models such as bagging ensem-

ble models, deep learning models, rein-

forcement learning, and transfer learn-

ing.

Used Bayesian optimization to hyper-

tune the parameters of three classiﬁca-

tion algorithms for anomaly-based IDSs

(Injadat et al., 2018)

- Study the impact of diﬀerent optimiza-

tion models and techniques in enhancing

the current intrusion detection frame-

works and models.

Proposed a novel multi-stage optimized

ML-based NIDS framework that re-

duced the computational complexity

while maintaining its detection perfor-

mance (Injadat et al., 2020b)

- Consider time-series analysis tech-

niques to identify and detect temporal-

based anomalies and intrusion attempts.

Proposed an ensemble feature selection

and an anomaly detection method for

network intrusion detection (Salo et al.,

2019)

- Explore the performance of NIDS

using recent datasets such as CI-

CIDS2017, CSE-CIC-IDS2018 and Ky-

oto 2006+.

Used SVM with augmented features

for their intrusion detection framework

(Wang et al., 2017)

Botnets

Detection

Proposed a BLSTM-RNN model to de-

tect botnets (McDermott et al., 2018)

- Investigate the use of hybrid ML mod-

els to see if they can satisfy all the re-

quirements of their proposed online bot-

net detection framework.

Used three ML algorithms to perform

botnet detection during network ﬂow

analysis (Pektas¸ and Acarman, 2017).

- Study the impact of diﬀerent optimiza-

tion models and techniques on current

botnet detection frameworks and mod-

els.

Used conversation-based network traf-

ﬁc analysis and supervised ML to iden-

tify malicious botnet traﬃc (Chen et al.,

2017)

- Consider non-numerical features as

part of any botnet detection models

since such features may contain valu-

able information.

Intrusion

Detection in

Vehicles

Used a deep neural network (DNN) for

intrusion detection in vehicles (Kang

and Kang, 2016)

- Explore the impact of diﬀerent opti-

mization techniques and meta-heuristics

to tune the hyper-parameters of existing

IDS models.

Proposed a DT-based IDS for au-

tonomous/connected vehicles (Yang

et al., 2019a)

- Develop more complex and hybrid ML

systems that can detect both the known

and unknown attacks in vehicular net-

works.

Proposed a deep Learning-based intru-

sion detection model composed of a

combination of CNN and LSTM to de-

tect malware traﬃc for on board units

(Zeng et al., 2019)

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 27

both the known and unknown attacks in vehicular networks. This is particularly important

since more novel attacks are being introduced that are targeting autonomous and connected

vehicles.

Table 3 summarizes the previously discussed challenges and present some of the liter-

ature work that has been conducted within this ﬁeld. Moreover, they also list some of the

potential research opportunities in which ML can play a role. Also, Figure 4 provides a

visualization of how Network Intrusion Detection System (NIDS), Detecting Botnets, and

Intrusion Detection in Vehicles ﬁt into the Detect level of the NIST’s Cybersecurity Frame-

work.

6 Banking & Finance

Moving on from network security, ML also has application in the sectors of banking and ﬁ-

nance. After the ﬁnancial crises of the 1980’s and 90’s, risk assessment of ﬁnancial interme-

diaries became a hot topic. “A ﬁnancial intermediary is an entity that acts as the middleman

between two parties in a ﬁnancial transaction, such as a commercial bank, investment banks,

mutual funds and pension funds” (Chen, 2019). Researchers such as Chen et al. (2016) be-

lieve that ML algorithms can be used to predict individual risk in the credit portfolios of

institutions. In turn, this will help in determining who will and will not repay various forms

of credit (e.g., loans, mortgages, and credit cards). Khandani et al. echo this sentiment as

they discuss the importance of using “hard” information (e.g., characteristics contained in

consumer credit ﬁles collected by credit bureau agencies) to determine the creditworthi-

ness of consumers (Khandani et al., 2010). In the past, human discretion has been used

to determine the creditworthiness of consumers. However, ML oﬀers a way to determine

the creditworthiness of consumers based on vast amounts of hard information. This section

will discuss how ML can be used to assess the credit risk of potential borrowers, predict if

borrowers will go bankrupt, and predict currency crises.

6.1 Credit Risk Assessment

6.1.1 Challenge Description:

With the increased dependency on mortgages and banks for ﬁnancial support, credit risk

assessment has garnered signiﬁcant interest from both practitioners and researchers. This is

especially crucial for ﬁnancial institutions to be able to diﬀerentiate between “good” and

“bad” applicants to minimize their risk (Bao et al., 2019). This applies to both individual

applicants as well as small-medium enterprise (SMEs) applicants (Zhu et al., 2019). Multi-

ple factors are typically considered when using traditional assessment systems (Zhang et al.,

2018b). However, an applicant’s dynamic transaction history, an important indicator of the

applicant’s trustworthiness and creditworthiness, is often not considered. Additionally, in

the case of SMEs, the enterprise’s “self-oriented” factor and “supply chain ﬁnance-oriented”

factor are often neglected when assessing credit risk. Therefore, it is important for any credit

assessment system to consider multiple factors to better utilize available resources.

28 MohammadNoor Injadat et al.

6.1.2 Previous Works:

Bao et al. (2019) proposed the combined use of unsupervised and supervised ML models

to assess the credit risk of individuals. More speciﬁcally, the authors explored two diﬀerent

clustering models, namely k-means and the self-organizing map (SOM) in addition to seven

potential supervised classiﬁcation models including LR, DT, gradient boosting decision tree

(GBDT), RF, SVM, K-NN, and artiﬁcial neural networks (ANN). The authors studied the

performance of their proposed models using three datasets from China, Germany, and Aus-

tralia respectively. Experimental results showed that the detection accuracy of the potential

models ranged between 81%-91% for the Chinese dataset, between 64%-79% for the Ger-

man dataset, and 64%-86% for the Australian dataset.

Zhu et al. (2019) proposed a hybrid ensemble ML model to assess the credit risk of

SMEs in supply chain ﬁnance. The authors integrated two ensemble learning models, namely

the random subspace (RS) model and the multi-boosting model based on DT algorithm to

improve the performance of the credit risk assessment process. The performance of the pro-

posed model was evaluated on a Chinese dataset collected between 31 March 2014 and 31

December 2015. Experimental results showed that the assessment accuracy ranged between

67%-84% with the hybrid RS-multi-boosting model achieving the highest accuracy.

On the other hand, Xu and He (2020) proposed a deep learning model for SME credit

risk assessment. More speciﬁcally, the authors proposed a DBN composed of the Restricted

Botlzman Machine (RBM) and Softmax classiﬁer to predict the credit risk of SMEs working

in the online supply chain space. The authors evaluated the performance of their proposed

model using three diﬀerent datasets and was compared to the performance of SVM and LR

models. Experimental results showed that the proposed DBN achieved the highest accuracy

of 96% compared to 82% and 87% for the LR and SVM models respectively.

6.1.3 Research Opportunities:

Despite the literature showing that using ML has great potential for credit risk prediction,

there are still research opportunities in this ﬁeld. One potential opportunity to explore is

optimizing the hyper-parameters of the ML models considered. As shown in the previous

works, most of the proposed models only consider default parameters without any attempt to

optimize them, which may result in reduced performance. Another potential opportunity is

exploring the performance of diﬀerent models that can make short, medium, and long term

risk prediction rather than just on the short term. In a similar manner to the work in (Lei

et al., 2019), a third opportunity is to consider other deep learning models and architectures

such as CNN, RNN, DBNs, and Generative Adversarial Networks (GANs) to investigate the

improvement in the credit assessment accuracy.

6.2 Bankruptcy Prediction

6.2.1 Challenge Description:

When selecting potential clients one aspect of their amount of credit risk is the probabil-

ity that they will go bankrupt. Quantitative risk management systems, which are based on

ML models, can provide ﬁnancial institutions with early warning signs of clients whose po-

tential business may fail (Antunes et al., 2017). Such failure can result in bankruptcy and

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 29

the client defaulting on their bank payments. In turn, this can have a devastating impact on

the ﬁrm owner, society, and the country’s overall economy (Alaka et al., 2018). This would

force governments to increase their rescue plans in order to maintain the economic growth

of the country which is a challenging task in itself. Prior work has used linear probability

and multivariate conditional probability models, the recursive partitioning algorithm, arti-

ﬁcial intelligence, multi-criteria decision making, and mathematical programming in order

to predict a person’s amount of credit risk. However, the performance of the previously

proposed models heavily depends on the features and data collected.

6.2.2 Previous Works:

Kim et al. (2019) discuss how the ﬁnancial sustainability of a company can maintain the

soundness of the state and society. They further discuss how the sustainability of ﬁnancial

institutions is directly dependent on the ﬁnancial sustainability of the bank’s borrowers.

Hence, it is important for ﬁnancial institutions to evaluate the sustainability of their borrow-

ers, which is often done with the corporate ﬁnancial distress prediction model. The authors

propose a novel hybrid SVM model that uses globally optimized SVMs (GOSVM) and

the genetic algorithm (GA) to predict potentially distressed burrowers. GOSVM optimizes

feature selection, instance selection, and kernel parameters; while GA simultaneously opti-

mizes multiple heterogeneous design factors of SVMs. The authors trained and tested their

model on real-world data from H commercial bank in Korea. The authors randomly chose

1,548 heavy industry companies, 774 of which had ﬁled for ﬁnancial distress between 1999

and 2002, and 774 which were non-bankrupt in this same time period. Experimental re-

sults showed that the proposed GOSVM model outperformed both non-SVM based models

and other SVM-based models at accurately predicting ﬁnancial distress during the hold-out

phase. Based on these results, the authors concluded that their model improves the predic-

tion accuracy of conventional SVMs.

Similarly, Barboza et al. (2017) proposed the use of diﬀerent ML models to predict

bankruptcy and default events of companies and institutions. More speciﬁcally, the authors

studied four models, namely SVM, DT bagging, DT boosting, and RF models in comparison

with other traditional models such as LR and ANN. Experimental results showed that the

proposed models achieved higher prediction accuracy during both the training and testing

stages. In particular, the bagging, boosting, and RF models all achieved a training accuracy

above 96% and a testing accuracy between 86%-87% as compared to the LR and ANN

methods which achieved a training accuracy between 82%-84% and a testing accuracy be-

tween 72%-76%.

Lin et al. (2019) also proposed the use of ensemble learning models in combination with

feature selection as part of their bankruptcy prediction models. To that end, the authors inves-

tigated two feature selection methods, namely information gain and genetic algorithm. The

authors also explored six ML models including LR, NB, ANN, DT, SVM, and K-NN. Ex-

perimental results illustrated that the bagging ensemble models achieved better performance

when compared to the single classiﬁers by having a lower false positive rate. Moreover, the

results also showed that genetic algorithm outperformed the information gain algorithm for

feature selection as it allowed the classiﬁers to achieve better performance.

On the other hand, Mai et al. (2019) proposed the use of deep learning models to predict

bankruptcy based on textual data in conjunction with accounting-based ratio and market-

based variables. In particular, the authors proposed a CNN-based model with word embed-

ding as part of their bankruptcy prediction model. To evaluate the performance of their pro-

posed models, the authors used a dataset consisting of 11,827 ﬁrms and 94,994 ﬁrm-years

30 MohammadNoor Injadat et al.

Fig. 5: Potential ML-based Credit Risk and Bankruptcy Assessment Framework

collected from Compustat North America, Center for Research in Security Prices (CRSP),

and the Securities Exchange Commission (SEC). Experimental results showed that the pro-

posed model outperformed other traditional models such as LR, RF, and SVM by achieving

higher prediction accuracies.

6.2.3 Research Opportunities:

Again, there are still many research opportunities that would beneﬁt from ML to better pre-

dict bankruptcy. For example, one open area is studying the impact of other ML models and

kernels in performing the prediction. This is based on the fact that most previous work only

focused on a single kernel or a single ML model. Another research opportunity that should

be considered is investigating the performance of diﬀerent optimization models and meta-

heuristics such as simulated annealing, tabu search, or particle swarm optimization to study

the potential trade-oﬀbetween performance improvement and computational complexity. A

third research opportunity is to explore other deep learning models and architectures such as

RNN, DBN, and GANs to investigate their performance in comparison with CNN models

proposed in the literature. Figure 5 provides a potential ML-based credit risk and bankruptcy

assessment framework that can be deployed by banks and ﬁnancial institutions.

6.3 Currency Crises Prediction

6.3.1 Challenge Description:

In the 90s, many countries suﬀered from a currency crisis wherein the value of their currency

became unstable. Europe experienced a currency crisis in 1992, Mexico in 1994, Asia in

1997-98, and Russia in 1998 (Lin et al., 2008). Therefore, the interest in developing such

systems increased in the aftermath of the 2008-09 global ﬁnancial crisis (McMahon, 2019;

Basu et al., 2019). The interest stems from the fact that a currency crisis can damage the

world economy. Hence, it would be beneﬁcial to create an early warning system in order

to prevent or at least to manage such events, particularly given the serious socio-economic

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 31

impact that such events can have. To that end, ML can play a major role as part of such

early systems given their ability to act as accurate prediction models and their promising

performance in other business-focused applications.

6.3.2 Previous Works:

McMahon (2019) proposed the use of ML models to predict crises in diﬀerent currencies.

In particular, the author proposed the use of SVM model as it can overcome many of the

limitations of traditional crises prediction approaches including data non-linearity and the

variance-bias trade-oﬀ. To that end, the author applied the SVM model to data collected

between 1996 and 2014. Experimental results showed that the proposed model accuracy

predicted the majority of crises within that period from 17 emerging markets, thus illustrat-

ing its potential as a valuable tool for economists to use.

On the other hand, Xu et al. (2018) proposed the combination of RF and wavelet trans-

formation to predict currency crises. More speciﬁcally, the authors proposed the use of

discrete wavelet transformation (DWT) to systematically extract key time-based features

related to the exchange rate behavior over diﬀerent time horizons. To evaluate the perfor-

mance of the proposed model the authors used a dataset containing instances about currency

crises between 1992 and 2015. Experimental results showed that the proposed RF-DWT

model achieved a high prediction accuracy between 89%-90%, outperforming the LR model

which achieved an accuracy between 84%-85%.

Similarly, Kinkyo (2020) also proposed the combination of RF and DWT for bi-annual

currency crises prediction based on the exchange market pressure (EMP) index. To that end,

the authors used a dataset covering 101 industrial and developing countries between 1994-

2018 from the International Financial Statistics of the International Monetary Fund (IMF).

Experimental results showed that the proposed model achieved a prediction accuracy of

close to 73%, outperforming other models by at least 5%. This highlighted the potential of

the proposed bi-annual forecasting model in providing guidance for both policy makers and

investors to detect currency risks.

In contrast, Alaminos et al. (2019) proposed a deep learning model to predict currency

crises. More speciﬁcally, the authors proposed the use of deep neural decision trees (DNDT)

and compared its performance to other widely adopted methodologies. To compare the per-

formance of the diﬀerent models, the authors used a dataset consisting of 162 developed,

emerging, and developing countries with information between 1970–2017. Experimental

results showed the proposed model achieved a training prediction accuracy ranging between

98%-99% and a testing accuracy between 97%-99% across multiple geographical regions.

This highlights the potential of deep learning models for accurate currency crises prediction.

6.3.3 Research Opportunities:

Similar to other opportunities in the banking and ﬁnance sector, there are multiple research

opportunities in which ML can play a role for ﬁnancial crises prediction. One opportunity

is investigating the performance of existing models in predicting ﬁnancial crises in speciﬁc

ﬁelds rather than just at the macro/country level. For example, study the eﬀectiveness of

existing models in predicting crises in the housing ﬁeld since such models can be extremely

helpful for real-estate developers and landlords. Another potential opportunity is exploring

the eﬀectiveness of diﬀerent classiﬁcation models in predicting such crises and studying

their complexity. A third potential research opportunity is exploring other deep learning

32 MohammadNoor Injadat et al.

Table 4: Challenges, Previous Works, and Research Opportunities within Banking and Finance Field

Challenge Previous Work Research Opportunity

Credit Risk

Assessment

Proposed the combined use of unsupervised and

supervised ML models to assess the credit risk of

individuals. (Bao et al., 2019)

- Explore hyper-parameter optimization

of the ML models considered to improve

their performance.

Proposed a hybrid ensemble ML model com-

posed of two ensemble learning models and

seven classiﬁcation models to assess the credit

risk of SMEs in supply chain ﬁnance (Zhu et al.,

2019)

- Explore the performance of diﬀerent

models that can make short, medium,

and long term risk prediction.

Proposed DBN composed of RBM and Softmax

classiﬁer to predict the credit risk of SMEs work-

ing in the online supply chain space

- Consider other deep learning mod-

els and architectures such as CNN and

RNN to investigate the improvement in

the credit assessment accuracy.

Bankruptcy

Prediction

Compared performance of optimized SVM with

ANN and other ML techniques to predict institu-

tion bankruptcy (Kim et al., 2019)

- Study the impact of other ML mod-

els and kernels in performing the

bankruptcy prediction.

Explored four diﬀerent ML models including

SVM, DT bagging, DT boosting, and RF models

for bankruptcy prediction (Barboza et al., 2017)

- Investigate the performance of dif-

ferent optimization models and meta-

heuristics to study the potential trade-oﬀ

between performance improvement and

computational complexity.

Proposed the use of ensemble learning models in

combination with feature selection as part of their

bankruptcy prediction models (Lin et al., 2019)

- Explore other deep learning models

and architectures such as RNN to inves-

tigate their performance in comparison

with CNN models proposed in the liter-

ature.

Proposed the use of CNN model to predict

bankruptcy based on textual data in conjunction

with accounting-based ratio and market-based

variables (Mai et al., 2019)

Currency

Crises

Prediction

Proposed the use of SVM model to predict cur-

rency crises (McMahon, 2019)

- Investigate the performance of exist-

ing models in predicting ﬁnancial crises

in speciﬁc ﬁelds rather than just at the

macro/country level.

Proposed the combination of RF and wavelet

transformation to predict currency crises (Xu

et al., 2018)

- Explore the eﬀectiveness of diﬀerent

classiﬁcation models in predicting such

crises and studying their complexity.

Proposed a combined RF-DWT model to predict

currency crises (Kinkyo, 2020)

- Explore other deep learning models

such as CNN and RNN to investigate

their performance.

Proposed the use of deep neural decision trees

(DNDT) and compared its performance to other

widely adopted methodologies (Alaminos et al.,

2019)

models such as CNN and RNN to investigate their performance given the promising results

achieved by other deep learning architectures.

Table 4 brieﬂy summarizes the challenges, previous works, and potential research op-

portunities of ML within the banking and ﬁnance sector.

7 Social Media

Another emerging area in which ML has been playing a major role is the area of social me-

dia. Communications and Marketing Oﬃce, Tufts University (2019) deﬁnes social media

as “the means of interactions among people in which they create, share, and/or exchange

information and ideas in virtual communities and networks”. The ﬁrst form of social me-

dia appeared in 1979 when USENET created a decentralized system of discussion boards

(Carvin, 2007). Since then, the Internet has advanced well beyond discussion boards where

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 33

interactions occur in text only. In the early 21st century, many websites were launched that

provide users with a platform to not only communicate via text, but also to share videos

and/or photos (Injadat et al., 2016).

According to Communications and Marketing Oﬃce, Tufts University (2019), eight of

the most popular social media platforms are Facebook, Twitter, Youtube, Vimeo, Flickr,

Instagram, Snapchat, and LinkedIn. As of 2019, there are 2.77 billion social media users

worldwide, with it being projected that in 2021 there will be 3.02 billion social media users

worldwide (Clement, 2019). There are 2.5 quintillion bytes of social media data created each

day (Marr, 2019). Furthermore, every minute of the day Snapchat users share 527,760 pho-

tos, 456,000 tweets are sent on Twitter, Instagram users post 46,740 photos, and Facebook

users post 510,000 comments and 293,000 status updates (Marr, 2019). Also, on Facebook

more than 300 million photos are uploaded per day (Marr, 2019). In conclusion, the vast

amount of data produced by social media cannot be processed by humans. Hence, social

media provides another area of opportunity for the use of ML. In this section, how ML

techniques can be applied to social media data in order to make discoveries in the ﬁelds of

pharmacovigilance, vaccine sentiment analysis, and politics will be discussed.

7.1 Pharmacovigilance

7.1.1 Challenge Description:

One way that social media data is being used is in pharmacovigilance. Pharmacovigilance

(PhV) is deﬁned as “the science and activities relating to the detection, assessment, under-

standing, and prevention of adverse eﬀects or any other drug-related problem” (Lezotre,

2014). Adverse eﬀects, also known as Adverse Drug Reactions (ADRs) are harmful reac-

tions that are caused by the intake of medication (Sarker et al., 2015). ADRs have led to

millions of deaths and hospitalizations and cost nearly seventy-ﬁve billion dollars annu-

ally. Governmental agencies such as the U.S. Food and Drug Administration (FDA) and

the European Medicines Agency (EMA), along with international organizations such as

the World Health Organization (WHO) engage in pharmacovigilance by requiring manu-

facturers to report adverse events (Nikfarjam et al., 2015). These agencies also encourage

voluntary reporting by healthcare professionals and the public. However, there is no guaran-

tee that healthcare professionals or the public will report ADRs. Furthermore, when ADRs

are voluntarily reported, the information may not be timely, may be incomplete, duplicated,

under-reported or over-reported. Due to the limited quantity and lack of quality of voluntar-

ily reported ADRs, it has become necessary to supplement voluntary reports with other data

forms. For example, information about ADRs can be acquired from health-related social

networks such as DailyStrength or on social media sites such as Twitter and Facebook.

Although these sites provide a vast amount of data for potential ADR detection, it is im-

possible for a human to analyze all of the data. Hence, natural language processing (NLP)

and ML algorithms have been used to process the data (Sarker et al., 2015; Nikfarjam et al.,

2015). A survey of the literature shows that NLP techniques are commonly used to analyze

social media data for ADRs via text classiﬁcation using lexicon-based approaches. Further-

more, SVM, NB, and Maximum Entropy algorithms have been used to classify text. While

these approaches provide a novel opportunity for collecting data about ADRs, there are still

many challenges to using these approaches. For example, pure lexicon-based approaches

are often impeded by consumers not using technical terms, misspelling words, using ab-

breviations, or sentence structure irregularities. Furthermore, when supervised learning ap-

34 MohammadNoor Injadat et al.

proaches are used, they require substantial amounts of data to be manually annotated, of-

ten by a domain expert. That being said, researchers have begun to use partially supervised

(semi-supervised) algorithms in order to reduce the amount of annotated data that is required

(Sloane et al., 2015).

7.1.2 Previous Works:

O’Connor et al. (2014) utilized ML on tweets in order to discover mentions of ADRs. How-

ever, in their work the authors found that false positive errors were occurring due to non-

ADR extracted terms being classiﬁed as ADRs. As an example, the authors discuss the

username TScpCancer, which was classiﬁed as an ADR even though the word cancer is be-

ing used as a name in this context.

Patki et al. (2014) used ML techniques on social media data to automatically classify

drugs into either a normal category or a blackbox category (blackbox is a category of drugs

that the FDA has identiﬁed as having serious or life-threatening safety concerns). The au-

thors’ approach showed promise at classifying social media comments as ADRs or non-

ADRs. However, their approach was only marginally successful at classifying drugs into the

normal or blackbox categories. The authors believe that they encountered this challenge due

to their limited annotated dataset. Furthermore, the authors found it challenging to distin-

guish true signals from the noisy social media text data.

Liu and Chen (2015) proposed a SVM-based framework for integrated and high-performance

patient reported adverse drug event extraction from social media. More speciﬁcally, the au-

thors used data collected from four major diabetes and heart disease forums in the United

States and applied various natural language processing models to create the lexicon-based

datasets to be fed to the SVM classiﬁer. Experimental results showed that the proposed

model achieved a high precision ranging between 91%-94% for ADR and between 79%-

87% for medical events.

Similarly, Alimova and Tutubalina (2017) proposed the use of SVM to accurately iden-

tify ADR posted by patients on social media platforms. To that end, the authors considered

two datasets, namely the CSIRO Adverse Drug Event Corpus (CADEC) and the Twitter Cor-

pus. From these datasets, a set of context-level and entity-level features were extracted and

provided as an input to the proposed linear SVM model. Experimental results showed that

the proposed model achieved a high precision ranging between 81%-84% for the CADEC

corpus dataset and between 64%-73% for the Twitter corpus dataset.

In contrast, Cocos et al. (2017) proposed a scalable deep-learning model to analyze

and identify ADRs in social media posts. More speciﬁcally, the authors proposed the use of

RNN that labels words in an input sequence with ADR membership tags. To that end, the au-

thors used a Twitter corpus dataset to evaluate the performance of the proposed RNN-based

framework. Experimental results showed that the authors’ model outperformed other tradi-

tional models such as the baseline lexicon matching (LM) system and conditional random

ﬁeld model (CRF) by achieving an F1-measure of 0.755 for ADR identiﬁcation compared

to 0.63 and 0.65 for the LM and CRF models respectively.

7.1.3 Research Opportunities:

Although many previous work has utilized ML for analyzing social media posts concerning

drugs and medications, there still exist many further opportunities. One potential oppor-

tunity is examining the eﬀectiveness of ML classiﬁcation in modeling the contextual and

semantic features of tweets. Another opportunity worth exploring is enriching the ADR

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 35

lexicon datasets so that the sentiment analysis of tweets and social media posts becomes

more accurate. Another potential research opportunity is performing temporal analyses to

mine drug-ADR patterns and investigate ADRs related to the interaction of drugs taken by

patients. Also, researchers can explore more complex classiﬁcation models such as hidden

markov models (HMM) to distinguish between symptoms and side-eﬀects mentioned in

the posts. Moreover, a transfer learning model can be explored to transfer knowledge from

one classiﬁcation domain to another, i.e. potentially from one drug to another or from one

platform to another.

7.2 Social Media and Vaccines

7.2.1 Challenge Description:

Another way that ML can be used to gather information from social media data is by deter-

mining people beliefs, thoughts, and feelings about various vaccines. In recent years it has

been observed that some individuals and/or groups have negative opinions about the safety

and value of vaccines, and these negative opinions are being expressed online via social

media. These negative opinions may inﬂuence some people’s decisions to receive vaccines

or to vaccinate their children (Dunn et al., 2015; Centers for Disease Control and Preven-

tion (CDC), 2019b,a; Huang et al., 2017; Du et al., 2017). In the past decade, in the United

States and other countries, there has been an increase of parents refusing to vaccinate their

children due to their concerns about the safety of vaccines. Vaccine refusal for one’s self

or one’s child can result in unnecessary harm or even death. One way that scientists and

researchers are combating the anti-vaccination movement is by analyzing social media data

with ML algorithms in order to understand how negative opinions about vaccines spread

through social media. Once these patterns are understood, scientists and researchers hope

that they can combat the spread of misinformation.

7.2.2 Previous Works:

Dunn et al. (2015) hypothesized that when Twitter users were exposed to negative opinions

about human papillomavirus (HPV) vaccines in Twitter communities that these users would

subsequently express the negative opinions that they were exposed to by re-posting simi-

lar negative opinions. In order to examine their hypothesis, the authors analyzed temporal

sequences of messages posted on Twitter (tweets) related to HPV vaccines and the social

connections between users. The researchers’ dataset was collected between October 2013

and April 2014. The dataset consisted of 83,551 tweets written in English that included

terms related to HPV vaccines. Furthermore, the social connections (N =957,865) of the

30,621 users who posted or reposted the tweets were examined to see if they also posted or

reposted such tweets. In order to analyze this large dataset, the authors utilized a supervised

ML approach to classify the tweets. This approach required the researchers to ﬁrst manually

label a random sample of tweets. Then, the labeled tweets were used to train a ML classiﬁer

to recognize similar patterns in the remaining tweets. More speciﬁcally, the classiﬁer was

an ensemble of four classiﬁers that used the content of the tweets (the words and word com-

binations in the tweets themselves) or the social relations between users (the users followed

by the user responsible for the tweet) in order to classify the sentiment of the tweets. The

sentiment of the users’ tweets about HPV vaccines was classiﬁed either as negative or neu-

tral/positive. When the four classiﬁers were trained and tested in a 10-fold cross validation,

36 MohammadNoor Injadat et al.

their accuracy ranged between 87.6% and 94.0%. The researchers concluded that Twitter

users who were more often exposed to negative opinions about HPV vaccines were more

likely to subsequently post negative tweets about HPV vaccines.

Similarly, Du et al. (2017) proposed a hierarchical SVM-based model to predict tweet

sentiments about HPV vaccines. To that end, the authors collected tweets written in En-

glish containing HPV vaccines-related keywords in the time between November 2, 2015

and March 28, 2016. Experimental results showed that the proposed model achieved a high

precision ranging between 71%-78%. Moreover, the model also showed particularly high

precision in identifying negative sentiments pertaining to HPV vaccine safety with a value

of around 80%.

Huang et al. (2017) used natural language classiﬁers to examine and analyze data from

Twitter in order to track ﬂu vaccinations over time, as well as by geography and gender. The

researchers collected a dataset of 1,007,582 tweets. From this dataset, the researchers cre-

ated a training dataset by annotating a random sample of 10,000 tweets. After testing various

classiﬁers, the researchers chose the best-performing classiﬁer, namely LR, and used it in

the rest of their experiments. When the researchers compared the results of their algorithm

to a published government survey data about vaccination from the US Centers for Disease

Control and Prevention (CDC), they found that their results were highly correlated with the

CDC’s data (r =0.90). These results suggest that ML algorithms can be applied to Twitter

data in order to track people’s attitudes and behaviors about ﬂu vaccinations.

7.2.3 Research Opportunities:

As evident by the diﬀerent research works discussed above, ML has great potential as it can

be used to examine large amounts of social media data in order to track and determine how

social media users may inﬂuence each other’s opinions of vaccines. While these studies have

shown great potential for the use of ML, there are still some limitations that oﬀer possibilities

for future research. One limitation that could use further development is that social media

users’ connections change over time and this may not be reﬂected in data that is taken

from a set time period. Therefore, it is important to develop adaptive ML models that can

change with the social connection changes of the social media platform. Another potential

opportunity is creating new datasets with updated vocabulary to better track the sentiment of

users based on the non-standard abbreviations, slang and phrases commonly used on social

media platforms. A third research opportunity worth exploring is to study the performance

of deep learning architectures such as CNNs and RNNs for sentiment analysis of various

vaccines. This is particularly important given the promising performance illustrated by such

architectures in analyzing social media posts.

7.3 Social Media and Politics

7.3.1 Challenge Description:

Moving beyond health-related topics, ML techniques can also be applied to social media in

order to collect, monitor, analyze, summarize, and visualize politically relevant information

(Hosni and Li, 2019). In recent years, social media platforms such as Twitter and Facebook

have been used to increase political participation. For example, social media users publicly

spread information about their political opinions on Twitter and political institutions have

begun to use Facebook pages or groups to engage with citizens (Stieglitz and Dang-Xuan,

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 37

2013). Furthermore, politicians and political parties are interested in social media data, be-

cause they can beneﬁt from understanding what the public thinks about them (Maynard

et al., 2012). Due to their interest in public opinion about politics, politicians or political

parties may monitor social media data in order to detect social media content that is directly

or indirectly associated with them. Furthermore, the monitoring of political social media

data is also important because it may provide information about potential political crises or

scandals. Additionally, the spread of political information through social networks can lead

to administrative, political and societal changes. For example, social media played a central

role in shaping political debates in the Arab Spring (a series of pro-democracy protests, up-

risings, and armed rebellions that spread across North Africa and the Middle East beginning

in the spring of 2011 (Editors of History.com Website, 2018)).

A common method that is used for the detection and analysis of political social media

content is opinion mining (also known as sentiment analysis). The process of political opin-

ion mining consists of collecting text that contains political opinions (or sentiments) and

extracting attributes and components about a speciﬁc political feature from said text, then

determine whether the text is positive, negative or neutral.

7.3.2 Previous Works:

Jahanbakhsh and Moon (2014) have implemented sentiment analysis on tweets. They were

interested in the predictive power of social media. In their study, the authors analyzed 32

million tweets related to the 2012 US presidential election using a combination of ML tech-

niques. The authors implemented a Twitter crawler from September 29, 2012 until Novem-

ber 16, 2012 using keywords such as Barack Obama, Mitt Romney, US election, Paul Ryan,

and Joe Biden. Their results were numerous. Firstly, the authors’ results (that Obama was

leading in Twitter for the 2012 US presidential election) matched with the outcome of the

election. Secondly, the authors found that by analyzing geo-tweets (tweets with a geo-tag)

with geographical sentiment analysis, they were able to uncover the popularity of candi-

dates across the US states. Thirdly, the authors work demonstrated that LDA is a powerful

unsupervised algorithm when combined with the NB classiﬁer as it was able to “predict” the

result of the 2012 US election. Hence, the authors have presented a system of mining social

media data that may be used for predicting future events.

In a similar fashion, Ramteke et al. (2016) proposed the use of ML models to predict the

results of the 2016 US presidential election based on sentiment analysis of the correspond-

ing tweets. To that end, the authors proposed the use of NB and SVM models to classify the

tweets about Donald Trump and Hilary Clinton. The performance of the model was evalu-

ated using a dataset collected through Twitter between March 16th-17th, 2016 which was

later labeled manually. Experimental results showed that the proposed models achieved a

sentiment prediction accuracy ranging between 97%-99% with an F1-score between 0.94-

0.97, highlighting the potential of ML models in predicting election results based on the

sentiment analysis of tweets.

Oyebode and Orji (2019) also proposed the use of ML models to predict the results of the

2019 Nigerian presidential election by comparing three lexicon-based classiﬁers (VADER,

VADER-EXT, and Textblob) and ﬁve ML-based classiﬁers (SVM, LR, NB, stochastic gra-

dient descent SGD, and RF). To evaluate the performance of the diﬀerent models, the au-

thors collected 118,421 posts between January 1 and February 22, 2019. Experimental re-

sults showed that the VADER-Ext approach outperformed the other two lexicon-based ap-

proaches by achieving a precision of approximately 81%. In a similar fashion, it was shown

that the LR method achieved the highest accuracy and precision of 77% and 78% respec-

38 MohammadNoor Injadat et al.

Fig. 6: Potential ML-based Social Media Analytics Framework

tively among the diﬀerent ML models.

On the other hand, Tsai et al. (2019) extended the concept of sentiment analysis by using

deep learning models to predict multiple local election results rather than the national elec-

tion. To that end, the authors proposed the use of recursive neural tensor network (RNTN)

to analyze the sentiment shown in various Twitter posts about the 2018 US Midterm elec-

tions. The performance of the proposed model was evaluated using a manually collected

dataset consisting of approximately 800 tweets. Experimental results showed that the pro-

posed model achieved a high prediction accuracy as it predicted an advantage of 9.2% for the

Democratic candidate compared to the actual advantage which was measured to be 8.6%.

As such, it was shown that the proposed model indeed has great potential in accurately

predicting multiple local election results.

7.3.3 Research Opportunities:

Again, there are still many research opportunities in which ML can play a role as part of

a politics sentiment analysis frameworks. One such opportunity is investigating other ML

algorithms such as SVM and ANN given their previous success in determining linguistic

features for opinion classiﬁcation. Another potential opportunity is collecting more features

such as swear words, sarcasm, and negative and conditional detection as well as contextual

clues features to make the sentiment analysis framework more accurate and eﬀective. A third

opportunity is to consider other deep learning models such as CNN and RNN to compare

their performance with the currently proposed deep learning models. This is crucial given

the promising results achieved by deep learning models such as RNTN model.

Table 5 summarizes some of the diﬀerent challenges and research opportunities of ML

within the social media ﬁeld. Moreover, Figure 6 provides a visualization of how these topics

ﬁt into a Social Media Analytics Framework.

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 39

Table 5: Challenges, Previous Works, and Research Opportunities within Social Media Field

Challenge Previous Work Research Opportunity

Pharmacovigilance

Utilized ML on tweets in order to dis-

cover mentions of ADRs (O’Connor

et al., 2014)

- Examine the eﬀectiveness of ML clas-

siﬁcation in modeling the contextual and

semantic features of tweets.

Used ML techniques on social media

data to automatically classify drugs into

either a normal category or a blackbox

category (Patki et al., 2014)

- Enrich the ADR lexicon datasets so

that the sentiment analysis of tweets and

social media posts become more accu-

rate.

Proposed a SVM-based framework for

integrated and high-performance patient

reported adverse drug event extraction

from social media (Liu and Chen, 2015)

- Perform temporal analyses to mine

drug-ADR patterns and investigate

ADRs related to the interaction of drugs

taken by patients.

Proposed the use of SVM to accurately

identify ADR posted by patients on so-

cial media platforms (Alimova and Tu-

tubalina, 2017)

- Explore more complex classiﬁcation

models to distinguish between symp-

toms and side-eﬀects mentioned in the

posts.

Proposed a scalable RNN model to ana-

lyze and identify ADRs in social media

posts (Cocos et al., 2017)

- Explore transfer learning models to

transfer knowledge from one classiﬁca-

tion domain to another.

Social Media and

Vaccines

Utilized a supervised ML approach to

classify vaccine-related tweets (Dunn

et al., 2015)

- Develop adaptive ML models that

can change with the social connection

changes of the social media platform.

Used SVM model on Twitter data in

order to assess HPV vaccination senti-

ments (Du et al., 2017)

- Create new datasets with updated vo-

cabulary to better track the sentiment of

users based on the non-standard abbre-

viations, slang and phrases commonly

used on social media platforms.

Used natural language classiﬁers to ex-

amine and analyze data from Twitter in

order to track ﬂu vaccinations over time,

geography, and gender (Huang et al.,

2017)

- Study and compare the performance

of deep learning architectures such as

CNNs and RNNs.

Social Media and

Politics

Created a ML-NLP engine that imple-

mented a NB classiﬁer for sentiment

analysis, and the Latent Dirichlet Allo-

cation (LDA) algorithm for topic mod-

eling (Jahanbakhsh and Moon, 2014)

- Investigate other ML algorithms such

as SVM and ANN given their previous

success in determining linguistic fea-

tures for opinion classiﬁcation.

Proposed the use of NB and SVM mod-

els to predict the results of the 2016

US presidential election based on corre-

sponding tweets (Ramteke et al., 2016)

- Collect more features such as swear

words, sarcasm, and negative and con-

ditional detection as well as contextual

clues features to make the sentiment

analysis framework more accurate and

eﬀective.

Compared the performance of three

lexicon-based and ﬁve ML-based mod-

els in predicting the results of the 2019

Nigerian presidential election (Oyebode

and Orji, 2019)

- Consider other deep learning mod-

els such as CNN and RNN to com-

pare their performance with the cur-

rently proposed deep learning models.

Proposed the use of RNTN deep learn-

ing model to predict the results of the

2018 US Midterm elections (Tsai et al.,

2019)

8 Conclusion

The availability and popularity of the Internet and related technologies has resulted in large

amounts of data being available for analyses. However, humans do not possess the cognitive

capabilities to understand such large amounts of data. Machine learning (ML) provides a

way for humans to process large amounts of data and come to conclusions about the data.

ML has applications in various ﬁelds. This review focused on some of the ﬁelds and ap-

40 MohammadNoor Injadat et al.

Fig. 7: Summary of Challenges and ML Techniques

plications such as education, healthcare, network security, banking and ﬁnance, and social

media. These ﬁelds each have their own unique challenges. However, ML can provide solu-

tions to these challenges, as well as create further research opportunities. Accordingly, this

work brieﬂy described some of the challenges facing the aforementioned ﬁelds and surveyed

some of the previous literature works that focused on them. Moreover, it presented several

research opportunities on the role and potential of using ML to address these challenges. Fig-

ure 7 summarizes the challenges and previous/potential ML techniques that addressed/can

address them respectively.

Acknowledgments This study was funded by Ontario Graduate Scholarship (OGS) Program.

Conﬂict of Interest The authors declare that they have no conﬂict of interest.

Informed Consent This study does not involve any experiments on animals.

References

A Flyvbjerg CC G Holt, Goldstein B (2010) Textbook of Diabetes: A Clinical Approach, 4th edn. Wiley, New

Jersey, USA

Agarap AFM (2018) On breast cancer detection: an application of machine learning algorithms on the wis-

consin diagnostic dataset. In: Proceedings of the 2nd International Conference on Machine Learning and

Soft Computing, pp 5–9

Aher SB, Lobo L (2013) Combination of machine learning algorithms for recommendation of courses in

e-learning system based on historical data. Knowledge-Based Systems 51:1–14

Alaka HA, Oyedele LO, Owolabi HA, Kumar V, Ajayi SO, Akinade OO, Bilal M (2018) Systematic review of

bankruptcy prediction models: Towards a framework for tool selection. Expert Systems with Applications

94:164–184

Alaminos D, Becerra-Vicario R, Fern´

andez-G´

amez M ´

A, Cisneros Ruiz AJ (2019) Currency crises prediction

using deep neural decision trees. Applied Sciences 9(23):5227

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 41

Albisser AM (2003) Analysis: Toward algorithms in diabetes self-management. Diabetes technology & ther-

apeutics 5(3):371–373

Alimova I, Tutubalina E (2017) Automated detection of adverse drug reactions from social media posts

with machine learning. In: International Conference on Analysis of Images, Social Networks and Texts,

Springer, pp 3–15

Antunes F, Ribeiro B, Pereira F (2017) Probabilistic modeling and visualization for bankruptcy prediction.

Applied Soft Computing 60:831–843

Bao W, Lianju N, Yue K (2019) Integration of unsupervised and supervised machine learning algorithms for

credit risk assessment. Expert Systems with Applications 128:301–315

Barboza F, Kimura H, Altman E (2017) Machine learning models and bankruptcy prediction. Expert Systems

with Applications 83:405–417

Barron-Estrada ML, Zatarain-Cabada R, Oramas-Bustillos R, Gonzalez-Hernandez F (2017) Sentiment anal-

ysis in an aﬀective intelligent tutoring system. In: 2017 IEEE 17th International Conference on Advanced

Learning Technologies (ICALT), pp 394–397

Basu SS, Perrelli RA, Xin W (2019) External crisis prediction using machine learning: Evidence from three

decades of crises around the world. Computing in Economics and Finance, Ottawa, Canada

Bawa P (2016) Retention in online courses: Exploring issues and solutions a literature review. Sage Open

6(1)

Bellazzi R (2008) Telemedicine and diabetes management: current challenges and future research directions.

Journal of diabetes science and technology 2(1):98–104

Bhatia K, Arora S, Tomar R (2016) Diagnosis of diabetic retinopathy using machine learning classiﬁcation

algorithm. In: 2016 2nd International Conference on Next Generation Computing Technologies (NGCT),

pp 347–351

Black G (2002) A comparison of traditional, online, and hybrid methods of course delivery. Journal of Busi-

ness Administration Online 1(1):1–9

Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konecn `

y J, Mazzocchi S,

McMahan HB, et al. (2019) Towards federated learning at scale: System design. In: 2nd Conference on

Machine Learning and Systems (SysML 2019)

Bourkoukou O, El Bachari E (2016) E-learning personalization based on collaborative ﬁltering and learner’s

preference. Journal of Engineering Science and Technology 11(11):1565–1581

Bourkoukou O, El Bachari E (2018) Toward a hybrid recommender system for e-learning personnalization

based on data mining techniques. JOIV: International Journal on Informatics Visualization 2(4):271–278

Bughin J, Seong J, Manyika J, H¨

am¨

al¨

ainen L, Windhagen E, Hazan E (2019) Notes from the ai frontier:

Tackling europe?s gap in digital and ai. McKinsey&Company: New York, NY, USA

Caban JJ, Gotz D (2015) Visual analytics in healthcare–opportunities and research challenges

Caldas S, Meher Karthik Duddu S, Wu P, Li T, Kone ˇ

cn`

y J, McMahan HB, Smith V, Talwalkar A (2019)

Leaf: A benchmark for federated settings. In: Workshop on Federated Learning for Data Privacy and

Conﬁdentiality

Carvin A (2007) Timeline: The life of the blog. URL https://www.npr.org/templates/story/story.

php?storyId=17421022, Accessed on: Jan. 5, 2020

Centers for Disease Control and Prevention (CDC) (2019a) Attention adults: You need vaccines too! URL

https://www.cdc.gov/features/adultimmunizations/index.html, Accessed on: Jan. 13, 2020

Centers for Disease Control and Prevention (CDC) (2019b) If you choose not to vaccinate your

child, understand the risk and responsibilities. URL https://www.cdc.gov/vaccines/parents/

vaccine-decision/no- vaccination.html, Accessed on: Jan. 13, 2020

Chanoch LH, Jovanovic L, Peterson CM (1985) The evaluation of a pocket computer as an aid to insulin dose

determination by patients. Diabetes Care 8(2):172–176

Chen J (2019) Financial intermediary. URL https://www.investopedia.com/terms/f/

financialintermediary.asp, Accessed on: Nov. 1, 2019

Chen N, Ribeiro B, Chen A (2016) Financial credit risk assessment: a recent review. Artiﬁcial Intelligence

Review 45(1):1–23

Chen R, Niu W, Zhang X, Zhuo Z, Lv F (2017) An eﬀective conversation-based botnet detection method.

Mathematical Problems in Engineering 2017

Chiarelli F, Tumini S, Morgese G, Albisser AM (1990) Controlled study in diabetic children comparing

insulin-dosage adjustment by manual and computer algorithms. Diabetes Care 13(10):1080–1084

Chiu YC, Chen HIH, Zhang T, Zhang S, Gorthi A, Wang LJ, Huang Y, Chen Y (2019) Predicting drug

response of tumors from integrated genomic proﬁles by deep neural networks. BMC medical genomics

12(1):18

Chung JY, Lee S (2019) Dropout early warning systems for high school students using machine learning.

Children and Youth Services Review 96:346–353

42 MohammadNoor Injadat et al.

Cisco (2019) What is network security? URL https://www.cisco.com/c/en/us/products/security/

what-is- network-security.html, Accessed on: Feb. 1, 2020

Clement J (2019) Number of social network users worldwide from 2010 to 2021. URL https://

www.statista.com/statistics/278414/number-of- worldwide-social-network- users/, Ac-

cessed on: Dec. 1, 2019

Cocos A, Fiks AG, Masino AJ (2017) Deep learning for pharmacovigilance: recurrent neural network archi-

tectures for labeling adverse drug reactions in twitter posts. Journal of the American Medical Informatics

Association 24(4):813–821

Columbus L (2020) Roundup of machine learning forecasts and market estimates, 2020. Forbes

Communications and Marketing Oﬃce, Tufts University (2019) Social media overview. URL

https://communications.tufts.edu/marketing-and- branding/social-media- overview/,

Accessed on: Jan. 19, 2020

Coussement K, Phan M, De Caigny A, Benoit DF, Raes A (2020) Predicting student dropout in subscription-

based online learning environments: The beneﬁcial impact of the logit leaf model. Decision Support Sys-

tems p 113325

Di Pietro R, Distefano S (2019) An intelligent tutoring system tool combining machine learning and gamiﬁ-

cation in education. In: TOOLS: International Conference on Objects, Components, Models and Patterns,

Springer International Publishing, pp 218–226

Dong D, Zhang W, Jing Q (2019) Paddle Federated Learning. Available at:

https://paddleﬂ.readthedocs.io/en/latest/introduction.html

Du J, Xu J, Song HY, Tao C (2017) Leveraging machine learning-based approaches to assess human papil-

lomavirus vaccination sentiment trends with twitter data. BMC medical informatics and decision making

17(2):69

Dunn AG, Leask J, Zhou X, Mandl KD, Coiera E (2015) Associations between exposure to and expression of

negative opinions about human papillomavirus vaccines on social media: an observational study. Journal

of medical Internet research 17(6):e144

Dwivedi P, Bharadwaj KK (2015) e-learning recommender system for a group of learners based on the uniﬁed

learner proﬁle approach. Expert Systems 32(2):264–276

Editors of Historycom Website (2018) Arab spring. URL https://www.history.com/topics/

middle-east/arab- spring, Accessed on: Jan. 21, 2020

Elfaki AO, Alhawiti KM, AlMurtadha YM, Abdalla OA, Elshiekh AA (2014) Rule-based recommendation

for supporting student learning-pathway selection. Recent Advances in Electrical Engineering and Educa-

tional Technologies pp 155–160

Felder RM, Silverman LK, et al. (1988) Learning and teaching styles in engineering education. Engineering

education 78(7):674–681

Forum WE (2018) The future of jobs report 2018. World Economic Forum Geneva

Gadekallu TR, Khare N, Bhattacharya S, Singh S, Reddy Maddikunta PK, Ra IH, Alazab M (2020) Early

detection of diabetic retinopathy using pca-ﬁreﬂy based deep learning model. Electronics 9(2):274

Halford GS, Baker R, McCredden JE, Bain JD (2005) How many variables can humans process? Psycholog-

ical science 16(1):70–76

Harvard Medical School (2017) Retinopathy. Available at: https://www.health.harvard.edu/a_to_z/

retinopathy-a- to-z

Herrero P, Pesl P, Reddy M, Oliver N, Georgiou P, Toumazou C (2014) Advanced insulin bolus advisor

based on run-to-run control and case-based reasoning. IEEE journal of biomedical and health informatics

19(3):1087–1096

Holzinger A, Dehmer M, Jurisica I (2014) Knowledge discovery and interactive data mining in

bioinformatics-state-of-the-art, future challenges and research directions. BMC bioinformatics 15(6):I1

Hosni AIE, Li K (2019) Minimizing the inﬂuence of rumors during breaking news events in online social

networks. Knowledge-Based Systems p 105452

Huang C, Clayton EA, Matyunina LV, McDonald LD, Benigno BB, Vannberg F, McDonald JF (2018) Ma-

chine learning predicts individual cancer patient responses to therapeutic drugs with high accuracy. Scien-

tiﬁc reports 8(1):1–8

Huang X, Smith MC, Paul MJ, Ryzhkov D, Quinn SC, Broniatowski DA, Dredze M (2017) Examining

patterns of inﬂuenza vaccination in social media. In: Workshops at the Thirty-First AAAI Conference on

Artiﬁcial Intelligence

Hussain L, Ahmed A, Saeed S, Rathore S, Awan IA, Shah SA, Majid A, Idris A, Awan AA (2018) Prostate

cancer detection using machine learning techniques by employing combination of features extracting

strategies. Cancer Biomarkers 21(2):393–413

Ingerman A, Ostrowski K (2019) Introducing tensorﬂow federated. Available at: https://medium.

com/tensorﬂow/introducing-tensorﬂow-federated-a4147aa20041

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 43

Injadat M, Salo F, Nassif AB (2016) Data mining techniques in social media: A survey. Neuro-

computing 214:654 – 670, DOI https://doi.org/10.1016/j.neucom.2016.06.045, URL http://www.

sciencedirect.com/science/article/pii/S092523121630683X

Injadat M, Salo F, Nassif AB, Essex A, Shami A (2018) Bayesian optimization with machine learning algo-

rithms towards anomaly detection. In: 2018 IEEE Global Communications Conference (GLOBECOM),

pp 1–6, DOI 10.1109/GLOCOM.2018.8647714

Injadat M, Moubayed A, Nassif AB, Shami A (2020a) Multi-split Optimized Bagging Ensemble

Model Selection for Multi-class Educational Datasets. Applied Intelligence DOI https://doi.org/10.1007/

s10489-020- 01776-3

Injadat M, Moubayed A, Nassif AB, Shami A (2020b) Multi-Stage Optimized Machine Learning Framework

for Network Intrusion Detection. IEEE Transactions On Network and Service Management

Injadat M, Moubayed A, Nassif AB, Shami A (2020c) Systematic ensemble model selec-

tion approach for educational data mining. Knowledge-Based Systems 200:105992, DOI https://

doi.org/10.1016/j.knosys.2020.105992, URL http://www.sciencedirect.com/science/article/

pii/S0950705120302999

Injadat M, Moubayed A, Shami A (2020) Detecting Botnet Attacks in IoT Environments: An Optimized

Machine Learning Approach. In: IEEE 32nd International Conference on Microelectronics (ICM2020)

Jahanbakhsh K, Moon Y (2014) The predictive power of social media: On the predictability of us presidential

elections using twitter. arXiv preprint arXiv:14070622

Javaid A, Niyaz Q, Sun W, Alam M (2016) A deep learning approach for network intrusion detection system.

In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications

Technologies (formerly BIONETICS), pp 21–26

Jelinek HF, Stranieri A, Yatsko A, Venkatraman S (2016) Data analytics identify glycated haemoglobin co-

markers for type 2 diabetes mellitus diagnosis. Computers in biology and medicine 75:90–97

Jovanovic L, Peterson CM (1982) Optimal insulin delivery for the pregnant diabetic patient. Diabetes Care

5:24–37

Kang MJ, Kang JW (2016) Intrusion detection system using deep neural network for in-vehicle network

security. PloS one 11(6)

Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) Machine learning and

data mining methods in diabetes research. Computational and structural biotechnology journal 15:104–

116

Kearns MJ, Vazirani UV, Vazirani U (1994) An introduction to computational learning theory. MIT press

Khandani AE, Kim AJ, Lo AW (2010) Consumer credit-risk models via machine-learning algorithms. Journal

of Banking & Finance 34(11):2767–2787

Kim Kj, Lee K, Ahn H (2019) Predicting corporate ﬁnancial sustainability using novel business analytics.

Sustainability 11(1):64

Kinkyo T (2020) A bi-annual forecasting model of currency crises. Applied Economics Letters 27(4):255–

261

Klaˇ

snja-Mili´

cevi´

c A, Vesin B, Ivanovi´

c M, Budimac Z (2011) E-learning personalization based on hybrid

recommendation strategy and learning style identiﬁcation. Computers & Education 56(3):885–899

Konecn`

y J, McMahan HB, Felix XY, Richt´

arik P, Suresh AT, Bacon D (2016) Federated learning: Strategies

for improving communication eﬃciency. In: 29th Conference on Neural Information Processing Systems

(NIPS 2016)

Lei K, Xie Y, Zhong S, Dai J, Yang M, Shen Y (2019) Generative adversarial fusion network for class

imbalance credit scoring. Neural Computing and Applications pp 1–12

Lezotre PL (2014) Part iii - recommendations to support the next phase of international cooperation, conver-

gence, and harmonization in the pharmaceutical domain. In: Lezotre PL (ed) International Cooperation,

Convergence and Harmonization of Pharmaceutical Regulations, Academic Press, Boston, pp 221 – 294,

DOI https://doi.org/10.1016/B978-0-12-800053- 3.00004-5, URL http://www.sciencedirect.com/

science/article/pii/B9780128000533000045

Lin CS, Khan HA, Chang RY, Wang YC (2008) A new approach to modeling early warning systems for cur-

rency crises: Can a machine-learning fuzzy expert system predict the currency crises eﬀectively? Journal

of International Money and Finance 27(7):1098–1121

Lin WC, Lu YH, Tsai CF (2019) Feature selection in single and ensemble learning-based bankruptcy predic-

tion models. Expert Systems 36(1):e12335

Liu X, Chen H (2015) A research framework for pharmacovigilance in health social media: identiﬁcation and

evaluation of patient adverse drug event reports. Journal of biomedical informatics 58:268–279

Liu X, Zhang P, Wang F, Hu Y, Liu H (2017) Research on automotive brake-by-wire system based on ﬂexray

bus. In: 2017 5th International Conference on Frontiers of Manufacturing Science and Measuring Tech-

nology (FMSMT 2017), Atlantis Press

44 MohammadNoor Injadat et al.

Mahana M, Johns M, Apte A (2012) Automated essay grading using machine learning. Mach Learn Session,

Stanford University

Mai F, Tian S, Lee C, Ma L (2019) Deep learning models for bankruptcy prediction using textual disclosures.

European journal of operational research 274(2):743–758

Marr B (2019) How much data do we create every day? the mind-blowing stats every-

one should read. URL https://www.forbes.com/sites/bernardmarr/2018/05/21/

how-much- data-do- we-create-every- day-the- mind-blowing- stats-everyone-should- read/

#259844e160ba, Accessed on: Nov. 20, 2019

Mathias S, Bhattacharyya P (2018) ASAP++: Enriching the ASAP automated essay grading dataset with

essay attribute scores. In: Proceedings of the Eleventh International Conference on Language Resources

and Evaluation (LREC 2018)

Mathias S, Bhattacharyya P (2020) Can Neural Networks Automatically Score Essay Traits? In: Proceedings

of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, Association

for Computational Linguistics, pp 85–91, DOI 10.18653/v1/2020.bea-1.8

Maynard D, Bontcheva K, Rout D (2012) Challenges in developing opinion mining tools for social media.

In: Proceedings of the Language Resources and Evaluation Conference (LREC), pp 15–22

McDermott CD, Majdani F, Petrovski AV (2018) Botnet detection in the internet of things using deep learning

approaches. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp 1–8

McMahon MJ (2019) Rethinking early warning systems: Using the radial based support vector machine to

forecast currency crises. PhD thesis, Claremont Graduate University

Michael Dansinger (2019) What is a glycated hemoglobin test (hba1c)? URL https://www.webmd.com/

diabetes/qa/what-is- a-glycated- hemoglobin-test-hba1c

Miotto R, Wang F, Wang S, Jiang X, Dudley JT (2018) Deep learning for healthcare: review, opportunities

and challenges. Brieﬁngs in bioinformatics 19(6):1236–1246

Mitchell TM (1997) Machine Learning. McGraw-Hill, New York, NY, USA

Mondal B, Patra O, Mishra S, Patra P (2020) A course recommendation system based on grades. In: 2020

International Conference on Computer Science, Engineering and Applications (ICCSEA), pp 1–5

Moocorg (2019) Massive open online courses: An edx site. URL https://www.mooc.org/, Accessed on:

Dec. 15, 2019

Moubayed A (2018) Optimization Modeling and Machine Learning Techniques Towards Smarter Systems

and Processes. PhD thesis, University of Western Ontario

Moubayed A, Shami A (2020) Softwarization, virtualization, & machine learning for intelligent & eﬀective

v2x communications. IEEE Intelligent Transportation Systems Magazine

Moubayed A, Injadat M, Nassif AB, Lutﬁyya H, Shami A (2018) E-learning: Challenges and research op-

portunities using machine learning data analytics. IEEE Access 6:39117–39138, DOI 10.1109/ACCESS.

2018.2851790

Moubayed A, Injadat M, Shami A, Lutﬁyya H (2018) Dns typo-squatting domain detection: A data analytics

& machine learning based approach. In: 2018 IEEE Global Communications Conference (GLOBECOM),

IEEE, pp 1–7

Moubayed A, Injadat M, Shami A, Lutﬁyya H (2018) Relationship between student engagement and per-

formance in e learning environment using association rules. In: 2018 IEEE World Engineering Education

Conference (EDUNINE), pp 1–6, DOI 10.1109/EDUNINE.2018.8451005

Moubayed A, Injadat M, Shami A, Lutﬁyya H (2019) Student engagement level in e learning environment:

Clustering using k means. American Journal of Distance Education 34(2), DOI 10.1080/08923647.2020.

1696140

Moubayed A, Aqeeli E, Shami A (2020) Ensemble-based Feature Selection and Classiﬁcation Model for

DNS Typo-squatting Detection. In: 33rd Canadian Conference on Electrical and Computer Engineering

(CCECE’20), IEEE, pp 1–6

Moubayed A, Injadat M, Shami A (2020a) Optimized Random Forest Model for Botnet Detection Based on

DNS Queries. In: IEEE 32nd International Conference on Microelectronics (ICM2020)

Moubayed A, Shami A, Heidari P, Larabi A, Brunner R (2020b) Edge-enabled v2x service placement for

intelligent transportation systems. IEEE Transactions on Mobile Computing

Mucaki EJ, Zhao JZ, Lizotte DJ, Rogan PK (2019) Predicting responses to platin chemotherapy agents with

biochemically-inspired machine learning. Signal transduction and targeted therapy 4(1):1–12

M¨

uller PL, Treis T, Odainic A, Pfau M, Herrmann P, Tufail A, Holz FG (2020) Prediction of function in

abca4-related retinopathy using ensemble machine learning. Journal of Clinical Medicine 9(8):2428

Nguyen G, Dlugolinsky S, Tran V, Lopez Garcia A (2020) Deep learning for proactive network monitoring

and security protection. IEEE Access 8:19696–19716, DOI 10.1109/ACCESS.2020.2968718

Nikfarjam A, Sarker A, O’connor K, Ginn R, Gonzalez G (2015) Pharmacovigilance from social media:

mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities 45

Journal of the American Medical Informatics Association 22(3):671–681

O’Connor K, Pimpalkhute P, Nikfarjam A, Ginn R, Smith KL, Gonzalez G (2014) Pharmacovigilance on

twitter? mining tweets for adverse drug reactions. In: AMIA annual symposium proceedings, American

Medical Informatics Association, vol 2014, pp 924–933

Owens C, Zisser H, Jovanovic L, Srinivasan B, Bonvin D, Doyle FJ (2006) Run-to-run control of blood

glucose concentrations for people with type 1 diabetes mellitus. IEEE Transactions on Biomedical Engi-

neering 53(6):996–1005

Oyebode O, Orji R (2019) Social media and sentiment analysis: The nigeria presidential election 2019. In:

2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference

(IEMCON), pp 0140–0146

Patki A, Sarker A, Pimpalkhute P, Nikfarjam A, Ginn R, O’Connor K, Smith K, Gonzalez G (2014) Mining

adverse drug reaction signals from social media: going beyond extraction. Proceedings of BioLinkSig

2014:1–8

Pektas¸ A, Acarman T (2017) Eﬀective feature selection for botnet detection based on network ﬂow analysis

Prasad V, Fojo T, Brada M (2016) Precision oncology: origins, optimism, and potential. The Lancet Oncology

17(2):e81–e86

Ramalingam V, Pandian A, Chetry P, Nigam H (2018) Automated essay grading using machine learning

algorithm. In: Journal of Physics: Conference Series, IOP Publishing, vol 1000, p 012030

Ramteke J, Shah S, Godhia D, Shaikh A (2016) Election result prediction using twitter sentiment analysis.

In: 2016 International Conference on Inventive Computation Technologies (ICICT), vol 1, pp 1–5

Reddy GT, Bhattacharya S, Siva Ramakrishnan S, Chowdhary CL, Hakak S, Kaluri R, Praveen Kumar Reddy

M (2020) An ensemble based machine learning model for diabetic retinopathy classiﬁcation. In: 2020

International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), pp

1–6

Ryﬀel T, Trask A, Dahl M, Wagner B, Mancuso J, Rueckert D, Passerat-Palmbach J (2018) A generic

framework for privacy preserving deep learning. In: PRIVACY PRESERVING MACHINE LEARNING

NeurIPS Workshop

Saba T (2020) Recent advancement in cancer detection using machine learning: Systematic survey of decades,

comparisons and challenges. Journal of Infection and Public Health

Salo F, Injadat M, Nassif AB, Shami A, Essex A (2018) Data mining techniques in intrusion detection sys-

tems: A systematic literature review. IEEE Access 6:56046–56058

Salo F, Injadat M, Moubayed A, Nassif AB, Essex A (2019) Clustering enabled classiﬁcation using ensemble

feature selection for intrusion detection. In: 2019 International Conference on Computing, Networking

and Communications (ICNC), IEEE, pp 276–281

Sarker A, Ginn R, Nikfarjam A, O’Connor K, Smith K, Jayaraman S, Upadhaya T, Gonzalez G (2015)

Utilizing social media data for pharmacovigilance: a review. Journal of biomedical informatics 54:202–

212

Schiﬀrin A, Belmonte M (1982) Multiple daily self-glucose monitoring: its essential role in long-term glucose

control in insulin-dependent diabetic patients treated with pump and multiple subcutaneous injections.

Diabetes Care 5(5):479–484

Schiﬀrin A, Mihic M, Leibel BS, Albisser AM (1985) Computer-assisted insulin dosage adjustment. Diabetes

Care 8(6):545–552

Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W (2019) Deep learning to improve breast

cancer detection on screening mammography. Scientiﬁc reports 9(1):1–12

Shi C, Pun CM (2018) Multi-scale hierarchical recurrent neural networks for hyperspectral image classiﬁca-

tion. Neurocomputing 294:82 – 93

Sloane R, Osanlou O, Lewis D, Bollegala D, Maskell S, Pirmohamed M (2015) Social media and phar-

macovigilance: a review of the opportunities and challenges. British journal of clinical pharmacology

80(4):910–920

Sneyers E, De Witte K (2017) The interaction between dropout, graduation rates and quality ratings in uni-

versities. Journal of the Operational Research Society 68(4):416–430

Sommer R, Paxson V (2010) Outside the closed world: On using machine learning for network intrusion

detection. In: 2010 IEEE symposium on security and privacy, IEEE, pp 305–316

Stieglitz S, Dang-Xuan L (2013) Social media and political communication: a social media analytics frame-

work. Social network analysis and mining 3(4):1277–1291

Symeonidis P, Malakoudis D (2016) Moocrec. com: Massive open online courses recommender system. In:

RecSys Posters

Troussas C, Chrysaﬁadi K, Virvou M (2018) Machine learning and fuzzy logic techniques for personalized

tutoring of foreign languages. In: International Conference on Artiﬁcial Intelligence in Education, Springer

International Publishing, pp 358–362

46 MohammadNoor Injadat et al.

Truong HM (2016) Integrating learning styles and adaptive e-learning system: Current developments, prob-

lems and opportunities. Computers in human behavior 55:1185–1193

Tsai M, Wang Y, Kwak M, Rigole N (2019) A machine learning based strategy for election result prediction.

In: 2019 International Conference on Computational Science and Computational Intelligence (CSCI), pp

1408–1410

Ullmann TD (2019) Automated analysis of reﬂection in writing: Validating machine learning approaches.

International Journal of Artiﬁcial Intelligence in Education 29(2):217–257

Van Der Aalst W (2016) Data science in action. In: Process mining, Springer, pp 3–23

Vidyasagar M (2015) Identifying predictive features in drug response using machine learning: opportunities

and challenges. Annual review of pharmacology and toxicology 55:15–34

Vormayr G, Zseby T, Fabini J (2017) Botnet communication patterns. IEEE Communications Surveys Tuto-

rials 19(4):2768–2796

Wang H, Gu J, Wang S (2017) An eﬀective intrusion detection framework based on svm with feature aug-

mentation. Knowledge-Based Systems 136:130–139

Wang J, Yang Y, Mao J, Huang Z, Huang C, Xu W (2016) Cnn-rnn: A uniﬁed framework for multi-label

image classiﬁcation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp

2285–2294, DOI 10.1109/CVPR.2016.251

Wang P, Wu L, Aslam B, Zou CC (2015) Analysis of Peer-to-Peer botnet attacks and defenses. In: Propagation

phenomena in real world networks, Springer, pp 183–214

Webank’s, AI (2019) Federated AI technology enabler

Wen Y, Li W, Roth H, Dogra P (2019) Federated Learning powered by NVIDIA Clara. Available at:

https://developer.nvidia.com/blog/federated-learning-clara/

Wilson RA, Keil FC (2001) The MIT encyclopedia of the cognitive sciences. MIT press

Wu Q, Zhao W (2017) Small-cell lung cancer detection using a supervised machine learning algorithm. In:

2017 International Symposium on Computer Science and Intelligent Controls (ISCSIC), pp 88–91

Xia F, Shukla M, Brettin T, Garcia-Cardona C, Cohn J, Allen JE, Maslov S, Holbeck SL, Doroshow JH,

Evrard YA, et al. (2018) Predicting tumor cell line response to drug pairs with deep learning. BMC bioin-

formatics 19(18):71–79

Xiao C, Choi E, Sun J (2018) Opportunities and challenges in developing deep learning models using elec-

tronic health records data: a systematic review. Journal of the American Medical Informatics Association

25(10):1419–1428

Xu L, Kinkyo T, Hamori S (2018) Predicting currency crises: A novel approach combining random forests

and wavelet transform. Journal of Risk and Financial Management 11(4):86

Xu R, He M (2020) Application of deep learning neural network in online supply chain ﬁnancial credit

risk assessment. In: 2020 International Conference on Computer Information and Big Data Applications

(CIBDA), pp 224–232

Yang L, Shami A (2020) On hyperparameter optimization of machine learning algorithms: Theory

and practice. Neurocomputing DOI https://doi.org/10.1016/j.neucom.2020.07.061, URL http://www.

sciencedirect.com/science/article/pii/S0925231220311693

Yang L, Moubayed A, Hamieh I, Shami A (2019a) Tree-based intelligent intrusion detection system in inter-

net of vehicles. In: 2019 IEEE Global Communications Conference (GLOBECOM)

Yang Q, Liu Y, Chen T, Tong Y (2019b) Federated machine learning: Concept and applications. ACM Trans-

actions on Intelligent Systems and Technology (TIST) 10(2):1–19

Zaidi R, Tanveer S (2017) Reviewing anatomy of botnets and botnet detection techniques. International Jour-

nal of Advanced Research in Computer Science 8(5)

Zeng Y, Qiu M, Zhu D, Xue Z, Xiong J, Liu M (2019) Deepvcm: A deep learning based intrusion detection

method in vanet. In: 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity),

IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on

Intelligent Data and Security (IDS), pp 288–293

Zhang H, Huang T, Lv Z, Liu S, Zhou Z (2018a) Mcrs: A course recommendation system for moocs. Multi-

media Tools and Applications 77(6):7051–7069

Zhang T, Zhang W, Wei X, Haijing H (2018b) Multiple instance learning for credit risk assessment with

transaction data. Knowledge-Based Systems 161:65–77

Zhang W, Zhong J, Yang S, Gao Z, Hu J, Chen Y, Yi Z (2019) Automated identiﬁcation and grading system

of diabetic retinopathy using deep neural networks. Knowledge-Based Systems 175:12–25

Zhu Y, Zhou L, Xie C, Wang GJ, Nguyen TV (2019) Forecasting smes’ credit risk in supply chain ﬁnance with

an enhanced hybrid ensemble machine learning approach. International Journal of Production Economics

211:22–33

Zisser H, Robinson L, Bevier W, Dassau E, Ellingsen C, Doyle III FJ, Jovanovic L (2008) Bolus calculator:

a review of four smart insulin pumps. Diabetes technology & therapeutics 10(6):441–444

ResearchGate has not been able to resolve any citations for this publication.

Prediction of Function in ABCA4-Related Retinopathy Using Ensemble Machine Learning

Article

Full-text available

Jul 2020

Full-field electroretinogram (ERG) and best corrected visual acuity (BCVA) measures have been shown to have prognostic value for recessive Stargardt disease (also called “ABCA4-related retinopathy”). These functional tests may serve as a performance-outcome-measure (PerfO) in emerging interventional clinical trials, but utility is limited by variability and patient burden. To address these limitations, an ensemble machine-learning-based approach was evaluated to differentiate patients from controls, and predict disease categories depending on ERG (‘inferred ERG’) and visual impairment (‘inferred visual impairment’) as well as BCVA values (‘inferred BCVA’) based on microstructural imaging (utilizing spectral-domain optical coherence tomography) and patient data. The accuracy for ‘inferred ERG’ and ‘inferred visual impairment’ was up to 99.53 ± 1.02%. Prediction of BCVA values (‘inferred BCVA’) achieved a precision of ±0.3LogMAR in up to 85.31% of eyes. Analysis of the permutation importance revealed that foveal status was the most important feature for BCVA prediction, while the thickness of outer nuclear layer and photoreceptor inner and outer segments as well as age of onset highly ranked for all predictions. ‘Inferred ERG’, ‘inferred visual impairment’, and ‘inferred BCVA’, herein, represent accurate estimates of differential functional effects of retinal microstructure, and offer quasi-functional parameters with the potential for a refined patient assessment, and investigation of potential future treatment effects or disease progression.

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

Article

Full-text available

Jul 2020
NEUROCOMPUTING

Machine learning algorithms have been used widely in various applications and areas. To fit a machine learning model into different problems, its hyper-parameters must be tuned. Selecting the best hyper-parameter configuration for machine learning models has a direct impact on the model’s performance. It often requires deep knowledge of machine learning algorithms and appropriate hyper-parameter optimization techniques. Although several automatic optimization techniques exist, they have different strengths and drawbacks when applied to different types of problems. In this paper, optimizing the hyper-parameters of common machine learning models is studied. We introduce several state-of-the-art optimization techniques and discuss how to apply them to machine learning algorithms. Many available libraries and frameworks developed for hyper-parameter optimization problems are provided, and some open challenges of hyper-parameter optimization research are also discussed in this paper. Moreover, experiments are conducted on benchmark datasets to compare the performance of different optimization methods and provide practical examples of hyper-parameter optimization. This survey paper will help industrial users, data analysts, and researchers to better develop machine learning models by identifying the proper hyper-parameter configurations effectively. Github code: https://github.com/LiYangHart/Hyperparameter-Optimization-of-Machine-Learning-Algorithms

Multi-split optimized bagging ensemble model selection for multi-class educational data mining

Article

Full-text available

Dec 2020
APPL INTELL

Predicting students’ academic performance has been a research area of interest in recent years, with many institutions focusing on improving the students’ performance and the education quality. The analysis and prediction of students’ performance can be achieved using various data mining techniques. Moreover, such techniques allow instructors to determine possible factors that may affect the students’ final marks. To that end, this work analyzes two different undergraduate datasets at two different universities. Furthermore, this work aims to predict the students’ performance at two stages of course delivery (20% and 50% respectively). This analysis allows for properly choosing the appropriate machine learning algorithms to use as well as optimize the algorithms’ parameters. Furthermore, this work adopts a systematic multi-split approach based on Gini index and p-value. This is done by optimizing a suitable bagging ensemble learner that is built from any combination of six potential base machine learning algorithms. It is shown through experimental results that the posited bagging ensemble models achieve high accuracy for the target group for both datasets.

A course recommendation system based on grades

Conference Paper

Full-text available

Mar 2020

The online courses are playing a crucial role indeveloping new skills in learners and in the education system.Now a days a massive number of online courses and certificationsare available over the internet from universities as open learningplatforms. As there is no in-person consultation with any expert,the learners may opt for irrelevant courses inadvertently and maynot be able to analyze their own suitability and adaptability ofthe courses which will west learners time and resources. Thispaper proposes a machine learning approach to recommendsuitable courses to learners based on their learning history andpast performance. The framework first classifies a new learnerbased on their past performance using the k-means clusteringalgorithm. Collaborative filtering will be applied in the clusterto recommend a few suitable courses. Further, based on anonline test the adaptability of the learner will be tested to thecustomized recommended courses according to learners needs.The framework will provide a personalized environment of studyto each learner

Detecting Botnet Attacks in IoT Environments: An Optimized Machine Learning Approach

Conference Paper

Dec 2020

Optimized Random Forest Model for Botnet Detection Based on DNS Queries

Conference Paper

Dec 2020

Ensemble-based Feature Selection and Classification Model for DNS Typo-squatting Detection

Conference Paper

Sep 2020

Domain Name System (DNS) plays in important role in the current IP-based Internet architecture. This is because it performs the domain name to IP resolution. However, the DNS protocol has several security vulnerabilities due to the lack of data integrity and origin authentication within it. This paper focuses on one particular security vulnerability, namely typo-squatting. Typo-squatting refers to the registration of a domain name that is extremely similar to that of an existing popular brand with the goal of redirecting users to malicious/suspicious websites. The danger of typo-squatting is that it can lead to information threat, corporate secret leakage, and can facilitate fraud. This paper builds on our previous work in [1], which only proposed majority voting based classifier, by proposing an ensemble-based feature selection and bagging classification model to detect DNS typo-squatting attack. Experimental results show that the proposed framework achieves high accuracy and precision in identifying the malicious/suspicious typo-squatting domains (a loss of at most 1.5% in accuracy and 5% in precision when compared to the model that used the complete feature set) while having a lower computational complexity due to the smaller feature set (a reduction of more than 50% in feature set size).

Application of Deep Learning Neural Network in Online Supply Chain Financial Credit Risk Assessment

Conference Paper

Apr 2020

Predicting student dropout in subscription-based online learning environments: The beneficial impact of the logit leaf model

Article

May 2020
DECIS SUPPORT SYST

Online learning has been adopted rapidly by educational institutions and organizations. Despite its many advantages, including 24/7 access, high flexibility, rich content, and low cost, online learning suffers from high dropout rates that hamper pedagogical and economic goal outcomes. Enhanced student dropout prediction tools would help providers proactively detect students at risk of leaving and identify factors that they might address to help students continue their learning experience. Therefore, this study seeks to improve student dropout predictions, with three main contributions. First, it benchmarks a recently proposed logit leaf model (LLM) algorithm against eight other algorithms, using a real-life data set of 10,554 students of a global subscription-based online learning provider. The LLM outperforms all other methods in finding a balance between predictive performance and comprehensibility. Second, a new multilevel informative visualization of the LLM adds novel benefits, relative to a standard LLM visualization. Third, this research specifies the impacts of student demographics; classroom characteristics; and academic, cognitive, and behavioral engagement variables on student dropout. In reviewing LLM segments, these results show that different insights emerge for various student segments with different learning patterns. This notable result can be used to personalize student retention campaigns.

An Ensemble based Machine Learning model for Diabetic Retinopathy Classification

Conference Paper

Feb 2020

Machine Learning Towards Intelligent Systems: Applications, Challenges, and Opportunities

Abstract and Figures

Recommended publications

Machine learning towards intelligent systems: applications, challenges, and opportunities

Optimized Random Forest Model for Botnet Detection Based on DNS Queries

Optimized Random Forest Model for Botnet Detection Based on DNS Queries

Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection