ArticlePDF Available

A Review on Machine Learning Algorithms, Tasks and Applications

Authors:

Abstract and Figures

Machine learning is a field of computer science which gives computers an ability to learn without being explicitly programmed. Machine learning is used in a variety of computational tasks where designing and programming explicit algorithms with good performance is not easy. Applications include email filtering, recognition of network intruders or malicious insiders working towards a data breach. One of the foundation objectives of machine learning is to train computers to utilize data to solve a specified problem. A good number of applications of machine learning like classifier training on email messages in order to differentiate between spam and non-spam messages, fraud detection etc. In this article we will focus on basics of machine learning, machine learning tasks and problems and various machine learning algorithms.
Content may be subject to copyright.
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 6, Issue 10, October 2017, ISSN: 2278 1323
1548
All Rights Reserved © 2017 IJARCET
A Review on Machine Learning Algorithms,
Tasks and Applications
Diksha Sharma1, Neeraj Kumar2
ABSTRACT: Machine learning is a field of
computer science which gives computers an
ability to learn without being explicitly
programmed. Machine learning is used in a
variety of computational tasks where
designing and programming explicit
algorithms with good performance is not
easy. Applications include email filtering,
recognition of network intruders or
malicious insiders working towards a data
breach. One of the foundation objectives of
machine learning is to train computers to
utilize data to solve a specified problem. A
good number of applications of machine
learning like classifier training on email
messages in order to differentiate between
spam and non-spam messages, fraud
detection etc. In this article we will focus on
basics of machine learning, machine
learning tasks and problems and various
machine learning algorithms.
Keywords: Machine learning, supervised
learning, unsupervised learning,
classification
1. INTRODUCTION
Machine learning is a branch of artificial
intelligence that allows computer systems to
learn directly from examples, data, and
experience. Through enabling computers to
perform specific tasks intelligently, machine
learning systems can carry out complex
processes by learning from data, rather than
following pre-programmed rules. Increasing
data accessibility has endorsed machine
learning systems to be trained on a bulky
pool of examples, while growing computer
processing power has supported the critical
capabilities of these systems. Within the
field itself there have also been algorithmic
advances, which have given machine
learning better power. As a outcome of these
advances, systems which performed at
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 6, Issue 10, October 2017, ISSN: 2278 1323
1549
All Rights Reserved © 2017 IJARCET
noticeably below-human levels can now go
better than humans at some definite tasks.
Many people now cooperate with systems
based on machine learning each day, for
example in image recognition systems.
Now-a-days the concept of machine learning
is used in many applications and is a core
concept for intelligent systems [1][3] .As the
field develops further, machine learning
shows promise of supporting potentially
transformative advances in a range of areas,
and the social and economic opportunities
which follow are significant. In healthcare,
machine learning is creating systems that
can assist doctors give more correct or
efficient diagnosis for definite conditions.
For public services it has the potential to
target support more effectively to those in
need, or to tailor services to users. Machine
learning is helping to make sense of the
gigantic quantity of data accessible to
researchers today, offering new insights into
biology, physics & medicine.
II. MACHINE LEARNING TASKS
Machine learning tasks are typically
classified into three broad categories,
depending on the nature of the learning
"signal" or "feedback" available to a
learning system.
Supervised learning
Unsupervised learning
Reinforcement learning
Supervised Learning: It is the machine
learning task of inferring a function from
labeled training data. The training data
consists of a set of training examples. A
supervised learning algorithm analyzes the
training data and produces an inferred
function that can be utilized for mapping
fresh examples. To work out on a given
problem of supervised learning, one has to
carry out the following steps:
(i) Decide the kind of training examples.
The user should decide what kind of data is
to be used as a training set.
(ii) Collect a training set. The training set
needs to be envoy of the real-world use of
the function. Thus, a set of input objects is
collected and corresponding outputs are also
collected.
(iii) Decide the input feature depiction of the
learned function. The accuracy of the
learned function relies sturdily on how the
input object is represented. Normally, the
input object is altered into a feature vector
that contains a number of features that are
descriptive of the object. The number of
features should not be too large.
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 6, Issue 10, October 2017, ISSN: 2278 1323
1550
All Rights Reserved © 2017 IJARCET
(iv) Decide the structure of the learned
function and corresponding learning
algorithm.
(v) Complete the design. Run the learning
algorithm on the gathered training set. Some
supervised learning algorithms need the user
to find out certain control parameters.
(vi) Assess the accuracy of the learned
function. After parameter adjustment and
learning, the performance of the resulting
function should be measured on a test set
that is separate from the training set.
Unsupervised learning: It is the machine
learning task of inferring a function to depict
concealed structure from "unlabeled" data.
Since the examples specified to the learner
are unlabeled, there is no assessment of the
accuracy of the structure that is output by
the relevant algorithmwhich is one way of
distinguishing unsupervised learning from
supervised learning and reinforcement
learning. A central case of unsupervised
learning is the problem of density estimation
in statistics [1].
Reinforcement learning: A computer
program interacts with a vibrant
environment in which it must perform a
certain goal. The program is provided
feedback in terms of rewards and
punishments as it navigates its problem
space.
III. MACHINE LEARNING
ALGORITHMS
There are number of machine learning
algorithms such as Linear Regression,
Logistic Regression, Decision Tree, SVM
[2], and KNN. Linear Regression is used to
estimate real values (cost of houses, number
of calls, total sales etc.) based on continuous
variable(s). Here, we establish relationship
between independent and dependent
variables by fitting a best line. Logistic
Regression is used to estimate discrete
values based on given set of independent
variable(s). In simple words, it predicts the
probability of occurrence of an event by
fitting data to a logit function. Decision Tree
is a type of supervised learning algorithm
that is mostly used for classification
problems.SVM is a classification method. In
this algorithm, we plot each data item as a
point in n-dimensional space (where n is
number of features you have) with the value
of each feature being the value of a
particular coordinate. K nearest neighbors is
a simple algorithm which stores the entire
available cases and classifies new cases by a
majority vote of its k neighbors.
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 6, Issue 10, October 2017, ISSN: 2278 1323
1551
All Rights Reserved © 2017 IJARCET
Fig.1: Machine learning algorithms
IV. MACHINE LEARNING
APPLICATIONS
Machine learning algorithms are widely
used in variety of applications like digital
image processing(image recognition)[5], big
data analysis[4], Speech Recognition,
Medical Diagnosis, Statistical Arbitrage,
Learning Associations, Classification,
Prediction etc.
V.CONCLUSION
The article illustrates the concept of machine
learning with its tasks and applications. The
article also highlights the various types of
learning such as supervised learning,
unsupervised learning and reinforcement
learning. In this article a detailed procedure
for solving a problem using supervised
learning has also been discussed..
VI. REFERENCES
1. Talwar, A. and Kumar, Y., 2013. Machine
Learning: An artificial intelligence methodology.
International Journal of Engineering and Computer
Science, 2, pp.3400-3404.
2. Muhammad, I. and Yan, Z., 2015. Supervised
Machine Learning Approaches: A Survey. ICTACT
Journal on Soft Computing, 5(3).
3. Singh, S., Kumar, N. and Kaur, N., 2014. Design
Anddevelopment Of Rfid Based Intelligent Security
System. International Journal of Advanced Research
in Computer Engineering & Technology (IJARCET)
Volume, 3.
4. Sharma, D., Pabby, G. and Kumar, N., Challenges
Involved in Big Data Processing & Methods to Solve
Big Data Processing Problems.IJRASET,5(8),pp.841-
844.
5. Kumar, N. and Gupta, S., 2016. Offline
Handwritten Gurmukhi Character Recognition: A
Review. International Journal of Software
Engineering and Its Applications, 10(5), pp.77-86.
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 6, Issue 10, October 2017, ISSN: 2278 1323
1552
All Rights Reserved © 2017 IJARCET
Ms. Diksha completed her
B.Tech from Chitkara University, Himachal
Pradesh in the stream of Electronics and
Communication Engineering. She is now
planning to pursue Masters in science from
abroad.
Mr. Neeraj Kumar is
presently working as Assistant Professor in
Electronics and Communication
Engineering Department at Chitkara
University, Himachal Pradesh, India. He has
more than 6 years of teaching experience.
His area of interest is digital image
processing, digital signal processing.
... Machine learning is a technique of training computers to learn from data, without being explicitly programmed [8]. It is a subset of artificial intelligence that focuses on the development of algorithms and statistical models that allow computers to learn from and make decisions based on data inputs [9]. ...
Chapter
Full-text available
This research paper proposes a novel approach for credit card fraud detection in the banking sector. The study utilizes ensemble learning with boosting techniques, combining the Random Forest(RF), Support Vector Machine(SVM), and Extreme Gradient Boosting(XGBoost) algorithms to create a powerful ensemble classifier. The approach is evaluated using an extensive dataset of credit card transactions. The results demonstrate exceptional recall, accuracy, precision, and F-score values with result of 1.0 for each evaluation metrics. In this study ensemble model developed outperforms previous studies by incorporating multiple evaluation measures and effectively leveraging the strengths of each base classifier. The research highlights the importance of considering a range of evaluation metrics and suggests avenues for further research in improving fraud detection systems. By addressing the limitations of earlier studies and using resampling techniques to handle imbalanced data, the proposed ensemble model offers significant potential for enhancing fraud detection and security protocols in the financial sector. The findings are considered trustworthy and have important implications for the industry, as they improve the realism and generalizability of credit card fraud detection through the use of the Kaggle.com dataset and ensemble learning techniques.
... Estimating the complete number of training iterations can be difficult. Similarly to other optimization problems, machine learning models require stopping criteria to prevent overfitting [119]- [121]. ...
Article
Full-text available
Chemical engineers' formulation, development, and stance processes all heavily rely on models. The physical and economic consequences of these decisions can have disastrous effects. Attempts to employ a hybrid form of artificial intelligence for modeling in various disciplines. However, they fell short of expectations. Due to a rise in the amount of data and computational resources during the previous five years. A lot of recent work has gone into developing new data sources, indexes, chemical interface designs, and machine learning algorithms in an effort to facilitate the adoption of these techniques in the research community. However, there are some important downsides to machine learning gains. The most promising uses for machine learning are in time-critical tasks like real-time optimization and planning that require extreme precision and can build on models that can self-learn to recognize patterns, draw conclusions from data, and become more intelligent over time. Due to their limited exposure to computer science and data analysis, the majority of chemical engineers are potentially vulnerable to the development of artificial intelligence. But in the not-too-distant future, chemical engineers' modeling toolbox will include a reliable machine learning component.
... An International Journal on Informatics, Decision Science, Intelligent Systems Applications El-Douh et al., Heart Disease Prediction under Machine Learning and Association Rules under Neutrosophic EnvironmentHowever, it makes very strong assumptions about the independence of characteristics, which might reduce its performance. Common NB classifier versions include the Gaussian, Multinomial, Complement, Bernoulli, and Categorical distributions[34,35]. ...
Article
Full-text available
Early identification and precise prediction of heart disease have important implications for preventative measures and better patient outcomes since cardiovascular disease is a leading cause of death globally. By analyzing massive amounts of data and seeing patterns that might aid in risk stratification and individualized treatment planning, machine learning algorithms have emerged as valuable tools for heart disease prediction. Predictive modeling is considered for many forms of heart illness, such as coronary artery disease, myocardial infarction, heart failure, arrhythmias, and valvar heart disease. Resource allocation, preventative care planning, workflow optimization, patient involvement, quality improvement, risk-based contracting, and research progress are all discussed as management implications of heart disease prediction. The effective application of machine learning-based cardiac disease prediction models requires collaboration between healthcare organizations, providers, and data scientists. This paper used three tools such as the neutrosophic analytical hierarchy process (AHP) as a feature selection, association rules, and machine learning models to predict heart disease. The neutrosophic AHP method is used to compute the weights of features and select the highest features. The association rules are used to give rules between values in all datasets. Then, we used the neutrosophic AHP as feature selection to select the best feature to input in machine learning models. We used nine machine learning models to predict heart disease. We obtained the random forest (RF) and decision tree (DT) have the highest accuracy with 100%, followed by Bagging, k-nearest neighbors (KNN), and gradient boosting have 99%, 98%, and 97%, then AdaBoosting has 89%, then logistic regression and Naïve Bayes have 84%, then the least accuracy is support vector machine (SVM) has 68%.
... A comprehensive classification of machine learning showing statistical frameworks[14] Recent progress and future advancement of artificial intelligence in reservoir geomechanical studies[12] ...
Article
Full-text available
Reservoir geomechanics is a crucial aspect of optimising and developing oil and gas activities, especially in maximising production. Recent technological advancements have revolutionised reservoir geomechanics studies, including integrating data-driven approaches. This review examines and integrates machine learning, data science, and data twin in reservoir studies. The primary aim is to identify the benefits, limitations, significant advancements, potential challenges, opportunities, and research gaps of data-driven approaches to reservoir geomechanics. Additionally, this study aims to create opportunities for further research to address these challenges. The review identifies cost-effectiveness, improved reservoir characterisation, and reduced operational risks as the benefits of integrating data-driven approaches in reservoir geomechanics. However, the review also highlights the significant challenges of data-driven approaches, such as insufficient datasets, limited interpretability, and limited transferability of models. By shedding light on these issues, this review provides a foundation for future research toward finding solutions to these challenges.
Article
Full-text available
In image segmentation, identifying information or object detection in medical im-ages is crucial, particularly information that is harder to spot in magnetic resonance imaging (MRI) of low-grade tumors or cerebrospinal fluid (CSF). To address the aforementioned problems associated with missing data in MRI images and the low quality of MRI images that required longer processing times, this research is to seg-ment brain tumors or detect CSF in four-dimensional MRI images. A new hybrid k-nearest neighbors (k-NN) framework is also proposed, which consists of three tech-niques: correlation matrices of discrete Fourier transform (CM-DFT), Laplace Eigen maps of locally preserving projection (LELPP), and a hybrid GrabCut hidden Markov model of k-mean clustering (GCHMkC). The combination of the Hidden Markov Model (HMM) and the k-mean clustering technique is known as the Hidden Markov Model of the k-mean clustering method (HMMkC). To begin with, the Graph Cut and Support Vector Machine (GCSVM) and the GCHMkC approach are combined. The method increased the quality of the images suggested by the methodology, achieving an accuracy of 99.83%, a sensitivity of 99.99%, a specificity of 99.8%, and a computa-tional execution time of 14.9 seconds. Second, a technique called CM-DFT is sug-gested to improve MRI images while resolving the issue of missing imputation data. The accuracy of the MRI image datasets was improved to 99.84%, the time-lag in the hybrid k-NN algorithm was reduced to 99%, the missing data ratio was reduced to 0.9%, 10%, and 12%, and the correctness of the imputed data was improved to 1.533 seconds with computational execution. Thirdly, the nonlinear data is reduced and unnecessary features are eliminated using the Laplace Eigen maps of locally pre-serving projection (LELPP) approach. The hybrid k-NN algorithm used by the tech-nique yields results with 99% accuracy and an execution duration of 2.42 seconds.
Article
Metal-organic frameworks (MOFs) have demonstrated exclusive features, including high porosity, high surface area, favorable biodegradability, and biocompatibility. Due to these unique properties, MOFs could be used extensively in several applications, including gas storage, catalyst, separation, and biomedical applications. Also, multiple types of MOFs with various features have been identified and used in many applications. Due to the diversity in the characteristics and capabilities of MOFs, their experimental examination is complex, costly, and time-consuming. Machine learning (ML), as an artificial intelligence tool, is an alternative for accurately predicting MOF characteristics. MOFs can benefit from integrating ML techniques for the design and development of them in various fields. This review summarizes ML tools that can be applied to predict MOF properties in designing drug delivery systems (DDSs). To this end, various ML classifiers were introduced first, and then related studies of ML methods for predicting MOF properties and capabilities were presented. ML exhibited a unique role in MOF research to predict the properties of MOFs with less cost and time. Finally, the potential application of ML in developing optimum MOF-based drug carriers is presented as a route to in silico DDS tailoring.
Article
Full-text available
The term " Big Data " refer to the gigantic bulkiness of data which cannot be dealt with by conventional data-handling techniques. Big Data is a new conception, and in this article we are going to intricate it in a clear fashion. It commences with the conception of the subject in itself along with its properties and the two general approaches of dealing with it. The widespread study further goes on to explain the applications of Big Data in all various aspects of economy and being. The deployment of Big Data Analytics after integrating it with digital capabilities to secure business growth and its apparition to make it intelligible to the technically apprenticed business analyzers has been discussed deeply. Also the challenge that hinders the growth of Big Data Analytics is explained in the paper. A brief description about " Hadoop " & Machine learning is also given in the article.
Article
Full-text available
All over India more than 12 crore people utilize Gurumukhi script for speaking, documenting & other purposes. A considerable advancement in the work associated with the recognition of handwritten and printed Gurmukhi text has been reported in last few years. From the last few decades offline handwritten character recognition has gained a lot of interest of researchers. It is well known that each individual has some different writing style, so it is very difficult to identify or recognize the handwritten characters. Researchers have worked in this field using various scripts like Hindi, English but a very little work has been done in Gurmukhi script point of view. Based on data acquirement process a concise classification of recognition system has been discussed in this article. Various feature mining techniques & classifiers like power arc fitting ,parabola arc fitting, ,diagonal feature extraction, transition feature extraction, K-NN classifier (K-nearest neighbor) & SVM classifier (Support vector machine) are also illustrated in this paper. The methodology for word recognition has also been discussed in this paper. Keywords: Handwritten character recognition (HCR), K-NN classifier SVM classifier, feature extraction
Article
Full-text available
One of the core objectives of machine learning is to instruct computers to use data or past experience to solve a given problem. A good number of successful applications of machine learning exist already, including classifier to be trained on email messages to learn in order to distinguish between spam and non-spam messages, systems that analyze past sales data to predict customer buying behavior, fraud detection etc. Machine learning can be applied as association analysis through Supervised learning, Unsupervised learning and Reinforcement Learning but in this study we will focus on strength and weakness of supervised learning classification algorithms. The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. We are optimistic that this study will help new researchers to guiding new research areas and to compare the effectiveness and impuissance of supervised learning algorithms.
Article
Full-text available
Radio frequency identification is one of the most exciting technologies that revolutionize the working practices by increasing efficiency. It is often presented as replacement for barcode, but the technology has much greater potential such as individual serial numbers for each item and possibility to read these numbers at some distance. RFID is a technology being adopted in security field, business fields and in the medical field. This work has the objective to present a system for security based on RFID technology. The proposed system used contactless smart card to limit the entries of unwanted persons. Contactless smart cared has information stored in it which when come in the field of RFID reader it immediately read the information stored in card. Reader recognize the information and match it with the information stored in it .If this reader has the information about that card it will allow the card user to enter .If reader does not find information in tag in its memory it will not.