ArticlePDF Available

Evaluating Students Performance by Artificial Neural Network using WEKA

June 2015
International Journal of Computer Applications 119(23):36-39

June 2015
119(23):36-39

DOI:10.5120/21380-4370

Authors:

Jiby J Puthiyidam

Model Engineering College

Content uploaded by Jiby J Puthiyidam

Content may be subject to copyright.

International Journal of Computer Applications (0975 – 8887)

Volume 119 – No.23, June 2015

Evaluating Students Performance by Artificial Neural

Network using WEKA

Sumam Sebastian

M-Tech Computer and Information Science

College of Engineering Poonjar

Jiby J Puthiyidam

Assistant Professor

Dept. of Computer Science and Engineering

College of Engineering Poonjar

ABSTRACT

Data mining is the process of extracting hidden patterns and

useful information from large set of data is now becoming

part of current inventions. Data mining now can be applied to

different fields like marketing, education; health etc. Data

mining in field of education is named as educational data

mining. Educational data mining can help institutions to

predict the performance of their students so as to improve

their academic results. In this paper artificial neural network is

used to predict the performance of student. Multilayer

Perceptron Neural Network is used for the implementation of

prediction strategy. Experiment is conducted using weka and

real time dataset available.

Keywords

Data Mining; Educational Data Mining; Artificial Neural

Network; Multilayer Perceptron Neural Network(MLP);

Association Rule Mining;

1. INTRODUCTION

Data mining [1] is the process of analyzing data from different

perspectives and summarizing it into important information so

as to find hidden patterns from a large data set. Data mining

[2] points to the strategy of discovering of implicit, previously

unknown and practically useful information from the data in

the databases. It uses techniques of machine learning,

statistical and visualization to discover and present knowledge

in a form which is easily understandable to us. The abundance

and fast evolution of the data mining discipline comes from its

large variety of research areas of interest. Data mining

applications adopts different kind of parameters to examine

the data. Educational Data Mining[3] is a newly emerged

technique that helps to discover methods that will explore

unique types of data from education database and helps to

predict students’ academic performance.

Figure 1: Educational Data mining System

It is very necessary for an institution to maintain a good

academic result, for that student’s academic performance has

to be maintained in better manner. So a continuous student’s

performance evaluation strategy has to be invented. Different

kinds of data mining techniques can be used for this like

association rule mining.K-means clustering, artifiacial neural

network etc.Among the different methods most advanced and

accurate method is the evaluation using artificial neural

network.

2. OBJECTIVES

The main objectives are, first to determine all the personal and

academic factors that affects the performance of student,

second to transform these factors to a suitable form for system

coding and third is to model a neural network that can predict

the performance based on the data of student. The main

concept used in this paper is that of artificial neural network.

3. THE ARTIFICIAL NEURAL

NETWORKS

An artificial neural network (ANN)[4], often called as a

"neural network" (NN), is a computational model based on the

biological neural networks, in other words, is a representation

and emulation of human neural system. It consists of an

interconnected group of artificial neurons and processes

information using a connectionist approach to computation. In

practical terms neural networks are non-linear statistical data

modeling tools [5]. They can be used to model complex

relationships between inputs and outputs or to find patterns in

data. Using neural networks as a tool, data warehousing firms

are harvesting information from datasets in the process known

as data mining [6].

Multilayer Perceptron

The most popular form of neural network architecture is the

multilayer perceptron (MLP). A multilayer perceptron:

 Has any number of inputs.

 Has one or more hidden layers with any number of

units.

 Uses generally sigmoid activation functions in the

hidden layers.

 Have connections between the input layer and the

first hidden layer, between the hidden layers, and

between the last hidden layer and the output layer.

Figure 2: Feed forward neural network

International Journal of Computer Applications (0975 – 8887)

Volume 119 – No.23, June 2015

MLP is especially suitable for approximating a classification

function which sets the example determined by the vector

attribute values into one or more classes. MLP trained with

back propagation algorithm is used for data mining.

4. DATA COLLECTION AND

PREPROCESSING

In this research paper the data was collected from the two

classes 8 and 9 of Sacred Hearts Girls' High School

Bharananganam Kerala.A dataset of 300 students was used

for the evaluation. Neural network is used for predicting the

student performance. The attributes selected are mainly of two

types, first academic attributes related to the academic details

of student and personal attributes related to the personal

details of student that affects the study and performance of

study of student. The academic attributes selected are

1.Interest of study of the student categorized as low, average

and good, 2.Unit test mark the average mark of unit tests

conducted divided as low, average and good, 3.Assignment

mark which is the average of all assignments divided as

average and good, 4.Attendance score which is the average of

attendance of the student taken divided into average and

good,5.Extracurricular activities performance which is the

performance of student in other activities along with studies

grouped in to low average and good,6.Residence which is the

staying of student categorized into either hostler or day

scholar. The personal attributes selected are 1.parent’s

education and family status where parents education divided

as poor average and good and 2.family status is divided as low

and average and good. In a given dataset Data Pre-Processing

technique is used to identify noise data, missing attribute

values, irrelevant and redundant data.

Table1. Attributes and Its Possible Values

5. METHODOLOGY

In this research we have used Weka for the entire

implementation. Simple training and testing using multilayer

perceptron neural network was done first. For this the entire

data set was divided into two separate tests. Half for training

set and another half for testing set. Training was done by

adjusting the different learning and momentum rates. Among

the training results the best was taken for analysis. Second

MLP training after association rule mining is done.

Association rule mining extracts the important rules so that it

helps to identify the important attributes. Unnecessary

attributes are removed from

the data set and MLP neural network training is done. It gives

a better result than simple train & test method. For the most

fare evaluation of result K-fold cross validation method of

MLP training was used. In this the entire data set is not

divided into two different sets as of prior, instead as the input

to the system whole data set is given.10 filed cross validation

is used here. In this the entire data set is divided into 10

subsets. Among this one set is used as test set and remaining

nine sets are used training set. The entire evaluation steps are

depicted in the following figure

Figure 3: Work Methodology

5.1 WEKA Environment

WEKA [8] stands for Waikato Environment for Knowledge

Learning. It was developed by the University of Waikato,

New Zealand. WEKA supports many data mining tasks such

as data re-processing, classification, clustering, regression and

feature selection to name a few.

The supported data formats are ARFF, CSV, C4.5 and binary.

Alternatively you could also import from URL or an SQL

database. After loading the data, preprocessing filters could be

used for adding/removing, attributes, discretization, Sampling,

randomizing etc.Weka is a collection of machine learning

algorithms for data mining & machine learning tasks. Weka is

open source software issued under the GNU General Public

License.

5.2 MLP Training with WEKA

Two sets are used for MLP neural network training in WEKA.

They are Training set and Test set [9].

ATTRIBUTES

DESCRIPTION

VALUES

INTEREST OF

STUDY

Interest of student

in studying

Low

Average

Good

UNIT TEST MARK

Average mark of

student in unit tests

Low

Average

Good

ASSIGNMENT

Average of marks

of assignments

Average

Good

ATTENDANCE

Average

attendance of

student in the class

Average

Good

EXTRACARRICULAR

ACTIVITIES

Performance of

student in

extracurricular

activities

Low

Average

Good

RESIDENCE

Residence of

student in studying

ie in hostel or not

Non

Hoster

Hostler

PARENT

EDUCATION

Education of the

parents of student

Poor

Average

Good

FAMILY STATUS

The total family

status of the

student

Low

Average

and good

International Journal of Computer Applications (0975 – 8887)

Volume 119 – No.23, June 2015

 Training set: A set of examples used for learning that is

to fit the parameters [i.e., weights] of the classifier.

 Test set: A set of examples used only to assess the

performance [generalization] of a fully-specified

classifier.

Back Propagation algorithm is used for the network training.

Back Propagation algorithm

Initialize all weights to small random numbers

Until satisfied DO

 For each training example Do

1. Input the training example to the network and

compute the training outputs

2. For each output unit k

3. For each hidden unit h

4. Update each network weight

Where

Here we are adjusting the learning rate and momentum to get

a better training result.

Main parameters for learning: hiddenLayers,

learningRate,momentum,trainingTime(epoh, seed .The

parameter setting function is given as

weka.classifiers.functions.MultilayerPercep

tron -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H

hiddenLayers -- This defines the hidden layers of the neural

network. This is a list of positive whole numbers. 1 for each

hidden layer. Comma seperated. To have no hidden layers put

a single 0 here. This will only be used if autobuild is set.

There are also wildcard values 'a' = (attribs + classes) / 2, 'i' =

attribs, 'o' = classes , 't' = attribs + classes.

learningRate -- The amount the weights are updated.

momentum -- Momentum applied to the weights during

updating.

For fare evaluation, the ‘cross-validation’ scheme is used

K-fold Cross Validation

 The data set is randomly divided into k subsets.

 One of the k subsets is used as the test set and the

other k-1 subsets are put together to form a training

set.

5.3 Association Rule Mining

In EDM, association rule learning is a conventional and well

researched method for determining interesting relations

between attributes in large databases [10]. Association rule

Mining is mainly intended to recognize strong rules from

databases using different measures of support and confidence.

Support (s) and confidence (c) are two measures of rule

interestingness. They truely reflect the usefulness and

certainty of the discovered rule.

Apriori Algorithm

Apriori is a seminal algorithm proposed by R. Agarwal and R.

Srikant in 1994 for mining frequent item sets for Boolean

association rules.

The following lines state the steps in generating frequent item

set in Apriori algorithm. [11]

Let Ck be a candidate item set of size k and Lk as a frequent

item set of size k. The main steps of iteration are:

 Find frequent set Lk-1

 Join step: Ck is generated by joining Lk -1 with itself

(Cartesian product Lk-1 x Lk-1)

 Prune step (apriori property): Any (k − 1) size item set

that is not frequent cannot be a subset of a frequent k size

item set, hence should be removed

 Frequent set Lk has been achieved [11].

The parameter setting function in weka for association rule

mining is

weka.associations.Apriori -N 10 -T 0 -C 0.9

-D 0.05 -U 1.0 -M 0.1 -S -1.0 -c -1

car -- If enabled class association rules are mined instead of

(general) association rules.

classIndex -- Index of the class attribute. If set to -1, the last

attribute is taken as class attribute.

delta -- Iteratively decrease support by this factor. Reduces

support until min support is reached or required number of

rules has been generated.

6. PERFORMANCE EVALUATION

To evaluate the performance of above methods of neural

network training different parameters are available like

Accuracy, Precision, Recall, F-Measure, Kappa score etc.

Here accuracy, precision and recall are considered.

 Accuracy (percent correct)

Accuracy is how close a measured value is to the actual (true)

value. Accuracy retrieves the percentage of correctly

classified instances.

Accuracy= TP+TN

TP+FP+TN+FN

 Precision

Precision is a value of the accuracy provided by a unique class

that was predicted.

Precision= TP

TP+FP

 Recall

Recall is a measure of the ability of a prediction model to

select instances of a certain class from a data set. It is c also

called sensitivity, and points to the true positive rate.

Recall= TP

TP+FN

International Journal of Computer Applications (0975 – 8887)

Volume 119 – No.23, June 2015

Where

TP =True Positive

TN=True Negative

FP=False Positive

FN=False Negative

7. RESULTS AND DISCUSSION

The table below shows the values obtained by various

performance evaluation parameters.

Table 2: Results of different neural network training

No.

Method of training

Learning rate

Momentum rate

Accuracy (%)

Precision

Recall

Simple Train &

Test

0.7

0.6

0.5

0.35

Simple Train &

Test after

Association Rule

Mining

0.9

0.8

0.76

0.62

K-Fold Cross

validation of Train

& Test

0.2

0.3

0.95

0.91

In this table the considered parameters are method of training,

learning rate, momentum rate, the accuracy precision and

recall obtained on that specified learning and momentum rate.

Learning and momentum rate that gives better training result

is considered here. The values given to learning rate and

momentum can range from 0.2 to 1.0.The three different

training has three different results. So a better method

selection is easier. Accuracy indicates how accurate the

training method is. Here k-fold cross validation is better than

MLP neural network training method.

8. CONCLUSION

This paper presented one use of data mining in the educational

data mining field for predicting student’s performance.

Artificial neural network was used here for prediction.k-fold

cross validation gives the most accurate result than basic

training method and training after association rule mining.

Association rule mining retrieved the most important

attributes that affects the performance of student. Those

attributes are mark of unit test, mark of assignment and

attendance in the class.MLP training with this attributes gave

a far better result than simple training.10 fold cross validation

is used here for the training. The data set considered here is

the real time dataset of marks of 300 students. As a future

work fuzzy logic can be implemented to increase the

performance evaluation result of the student.

9. REFERENCES

[1] Han J. and Kamber M.: “Data Mining: Concepts and

Techniques,” Morgan Kaufmann Publishers, San

Francisco, 2000.

[2] Anwar, M. A., and Naseer Ahmed. Knowledge Mining in

Supervised and Unsupervised ssessment Data of

Students’ Performance." 2011 2nd International

Conference on Networking and Information Technology

IPCSIT vol. Vol. 17. 2011.

[3] http://www.educationaldatamining.org/JEDM/index.php/

JEDM

[4] V.O. Oladokun, Ph.D., A.T. Adebanjo, B.Sc., and O.E.

Charles-Owaba, Ph.D. “Predicting Students’ Academic

Performance using Artificial Neural Network: A Case

Study of an Engineering Course.”

[5] Refaat, M. Data Preparation for Data Mining Using SAS,

Elsevier, 2007.

[6] S. M. Kamruzzaman and A. M. Jehad Sarkar “A New

Data Mining Scheme Using Artificial Neural

Networks”,2011.

[7] Amrender Kumar: “Artificial Neural Network for Data

Mining”.

[8] WEKA MANUAL

[9] Jung-Woo Ha.” Classification using Weka (Brain,

Computation, and Neural Learning)”.

[10] Predicting Student Performance by Using Data Mining

Methods for Classification; Dorina Kabakchieva Sofia

University “St. Kl. Ohridski”, Sofia 1000.

[11] Paresh Tanna, Dr. Yogesh Ghodasara:” Using Apriori

with WEKA for Frequent Pattern Mining”.

[12] Baha Sen, Emine Ucar. Evaluating the achievements of

computer engineering department of distance education

students with data mining methods. Procedia Technology

1 262 – 267, 2012.

IJCATM : www.ijcaonline.org

Forecasting of University Students' Performance Using A Hybrid Model of Neural Networks and Fuzzy Logic

Article

Apr 2023

Artificial intelligence techniques can be applied in forecasting the academic performance of university students, with aim of detecting the factors that influence their learning process which allows instructors and university administration to take more effective actions to increase the university student's performance. Identifying the students' performance will improve the quality of education which will be through analyzing and forecasting the students' performance at the course level and degree level. This research focuses on first-year students' performance in two university-requirement courses, depending on features such as attendance, assessment marks, exams, assignments, and projects. Forecasting the students' performance in the whole degree will depend on these features; high school average, Grade Point Average (GPA) for each semester, drop courses, selected core courses in the degree, period of study, and final GPA. A hybrid Adaptive Neuro-Fuzzy Inference System (ANFIS) model was used toperform the forecasting process. In this way, based on the datasets collected from the selected courses, or the whole degree, the future results can be forecasted and suggestions can be made to carry out corrective steps to improve the final results. The experiments result of the applied models performed that ANFIS-Grid outperforms the ANFIS-Cluster, wherein each model produces the lowest error of 0.7%, where it just fails in one sample from thirteen samples, while the ANFISCluster after modification produces an error equal to 0.15%. Keywords:University Student Performance, Forecasting, Fuzzy logic, Neural Network, Adaptive Neuro-Fuzzy Inference System.

Machine Learning-Based Academic Result Prediction System

Article

Full-text available

Jan 2024

Students’ academic performance is a critical issue as it decides his/her career. It is pivotal for the educational institutes to track the performance record because it can help to enhance the standard of their quality education. Thus, the role of the academic result prediction system comes into existence which uses semester grade point average (SGPA) as a metric. The proposed work aims to create a model that can forecast the SGPA of students based on certain traits. It predicts the result in the form of SGPA of computer science students considering their past academic performance, study, and personal habits during their academic semester using different machine learning models, and to compare them based on different accuracy parameters. Some models that are widely used and are found effective in this field are regression algorithms, classification algorithms, and deep learning techniques. The results conclude that deep learning techniques are the most effective in the proposed work because of their high accuracy and performance, depending upon the attributes used in the prediction.

Anomaly Detection for Modbus over TCP in Control Systems Using Entropy and Classification-Based Analysis

Article

Full-text available

Dec 2023

This article presents a statistical approach using entropy and classification-based analysis to detect anomalies in industrial control systems traffic. Several statistical techniques have been proposed to create baselines and measure deviation to detect intrusion in enterprise networks with a centralized intrusion detection approach in mind. Looking at traffic volume alone to find anomalous deviation may not be enough—it may result in increased false positives. The near real-time communication requirements, coupled with the lack of centralized infrastructure in operations technology and limited resources of the sensor motes, require an efficient anomaly detection system characterized by these limitations. This paper presents extended results from our previous work by presenting a detailed cluster-based entropy analysis on selected network traffic features. It further extends the analysis using a classification-based approach. Our detailed entropy analysis corroborates with our earlier findings that, although some degree of anomaly may be detected using univariate and bivariate entropy analysis for Denial of Service (DOS) and Man-in-the-Middle (MITM) attacks, not much information may be obtained for the initial reconnaissance, thus preventing early stages of attack detection in the Cyber Kill Chain. Our classification-based analysis shows that, overall, the classification results of the DOS attacks were much higher than the MITM attacks using two Modbus features in addition to the three TCP/IP features. In terms of classifiers, J48 and random forest had the best classification results and can be considered comparable. For the DOS attack, no resampling with the 60–40 (training/testing split) had the best results (average accuracy of 97.87%), but for the MITM attack, the 80–20 non-attack vs. attack data with the 75–25 split (average accuracy of 82.81%) had the best results.

Um estudo comparativo da aplicação de redes neurais artificiais na previsão de geração eólica

Article

Full-text available

Apr 2023

O presente trabalho tem como objetivo comparar modelos, de redes neurais artificiais, para previsão de geração eólica. A base de dados fornecida pelo Operador Nacional do Sistema Elétrico (ONS) apresenta uma série histórica de geração de energia, do parque eólico de Icaraizinho, no Ceará, no período entre 2010 e 2020. Modelos de previsão, baseados em redes neurais LSTM (Bidirectional Long Short Term Memory), GRU (Gated Recurrent Units), CNN (Convolutional neural network) e BIGRU-CNN (Bidirectional Gated Recurrent Units - Convolutional neural network), foram implementados na linguagem python, utilizando o framework Keras. Resultados obtidos, dos quatro modelos, foram comparados por meio das métricas RSME (Root Mean Squared Error), MAPE (Mean Absolute Percent Error) e MAE (Mean Absolute Error). Verificou-se, para um horizonte de curto prazo, que o modelo híbrido BIGRU-CNN apresentou melhor desempenho.

Redes neurais aplicadas na predição do preço da soja no estado do Paraná

Article

Full-text available

Jun 2021

Atualmente, o setor agrícola enfrenta o desafio de crescer, de modo competitivo, para atender a demanda interna e manter o espaço conquistado no mercado externo. Produtores, no mercado competitivo da soja, precisam de ferramentas de previsão de preço. As previsões de preço incorporam informações cruciais no momento da comercialização da safra. Neste contexto, este trabalho tem como objetivo aplicar modelos, baseados em redes neurais artificiais, para previsão do preço da saca de soja no estado do Paraná. A base de dados, disponibilizada pelo Centro de Estudos Avançados em Economia Aplicada (CEPEA), apresenta uma série de preços mensal compreendida ente Janeiro/2000 e Agosto/2020. Modelos de previsão, baseados em Redes Neurais LSTM e BLSTM, foram implementados na linguagem Python. Os resultados obtidos, para um horizonte de curto prazo, mostram que os dois modelos de previsão fornecem estimativas confiáveis para o preço da saca de soja.

Assistance System for the Teaching of Natural Numbers to Preschool Children with the Use of Artificial Intelligence Algorithms

Article

Full-text available

Sep 2022

This research was aimed at designing an image recognition system that can help increase children’s interest in learning natural numbers between 0 and 9. The research method used was qualitative descriptive, observing early childhood learning in a face-to-face education model, especially in the learning of numbers, with additional data from literature studies. For the development of the system, the cascade method was used, consisting of three stages: identification of the population, design of the artificial intelligence architecture, and implementation of the recognition system. The method of the system sought to replicate a mechanic that simulates a game, whereby the child trains the artificial intelligence algorithm such that it recognizes the numbers that the child draws on a blackboard. The system is expected to help increase the ability of children in their interest to learn numbers and identify the meaning of quantities to help improve teaching success with a fun and engaging teaching method for children. The implementation of learning in this system is expected to make it easier for children to learn to write, read, and conceive the quantities of numbers, in addition to exploring their potential, creativity, and interest in learning, with the use of technologies.

A Theoretical framework for Harnessing Machine Learning for Digital Forensics in Online Social Networks

Conference Paper

Feb 2024

Online social networks have become a popular platform for communication and information exchange, but also a target for various cybercriminal activities. Digital forensics is essential for investigating and combating these crimes, but it faces challenges in handling the vast amount of data generated on social network platforms. This paper proposes a machine learning-based digital forensic framework for online social networks, consisting of five stages: data identification, preparation, acquisition, examination, and reporting. The framework employs machine learning algorithms such as Artificial Neural Networks, Decision Trees, and Support Vector Machines for data analysis and evidence extraction in the examination stage. The framework aims to automate data processing and analysis, enabling investigators to focus on understanding crime dynamics and reporting. The paper presents the theoretical analysis of the framework, addressing challenges in digital forensics for online social networks. It also discusses theoretical limitations and future research directions to enhance the framework’s capabilities. A proof-of-concept implementation using a real-world dataset is planned to validate its practicality in solving actual digital forensic investigations. The paper contributes to the advancement of digital forensics for a safer online environment.

Modelling of RCC Framed Structure on Sloping Ground using ANN and Random Tree techniques

Conference Paper

Full-text available

Dec 2023

Framed buildings built on hill slopes exhibit structural behaviour that differs from those built on flat ground. Because these structures are unsymmetrical in nature, they draw a high quantity of shear pressures and torsional moments, and their distribution is uneven owing to different column lengths. Because analysing complex structures takes a significant amount of time and effort. The aim of this paper is to create a model to predict seismic analysis parameters using machine learning methods and evaluate the outcomes of various techniques utilised.

International Journal of Innovative Knowledge Concepts

Research

Jun 2023

Improved Multilayer Perceptron Neural Networks Weights and Biases Based on The Grasshopper optimization Algorithm to Predict Student Performance on Ambient Learning

Conference Paper

Jun 2023

Using Apriori with WEKA for Frequent Pattern Mining

Article

Full-text available

Jun 2014

Knowledge exploration from the large set of data,generated as a result of the various data processing activities due to data mining only. Frequent Pattern Mining is a very important undertaking in data mining. Apriori approach applied to generate frequent item set generally espouse candidate generation and pruning techniques for the satisfaction of the desired objective. This paper shows how the different approaches achieve the objective of frequent mining along with the complexities required to perform the job. This paper demonstrates the use of WEKA tool for association rule mining using Apriori algorithm.

Evaluating the achievements of computer engineering department of distance education students with data mining methods

Article

Full-text available

Dec 2012

Recently, the internet technology has become an indispensable part of life, a very useful application that cannot be earlier have made it possible. One of these is distance learning technologies. Due to limitations of traditional learning-teaching methods in classroom activities and practitioners who intend to conduct training activities in the absence of the possibility of communication and interaction among learners with special education units are prepared and provided a wide range of media center through a certain method of teaching. According to a further recognition of Distance Education, although far away from each other with the student who teaches the same time (synchronous) or different time (asynchronous) communications with a tool as training system established. The aim of this study is to compare the achievements of Computer Engineering Department students in Karabük University according to criteria such as age, gender, type of high school graduation and whether the students studying in distance education or regular education using data mining techniques. Also discussing the differences of the techniques according to the results and to make suggestions for which technique would be more effective.

Predicting Students Academic Performance using Artificial Neural Network: A Case Study of an Engineering Course

Article

Full-text available

The observed poor quality of graduates of some Nigerian Universities in recent times has been partly traced to inadequacies of the National University Admission Examination System. In this study an Artificial Neural Network (ANN) model, for predicting the likely performance of a candidate being considered for admission into the university was developed and tested. Various factors that may likely influence the performance of a student were identified. Such factors as ordinary level subjects' scores and subjects' combination, matriculation examination scores, age on admission, parental background, types and location of secondary school attended and gender, among others, were then used as input variables for the ANN model. A model based on the Multilayer Perceptron Topology was developed and trained using data spanning five generations of graduates from an Engineering Department of University of Ibadan, Nigeria's first University. Test data evaluation shows that the ANN model is able to correctly predict the performance of more than 70% of prospective students.

A New Data Mining Scheme Using Artificial Neural Networks

Article

Full-text available

Dec 2011
SENSORS-BASEL

Classification is one of the data mining problems receiving enormous attention in the database community. Although artificial neural networks (ANNs) have been successfully applied in a wide range of machine learning applications, they are however often regarded as black boxes, i.e., their predictions cannot be explained. To enhance the explanation of ANNs, a novel algorithm to extract symbolic rules from ANNs has been proposed in this paper. ANN methods have not been effectively utilized for data mining tasks because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by human experts. With the proposed approach, concise symbolic rules with high accuracy, that are easily explainable, can be extracted from the trained ANNs. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the accuracy. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of benchmark data mining classification problems.

Knowledge Mining in Supervised and Unsupervised Assessment Data of Students’ Performance

Conference Paper

Jan 2011

Muhammad Abaidullah Anwar

Data Mining: Concepts and Techniques

Article

Jan 2006

Data Preparation for Data Mining Using SAS

Book

Jan 2007

Mamdouh Refaat

Are you a data mining analyst, who spends up to 80% of your time assuring data quality, then preparing that data for developing and deploying predictive models? And do you find lots of literature on data mining theory and concepts, but when it comes to practical advice on developing good mining views find little how to information? And are you, like most analysts, preparing the data in SAS? This book is intended to fill this gap as your source of practical recipes. It introduces a framework for the process of data preparation for data mining, and presents the detailed implementation of each step in SAS. In addition, business applications of data mining modeling require you to deal with a large number of variables, typically hundreds if not thousands. Therefore, the book devotes several chapters to the methods of data transformation and variable selection. FEATURES * A complete framework for the data preparation process, including implementation details for each step. The complete SAS implementation code, which is readily usable by professional analysts and data miners; * A unique and comprehensive approach for the treatment of missing values, optimal binning, and cardinality reduction; * Assumes minimal proficiency in SAS and includes a quick-start chapter on writing SAS macros. * CD includes dozens of SAS macros plus the sample data and the program for the books case study. It is easy to write books that address broad topics and ideas leaving the reader with the question "Yes, but how?#148; By combining a comprehensive guide to data preparation for data mining along with specific examples in SAS, Mamdouh's book is a rare find-a blend of theory and the practical at the same time. As anyone who has mined data will confess, 80% of the problem is in data preparation; Mamdouh addresses this difficult subject with strong practical techniques and methods. If you are working on an SAS data mining project, this book is a must! If you are working on any data mining project, the techniques and methods will be a guiding light! --Frank Byrum, Cormine Intelligent Data, LLC.

Using Neural Network for Data Mining

Conference Paper

Feb 2008

Dr. Ashutosh Kumar Bhatt

Neural networks have been successfully applied in a wide range of supervised and unsuper vised learning applications� Neuralnetwork methods are not commonly used for datamining tasks however because they often produce incomprehensible models and require long training times� In this article we describe neuralnetwork learning algorithms that are able to produce comprehensible models and that do not require excessive training times� Specically we discuss two classes of approaches for data mining with neural networks� The rst type of approach often called rule extraction involves extracting symbolic models from trained neural networks� The second approach is to directly learn simple easytounderstand networks� We argue that given the current state of the art neuralnetwork methods deserve a place in the tool boxes of datamining specialists�.

Classification using Weka (Brain, Computation, and Neural Learning)

Jung-Woo Ha

Jung-Woo Ha." Classification using Weka (Brain, Computation, and Neural Learning)".

Evaluating Students Performance by Artificial Neural Network using WEKA

Recommended publications

Plan for the Application of a Cognitive Diagnosis Model to the Evaluation of General High School Stu...

Characterization of Superalloys by Artificial Neural Network Method

Predicting Students Academic Performance Using Education Data Mining

Predicting Students Academic Performance Using Education Data Mining

A New Approach for Evaluation of Data Mining Techniques