Conference PaperPDF Available

PREDICTING UNIVERSITY DROPOUT BY USING CONVOLUTIONAL NEURAL NETWORKS

March 2019

March 2019

DOI:10.21125/inted.2019.2274

Conference: 13th International Technology, Education and Development Conference

Authors:

Mauro Mezzini

Università Degli Studi Roma Tre

Gianmarco Bonavolonta

Università degli studi di Cagliari

Francesco Agrusti

Università Degli Studi Roma Tre

List of administrative attributes.

…

Accuracy on the validation set.

…

Accuracy on the test set.

…

Figures - uploaded by Gianmarco Bonavolonta

Content may be subject to copyright.

Content uploaded by Gianmarco Bonavolonta

Content may be subject to copyright.

PREDICTING UNIVERSITY DROPOUT BY USING CONVOLUTIONAL

NEURAL NETWORKS

Mauro Mezzini, Gianmarco Bonavolontà, Francesco Agrusti

Roma Tre University, Department of Education (ITALY)

Abstract

Current trends in graduation rates show that 39% of young adults on average across OECD countries

are expected to complete tertiary-type A (university level) education during their lifetime. According to

Eurostat in 2017, an average of 10.6% of young people (aged 18-24) in the EU-28 were early leavers

from education and training. Over 3 million young people in the European Union had been to university

or college but had discontinued their studies at some point in their life, according to a survey of 2016.

Therefore, the dropout level could potentially represent one of the major issues to be faced in a near

future in the European Union.

The main aim of the research is to predict, as early as possible, which kind of student will more easly

dropout from the Higher Education (HE). This information would allow one to effectively carry out

targeted actions in order to limit the incidence of the phenomenon.

Today, Artificial Intelligence (AI) is being employed to replace human activities that are repetitive, e.g.

in the autonomous driving field or for the image classification task. In these areas AI competes with man

with quite satisfactory results and, in the case of HE dropout, it is extremely unlikely that an expert

teacher will be able to "predict" the student's educational success based only on the data provided by

administrative offices.

The recent breakthrough on Neural Networks with the use of Convolutional Neural Networks (CNN)

architectures has become disruptive in AI. By stacking together tens or hundreds of convolutional neural

layers, a deep network structure is obtained, which has been proved very effective in producing high

accuracy models.

In this study the administrative data of approximately 6,000 students enrolled from 2009 on in the

Education Department at Rome Tre University were used to train the CNNs. Then, the trained network

provides a probabilistic model that indicates, for each student, the probability of dropping out. We used

several types of state-of-the-art CNNs, and their variants, in order to build the most accurate model for

the dropout prediction. The accuracy of the obtained models ranged from 67.1% for the students at the

beginning of the first year up to 88.7% for the students at the end of the second year of their academic

career. With the use of more data, for example students’ career data, we could develop more accurate

dropout prediction models.

Keywords: university dropout; convolutional neural networks; artificial intelligence

1 INTRODUCTION

There are several variables that influence the decision of students to leave their studies at university

level [1]. This phenomenon is known as college dropout and it has been defined by Larsen and other

researchers [2] as the “withdrawal from a university degree program before it has been completed” (p.

18). In this notion is included also the dropout from single courses of study but not withdrawals due to

pregnancy, illness etc. Higher Education (HE) student dropout phenomena has several negative effects:

it has consequences on a personal level, on a family level and, from a systematic point of view, these

low completion rates could bring to a skills bottleneck which can have consequences on the economic

and social level, decreasing competitiveness, innovation and productivity.

In the comparative study on higher education dropout and competition in Europe has been found that

study success is considered as important in 28 out of 35 participating countries [3]. An early recognition

of dropout phenomenon is the prerequisite to reduce the dropout rates: several studies highlights the

importance of monitoring students’ individual and social characteristics since they have a strong impact

on students’ probability of success in HE. Therefore, a strategic goal in the Europe 2020 strategy is to

reduce the university dropout having at least 40% of 30-34-year-olds complete higher education [3].

As reported in literature, students in general leave Higher Education institutions (HEI) during their first

year of college [4,5] just after the upper secondary school: in this period, they have to develop their

sense of responsibility and self-regulation [6,7]. Individual skills and dispositions are investigated in

several psychological and pedagogical models in relation with the early dropout phenomenon in terms

of characteristics of personality [8,9]. The impact of student economic and social status (e.g. race or

income) and organizational services provided to the students by the HEI (e.g. faculty-student ratios) is

also explored by numerous researches [9,10,11,12,13].

From decades one of the most used and discussed model is the Tinto’s student integration model which

underlines the importance of students’ academic and social integration in the forecasting of dropout

phenomenon [14,15,16]. One of the other major models is the one proposed by Bean [17], the students

attrition model, based on the attitude-behavior interactions [18]. In all these models, the relation between

students and institutions is of crucial importance to reduce dropout rates [19,20] and several variables

have been identified to improve students’ retention [2,21].

In Italy, due to the very high rates of university students’ dropout [22] several specific studies have been

conducted [23,24,25].

Relatively recent advances on neural networks (NN) have shown that Artificial Intelligence (AI) may be

able to compete (or even surpass) with human ability, in the tasks of classification and recognition.

Some of the most dramatic achievements of the AI are on the dropout prediction are shown below.

2 RELATED WORKS

Several research projects using data mining to predict or discover patterns of dropout have been

developed. Therefore, in this section, we discuss previous works that investigated the university student

dropout and performance prediction using Educational Data Mining techniques (EDM) that is a new

approach, applying computerized methods, to analyse large collections of educational data that would

otherwise be impossible to the enormous volume of data [26,27].

The Decision Tree algorithm (DT) has a huge use for the classification problem prediction of the

university dropout. A research conducted at University of Chittagong (Bangladesh) investigated whether

it can use the enrolment data alone to predict study outcome for newly enrolled student [28]. The models

were developed by Classification And Regression Tree (CART) and Chi-squared Automatic Interaction

Detector (CHAID) algorithms evaluated using cross-validation and misclassification errors to decide

which model outperforms other models in term of classification accuracy.

Another research project that aimed to identify patterns of student dropout from 6870 records and 62

attributes (data of students of cohorts 2004, 2005 and 2006) belonging to socio-economic, academic,

and institutional data using DT (J48 algorithm), was conducted in Latin America and was funded by the

Ministry of Education in Colombia and counterpart funds from the University of Nariño and CESMAG

University Institution [29]. The cross-validation method was used in order to evaluate the quality and

prediction accuracy of the discovered patterns with a result of confidence greater than 80%.

Similarly, a research was conducted in India to develop an improve DT based on ID3 which can able to

predict the university dropout students [30]. A dataset of 240 samples collected randomly through survey

at university was used for this study consisting of 32 variables. The performance of the 2 models (ID3,

ID3 improved by using Renyi entropy) were evaluated using accuracy, precision, recall and F-measure.

The result shows accuracy percent 97,50 % for improved ID3 algorithm.

In 2018 a study presented a classification based on DT (C4.5 algorithm) with parameters optimized to

predict the desertion of university students [31]. The study analysed 5288 cases of students belonging

to a Chilean public university (cohorts of students belonging to 44 undergraduate programs in the areas

of humanities, arts, education, engineering, and health). The attributes selected for the analysis were

related to the demographic variables of the student, prior to his admission to university, to his economic

situation and data on academic performance. The result is an accuracy ratio of 87.27%.

Some researcher used specific methodologies, like CRISP-DM (Cross Industry Standard Process for

Data Mining), to develop a model that can predict at the end of the first semester students at risk of

dropout. The model types used by this research are DT, Artificial Neural Network (ANN) and Logistic

Regression (LR) with a data set of 8 years between the years of 1999 and 2006 (over 25,000 students

and 39 variables for each student) with the accuracy 81.19% for the ANN model [32].

Another study used ANN for detecting students at risk of dropout [33]. The population consists of 810

students enrolled for the first time in a health care professions degree course at the University of Genoa

in the academic year 2008-09 and the data came from administrative sources, an ad hoc survey and

telephone interviews. Other comparative analysis of several classification methods was used in order

to develop models to predict dropout students.

Example was the work conducted at Colleges of Technology under the Federal Institute of Education,

Science and Technology of Mato Grosso [34]. It presented a model using Fuzzy-ARTMAP Neural

Network using only the enrolment data collected for a seven year period from 2004 to 2011. The results

show a success rate of accuracy over 85%.

At Universidade Federal do Rio de Janeiro, in Brasil, was conducted a research project where the goal

was to predict risk of dropout their undergraduate education [35]. The study compared DT, SimpleCart

(SC), Support Vector Machine (SVM), Naïve Bayes (NB), and ANN and the data set used came from

14.000 students. The classifier SVM presented the highest true positive rate for all datasets used in the

experiments.

In another study at Budapest University of Technology and Economics [36], using data of 15,285

undergraduate students regarding both their secondary school and university performance, employed

and evaluated several algorithms (DT, Random Forest, Gradient Boosted Trees, LR, Generalized Linear

Model, Deep Learning) to identify students at risk of dropout. The accuracy, recall, precision and AUC

(area under curve) of the ROC were used to evaluate the models and the results highlighted that the

best model was developed by Deep Learning with an accuracy rate of 73.5%.

A similar work [37] used five classification algorithms (LR, Gaussian Naive Bayes, SVM, Random Forest

and Adaptive Boosting) to predict dropout over 4432 student data from the degree studies in Law,

Computer Science and Mathematics of the Barcelona University between the years 2009 and 2014. It

found that all the machine learning algorithms reached an accuracy around 90%.

Another model to predict the drop out was developed at the Instituto Tecnológico de Costa Rica using

Random Forest, SVM, ANN and LR [38]. The data set used in this study gathered from 16.807 students

enrolled between years 2011 and 2016. The results were that the best algorithm for classifying dropouts

was the Random Forest.

The studies mentioned above use different data, algorithms, performance metrics and methodologies

therefore it is impossible to say that one model is better than the other one due to heterogeneity, but all

the studies confirm the effectiveness of data mining approach to analyse and predict university dropout.

As far as we are concerned, the key differences from such related studies to our approach is that we

have introduced Convolutional Neural Networks (CNN) used to classify images to analyse educational

data.

3 METHODOLOGY

One of the most important problem of the field of AI is the classification problem [1]. In the classification

problem we have an object which can be an image, a sound or a written sentence and we want to

associate to each object a class taken from a finite set of predefined classes. For example, in the

image classification problem the solution consists in to associate to each image a proper class according

to some interpretation rule. In this case, a natural interpretation rule would be to label each image with

the subject it contains. Put in more mathematical terms, if we represent each object as an

 dimensional vector of real numbers   , the solution of the classification problem consists in

finding a function    that associate to each object  its class.

A NN can be viewed in fact as a function  that takes as input an  dimensional vector  and produces

a value, called a prediction on . The prediction is correct when 󰇛󰇜 󰇛󰇜 and incorrect otherwise

For having a NN that produce correct predictions, we need that the NN undergo to a training process. It

consists of feeding the NN with a set of objects, called the training set, and denoted as   󰇛 󰇛󰇜󰇜  

    where  is the number of elements of the training sets. The class󰇛󰇜, of each object in the

training set, is already known. For each object in the training set, the value 󰇛󰇜 is compared to the

prediction 󰇛󰇜of the NN. If the value of the prediction 󰇛󰇜is different from its class 󰇛󰇜, the NN will

be modified according to some optimization rule [40], in order to correct the error. This process is

repeated for hundreds of time for all elements of the training set or until no improvement to the rate of

error is achieved. The rate of error is represented as the fraction of the incorrect predictions out of the

total number of objects in the training set. This process is called also supervised learning since it is

similar to the training process that is usually employed either with humans or with animals.

Among different type of NN the Convolutional neural networks (CNN) have gained much popularity since

recently when cutting edge breakthrough have been obtained in the image classification task [41]. In

their work, Krizhevsky et al. [41] trained a deep convolutional model using 1.2 million images training

set of the Imagenet challenge with 1000 different classes improving of prediction by reducing the error

of almost 50% with respect to the previous system for image classification. Since then, much researches

and advances on CNN have been accomplished.

In classical NN the input object is feed, at the beginning, to a set of  neurons, called the first layer.

Every neuron receives in input a copy of the object , then make some computation using an

 dimensional vector ,       of weights and gives in output a real number called activation.

Therefore, the output of the first layer can be seen as a dimensional vector of reals. Therefore, this

latter vector can be feed to a second layer of neurons. In this way layers of neurons can be stacked

together to form a more complex and powerful network. The problem is that the number of weights of

the network, and thus the memory and computational resources can be very high as the number of

layers and the number of neurons for each layer increments. For example, at a resolution   pixels,

an image is quite small even for outdated smartphones. However, the numerical encoding of such an

images results in a  dimensional vector. If in the first layer of a NN we want to put, say

neurons, the number of different weights for the first layer only would be 3 million. A key difference

between CNN and NN is that not all components of the input object is feed in input to each neuron, but

only a portion of it [42]. In this way the number of weights for each neuron can be reduced drastically

and at the same time it can produce in output of each layer a very large number of activations.

We employed three different architectures of CNN in order to test their effectiveness for our predictive

model. The first two architectures represent the state of the art of CNN and perform the best or among

the best (at the date of 2017) against industrial benchmarks [43, 44]. The third architecture was built by

us by making several modifications to the ResNet [43] and VGG architectures [45].

We collected, from the administration office of Roma Tre University (R3U), a dataset of students enrolled

in the Department of Education (DE). The years of enrollment ranges from 2009 up to 2014 comprising

a total of 6078 students. About 649 of all students were still active at the time when we acquired the

dataset (August 2018), that is, they were still in the course of their studies, while the remaining 5429

closed the course of their studies either because they graduated or because they dropped out or by

other reasons, explained later. We refer to this set of students as the no active students. The

administrative rule of R3U establishes a time limit of 9 years for the completion of the studies, that is, a

student can be enrolled up to 9 years. If a student does not obtain her/his graduation within this time,

the course of studies of the student will be closed without graduation. We regarded these cases in the

same manner as those in which the student dropped out.

In general, each of the no active student is classified in one of three different classes: Graduated,

Dropout, Other. The class Other, totaling 118 students, contains students that either do not dropped out

and do not graduated in the DE, like for example students changing faculty within the R3U or to another

university. The number of graduated students is 2833 while the number of who dropped out is 2478.

We obtained, from the R3U’s administrative office, most of the (out of what were available)

administrative fields of all students. In the Table 1 is reported the list of administrative fields that were

used.

Table 1. List of administrative attributes.

List 1

List 2

Year of beginning of studies

Academic year

Year of birth

Course code

Gender

Course name

Country of birth

Course year

High school type

Family income class

High school exit score

Working status

High school maximum exit score

Exemption from taxes

Year ending high school

Type of exemption from taxes

Transferred from other university

Handicap

CFU from other university

Part time status

Faculty

Part time CFU

Type of renew of enrolment

Due to privacy reasons, even though some information were available, they were not disclosed. In fact,

it was not possible to know the city of birth nor the city of residence of the students. Other attributes as

the working status were not collected systematically and accurately by the administrative office, since,

in the dataset we found only few students to have a status of worker, while it is well known that a

considerable number of students enrolled in the R3U DE already work as educator.

Note that the value of the attributes in List 2 of Table 2, may change for a same student during its

academic career from year to year, while the values of attributes in List 1 does not change during all

his/her academic career.

In order to construct the training set we need to associate to each student a numeric representation of

her/him. All the domains of the dataset are converted, using an arbitrary bijective function, to a

nonnegative integer domain. For example, the domain of field gender which is the two string {“male”,

“female”}, was converted to the domain {0,1} where 0 correspond to “male” and 1 to “female”.

We create a table student whose schema S (that is the fields list) contains all fields of List 1 of Table 1.

For each field f of List 2, we added to S, 9 fields, denoted as 

 where      , that is, one for each

of the possible 9 years in which the student can be enrolled. If a student ends her/his career at the year

, then 

 is set equal to  for all     . The value of , which was chosen arbitrarily to be , should

be considered as a NULL value and does not appear in the domain of any field in the dataset.

We partitioned the number of no active students into three mutually disjoint sets. We randomly chosen

4532 students to form the training set, 450 students to form the validation set and the remaining 447 to

form the test set. We will explain the use of these sets in detail in the following.

If we want to make a prediction on a student at the moment of enrollment, we have only the data of this

student up to year 0. In general, a student beginning the academic year  we know only data up to year

. Therefore, we created 9 tables, denoted as studenty, by projecting out all fields 

 where   and

discarding all other fields. For example, the schema of table student0 contains only the field in List 1 of

Table 1, and for each field  of List 2 of Table 1, it contains only 

. Each table studenty,       is

used as training set for the year .

We used the training sets for the training of the three architectures mentioned before for each of the

year up to year 3. In fact, after year 3 the number of students that make dropout start to be a tiny fraction

of the students that exit with graduation. Furthermore, as explained below, the validation set becomes

smaller and smaller as the year of enrollment increases making the statistics of evaluation less

significant. We run the training for 100 epochs, where an epoch denotes that all the whole training set

has been fed to the CNN once. In other words, we fed to the CNN the whole training set 100 times and

then stopped because, in general, over 100 epochs we detected no significant improvement in the

accuracy. We selected the model which gave the best accuracy on the validation set. The validation set

in turn changes depending on the year. The first year all the students in the validation set are considered,

while in the year  only students still active at that year were considered. This reduces the validation

set, as the year increase, since the number of students still active at year  is always less than the

number of active students in the year  for every   .

4 RESULTS

The training of CNN is a time-consuming task even for the powerful CPUs. Therefore, we used a GPU

in order to implement all the tests. We use the state-of-.the-art open source software libraries to

implement the CNN. All the experiments where implemented using Tensorflow with Keras library [46,47].

Python and MySQL were used, respectively, as a programming language and for the database system.

On a NVIDIA QUADRO P2000 each epoch took between 40 and 73 second to complete depending on

the type of architecture was used in the training. On average the training of a single model can took

about 1 hour.

Table 2. Accuracy on the validation set.

year

arch. 1

arch. 2

arch 3

67.1%

66.7%

78.1%

77.8%

78.1%

86.0%

88.7%

86.6%

83.1%

82.3%

83.1%

In Table 2 we reported the best accuracy of the three architectures mentioned in the previous section

and the year of the trained model. We note that the accuracy of the models on the validation test is low

in the years 0 and 1 but and increase dramatically in the years 2 and 3. This is likely due to the fact that

most of the dropout occur in between the beginning of the course of studies and the beginning of year

2 and at the same time in the validation set contains more information about each student.

Figure 2. Graphs of the accuracy on the training and validation sets during the training process of the

architecture 1. Left: training set and validation set taken from the table student0. Right: training set and

validation set taken from the table student1.

In Fig. 2 we report the plot of the accuracy on the training set and the validation set during the training

process. We reported the data from the training of architecture 1 for the year 0 and the year 1. For the

other architectures the plot of the graph follows a similar pattern. Unfortunately, we can observe that for

the students at the beginning of their course of studies (i.e. at year 0) the model quickly overfit and, at

the same time, the accuracy on the validation set decreases. Furthermore, the maximum value of the

accuracy is attained in the first epochs while at the same time the variance can be very high.

Table 3. Accuracy on the test set.

year

arch. 1

arch. 2

arch 3

62.2%

56.4%

64.0%

71.1%

72.0%

71.1%

This reflect on the accuracy of the test set, the result of which, for the first two year are reported in Table

3. We can observe a great variability due to random fluctuation in the model tested during the early

phase of training. This in turn suggest that the use of an ensemble of models would reduce the

uncertainty and increase the accuracy of the prediction process.

0.4

0.5

0.6

0.7

0.8

0.9

110 19 28 37 46 55 64 73 82 91 100

year 0

Train Validation

0.4

0.5

0.6

0.7

0.8

0.9

110 19 28 37 46 55 64 73 82 91 100

year 1

Train Validation

5 CONCLUSIONS AND FUTURE RESEARCH

We employed state-of-the-art Convolutional Neural Networks in order to build a prediction model which

predict the dropout from higher studies of a student. We implemented several models, using real data

of students of the DE in the R3U enrolled between 2009 and 2014. The result obtained are quite

encouraging given the fact that the dataset provided was limited due to privacy constraint and some

important information about students were not (accurately) collected or even considered by the

administration office of R3U. The data we obtained from several tests confirm that the more data is

collected the more accurate the model and the prediction can be.

Much more works remain to be done. First, we can integrate the dataset with data from the academic

career like number of exams, tests score and so on. Then, we can incorporate in the data the fields not

included due to privacy censoring. We tested only three different architectures, but many other different

CNN architectures exist in literature (e.g. [48]). Furthermore, many the parameters of these architecture

can be modified, and much could be explored in order to increase the effectiveness of the models. A

method of data augmentation [49, 50, 51] should be introduced in order to regularize the training process

and to increase the size of the training, validation and test set which is very small due to scarcity of data.

Since it is not required that the prediction process if made in real time, we can train hundreds (perhaps

thousand) of models and make multiple prediction in order to reduce the random variation found in the

early phase of training. Clearly the system can be made finer by introducing a prediction model every

semester or even every trimester or it can be extended to other faculty or other types of students.

ACKNOWLEDGEMENTS

Research group is composed by the authors of the contribution that was edited in the following order:

Mauro Mezzini (§§ 3-4-5), Gianmarco Bonavolontà (§§ 2), Francesco Agrusti (§§ 1).

REFERENCES

[1] K.-L. Krause, ‘Serious thoughts about dropping out in first year: trends, patterns and implications

for higher education’, Studies in Learning, Evaluation, Innovation and Development, vol. 2, no.

3, pp. 55–68, 2005.

[2] M. Søgaard Larsen and Dansk Clearinghouse for Uddannelsesforskning, Dropout phenomena at

universities: what is dropout? Why does dropout occur? What can be done by the universities to

prevent or reduce it?: a systematic review. Danish Clearinghouse for Educational Research,

2013.

[3] J. J. Vossensteyn et al., Dropout and completion in higher education in Europe: main report.

European Union, 2015.

[4] L. Harvey, S. Drew, and M. Smith, ‘The first-year experience: A review of literature for the Higher

Education Academy’, York: The Higher Education Academy, 2006.

[5] M. R. Larsen, H. B. Sommersel, and M. S. Larsen, Evidence on dropout phenomena at

universities. Danish Clearinghouse for educational research Copenhagen, 2013.

[6] A. Moè and R. De Beni, ‘Strategie di autoregolazione e successo scolastico: Uno studio con

ragazzi di scuola superiore e universitari’, Psicologia dell’Educazione e della Formazione, vol. 2,

no. 1, pp. 31–44, 2000.

[7] P. R. Pintrich and A. Zusho, ‘The development of academic self-regulation: The role of cognitive

and motivational factors’, in Development of achievement motivation, Elsevier, pp. 249–284,

2002.

[8] E. Marks, ‘Student perceptions of college persistence, and their intellective, personality and

performance correlates.’, Journal of Educational Psychology, vol. 58, no. 4, p. 210, 1967.

[9] F. Pincus, ‘The false promises of community colleges: Class conflict and vocational education’,

Harvard Educational Review, vol. 50, no. 3, pp. 332–361, 1980.

[10] D. H. Kamens, ‘The college" charter" and college size: Effects on occupational choice and college

attrition’, Sociology of education, pp. 270–296, 1971.

[11] S. I. Iwai and W. D. Churchill, ‘College attrition and the financial support systems of students’,

Research in Higher Education, vol. 17, no. 2, pp. 105–113, 1982.

[12] J. O. Stampen and A. F. Cabrera, ‘The targeting and packaging of student aid and its effect on

attrition’, Economics of Education Review, vol. 7, no. 1, pp. 29–46, 1988.

[13] J. M. Braxton and E. M. Brier, ‘Melding organizational and interactional theories of student

attrition: A path analytic study’, The Review of Higher Education, vol. 13, no. 1, pp. 47–61,

1989.

[14] V. Tinto, ‘Dropout from higher education: A theoretical synthesis of recent research’, Review of

educational research, vol. 45, no. 1, pp. 89–125, 1975.

[15] V. Tinto, Leaving college: Rethinking the causes and cures of student attrition. ERIC, 1987.

[16] V. Tinto, ‘From theory to action: Exploring the institutional conditions for student retention’, in

Higher education: Handbook of theory and research, Springer, 2010, pp. 51–89.

[17] J. P. Bean, Leaving college: Rethinking the causes and cures of student attrition. Taylor &

Francis, 1988.

[18] P. M. Bentler and G. Speckart, ‘Models of attitude–behavior relations.’, Psychological review, vol.

86, no. 5, p. 452, 1979.

[19] A. F. Cabrera, M. B. Castaneda, A. Nora, and D. Hengstler, ‘The convergence between two

theories of college persistence’, The journal of higher education, vol. 63, no. 2, pp. 143–164,

1992.

[20] G. Hede and L. Wikander, ‘Interruptions and Study Delays in Legal Educations’, A Follow-Up of

the Admissions round Autumn Term, 1983.

[21] A. Siri, Predicting students’ academic dropout using artificial neural networks. Nova, 2014.

[22] ANVUR, Rapporto sullo stato del sistema universitario e della ricerca 2018. 2018.

[23] G. Moretti, M. Burgalassi, and A. Giuliani, ‘ENHANCE STUDENTS’ ENGAGEMENT TO

COUNTER DROPPING-OUT: A RESEARCH AT ROMA TRE UNIVERSITY’, presented at the

International Technology, Education and Development Conference, Valencia, Spain, 2017, pp.

305–313.

[24] M. Burgalassi, V. Biasi, R. Capobianco, and G. Moretti, ‘Il fenomeno dell’abbandono universitario

precoce. Uno studio di caso sui corsi di laurea del Dipartimento di Scienze della Formazione

dell’Università «Roma Tre»’, Giornale Italiano di Ricerca Didattica/Italian Journal of Educational

Research, vol. 17, pp. 131–152, 2016.

[25] V. Carbone and G. Piras, ‘Palomar Project: Predicting School Renouncing Dropouts, Using the

Artificial Neural Networks as a Support for Educational Policy Decisions’, Substance Use &

Misuse, vol. 33, no. 3, pp. 717–750, Jan. 1998.

[26] M. Bala and D. D. B. Ojha, “STUDY OF APPLICATIONS OF DATA MINING TECHNIQUES IN

EDUCATION,” Vol. No., no. 1, p. 10, 2012.

[27] K. R. Koedinger, S. D'Mello, E. A. McLaughlin, Z. A. Pardos, and C. P. Rose. Data mining and

education. Wiley Interdisciplinary Reviews: Cognitive Science, 6(4):333{353, 2015.

[28] M. N. Mustafa, L. Chowdhury, and M. S. Kamal, “Students dropout prediction for intelligent

system from tertiary level in developing country,” presented at the 2012 International

Conference on Informatics, Electronics and Vision, ICIEV 2012, pp. 113–118, 2012.

[29] R. T. Pereira, A. C. Romero, and J. J. Toledo, “Extraction student dropout patterns with data

mining techniques in undergraduate programs,” presented at the IC3K 2013; KDIR 2013 - 5th

International Conference on Knowledge Discovery and Information Retrieval and KMIS 2013 -

5th International Conference on Knowledge Management and Information Sharing, Proc., pp.

136–142, 2013.

[30] S. Sivakumar, S. Venkataraman, and R. Selvaraj, ‘Predictive modeling of student dropout

indicators in educational data mining using improved decision tree’, Indian Journal of Science

and Technology, vol. 9, no. 4, pp. 1–5, 2016.

[31] P. E. Ramírez and E. E. Grandón, ‘Prediction of student dropout in a Chilean public university

through classification based on decision trees with optimized parameters’, Formacion

Universitaria, vol. 11, no. 3, pp. 3–10, 2018.

[32] Dursun Delen, ‘Predicting Student Attrition with Data Mining Methods’, Journal of College

Student Retention: Research, Theory & Practice, vol. 13, no. 1, pp. 17–35, Jan, 2011.

[33] A. Siri. Predicting Students’ Dropout at University Using Artificial Neural Networks. Italian Journal

of Sociology of Education, 7(2), 225-247, 2015.

[34] V. R. C. Martinho, C. Nunes, and C. R. Minussi, ‘An intelligent system for prediction of school

dropout risk group in higher education classroom based on artificial neural networks’, presented

at the Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI, pp.

159–166, 2013.

[35] L. M. B. Manhães, S. M. S. Da Cruz, and G. Zimbrão, ‘The impact of high dropout rates in a large

public brazilian university a quantitative approach using educational data mining’, presented at

the CSEDU 2014 - Proceedings of the 6th International Conference on Computer Supported

Education, vol. 3, pp. 124–129, 2014.

[36] M. Nagy and R. Molontay, ‘Predicting Dropout in Higher Education Based on Secondary School

Performance’, presented at the INES 2018 - IEEE 22nd International Conference on Intelligent

Engineering Systems, Proceedings, pp. 000389–000394, 2018.

[37] S. Rovira, E. Puertas, and L. Igual, ‘Data-driven system to predict academic grades and dropout’,

PLoS ONE, vol. 12, no. 2, 2017.

[38] M. Solis, T. Moreira, R. Gonzalez, T. Fernandez, and M. Hernandez, ‘Perspectives to Predict

Dropout in University Students with Machine Learning’, presented at the 2018 IEEE

International Work Conference on Bioinspired Intelligence, IWOBI 2018 - Proceedings, 2018.

[39] Y. LeCun, Y. Bengio, G. E. Hinton, “Deep learning”, Nature 521 (7553) 436–444, 2015.

[40] N. Qian, “On the momentum term in gradient descent learning algorithms”, Neural Networks,

Volume 12, Issue 1, pp. 145-151, 1999.

[41] A. Krizhevsky, I. Sutskever, G. E. Hinton, “Imagenet classification with deep convolutional neural

networks”, Advances in Neural Information Processing Systems 25, pp. 1097–1105, 2012.

[42] Vincent Dumoulin, Francesco Visin, “A guide to convolution arithmetic for deep learning” CoRR,

abs/ 1603.07285, 2016.

[43] K. He, X. Zhang, S. Ren, J. Sun, “Deep residual learning for image recognition”, IEEE Conference

on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016.

[44] C. Szegedy, S. Ioffe, V. Vanhoucke, A. A. Alemi, “Inception-v4, inception-resnet and the impact of

residual connections on learning”, Proceedings of the Thirty-First AAAI Conference on Artificial

Intelligence, pp. 4278–4284, 2017.

[45] Simonyan, K. & Zisserman, A. “Very Deep Convolutional Networks for Large-Scale Image

Recognition”. CoRR, abs/1409.1556, 2014.

[46] M. Abadi, et al., “Tensorflow: A system for large-scale machine learning”. CoRR, abs/1605.08695,

2016.

[47] F. Chollet, et al., “Keras”, https://github.com/keras-team/keras, 2015.

[48] J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, CoRR abs/1709.01507, 2018.

[49] J. Salamon, J. P. Bello, “Deep convolutional neural networks and data augmentation for

environmental sound classification”, IEEE Signal Processing Letters 24 (3), pp. 279–283, 2017.

[50] L. Perez, J. Wang, “The effectiveness of data augmentation in image classification using deep

learning”, CoRR abs/1712.04621, 2017.

[51] P. Y. Simard, D. Steinkraus and J. C. Platt, "Best practices for convolutional neural networks

applied to visual document analysis," Seventh International Conference on Document Analysis

and Recognition, pp. 958-963, 2003.

Application of deep learning algorithms to confluent flow-rate forecast with multivariate decomposed variables

Article

Full-text available

Apr 2023

Study region: Song bengue confluent in Cameroon regulates the river flow rate for hydro energy production with input from four upstream reservoirs. Study focus: Deep learning models forecast a day flow rate of the Song bengue confluent. Decomposed time series multivariate variables of flow rate, precipitation, and upstream reservoir inflows, outflows, and precipitation are used. Different windows and horizons for the forecast are analyzed using deep learning models. A comparative study among the models is carried out. Input parameters are decomposed and different partitions are used as scenarios for the best partition. New hydrological insight: A 7-day window and 1-day forecast yield the lowest error. The dense model is the best among the models followed by the Long-short term memory (LSTM) model, and lastly, the one-dimensional convolutional neural network (Conv1D) based on mean absolute error (MAE), mean square error (MSE), root mean squared error (RMSE), and Nash Sutcliff Efficiency (NSE). Using the scenario with all decomposed variables produces the best result with about a 50% difference in error margin. The second-best result is obtained by using only undecomposed data. The remainder component should not be ignored as it contains important hydrological information.

Supporting Decision-Making Process on Higher Education Dropout by Analyzing Academic, Socioeconomic, and Equity Factors through Machine Learning and Survival Analysis Methods in the Latin American Context

Article

Full-text available

Feb 2023

The prediction of university dropout is a complex problem, given the number and diversity of variables involved. Therefore, different strategies are applied to understand this educational phenomenon, although the most outstanding derive from the joint application of statistical approaches and computational techniques based on machine learning. Student Dropout Prediction (SDP) is a challenging problem that can be addressed following various strategies. On the one hand, machine learning approaches formulate it as a classification task whose objective is to compute the probability of belonging to a class based on a specific feature vector that will help us to predict who will drop out. Alternatively, survival analysis techniques are applied in a time-varying context to predict when abandonment will occur. This work considered analytical mechanisms for supporting the decision-making process on higher education dropout. We evaluated different computational methods from both approaches for predicting who and when the dropout occurs and sought those with the most-consistent results. Moreover, our research employed a longitudinal dataset including demographic, socioeconomic, and academic information from six academic departments of a Latin American university over thirteen years. Finally, this study carried out an in-depth analysis, discusses how such variables influence estimating the level of risk of dropping out, and questions whether it occurs at the same magnitude or not according to the academic department, gender, socioeconomic group, and other variables.

Use of Artificial Intelligence to Predict University Dropout: A Quantitative Research

Article

Full-text available

Jun 2020

The main aim of the research is to predict, as early as possible, which student will drop out in the Higher Education (HE) context. Artificial Intelligence (AI) is used for replacing repetitive human activities, e.g. in the field of for autonomous driving or for the task of classification pictures. In these areas IA competes with the man with fairly satisfactory results and, in the case of college dropout, it is extremely unlikely that an experienced teacher can “predict” the student’s academic success based on only on data provided by administrative offices. In this study used administrative data of about 6,000 students enrolled in the Department of Education of the University of Roma Tre to train convolutive neural nets (RNC). The trained network provides a probabilistic indicating, for each student, the probability of abandonment. Then, the trained network provides a predictive model that predicts whether the student will dropout. The accuracy of the obtained deep learning models ranged from 67.1% for the first-year students up to 94.3% for the third-year students.

PEER REVIEWED PAPERS LEARNING ANALYTICS: FOR A DIALOGUE BETWEEN TEACHING PRACTICES AND EDUCATIONAL RESEARCH UNIVERSITY DROPOUT PREDICTION THROUGH EDUCATIONAL DATA MINING TECHNIQUES: A SYSTEMATIC REVIEW

Article

Full-text available

May 2020

The dropout rates in the European countries is one of the major issues to be faced in a near future as stated in the Europe 2020 strategy. In 2017, an average of 10.6% of young people (aged 18-24) in the EU-28 were early leavers from education and training according to Eurostat’s statistics. The main aim of this review is to identify studies which uses educational data mining techniques to predict university dropout in traditional courses. In Scopus and Web of Science (WoS) catalogues, we identified 241 studies related to this topic from which we selected 73, focusing on what data mining techniques are used for predicting university dropout. We identified 6 data mining classification techniques, 53 data mining algorithms and 14 data mining tools.

Article

Dec 2019

Perspectives to Predict Dropout in University Students with Machine Learning

Conference Paper

Full-text available

Jul 2018

ENHANCE STUDENTS' ENGAGEMENT TO COUNTER DROPPING-OUT: A RESEARCH AT ROMA TRE UNIVERSITY

Conference Paper

Full-text available

Mar 2017

Data-driven system to predict academic grades and dropout

Article

Full-text available

Feb 2017
PLOS ONE

Nowadays, the role of a tutor is more important than ever to prevent students dropout and improve their academic performance. This work proposes a data-driven system to extract relevant information hidden in the student academic data and, thus, help tutors to offer their pupils a more proactive personal guidance. In particular, our system, based on machine learning techniques, makes predictions of dropout intention and courses grades of students, as well as personalized course recommendations. Moreover, we present different visualizations which help in the interpretation of the results. In the experimental validation, we show that the system obtains promising results with data from the degree studies in Law, Computer Science and Mathematics of the Universitat de Barcelona.

Predicting Dropout in Higher Education Based on Secondary School Performance

Conference Paper

Jun 2018

The Effectiveness of Data Augmentation in Image Classification using Deep Learning

Article

Dec 2017

In this paper, we explore and compare multiple solutions to the problem of data augmentation in image classification. Previous work has demonstrated the effectiveness of data augmentation through simple techniques, such as cropping, rotating, and flipping input images. We artificially constrain our access to data to a small subset of the ImageNet dataset, and compare each data augmentation technique in turn. One of the more successful data augmentations strategies is the traditional transformations mentioned above. We also experiment with GANs to generate images of different styles. Finally, we propose a method to allow a neural net to learn augmentations that best improve the classifier, which we call neural augmentation. We discuss the successes and shortcomings of this method on various datasets.

Very Deep Convolutional Networks for Large-Scale Image Recognition

Technical Report

Sep 2014

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.

Imagenet classification with deep convolutional neural networks

Conference Paper

Jan 2012

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry

Squeeze-and-Excitation Networks

Article

Sep 2017

The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at slight additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251 percent, surpassing the winning entry of 2016 by a relative improvement of ${\sim }$ 25 percent. Models and code are available at https://github.com/hujie-frank/SENet .

Leaving College: Rethinking the Causes and Cures of Student Attrition

Article

Nov 1988

John P. Bean

Leaving College: Rethinking the Causes and Cures of Student Attrition

Article

Nov 1988

PREDICTING UNIVERSITY DROPOUT BY USING CONVOLUTIONAL NEURAL NETWORKS

Figures

Recommended publications

PREDICTING UNIVERSITY DROPOUT BY USING CONVOLUTIONAL NEURAL NETWORKS

Use of Artificial Intelligence to Predict University Dropout: A Quantitative Research

Deep learning approach for predicting university dropout: a case study at Roma Tre University

PEER REVIEWED PAPERS LEARNING ANALYTICS: FOR A DIALOGUE BETWEEN TEACHING PRACTICES AND EDUCATIONAL R...