Content uploaded by Rafael Durelli
Author content
All content in this area was uploaded by Rafael Durelli on Oct 16, 2019
Content may be subject to copyright.
CleanGame: Gamifying the Identification of Code Smells
Hoyama Maria dos Santos
Federal University of Lavras
Lavras-MG, Brazil
hoyama.santos@ufla.br
Vinicius H. S. Durelli
Federal University of São João Del Rei
São João Del Rei-MG, Brazil
durelli@ufsj.edu.br
Maurício Souza
Federal University of Minas Gerais
Belo Horizonte-MG, Brazil
mrasouza@dcc.ufmg.br
Eduardo Figueiredo
Federal University of Minas Gerais
Belo Horizonte-MG, Brazil
figueiredo@dcc.ufmg.br
Lucas Timoteo da Silva
Federal University of Lavras
Lavras-MG, Brazil
lucastimoteo@ufla.br
Rafael S. Durelli
Federal University of Lavras
Lavras-MG, Brazil
rafael.durelli@ufla.br
ABSTRACT
Refactoring is the process of transforming the internal structure
of existing code without changing its observable behavior. Many
studies have shown that refactoring increases program maintain-
ability and understandability. Due to these benefits, refactoring is
recognized as a best practice in the software development com-
munity. However, prior to refactoring activities, developers need
to look for refactoring opportunities, i.e., developers need to be
able to identify code smells, which essentially are instances of poor
design and ill-considered implementation choices that may hinder
code maintainability and understandability. However, code smell
identification is overlooked in the Computer Science curriculum.
Recently, Software Engineering educators have started exploring
gamification, which entails using game elements in non-game con-
texts, to improve instructional outcomes in educational settings.
The potential of gamification lies in supporting and motivating
students, enhancing the learning process and its outcomes. We
set out to evaluate the extent to which such claim is valid in the
context of post-training reinforcement. To this end, we devised and
implemented CleanGame, which is a gamified tool that covers one
important aspect of the refactoring curriculum: code smell iden-
tification. We also carried out an experiment involving eighteen
participants to probe into the effectiveness of gamification in the
context of post-training reinforcement. We found that, on average,
participants managed to identify twice as much code smells during
learning reinforcement with a gamified approach in comparison
to a non-gamified approach. Moreover, we administered a post-
experiment attitudinal survey to the participants. According to the
results of such survey, most participants showed a positive attitude
towards CleanGame.
CCS CONCEPTS
·Social and professional topics →Software engineering ed-
ucation.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
SBES 2019, September 23–27, 2019, Salvador, Brazil
©2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-7651-8/19/09.. .$15.00
https://doi.org/10.1145/3350768.3352490
KEYWORDS
Refactoring, gamification, code smell, Software Engineering educa-
tion, post-training reinforcement
ACM Reference Format:
Hoyama Maria dos Santos, Vinicius H. S. Durelli, Maurício Souza, Eduardo
Figueiredo, Lucas Timoteo da Silva, and Rafael S. Durelli. 2019. CleanGame:
Gamifying the Identification of Code Smells. In XXXIII Brazilian Symposium
on Software Engineering (SBES 2019), September 23–27, 2019, Salvador, Brazil.
ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3350768.3352490
1 INTRODUCTION
Many studies involving industrial scale software systems have pro-
vided evidence that the lion’s share of software development ex-
penses can be ascribed to software maintenance. Maintaining soft-
ware systems is a challenging and long-standing topic owing mostly
to the fact that modern software systems must cope with chang-
ing requirements. As a consequence, developers need to strive to
keep software systems in a condition that allows for continuous
evolution. This constant need for improving software systems has
spurred a growing interest in refactoring, which is deemed as one
of the main practices to improve the internal structure of evolving
software systems [
11
]. The key idea underlying refactoring is to
improve the internal structure of existing code without changing
the observable behavior [
11
], thereby preparing the code for fu-
ture modifications. When performed properly, refactoring activities
improve the design of software, increasing maintainability and un-
derstandability. Accordingly, refactoring is listed as a recommended
practice in the Software Engineering (SE) body of knowledge [6].
Prior to refactoring activities, developers need to look for code
smells, i.e., particular code structures that when removed through
refactoring activities lead to more readable, easy-to-understand,
and cheaper-to-modify code. However, the set of skills required to
identify code smells is acquired through training and experience.
Despite the aforementioned benefits, refactoring and code smell
identification skills have been overlooked in the Computer Science
curriculum. Even though continuous evolution (i.e., maintenance
activities) accounts for more technical and financial resources than
software development per se, a major share of a typical under-
graduate curriculum is dedicated to development activities [
16
].
Practices as refactorings are often neglected in favor of more con-
structive activities such as design and implementation. In effect,
going through code while looking for code smells is a difficult and
somewhat boring task.
437
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
A recurring challenge in SE education is engaging students in
learning activities that relate to the professional practices of SE.
Additionally, it is often challenging for SE students to contextualize
how some concepts and skills will fit into or influence their fu-
ture professional practices. Recently, in hopes of dealing with such
challenge, the SE education community has turned to innovative
pedagogical strategies such as gamification [
2
,
22
]. Essentially, gam-
ification entails employing game design elements in a non-game
setting. In other words, gamification is centered around generating
learning experiences that convey feelings and engage students as
if they were playing games, but not with entertainment in mind.
We conjecture that gamification can be used to improve SE educa-
tion. More specifically, we believe that gamification can be used to
support and motivate SE students in the development of code smell
identification skills by turning a difficult and somewhat tedious
activity (e.g., going over snippets of code) into an engaging experi-
ence. There is much potential insight to be gained in exploring how
SE education can be improved by devising gamification approaches
that cover different aspects related to topics that are overlooked
in academic curricula, e.g., code smell identification concepts and
skills.
Based on the premise that gamification is well suited to engage
students with code smell identification concepts, specially when
used as a way to provide students with training follow-up, we set
out to explore whether gamification can have a positive impact on
post-training reinforcement in comparison with a more traditional
approach, which consists in setting up post-training reinforcement
content manually. Generally, traditional post-training to evaluate
skill-building in activities such as code smell identification entails
hands-on tasks that involve perusing source code for code smells.
Usually, in traditional post-training these tasks are supported only
by an integrated development environment (IDE), which allows for
easier code navigation. The lack of guidelines and elements to keep
students engaged make traditional post-training for code smell
identification unwieldy. So a gamified approach can be employed
to mitigate these problems. To probe into the benefits provided
by a gamified environment over an IDE-driven post-training, we
developed a tool that supports post-training activities centered
around code smell identification. In the context of our tool, these
post-training activities follow a gameful design approach, i.e, they
leverage gamification elements such as leaderboard and rewarding
badges. To the best of our knowledge, our tool is the first educational
platform to realize a gamified, post-training reinforcement approach
to code smell identification. To corroborate the benefits of our
gamified approach, we carried out two evaluations: an experiment
involving 18 participants and an attitudinal survey, which was
conducted after the experiment. The main contributions of our
research are threefold:
(1)
We introduce CleanGame: a gamified platform for post-
training reinforcement of code smell identification concepts
and skills.
(2)
In keeping with current evidence, we argue that a gamified
environment is more effective at conveying code smell iden-
tification skills while keeping students engaged than a more
traditional approach to code smell identification (i.e., IDE-
driven). So we carried out an experiment to probe into the
impact and soundness of gamification in supporting and
engaging students during code smell identification activities.
(3)
We administered an attitudinal survey to the experiment
participants to get an overview of their attitudes towards
CleanGame and the advantages and drawbacks of using a
gamified approach to code smell identification.
The participants of our experiments confirmed that playing the
game is fun, and that identifying smells as part of CleanGame
is more enjoyable than doing so outside the game. On average,
participants were able to identify approximately twice as much
code smells using CleanGame (4.94) as subjects using an IDE (2.39).
Additionally, the best-performing participants were able to correctly
identify 8 code smells out of 10 using CleanGame.
The remainder of this paper is organized as follows. Section 2
provides background on code smells and gamification. Section 3 out-
lines related work. Section 4 gives a brief description of CleanGame.
Section 5 details the experiment we carried out to evaluate CleanGame.
Section 6 discusses the results of the experiment and their impli-
cations. The quantitative results of the attitudinal survey are pre-
sented in Section 7. Section 8 discusses the threats to validity of the
study. Section 9 presents concluding remarks.
2 BACKGROUND
This section describes the theoretical foundation necessary for
understanding CleanGame (i.e., code smells and gamification).
2.1 Code Smells
Code smells, also known as bad-smells or just smells [
11
], repre-
sent symptoms of the presence of poor design or implementation
choices in the source code, represent one of the most serious forms
of technical debt [
15
]. Fowler et al. [
11
] described 22 smells and
incorporated them into refactoring strategies to improve design
quality. In addition to the smells proposed by Fowler et al. [
11
],
there are many other code smells [
7
]. Nevertheless, in this paper
we focus on the following code smells: (i)
Large Class:
Classes
that are trying to do too much often have a large number of in-
stance variables; (ii)
Long Method:
It is a method that contains
too many lines of code; (iii)
Divergent Change:
Happens when
a class is often changed in many different ways and for different
reasons; (iv)
Feature Envy:
Happens when a class spends more
time communicating with functions or data in another class than
with their own, may occur after fields have moved to a data class;
(v)
Shotgun Surgery:
Occurs when changing a class entails a lot
of small modifications in many different classes as well.
Please note that, we selected these code smells because they are
widely used in academic and industrial settings [21].
2.2 Gamification
Gamification is a relatively new term that has been used to denote
the use of game elements and game-design techniques in non-
gaming contexts [
8
]. Game elements are a set of components that
compose a game [
4
]. In some studies, game elements are also called
game attributes [4].
In the context of SE, there has been an increasing interest in
using gamification with the goal of increasing engagement and
motivation. Researchers and practitioners have started adopting
438
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
gamification in several different contexts, such as, gamification of
the software development life cycle [
9
], software process improve-
ment initiatives [
14
], and also in SE education [
2
,
22
]. However,
as mentioned, very little attention has been directed towards inte-
grating refactoring and code smell identification concepts into the
Computer Science curriculum. As stated by Fraser
[12]
, some SE
activities are overlooked in academic curricula because emphasis is
placed on more constructive activities such as software design and
implementation.
In summary, gamification can be applied as a strategy to turn
complex or somewhat boring activities into engaging and competi-
tive activities. Thus, there is much potential insight to be gained
in exploring how SE education can be further improved by devel-
oping gamification approaches that cover different aspects related
to topics that are in a way overlooked in academic curricula, e.g.,
refactoring and refatoring-related concepts as code smells.
3 RELATED WORK
The proposal and use of game-related methods is a crescent topic
in SE education [
22
]. Gamification, specifically, has gained consid-
erable attention lately [
3
], both in professional and educational
contexts of SE, as a method to increase motivation and engagement
of subjects in the execution of SE activities.
In the professional context, Dal Sasso et al.[
20
] and Garcia et al.
[
13
] propose frameworks for the gamification of SE activities. The
first [
20
] provides a set of basic building blocks to apply gamification
techniques, supported by a conceptual framework. The later [
13
]
propose a complete framework for the introduction of gamification
in SE environments.
In the context of SE education, there are also several proposals of
using gamification to support varied knowledge areas. Akpolat and
Slany [
1
] use weekly challenges to motivate students on applying
eXtreme Programming practices to their project. The students had
to compete for a “challenge cup” award. Code Defenders [
19
] uses
gamification to create a ludic and competitive approach to promote
software testing using mutation and unit tests. Bell et al. [
5
] expose
students to software testing using a game-like environment, HALO
(Highly Addictive, sociaLly Optimized) SE.
Nonetheless, to the best of our knowledge, there is no study fo-
cusing on the detection of code smells. CodeArena [
10
], for instance,
uses gamification to motivate refactoring. However, the target users
are practitioners. We believe that CleanGame is a solid contribution
to the context of game-related approaches to SE education.
4 CLEANGAME
This section proposes and describes CleanGame
1
, a gamified soft-
ware tool aimed to teach code smell detection. CleanGame is com-
posed of two independent modules: Smell-related Quiz and Code
Smell Identification. The goal of the first module is to allow student
learning or revisiting the main concepts surrounding code smells.
To achieve this goal, the Smell-related Quiz module presents ques-
tions about code smells with multiple-choice answers. The second
module, Code Smell Identification, focuses on practical tasks of
1CleanGame is available at https://bit.ly/2W6xClB
identifying code smells in the source code. The current implementa-
tion of this module is integrated to PMD
2
to allow the creation of a
list of code smells identified in Java source code. CleanGame allows
users not only to access pre-defined quizzes and identification tasks,
but also to create their own quizzes and tasks.
Figure 1: Smell-related Quiz module of CleanGame.
Figure 1 presents a screenshot of CleanGame. This figure shows
a quiz question with four possible answers and the user has to
choose the best option. On the right-hand side of Figure 1, we can
see several game elements used in this gamified software tool, such
as player status, score, timing, and skip questions. CleanGame also
presents a ranking of the top-10 best scores in this quiz. Therefore,
the player is able to check in real time his classification in the
current quiz and how far this score is in relation to the top scores.
The player score in the current questions is penalized in several
situations. For instance, if the player either skips a question or takes
too long to answer it, his score in this question is penalized by up
to the total amount of points assigned to the given question. Code
smell identification tasks also have options that allow players to
ask for help (shown in Figure 2).
It is worth mentioning that CleanGame is fully integrated with
the GitHub application programming interface (API). Therefore,
during the creation of a room in the identification module, the user
need to provide an uniform resource locator (URL) of a Java GitHub
repository. Then, the Java source code is cloned and transformed
into an abstract syntax tree (AST) in a fully automatic way to create
an oracle of smells-related questions. Three help hints are available:
metrics used to detect this code smell, refactoring aimed to address
this code smell, and short definition of this code smell. Asking for
help also negatively impact on the points the player receives for a
question.
5 EXPERIMENT SETUP
We surmise that gamification is well suited to better engage stu-
dents with refatoring-related topics such as code smell identifi-
cation, specially when used as a way to provide students with
training follow-up. Based on this assumption, we set out to explore
whether gamification can have a positive impact on post-training
2
It is an extensible cross-language static code analyzer. PMD is available at
https://pmd.github.io/
439
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
reinforcement in comparison with a more traditional approach,
which consists in setting up post-training reinforcement content
manually. Traditional post-training to evaluate skill-building in ac-
tivities such as code smell identification entails hands-on tasks that
involve perusing source code for code smells. Generally, the tasks
in post-training are supported only by an integrated development
environment (IDE), which allows for easier code navigation [
23
].
The lack of guidelines and elements to keep students engaged make
traditional post-training for code smell identification unwieldy. We
believe that a gamified approach can be employed to mitigate these
problems. To probe into the benefits provided by a gamified envi-
ronment over an IDE-driven post-training, identification developed
a tool that supports post-training activities centered around code
smell identification. In the context of our tool, these post-training
activities follow a gameful design approach, i.e, they leverage gam-
ification elements such as leaderboard and rewarding badges.
Figure 2: Smell-related Identification module of CleanGame.
We designed an experiment to answer the following research
question:
RQ1:
Does gamication have a positive impact on how students
identify code smells during post-training activities?
We surmised that students will thrive on a game-based approach
to code smell identification because there is evidence that gamifica-
tion elements such as points and leaderboards convey a sense of
competence to students and enhance intrinsic motivation, thereby
improving performance. In keeping with this evidence, we posit
that a gamified environment is more effective at conveying code
smell identification skills and keeping students engaged than a
more traditional approach to code smell identification (i.e., IDE-
driven). Therefore,
RQ1
comes down to examining the impact and
soundness of gamification in engaging students in code smell iden-
tification from a researcher’s perspective. In the context of our
experiment, we used the following proxy to measure the effective-
ness of gamification: average rate of correct answers (i.e., amount
of code smells correctly identified),
5.1 Scoping
5.1.1 Experiment Goals. We used the organization proposed by the
Goal/Question/Metric (GQM) [
25
] to set the goals of our experiment.
Following such goal definition template, the scope of our study can
be summarized as outlined below.
Analyze our gamified approach
for the purpose of evaluation
with respect to post-training effectiveness
from the point of view of the researcher
in the context of students looking for code smells.
5.1.2 Hypotheses Formulation. We framed our prediction for
RQ1
:
our gamified post-training approach is more effective than an IDE-
driven approach. As mentioned, to answer
RQ1
we evaluated the
effectiveness of the post-training approaches in terms of one proxy
measure: amount of correctly identified code smells (CICS). So
RQ1
was turned into the following hypotheses:
Null hypothesis, H0−CI CS :
there is no difference be-
tween a gamified approach and an IDE-based approach
for code smell identification in terms of the amount of
code smells correctly identified by students during post-
training.
Alternative hypothesis, H1−CI CS :
students are able to
identify more code smells through a gamified environment
than through a traditional (i.e., IDE-driven) approach.
Let
µ
be the average amount of correctly identified code smells.
So,
µCl e an Game
and
µI DE
denote the average amount of code
smells correctly identified by students using CleanGame and the av-
erage amount code smells identified using an IDE-based approach.
Then, the aforementioned set of hypotheses can be formally stated
as:
H0−CI CS :µC l e anG am e =µI D E
and
H1−CI CS :µC l e anG am e >µI D E
5.2 Selection of Subjects
This experiment was run using Computer Science students: more
specifically, undergraduate, master’s, and PhD students were used
as subjects in our experiment. This experiment was run at the
Federal University of Minas Gerais (UFMG). It is worth highlighting
that this study can be classified as a quasi-experiment due to the
lack of randomization of participants: the participating students
signed up for the course. We elaborate on the ability to generalize
from this specific context in Subsection 8.
All subjects signed a consent form prior to participating in the
experiment. All subjects already had prior experience with Java
and object oriented programming. Previous knowledge regarding
refactoring and refactoring related concepts (e.g., code smells) was
not mandatory. Note that, none of the subjects had participated in
the course before.
440
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
5.3 Experiment Design
This experiment has one factor with two treatments: the factor is
the post-training reinforcement approach through which the sub-
jects try to identify code smells and the treatments are CleanGame
(i.e., our gamified way of teaching and supporting the identification
of code smells) and IDE-based approach, which is a hands-on assign-
ment using an IDE. The experience of the subjects was not used as
a blocking factor: we did not ask subjects to fill a pre-experimental
questionnaire because we decided against further stratifying our
sample into groups with similar experience levels. So, it is assumed
that the subjects in this experiment have equivalent background
and level of experience.
We used a randomized crossover design so that all subjects could
be exposed to both post-training approaches. That is, all participants
were assigned to use CleanGame as well as the IDE-based post-
training approach. Both groups went over the same Java programs
and code smells. Note that, none subjects quit the experiment.
5.4 Instrumentation
In the introduction phase, subjects attended lectures (i.e., classroom-
based delivery) about refactoring and code smells. We employed
a randomized crossover design so that subjects could be exposed
to both post-training tasks, i.e., the gamified and the traditional
approach. More specifically, subjects were randomly assigned into
two groups and assigned to complete code smell identification tasks
using each approach as follows: one group performed code smell
identification using an IDE followed by code smell identification
using CleanGame; the other group performed code smell identifica-
tion with CleanGame followed by code smell identification using
an IDE. Therefore, the response is measured twice in each subject.
Following randomization, the subjects assigned to be the first
group to use CleanGame took part in a short training session in
which they were introduced to each feature of our tool. During this
training session, we had the subjects identify a few code smells
using CleanGame. The goal was to allow the subjects to familiarize
themselves with the graphical user interface (GUI) of the tool. Ad-
ditionally, throughout this training session, subjects were allowed
to ask any questions about CleanGame. No further assistance was
provided to the subjects assigned to carry out post-training tasks
using the traditional approach (i.e., IDE-driven).
Since we used a randomized crossover design, in a later stage,
the group that first took part in code smell identification tasks using
CleanGame was then assigned to carry out code smell identification
tasks using the traditional approach. In turn, the group initially
assigned to the traditional approach was introduced to CleanGame
(i.e., participated in our brief training session) and proceeded to
identify code smells using our gamified approach.
The advantages of applying each code smell identification ap-
proach as perceived by the subjects were investigated through a
post-questionnaire handed out after the experiment had been car-
ried out (i.e., wrap-up phase). Moreover, the same questionnaire
was also used to gather further information from the participants
concerning the main hindrances/inhibitors of applying both code
smell identification approaches.
6 EXPERIMENT RESULTS
In this section, we present the experimental results of the exper-
iment we carried out. First we outline some descriptive statistics,
then we present the hypothesis testing.
6.1 Descriptive Statistics
As mentioned, we employed a randomized crossover design, so
subjects in both groups were exposed to the two approaches (Sub-
section 5.4). Table 1 presents detailed results of the performance of
the eighteen participants when using CleanGame to identify code
smells. In Table 2 we summarize how the subjects performed while
identifying code smells via an IDE.
As shown in Tables 1 and 2, on average, subjects were able to
identify approximately twice as much code smells using CleanGame
(4.94) as subjects using an IDE (2.39). Additionally, the best-performing
subject was able to correctly identify 8 code smells out of 10 using
CleanGame. As shown in Figure 3, subjects in group 1 performed
slightly better at code smell identification while using CleanGame
than when using an IDE. As for the subjects in group 2, they per-
formed significantly better when identifying code smells using
CleanGame (Figure 3). When combining the performance of both
groups with both experimental treatments, our results would seem
to indicate that CleanGame allowed participants to be more effec-
tive at code smell identification (Figure 3, boxplot on the left).
Subjects were more apt to skip code identification tasks when
using an IDE. When using an IDE, subjects skipped on average 1.22
tasks: subject #5 from group 2 had the largest number of answered
questions in this group, 6 questions were skipped over. Participants
who had larger numbers of skipped questions also seemed to have
had difficult on most questions: for instance, subjects #3, #5, #6,
and #9 had a high ratio of incorrect answers. In contrast, while
using CleanGame, participants seldom skipped over questions (as
shown in Table 1). We surmise that participants were less likely
to skip questions while using CleanGame because of the metrics-,
refactoring-, and definition-related tips provided by the tool. Ac-
cording to the results in Table 1, the most commonly requested
type of tip was metric-related: while going over the 10 code smell
identification tasks, subjects requested on average approximately
five metric-related tips. Interestingly, refactoring-related tips were
not requested very often by the participants. Definition-related tips
were the least requested type of tip. We believe that these results
might indicate that the subjects had a good grasp of the concepts
underlying code smells but needed some sort of metric to back up
their opinions regarding whether or not they were looking at a
given code smell. Given that an IDE does not provide much sup-
port in terms of code smell identification, our results indicate that
tackling those tasks within an IDE seems to be more difficult for
most participants, thus participants seem to stop responding more
often at points where the code identification task gets complicated.
6.2 Hypothesis Testing
To test the hypotheses we formulated in Section 5.1.2 we applied a
paired Wilcoxon’s rank-sum test. As we hypothesized, according
to the results of this non-parametric test, subjects perform signif-
icantly better when using CleanGame than when using an IDE
(V=125.5,p=0.003).
441
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
Table 1: Code smell identification performance of the two experimental groups using CleanGame.
Group 1
Subject Correct
Answers
Incorrect
Answers Skipped Metric-related Tip Refactoring-related Tip Definition Tip Average Time†
#1 6 4 0 8 2 0 476
#2 6 4 0 6 4 4 2,593
#3 5 5 0 6 3 0 1,336
#4 5 5 0 9 4 0 640
#5 5 5 0 6 3 3 674
#6 4 6 0 4 4 3 1,601
#7 2 7 1 1 1 0 1,198
#8 1 9 0 3 1 0 1,231
#9 3 7 0 3 2 1 891
Group 2
Subject Correct
Answers
Incorrect
Answers Skipped Metric-related Tip Refatoring-related Tip Definition Tip Average Time†
#1 8 2 0 8 4 0 897
#2 7 3 0 1 1 0 1,259
#3 7 3 0 1 0 0 2,349
#4 6 4 0 8 4 0 1,317
#5 6 4 0 9 4 1 993
#6 5 5 0 9 2 0 1,411
#7 5 5 0 8 2 1 1,951
#8 4 6 0 0 0 0 960
#9 4 6 0 1 1 1 943
Descriptive Statistics for both Experimental Groups
Min 1 2 0 0 0 0 476
Max 8 9 1 9 4 4 2,593
Average (Mean) 4.94 5.00 0.05 5.06 2.33 0.78 1,262.22
Standard Dev. 1.76 1.68 0.24 3.30 1.46 1.27 566.10
†Average time is indicated in seconds.
0
2
4
6
8
CleanGame IDE
Performance using Both Post−training Approaches
Amount of Correct Answers
2
4
6
8
Group 1 Group 2
Performance using CleanGame
Amount of Correct Answers
0
2
4
6
8
Group 1 Group 2
Performance using IDE
Amount of Correct Answers
Figure 3: Overview of the performance of the experimental
subjects in terms of properly identifying code smells using
both post-training approaches.
7 ATTITUDINAL SURVEY
This section outlines the results of an attitudinal survey we con-
ducted to answer the following research questions:
RQ2
:Do students have a positive attitude towards a game-based
learning experience? - The effectiveness of a post-training approach
is heavily influenced by the attitudes held toward how the instruc-
tional content is presented. Thus,
RQ2
investigates the subjects’
outlook on gamification as a post-training approach to code smell
identification.
RQ3
:What are the advantages and drawbacks of a gamied post-
training approach? - In addition, we set out to investigate the pros
and cons of gamification as a post-training reinforcement approach
from the standpoint of students.
Therefore, the goal of this attitudinal survey is to gauge students’
opinions, level of satisfaction, and overall attitude towards our
gamified post-training approach.
After developing an initial draft of the survey questionnaire, we
ran a pilot test with a group of five (Computer Science) graduate
students. Our goal was to validate the questionnaire in terms of
clarity, objectiveness, and correctness. We refined the questionnaire
based on the feedback from the pilot study and created an online
version using Google Forms
3
. Table 3 summarizes each question
in the questionnaire. The questionnaire comprises 22 questions,
divided into three parts: Q1 to Q4 are aimed at gathering back-
ground information from the participants; Q5 to Q9 are questions
about code smell identification (both with and without CleanGame);
Q10 to Q24 are related to the participants’ experience while using
CleanGame. It is worth mentioning that questions Q10 to 21 were
3https://www.google.com/forms/about/
442
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
Table 2: Code smell identification performance of the two
experimental groups using an IDE.
Group 1
Subject Correct
Answers
Incorrect
Answers Skipped Average Time†
#1 3 7 0 2,340
#2 2 8 0 1,080
#3 5 5 0 1,500
#4 3 7 0 1,380
#5 4 6 0 1,140
#6 3 7 0 1,680
#7 2 8 0 1,860
#8 5 5 0 2,220
#9 2 8 0 1,680
Group 2
Subject Correct
Answers
Incorrect
Answers Skipped Average Time†
#1 4 6 0 1,560
#2 3 6 1 1,740
#3 0 6 4 540
#4 1 9 0 2,160
#5 2 2 6 1,080
#6 0 6 4 1,200
#7 2 8 0 1,080
#8 0 7 3 1,740
#9 2 4 4 1,380
Descriptive Statistics for both Experimental Groups
Min 0 2 0 540
Max 5 9 6 2,340
Average (Mean) 2.39 6.39 1.22 1,520.00
Standard Dev. 1.54 1.69 3.95 464.30
†Average time is indicated in seconds.
adapted from MEEGA+ [
18
], which is a framework for evaluating
serious games tailored to computing education.
The participants were asked to answer the questionnaire immedi-
ately by the end of the experiment. We made it clear to the students
that questionnaire completion was optional and anonymous.
7.1 Attitudinal Survey Results
Eighteen participants completed the questionnaire. Most partici-
pants (thirteen participants, which represents roughly 72.2% of our
sample) are 23 to 28 years old. Also, thirteen participants (72.2%)
claimed that they play games at least once a month, out of which
seven (38.9%) claimed that they play games on a daily basis. Only
three participants (16.7%) claimed never playing games. Regarding
the participants’ experience with Java or object oriented devel-
opment: eleven participants (61.1%) claimed having professional
experience with either Java or object oriented development, and
seven (38.9%) claimed having only academic experience.
Figure 4 highlights the questionnaire results regarding questions
Q5 and Q6, which ask participants about the difficulty of performing
code smell identification activities with and without the support of
CleanGame. The results would seem to suggest that the participants
found the activity more challenging to perform without CleanGame.
Figure 5 shows the answers we collected for questions Q7, Q8,
and Q9. From looking at the answers to Q7, we can see that most
participants avoided skipping questions, regardless of their dif-
ficulty level. As for Q8, only 2 participants (1.1%) affirmed that
Table 3: Questionnaire
ID Questions Type of answer
Q1 Student level Single Choice:
a. Computer Science under-
graduate student;
b. Information Systems under-
graduate student;
c. Computer Science graduate
student.
Q2 Age Nominal Scale:
(1) 17 - 22 years old; (2) 23 a 28
years old; (3) 29 a 34 years old;
(4) 34+ years old.
Q3 How often do you play (digital or non-digital)
games?
Nominal Scale:
(1) Never; (2) Rarely; (3)
Monthly; (4) Weekly; (5) Daily
Q4 What is your experience with Java or Object Ori-
ented development
Nominal Scale:
(1) None; (2) Academic experi-
ence; (3) Beginner professional
experience; (4) Advanced pro-
fessional experience
Q5 How difficult was the execution of the code smell
identification activity without CleanGame?
Nominal Scale:
(1) Very easy; (2) Easy; (3) Bal-
anced; (4) Hard; (5) Very hard.
Q6 How difficult was code smell identification with
CleanGame?
Nominal Scale:
(1) Very easy; (2) Easy; (3) Bal-
anced; (4) Hard; (5) Very hard.
Q7 I skipped questions (i.e., code smell identification ac-
tivities) because it was too hard.
Likert Scale*
Q8 I tried to answer all questions consciously (without
guessing).
Likert Scale*
Q9 I tried to solve the challenges exhaustively before tak-
ing advantage of tips (provided by CleanGame).
Likert Scale*
Q10 [Challenge] CleanGame is adequately challenging
without becoming boring.
Likert Scale*
Q11 [Satisfaction] Completing tasks in CleanGame gave
me the feeling of achievement.
Likert Scale*
Q12 [Satisfaction] I would recommend this game to my
peers.
Likert Scale*
Q13 [Social interaction] CleanGame promotes competi-
tion.
Likert Scale*
Q14 [Fun] There was a element in CleanGame that cap-
tured my attention
Likert Scale*
Q15 [Focus] CleanGame kept me engaged during the exe-
cution of activities/ I lost track of time/ I forgot about
my surroundings.
Likert Scale*
Q16 [Relevance]The contents of CleanGame are relevant
for my interests and it is clear how they are related
to code smell identification skill acquisition
Likert Scale*
Q17 [Relevance] I would like to use more tools similar to
CleanGame throughout my academic formation.
Likert Scale*
Q18 [Relevance] I prefer to practice the concepts of code
smells with CleanGame than with other educational
methods.
Likert Scale*
Q19 [Learningperception] CleanGame contributed to my
learning and was efficient in comparison to other ac-
tivities.
Likert Scale*
Q20 [Learning perception] CleanGame (Identification
Game) contributed practice the concepts of code
smell identification.
Likert Scale*
Q21 [Learning perception] CleanGame (Quiz) con-
tributed to remember code smell identification.
Likert Scale*
Q22 What were the positive aspects of CleanGame? Open Answer
Q23 What were the negative aspects of CleanGame? Open Answer
Q24 Do you have any additional comments regarding
CleanGame?
Open Answer
* Likert Scale: (-2) Definitely disagree; (-1) Disagree; (0) Indifferent; (1) Agree; (2) Definitely agree
they tried to answer all questions consciously. So we conjecture
that some participants tried to guess the correct answer to some
code smell identification tasks. Finally, the participants had mixed
opinions concerning Q9: our results would seem to suggest that
some participants tried to take advantage of the tips provided by
CleanGame before tackling the code smell identification task. On
the other hand, some participants only took advantage of tips after
exhaustively trying to grasp the code smell identification task at
hand.
Figure 6 shows the answers related to the participants’ experi-
ences using CleanGame. The items are grouped according to the
443
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
0
1
2
9
6
0
2
8
5
3
1 2 3 4 5
Difficulty without CleanGame Difficulty with CleanGame
Figure 4: Diculty in performing the code smell identifica-
tion without and with CleanGame.
9
12
3
1
3
2
2
1
6
5
0
5
1
2
2
Q7
Q8
Q9
Definetely disagree Disagree Indifferent Agree Definetely agree
Figure 5: Results for survey questions Q7, Q8, and Q9.
following factors: challenge, satisfaction, social interaction, fun, fo-
cus, relevance, and learning. For each item, the right most column
presents the median value of the participants’ responses, ranging
from “Definitely Disagree” (-2) to “Definitely Agree” (2). No factors
presented a negative median value. The factors “satisfaction” and
“focus” presented median values of 0, meaning that these aspects
were observed with indifference or mixed opinions by the partici-
pants. For all other factors, the median values were positive. The
items Q14 and Q19 had the most “Definitely agree” responses. These
items are related to the adequacy of the content of CleanGame to
the course, and the learning impact of the quiz on how the partici-
pants committed code smells related concepts to memory. Except
for the items Q11, Q12 and Q15, all other items received more than
50% of positive responses (“Agree” and “Definitely agree”). The
items with the highest count of positive responses are the follow-
ing: Q13 (i.e., likeliness of recommending CleanGame for other
students), Q16 (i.e., adequacy of CleanGame contents) and Q21 (i.e.,
how much CleanGame supports the memorization of code smell
related concepts) with 83.3% of positive responses each. In contast,
items Q15 (i.e., focus), Q11 (i.e., satisfaction) and Q12 (i.e., feeling of
achievement) received the highest number of negative responses,
with 44.4%, 38.9% amd 27.8% of responses “Disagree” or “Definitely
disagree”.
As for
RQ3
, positive and negative feedback from participants
were captured from the answers to Q22, Q23, and Q24. To gather and
synthesize such feedback, we employed an approach inspired in the
coding phrase of ground theory [
24
]. Two researchers analyzed the
responses individually and marked relevant segments with “codes”
(i.e., tagging with keywords). Afterwards, the researchers compared
their codes to reach consensus, and tried to group these codes into
2
2
1
1
1
3
0
0
0
0
2
2
2
5
4
0
2
5
0
0
2
3
2
0
4
6
5
2
3
6
3
4
4
4
2
1
6
2
5
8
7
1
7
7
6
7
6
4
4
3
3
7
5
3
8
7
6
4
6
11
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
Q19
Q20
Q21
Definitely disagree Disagree Indifferent Agree Definitely agree
Relevance
Focus
Fun
Social Interaction
Satisfaction
Challenge
Learning
Perception
1
0
0
1
1
0
1
1
1
1
2
1
Figure 6: Participants’ experience with CleanGame
relevant categories. Consequently, it is possible to count the number
of occurrences of codes and the number of items in each category
to understand what recurring positive and negative were reported
by the participants. Tables 4 and 5 list the positive and negative
aspects reported by the participants. The column “code” groups
recurring items observed in the responses. The column “category”
presents a broad category used to group the codes. The column
“occurrences” presents the number of times the each code appeared
in the responses.
Table 4: Positive Aspects stated by the participants
Positive Aspects Category #
Ludic and interactive tool Design / Usability 7
Support comprehension on code smells Learning 6
Competition Gamification 5
Easiness of use Design / Usability 3
Tips Gamification 2
Dynamic leaderboards Gamification 2
Multiple choice questions Question structure 2
Adequate to different profiles of students Learning 1
Score system Gamification 1
Question and tip visualization Design / Usability 1
Motivating Learning 1
Interesting for online courses Learning 1
We found 32 occurrences of 12 distinct codes describing posi-
tive aspects. These codes are grouped in four categories: “Design
and Usability” (11 occurrences); “Gamification” (10 occurrences);
“Learning” (9 occurrences); and “Question structure” (2 occurrences).
The most recurring codes found were the following: “Ludic and
444
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
interactive tool” (7 occurrences); “Support comprehension on code
smells” (6 occurrences); and “Competition” (5 occurrences). These
results provide evidence of the positive attitude of students towards
the effects of CleanGame (and its gamification approach) when
applied for the acquisition of code smell identification skills.
We found 28 occurrences of 15 distinct codes representing nega-
tive feedback. These codes are grouped in three categories: “Design
and usability” (11 occurrences); “Business rules” (10 occurrences);
and “Experiment design” (7 occurrences). Problems in the “Design
and usability” group are related to graphical user interface, how
its visual elements are arranged, or the lack of a particular visual
element. “Business rules” are problems related to how things work
in the software. These are the most interesting feedback, because
they affect the functional aspect of CleanGame. For instance, the
most recurring “Business rules” codes were related to showing the
correct answer as a feedback for the user after picking a wrong
choice, and the suggestion to not disclosing the score of all users,
as it may lead to embarrassment or undermine the motivation of
users with lower performance. Finally, the category “Experiment
design” groups codes related to complaints regarding how the ex-
periment was organized. For instance, two users complained about
the duration of the classes that took place before the experiment.
We observed that most of the negative aspects are actually oppor-
tunities of improvement and do not jeopardize the learning process
using CleanGame.
Table 5: Negative Aspects stated by the participants
Negative Aspects Category #
Interface problems Design / Usability 7
Should show the correct answer after failure Business rules 4
Disclosing scores of all participants Business rules 4
Code length Experiment design 2
Confusing scoring system Business rules 1
Rules for losing score for using tips Business rules 1
Should have ohter types of questions Design / Usability 1
Not being able to see tips already used Business rules 1
Experiment duration Experiment design 1
Form of displaying earned and lost points Design / Usability 1
Difference betwaeen the duration of the quiz
and identification activities
Experiment design 1
Poor experiment instructions Experiment design 1
Should have provided the correct answers by
the end of the experiment
Experiment design 1
Unknown metrics used Exp eriment design 1
Should show the quantity of questions Design / Usability 1
As for
RQ2
, the results of our attitudinal survey would seem
to suggest that most participants showed a positive attitude to-
wards CleanGame. Results of Q5 and Q6 indicate that the partic-
ipants found it less difficult to practice code smell identification
with CleanGame support. The results related to the participants
experience with CleanGame show positive perception regarding
relevance, perception of learning, and social interaction aspects of
the tool. However, there were some mixed opinions regarding focus
and satisfaction. Among the positive aspects described by the par-
ticipants, there were 10 mentions to gamification and 9 mentions to
positive effects on learning, out of 32 codes identified. Therefore we
have positive findings about students attitude towards CleanGame,
specially regarding the gamification strategy used in the tool and
its effect on learning.
8 THREATS TO VALIDITY
As with any empirical study, this experiment has several threats
to validity. In this subsection we outline the main threats to four
types of validity that might jeopardize our experiment: (i) internal,
(ii) external, (iii) conclusion, and (iv) construct. Internal validity
has to do with the confidence that can be placed in the cause-effect
relationship between the treatments and the dependent variables in
the experiment. External validity is concerned with generalization:
whether the cause-effect relationship between the treatments and
the dependent variables can be generalized outside the scope of the
experiment. Conclusion validity is centered around the conclusions
that can be drawn from the relationship between treatment and
outcome. Finally, construct validity is about to the relationship
between theory and observation: whether the treatments properly
reflect the cause the suitability of the outcomes in representing the
effect.
8.1 Internal Validity
We mitigated the selection bias issue by using randomization. How-
ever, since we assumed that all subjects have similar background,
no blocking factor was applied to minimize the threat of possible
variations in the performance of the subjects. Therefore, we cannot
rule out the possibility that some variability in how the subjects
performed stems from their previous knowledge and experience.
Another possible threat to the internal validity has to do with the
files containing the code smells we used in our experiment: if we
had used other files, the results could have been different. Never-
theless, we tried to mitigate this threat by selecting files with code
smells that are representative for the experience level of undergrad-
uate and grad students alike. Specifically, we selected code smells
from Landfill [
17
], which is a web-based platform for sharing and
validating code smell datasets. To the best of our knowledge, Land-
fill comprises the largest publicly available collection of manually
validated smells.
8.2 External Validity
The main threat to the external validity of our results is the sam-
ple: as mentioned, the experiment was staffed by students, so all
subjects have academic backgrounds. Thus, the insights gained
from our experiment can be generalized only to similar settings
(i.e., in the context of students with similar experience). We are
aware that further replication of our experiment is needed to es-
tablish more conclusive results: to increase external validity, we
need to replicate our study with a larger sample. Moreover, our
sample consisted solely of students. Therefore, we cannot be sure
whether CleanGame would also be able to engage practitioners
while helping them hone their code smell identification skills. We
cannot rule out the threat that the results could have been different
if practitioners were selected as subjects. So, it may be worthwhile
to replicate our experiment with a more diverse sample (including
practitioners) to corroborate our findings. It is also worth point-
ing out that the subjects in sample may have higher affinity for
video games and thus better attitudes toward a gamification based
approach than the general population.
Additionally, regarding the generalization of our findings, we
are aware that we limited the scope of our experiment to only Java
445
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
programs. For future studies, we intend to replicate our experiment
using programs written in other programming languages. It is also
worth noting that we focused on code smells found in open-source
programs, hence we cannot speculate about how the results of our
experiment would be different when taking into account industrial-
scale software. However, we conjecture that using programs that
are way to complex might hinder learning.
8.3 Conclusion Validity
The approach we used to analyze the results of our experiment
represents the main threat to the conclusions we can draw from our
study: we discussed our results by presenting descriptive statistics
and statistical hypothesis tests.
8.4 Construct Validity
We cannot rule out the possibility that the measures we employed
in our experiment may not be appropriate to quantify the effects
we set out to investigate. For instance, the amount of correct an-
swers may neither be the only nor the most important predictor
of post-training effectiveness and engagement. If the measures we
used do not reflect the target theoretical construct, the results of
our experiment might be less reliable. One way to extend our study
is to examine which other measures might be relevant for a model
of post-training effectiveness in the context of code smell identifi-
cation. Moreover, given that the experiment took place as part of a
course, in which students are graded, students (i.e., subjects) may
bias their answers in hope of getting a better grade. To mitigate
this threat, subjects were assured that the experiment would not
have any effect on their grades.
9 CONCLUDING REMARKS
We empirically evaluated whether gamification can have a positive
impact on post-training reinforcement for code smell identification
skills and concepts. To the best of our knowledge, this is the first
study that applies gamification to engage students in code smell
identification tasks during post-training reinforcement. To evaluate
the effectiveness of gamification in this context, we used the average
rate of correct answers as a proxy. According to the results of
our experiment, on average, subjects managed to identify twice as
much code smells during learning reinforcement with a gamified
approach in comparison to the IDE-driven approach: the results of a
non-parametric test show that subjects perform significantly better
when using CleanGame than when using an IDE. We interpret these
findings as general support for our hypothesis that gamification can
be applied to engage students in activities that tend to be somewhat
tedious and complex, viz., code smell identification. Furthermore,
subjects were less apt to skip code identification tasks when using
CleanGame. We believe that this can be ascribed to the metrics-,
refactoring-, and definition-related tips provided by the tool.
The results of our post-experiment attitudinal survey suggest
that most participants showed a positive attitude towards CleanGame
(and its gamification strategy) as an educational support tool for
practicing code smell identification. The participants’ evaluation of
the tool aesthetics revealed that, while “Focus” and “Satisfaction”
were the lowest rated aspects of the game (however, not negatively
rated), the other aspects were rated positively, specially “Relevance”,
“Learning Perception” and “Social Interaction”.
REFERENCES
[1]
B. S. Akpolat and W. Slany. 2014. Enhancing software engineering student team
engagement in a high-intensity Extreme Programming course using gamification.
In 27th IEEE Conference on Software Engineering Education and Training (CSEE T).
149–153.
[2]
Manal M. Alhammad and Ana M. Moreno. 2018. Gamification in Software
Engineering Education: A Systematic Mapping. Journal of Systems and Software
141 (2018), 131–150.
[3]
M. M. Alhammad and A. M. Moreno. 2018. Gamification in software engineering
education: A systematic mapping. Journal of Systems and Software 141 (2018),
131–150.
[4]
Wendy L. Bedwell, Davin Pavlas, Kyle Heyne, Elizabeth H. Lazzara, and Eduardo
Salas. 2012. Toward a taxonomy linking game attributes to learning: An empirical
study. Simulation & Gaming 43 (2012), 729–760.
[5]
Jonathan Bell, Swapneel Sheth, and Gail Kaiser. 2011. Secret ninja testing with
HALO software engineering. In 4th Proceedings of the International Workshop on
Social Software Engineering (SSE). 43–47.
[6]
Pierre Bourque and Richard E. Fairley. 2014. Guide to the Software Engineering
Body of Knowledge (SWEBOK(R)): Version 3.0. IEEE Computer Society Press.
[7]
Luis Cruz, Rui Abreu, and Jean-Noël Rouvignac. 2017. Leafactor: Improving
Energy Efficiency of Android Apps via Automatic Refactoring. In 4th International
Conference on Mobile Software Engineering and Systems (MOBILESoft). 205–206.
[8]
Sebastian Deterding, Miguel Sicart, Lennart Nacke, Kenton O’Hara, and Dan
Dixon. 2011. Gamification. using game-design elements in non-gaming contexts.
In 11st Extended Abstracts on Human Factors in Computing Systems (CHI). 2425–
2428.
[9]
Daniel J. Dubois and Giordano Tamburrelli. 2013. Understanding gamification
mechanisms for software development. In 9th Proceedings of the 2013 Joint Meeting
on Foundations of Software Engineering (ESEC/FSE). 659–662.
[10]
Leonard Elezi, Sara Sali, Serge Demeyer, Alessandro Murgia, and Javier Pérez.
2016. A game of refactoring: Studying the impact of gamification in software
refactoring. In 16th Proceedings of the Scientic Workshop Proceedings (XP2016).
1–6.
[11]
Martin Fowler. 2018. Refactoring: Improving the Design of Existing Code. Addison-
Wesley Professional.
[12]
G. Fraser. 2017. Gamification of Software Testing. In 12th 2017 IEEE/ACM Inter-
national Workshop on Automation of Software Testing (AST). 2–7.
[13]
Felix Garcia, Oscar Pedreira, Mario Piattini, Ana Cerdeira-Pena, and Miguel
Penabad. 2017. A framework for gamification in software engineering. Journal
of Systems and Software 132 (2017), 21–40.
[14]
Eduardo Herranz, Ricardo Colomo-Palacios, and Antonio de Amescua Seco. 2015.
Gamiware: A Gamification Platform for Software Process Improvement. In 22nd
Systems, Software and Services Process Improvement (SPI). 127–139.
[15]
P. Kruchten, R. L. Nord, and I. Ozkaya. 2012. Technical Debt: From Metaphor to
Theory and Practice. IEEE Software 29 (2012), 18–21.
[16]
The Joint Task Force on Computing Curricula. 2015. Curriculum Guidelines
for Undergraduate Degree Programs in Software Engineering. Technical Report.
MISSING.
[17]
F. Palomba, D. Di Nucci, M. Tufano, G. Bavota, R. Oliveto, D. Poshyvanyk, and A.
De Lucia. 2015. Landfill: An Open Dataset of Code Smells with Public Evaluation.
In 12th 2015 IEEE/ACM Working Conference on Mining Software Repositories (MSR).
482–485.
[18]
Giani Petri, Christiane Gresse von Wangenheim, and Adriano Ferreti Borgatto.
2017. A large-scale evaluation of a model for the evaluation of games for teaching
software engineering. In 39th 2017 IEEE/ACM International Conference on Software
Engineering: Software Engineering Education and Training Track (ICSE-SEET). 180–
189.
[19] Jose Miguel Rojas and Gordon Fraser. 2016. Code defenders: a mutation testing
game. In 9th Software Testing, Verication and Validation Workshops (ICSTW).
162–167.
[20]
Tommaso Dal Sasso, Andrea Mocci, Michele Lanza, and Ebrisa Mastrodicasa.
2017. How to gamify software engineering. In 24th Software Analysis, Evolution
and Reengineering (SANER). 261–271.
[21]
Tushar Sharma and Diomidis Spinellis. 2018. A survey on software smells. Journal
of Systems and Software 138 (2018), 158–173.
[22]
Mauricio Ronny Almeida Souza, Lucas Veado, Renata Teles Moreira, Eduardo
Figueiredo, and Heitor Costa. 2018. A Systematic Mapping Study on Game-
related Methods for Software Engineering Education. Information and Software
Technology 95 (2018), 201–218.
[23] D. Spinellis. 2012. Refactoring on the Cheap. IEEE Software 29 (2012), 96–95.
[24]
Klaas-Jan Stol, Paul Ralph, and Brian Fitzgerald. 2016. Grounded theory in
software engineering research: a critical review and guidelines. In 38th 2016
IEEE/ACM International Conference on Software Engineering (ICSE). 120–131.
[25]
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. 2012.
Experimentation in Software Engineering. Springer.
446