Conference PaperPDF Available

CleanGame: Gamifying the Identification of Code Smells

Authors:

Abstract and Figures

Refactoring is the process of transforming the internal structure of existing code without changing its observable behavior. Many studies have shown that refactoring increases program maintainability and understandability. Due to these benefits, refactoring is recognized as a best practice in the software development community. However, prior to refactoring activities, developers need to look for refactoring opportunities, i.e., developers need to be able to identify code smells, which essentially are instances of poor design and ill-considered implementation choices that may hinder code maintainability and understandability. However, code smell identification is overlooked in the Computer Science curriculum. Recently, Software Engineering educators have started exploring gamification, which entails using game elements in non-game contexts, to improve instructional outcomes in educational settings. The potential of gamification lies in supporting and motivating students, enhancing the learning process and its outcomes. We set out to evaluate the extent to which such claim is valid in the context of post-training reinforcement. To this end, we devised and implemented CleanGame, which is a gamified tool that covers one important aspect of the refactoring curriculum: code smell identification. We also carried out an experiment involving eighteen participants to probe into the effectiveness of gamification in the context of post-training reinforcement. We found that, on average, participants managed to identify twice as much code smells during learning reinforcement with a gamified approach in comparison to a non-gamified approach. Moreover, we administered a post-experiment attitudinal survey to the participants. According to the results of such survey, most participants showed a positive attitude towards CleanGame.
Content may be subject to copyright.
CleanGame: Gamifying the Identification of Code Smells
Hoyama Maria dos Santos
Federal University of Lavras
Lavras-MG, Brazil
hoyama.santos@ufla.br
Vinicius H. S. Durelli
Federal University of São João Del Rei
São João Del Rei-MG, Brazil
durelli@ufsj.edu.br
Maurício Souza
Federal University of Minas Gerais
Belo Horizonte-MG, Brazil
mrasouza@dcc.ufmg.br
Eduardo Figueiredo
Federal University of Minas Gerais
Belo Horizonte-MG, Brazil
figueiredo@dcc.ufmg.br
Lucas Timoteo da Silva
Federal University of Lavras
Lavras-MG, Brazil
lucastimoteo@ufla.br
Rafael S. Durelli
Federal University of Lavras
Lavras-MG, Brazil
rafael.durelli@ufla.br
ABSTRACT
Refactoring is the process of transforming the internal structure
of existing code without changing its observable behavior. Many
studies have shown that refactoring increases program maintain-
ability and understandability. Due to these benefits, refactoring is
recognized as a best practice in the software development com-
munity. However, prior to refactoring activities, developers need
to look for refactoring opportunities, i.e., developers need to be
able to identify code smells, which essentially are instances of poor
design and ill-considered implementation choices that may hinder
code maintainability and understandability. However, code smell
identification is overlooked in the Computer Science curriculum.
Recently, Software Engineering educators have started exploring
gamification, which entails using game elements in non-game con-
texts, to improve instructional outcomes in educational settings.
The potential of gamification lies in supporting and motivating
students, enhancing the learning process and its outcomes. We
set out to evaluate the extent to which such claim is valid in the
context of post-training reinforcement. To this end, we devised and
implemented CleanGame, which is a gamified tool that covers one
important aspect of the refactoring curriculum: code smell iden-
tification. We also carried out an experiment involving eighteen
participants to probe into the effectiveness of gamification in the
context of post-training reinforcement. We found that, on average,
participants managed to identify twice as much code smells during
learning reinforcement with a gamified approach in comparison
to a non-gamified approach. Moreover, we administered a post-
experiment attitudinal survey to the participants. According to the
results of such survey, most participants showed a positive attitude
towards CleanGame.
CCS CONCEPTS
·Social and professional topics Software engineering ed-
ucation.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
SBES 2019, September 23–27, 2019, Salvador, Brazil
©2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-7651-8/19/09.. .$15.00
https://doi.org/10.1145/3350768.3352490
KEYWORDS
Refactoring, gamification, code smell, Software Engineering educa-
tion, post-training reinforcement
ACM Reference Format:
Hoyama Maria dos Santos, Vinicius H. S. Durelli, Maurício Souza, Eduardo
Figueiredo, Lucas Timoteo da Silva, and Rafael S. Durelli. 2019. CleanGame:
Gamifying the Identification of Code Smells. In XXXIII Brazilian Symposium
on Software Engineering (SBES 2019), September 23–27, 2019, Salvador, Brazil.
ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3350768.3352490
1 INTRODUCTION
Many studies involving industrial scale software systems have pro-
vided evidence that the lion’s share of software development ex-
penses can be ascribed to software maintenance. Maintaining soft-
ware systems is a challenging and long-standing topic owing mostly
to the fact that modern software systems must cope with chang-
ing requirements. As a consequence, developers need to strive to
keep software systems in a condition that allows for continuous
evolution. This constant need for improving software systems has
spurred a growing interest in refactoring, which is deemed as one
of the main practices to improve the internal structure of evolving
software systems [
11
]. The key idea underlying refactoring is to
improve the internal structure of existing code without changing
the observable behavior [
11
], thereby preparing the code for fu-
ture modifications. When performed properly, refactoring activities
improve the design of software, increasing maintainability and un-
derstandability. Accordingly, refactoring is listed as a recommended
practice in the Software Engineering (SE) body of knowledge [6].
Prior to refactoring activities, developers need to look for code
smells, i.e., particular code structures that when removed through
refactoring activities lead to more readable, easy-to-understand,
and cheaper-to-modify code. However, the set of skills required to
identify code smells is acquired through training and experience.
Despite the aforementioned benefits, refactoring and code smell
identification skills have been overlooked in the Computer Science
curriculum. Even though continuous evolution (i.e., maintenance
activities) accounts for more technical and financial resources than
software development per se, a major share of a typical under-
graduate curriculum is dedicated to development activities [
16
].
Practices as refactorings are often neglected in favor of more con-
structive activities such as design and implementation. In effect,
going through code while looking for code smells is a difficult and
somewhat boring task.
437
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
A recurring challenge in SE education is engaging students in
learning activities that relate to the professional practices of SE.
Additionally, it is often challenging for SE students to contextualize
how some concepts and skills will fit into or influence their fu-
ture professional practices. Recently, in hopes of dealing with such
challenge, the SE education community has turned to innovative
pedagogical strategies such as gamification [
2
,
22
]. Essentially, gam-
ification entails employing game design elements in a non-game
setting. In other words, gamification is centered around generating
learning experiences that convey feelings and engage students as
if they were playing games, but not with entertainment in mind.
We conjecture that gamification can be used to improve SE educa-
tion. More specifically, we believe that gamification can be used to
support and motivate SE students in the development of code smell
identification skills by turning a difficult and somewhat tedious
activity (e.g., going over snippets of code) into an engaging experi-
ence. There is much potential insight to be gained in exploring how
SE education can be improved by devising gamification approaches
that cover different aspects related to topics that are overlooked
in academic curricula, e.g., code smell identification concepts and
skills.
Based on the premise that gamification is well suited to engage
students with code smell identification concepts, specially when
used as a way to provide students with training follow-up, we set
out to explore whether gamification can have a positive impact on
post-training reinforcement in comparison with a more traditional
approach, which consists in setting up post-training reinforcement
content manually. Generally, traditional post-training to evaluate
skill-building in activities such as code smell identification entails
hands-on tasks that involve perusing source code for code smells.
Usually, in traditional post-training these tasks are supported only
by an integrated development environment (IDE), which allows for
easier code navigation. The lack of guidelines and elements to keep
students engaged make traditional post-training for code smell
identification unwieldy. So a gamified approach can be employed
to mitigate these problems. To probe into the benefits provided
by a gamified environment over an IDE-driven post-training, we
developed a tool that supports post-training activities centered
around code smell identification. In the context of our tool, these
post-training activities follow a gameful design approach, i.e, they
leverage gamification elements such as leaderboard and rewarding
badges. To the best of our knowledge, our tool is the first educational
platform to realize a gamified, post-training reinforcement approach
to code smell identification. To corroborate the benefits of our
gamified approach, we carried out two evaluations: an experiment
involving 18 participants and an attitudinal survey, which was
conducted after the experiment. The main contributions of our
research are threefold:
(1)
We introduce CleanGame: a gamified platform for post-
training reinforcement of code smell identification concepts
and skills.
(2)
In keeping with current evidence, we argue that a gamified
environment is more effective at conveying code smell iden-
tification skills while keeping students engaged than a more
traditional approach to code smell identification (i.e., IDE-
driven). So we carried out an experiment to probe into the
impact and soundness of gamification in supporting and
engaging students during code smell identification activities.
(3)
We administered an attitudinal survey to the experiment
participants to get an overview of their attitudes towards
CleanGame and the advantages and drawbacks of using a
gamified approach to code smell identification.
The participants of our experiments confirmed that playing the
game is fun, and that identifying smells as part of CleanGame
is more enjoyable than doing so outside the game. On average,
participants were able to identify approximately twice as much
code smells using CleanGame (4.94) as subjects using an IDE (2.39).
Additionally, the best-performing participants were able to correctly
identify 8 code smells out of 10 using CleanGame.
The remainder of this paper is organized as follows. Section 2
provides background on code smells and gamification. Section 3 out-
lines related work. Section 4 gives a brief description of CleanGame.
Section 5 details the experiment we carried out to evaluate CleanGame.
Section 6 discusses the results of the experiment and their impli-
cations. The quantitative results of the attitudinal survey are pre-
sented in Section 7. Section 8 discusses the threats to validity of the
study. Section 9 presents concluding remarks.
2 BACKGROUND
This section describes the theoretical foundation necessary for
understanding CleanGame (i.e., code smells and gamification).
2.1 Code Smells
Code smells, also known as bad-smells or just smells [
11
], repre-
sent symptoms of the presence of poor design or implementation
choices in the source code, represent one of the most serious forms
of technical debt [
15
]. Fowler et al. [
11
] described 22 smells and
incorporated them into refactoring strategies to improve design
quality. In addition to the smells proposed by Fowler et al. [
11
],
there are many other code smells [
7
]. Nevertheless, in this paper
we focus on the following code smells: (i)
Large Class:
Classes
that are trying to do too much often have a large number of in-
stance variables; (ii)
Long Method:
It is a method that contains
too many lines of code; (iii)
Divergent Change:
Happens when
a class is often changed in many different ways and for different
reasons; (iv)
Feature Envy:
Happens when a class spends more
time communicating with functions or data in another class than
with their own, may occur after fields have moved to a data class;
(v)
Shotgun Surgery:
Occurs when changing a class entails a lot
of small modifications in many different classes as well.
Please note that, we selected these code smells because they are
widely used in academic and industrial settings [21].
2.2 Gamification
Gamification is a relatively new term that has been used to denote
the use of game elements and game-design techniques in non-
gaming contexts [
8
]. Game elements are a set of components that
compose a game [
4
]. In some studies, game elements are also called
game attributes [4].
In the context of SE, there has been an increasing interest in
using gamification with the goal of increasing engagement and
motivation. Researchers and practitioners have started adopting
438
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
gamification in several different contexts, such as, gamification of
the software development life cycle [
9
], software process improve-
ment initiatives [
14
], and also in SE education [
2
,
22
]. However,
as mentioned, very little attention has been directed towards inte-
grating refactoring and code smell identification concepts into the
Computer Science curriculum. As stated by Fraser
[12]
, some SE
activities are overlooked in academic curricula because emphasis is
placed on more constructive activities such as software design and
implementation.
In summary, gamification can be applied as a strategy to turn
complex or somewhat boring activities into engaging and competi-
tive activities. Thus, there is much potential insight to be gained
in exploring how SE education can be further improved by devel-
oping gamification approaches that cover different aspects related
to topics that are in a way overlooked in academic curricula, e.g.,
refactoring and refatoring-related concepts as code smells.
3 RELATED WORK
The proposal and use of game-related methods is a crescent topic
in SE education [
22
]. Gamification, specifically, has gained consid-
erable attention lately [
3
], both in professional and educational
contexts of SE, as a method to increase motivation and engagement
of subjects in the execution of SE activities.
In the professional context, Dal Sasso et al.[
20
] and Garcia et al.
[
13
] propose frameworks for the gamification of SE activities. The
first [
20
] provides a set of basic building blocks to apply gamification
techniques, supported by a conceptual framework. The later [
13
]
propose a complete framework for the introduction of gamification
in SE environments.
In the context of SE education, there are also several proposals of
using gamification to support varied knowledge areas. Akpolat and
Slany [
1
] use weekly challenges to motivate students on applying
eXtreme Programming practices to their project. The students had
to compete for a “challenge cup” award. Code Defenders [
19
] uses
gamification to create a ludic and competitive approach to promote
software testing using mutation and unit tests. Bell et al. [
5
] expose
students to software testing using a game-like environment, HALO
(Highly Addictive, sociaLly Optimized) SE.
Nonetheless, to the best of our knowledge, there is no study fo-
cusing on the detection of code smells. CodeArena [
10
], for instance,
uses gamification to motivate refactoring. However, the target users
are practitioners. We believe that CleanGame is a solid contribution
to the context of game-related approaches to SE education.
4 CLEANGAME
This section proposes and describes CleanGame
1
, a gamified soft-
ware tool aimed to teach code smell detection. CleanGame is com-
posed of two independent modules: Smell-related Quiz and Code
Smell Identification. The goal of the first module is to allow student
learning or revisiting the main concepts surrounding code smells.
To achieve this goal, the Smell-related Quiz module presents ques-
tions about code smells with multiple-choice answers. The second
module, Code Smell Identification, focuses on practical tasks of
1CleanGame is available at https://bit.ly/2W6xClB
identifying code smells in the source code. The current implementa-
tion of this module is integrated to PMD
2
to allow the creation of a
list of code smells identified in Java source code. CleanGame allows
users not only to access pre-defined quizzes and identification tasks,
but also to create their own quizzes and tasks.
Figure 1: Smell-related Quiz module of CleanGame.
Figure 1 presents a screenshot of CleanGame. This figure shows
a quiz question with four possible answers and the user has to
choose the best option. On the right-hand side of Figure 1, we can
see several game elements used in this gamified software tool, such
as player status, score, timing, and skip questions. CleanGame also
presents a ranking of the top-10 best scores in this quiz. Therefore,
the player is able to check in real time his classification in the
current quiz and how far this score is in relation to the top scores.
The player score in the current questions is penalized in several
situations. For instance, if the player either skips a question or takes
too long to answer it, his score in this question is penalized by up
to the total amount of points assigned to the given question. Code
smell identification tasks also have options that allow players to
ask for help (shown in Figure 2).
It is worth mentioning that CleanGame is fully integrated with
the GitHub application programming interface (API). Therefore,
during the creation of a room in the identification module, the user
need to provide an uniform resource locator (URL) of a Java GitHub
repository. Then, the Java source code is cloned and transformed
into an abstract syntax tree (AST) in a fully automatic way to create
an oracle of smells-related questions. Three help hints are available:
metrics used to detect this code smell, refactoring aimed to address
this code smell, and short definition of this code smell. Asking for
help also negatively impact on the points the player receives for a
question.
5 EXPERIMENT SETUP
We surmise that gamification is well suited to better engage stu-
dents with refatoring-related topics such as code smell identifi-
cation, specially when used as a way to provide students with
training follow-up. Based on this assumption, we set out to explore
whether gamification can have a positive impact on post-training
2
It is an extensible cross-language static code analyzer. PMD is available at
https://pmd.github.io/
439
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
reinforcement in comparison with a more traditional approach,
which consists in setting up post-training reinforcement content
manually. Traditional post-training to evaluate skill-building in ac-
tivities such as code smell identification entails hands-on tasks that
involve perusing source code for code smells. Generally, the tasks
in post-training are supported only by an integrated development
environment (IDE), which allows for easier code navigation [
23
].
The lack of guidelines and elements to keep students engaged make
traditional post-training for code smell identification unwieldy. We
believe that a gamified approach can be employed to mitigate these
problems. To probe into the benefits provided by a gamified envi-
ronment over an IDE-driven post-training, identification developed
a tool that supports post-training activities centered around code
smell identification. In the context of our tool, these post-training
activities follow a gameful design approach, i.e, they leverage gam-
ification elements such as leaderboard and rewarding badges.
Figure 2: Smell-related Identification module of CleanGame.
We designed an experiment to answer the following research
question:
RQ1:
Does gamication have a positive impact on how students
identify code smells during post-training activities?
We surmised that students will thrive on a game-based approach
to code smell identification because there is evidence that gamifica-
tion elements such as points and leaderboards convey a sense of
competence to students and enhance intrinsic motivation, thereby
improving performance. In keeping with this evidence, we posit
that a gamified environment is more effective at conveying code
smell identification skills and keeping students engaged than a
more traditional approach to code smell identification (i.e., IDE-
driven). Therefore,
RQ1
comes down to examining the impact and
soundness of gamification in engaging students in code smell iden-
tification from a researcher’s perspective. In the context of our
experiment, we used the following proxy to measure the effective-
ness of gamification: average rate of correct answers (i.e., amount
of code smells correctly identified),
5.1 Scoping
5.1.1 Experiment Goals. We used the organization proposed by the
Goal/Question/Metric (GQM) [
25
] to set the goals of our experiment.
Following such goal definition template, the scope of our study can
be summarized as outlined below.
Analyze our gamified approach
for the purpose of evaluation
with respect to post-training effectiveness
from the point of view of the researcher
in the context of students looking for code smells.
5.1.2 Hypotheses Formulation. We framed our prediction for
RQ1
:
our gamified post-training approach is more effective than an IDE-
driven approach. As mentioned, to answer
RQ1
we evaluated the
effectiveness of the post-training approaches in terms of one proxy
measure: amount of correctly identified code smells (CICS). So
RQ1
was turned into the following hypotheses:
Null hypothesis, H0CI CS :
there is no difference be-
tween a gamified approach and an IDE-based approach
for code smell identification in terms of the amount of
code smells correctly identified by students during post-
training.
Alternative hypothesis, H1CI CS :
students are able to
identify more code smells through a gamified environment
than through a traditional (i.e., IDE-driven) approach.
Let
µ
be the average amount of correctly identified code smells.
So,
µCl e an Game
and
µI DE
denote the average amount of code
smells correctly identified by students using CleanGame and the av-
erage amount code smells identified using an IDE-based approach.
Then, the aforementioned set of hypotheses can be formally stated
as:
H0CI CS :µC l e anG am e =µI D E
and
H1CI CS :µC l e anG am e >µI D E
5.2 Selection of Subjects
This experiment was run using Computer Science students: more
specifically, undergraduate, master’s, and PhD students were used
as subjects in our experiment. This experiment was run at the
Federal University of Minas Gerais (UFMG). It is worth highlighting
that this study can be classified as a quasi-experiment due to the
lack of randomization of participants: the participating students
signed up for the course. We elaborate on the ability to generalize
from this specific context in Subsection 8.
All subjects signed a consent form prior to participating in the
experiment. All subjects already had prior experience with Java
and object oriented programming. Previous knowledge regarding
refactoring and refactoring related concepts (e.g., code smells) was
not mandatory. Note that, none of the subjects had participated in
the course before.
440
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
5.3 Experiment Design
This experiment has one factor with two treatments: the factor is
the post-training reinforcement approach through which the sub-
jects try to identify code smells and the treatments are CleanGame
(i.e., our gamified way of teaching and supporting the identification
of code smells) and IDE-based approach, which is a hands-on assign-
ment using an IDE. The experience of the subjects was not used as
a blocking factor: we did not ask subjects to fill a pre-experimental
questionnaire because we decided against further stratifying our
sample into groups with similar experience levels. So, it is assumed
that the subjects in this experiment have equivalent background
and level of experience.
We used a randomized crossover design so that all subjects could
be exposed to both post-training approaches. That is, all participants
were assigned to use CleanGame as well as the IDE-based post-
training approach. Both groups went over the same Java programs
and code smells. Note that, none subjects quit the experiment.
5.4 Instrumentation
In the introduction phase, subjects attended lectures (i.e., classroom-
based delivery) about refactoring and code smells. We employed
a randomized crossover design so that subjects could be exposed
to both post-training tasks, i.e., the gamified and the traditional
approach. More specifically, subjects were randomly assigned into
two groups and assigned to complete code smell identification tasks
using each approach as follows: one group performed code smell
identification using an IDE followed by code smell identification
using CleanGame; the other group performed code smell identifica-
tion with CleanGame followed by code smell identification using
an IDE. Therefore, the response is measured twice in each subject.
Following randomization, the subjects assigned to be the first
group to use CleanGame took part in a short training session in
which they were introduced to each feature of our tool. During this
training session, we had the subjects identify a few code smells
using CleanGame. The goal was to allow the subjects to familiarize
themselves with the graphical user interface (GUI) of the tool. Ad-
ditionally, throughout this training session, subjects were allowed
to ask any questions about CleanGame. No further assistance was
provided to the subjects assigned to carry out post-training tasks
using the traditional approach (i.e., IDE-driven).
Since we used a randomized crossover design, in a later stage,
the group that first took part in code smell identification tasks using
CleanGame was then assigned to carry out code smell identification
tasks using the traditional approach. In turn, the group initially
assigned to the traditional approach was introduced to CleanGame
(i.e., participated in our brief training session) and proceeded to
identify code smells using our gamified approach.
The advantages of applying each code smell identification ap-
proach as perceived by the subjects were investigated through a
post-questionnaire handed out after the experiment had been car-
ried out (i.e., wrap-up phase). Moreover, the same questionnaire
was also used to gather further information from the participants
concerning the main hindrances/inhibitors of applying both code
smell identification approaches.
6 EXPERIMENT RESULTS
In this section, we present the experimental results of the exper-
iment we carried out. First we outline some descriptive statistics,
then we present the hypothesis testing.
6.1 Descriptive Statistics
As mentioned, we employed a randomized crossover design, so
subjects in both groups were exposed to the two approaches (Sub-
section 5.4). Table 1 presents detailed results of the performance of
the eighteen participants when using CleanGame to identify code
smells. In Table 2 we summarize how the subjects performed while
identifying code smells via an IDE.
As shown in Tables 1 and 2, on average, subjects were able to
identify approximately twice as much code smells using CleanGame
(4.94) as subjects using an IDE (2.39). Additionally, the best-performing
subject was able to correctly identify 8 code smells out of 10 using
CleanGame. As shown in Figure 3, subjects in group 1 performed
slightly better at code smell identification while using CleanGame
than when using an IDE. As for the subjects in group 2, they per-
formed significantly better when identifying code smells using
CleanGame (Figure 3). When combining the performance of both
groups with both experimental treatments, our results would seem
to indicate that CleanGame allowed participants to be more effec-
tive at code smell identification (Figure 3, boxplot on the left).
Subjects were more apt to skip code identification tasks when
using an IDE. When using an IDE, subjects skipped on average 1.22
tasks: subject #5 from group 2 had the largest number of answered
questions in this group, 6 questions were skipped over. Participants
who had larger numbers of skipped questions also seemed to have
had difficult on most questions: for instance, subjects #3, #5, #6,
and #9 had a high ratio of incorrect answers. In contrast, while
using CleanGame, participants seldom skipped over questions (as
shown in Table 1). We surmise that participants were less likely
to skip questions while using CleanGame because of the metrics-,
refactoring-, and definition-related tips provided by the tool. Ac-
cording to the results in Table 1, the most commonly requested
type of tip was metric-related: while going over the 10 code smell
identification tasks, subjects requested on average approximately
five metric-related tips. Interestingly, refactoring-related tips were
not requested very often by the participants. Definition-related tips
were the least requested type of tip. We believe that these results
might indicate that the subjects had a good grasp of the concepts
underlying code smells but needed some sort of metric to back up
their opinions regarding whether or not they were looking at a
given code smell. Given that an IDE does not provide much sup-
port in terms of code smell identification, our results indicate that
tackling those tasks within an IDE seems to be more difficult for
most participants, thus participants seem to stop responding more
often at points where the code identification task gets complicated.
6.2 Hypothesis Testing
To test the hypotheses we formulated in Section 5.1.2 we applied a
paired Wilcoxon’s rank-sum test. As we hypothesized, according
to the results of this non-parametric test, subjects perform signif-
icantly better when using CleanGame than when using an IDE
(V=125.5,p=0.003).
441
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
Table 1: Code smell identification performance of the two experimental groups using CleanGame.
Group 1
Subject Correct
Answers
Incorrect
Answers Skipped Metric-related Tip Refactoring-related Tip Definition Tip Average Time
#1 6 4 0 8 2 0 476
#2 6 4 0 6 4 4 2,593
#3 5 5 0 6 3 0 1,336
#4 5 5 0 9 4 0 640
#5 5 5 0 6 3 3 674
#6 4 6 0 4 4 3 1,601
#7 2 7 1 1 1 0 1,198
#8 1 9 0 3 1 0 1,231
#9 3 7 0 3 2 1 891
Group 2
Subject Correct
Answers
Incorrect
Answers Skipped Metric-related Tip Refatoring-related Tip Definition Tip Average Time
#1 8 2 0 8 4 0 897
#2 7 3 0 1 1 0 1,259
#3 7 3 0 1 0 0 2,349
#4 6 4 0 8 4 0 1,317
#5 6 4 0 9 4 1 993
#6 5 5 0 9 2 0 1,411
#7 5 5 0 8 2 1 1,951
#8 4 6 0 0 0 0 960
#9 4 6 0 1 1 1 943
Descriptive Statistics for both Experimental Groups
Min 1 2 0 0 0 0 476
Max 8 9 1 9 4 4 2,593
Average (Mean) 4.94 5.00 0.05 5.06 2.33 0.78 1,262.22
Standard Dev. 1.76 1.68 0.24 3.30 1.46 1.27 566.10
Average time is indicated in seconds.
0
2
4
6
8
CleanGame IDE
Performance using Both Posttraining Approaches
Amount of Correct Answers
2
4
6
8
Group 1 Group 2
Performance using CleanGame
Amount of Correct Answers
0
2
4
6
8
Group 1 Group 2
Performance using IDE
Amount of Correct Answers
Figure 3: Overview of the performance of the experimental
subjects in terms of properly identifying code smells using
both post-training approaches.
7 ATTITUDINAL SURVEY
This section outlines the results of an attitudinal survey we con-
ducted to answer the following research questions:
RQ2
:Do students have a positive attitude towards a game-based
learning experience? - The effectiveness of a post-training approach
is heavily influenced by the attitudes held toward how the instruc-
tional content is presented. Thus,
RQ2
investigates the subjects’
outlook on gamification as a post-training approach to code smell
identification.
RQ3
:What are the advantages and drawbacks of a gamied post-
training approach? - In addition, we set out to investigate the pros
and cons of gamification as a post-training reinforcement approach
from the standpoint of students.
Therefore, the goal of this attitudinal survey is to gauge students’
opinions, level of satisfaction, and overall attitude towards our
gamified post-training approach.
After developing an initial draft of the survey questionnaire, we
ran a pilot test with a group of five (Computer Science) graduate
students. Our goal was to validate the questionnaire in terms of
clarity, objectiveness, and correctness. We refined the questionnaire
based on the feedback from the pilot study and created an online
version using Google Forms
3
. Table 3 summarizes each question
in the questionnaire. The questionnaire comprises 22 questions,
divided into three parts: Q1 to Q4 are aimed at gathering back-
ground information from the participants; Q5 to Q9 are questions
about code smell identification (both with and without CleanGame);
Q10 to Q24 are related to the participants’ experience while using
CleanGame. It is worth mentioning that questions Q10 to 21 were
3https://www.google.com/forms/about/
442
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
Table 2: Code smell identification performance of the two
experimental groups using an IDE.
Group 1
Subject Correct
Answers
Incorrect
Answers Skipped Average Time
#1 3 7 0 2,340
#2 2 8 0 1,080
#3 5 5 0 1,500
#4 3 7 0 1,380
#5 4 6 0 1,140
#6 3 7 0 1,680
#7 2 8 0 1,860
#8 5 5 0 2,220
#9 2 8 0 1,680
Group 2
Subject Correct
Answers
Incorrect
Answers Skipped Average Time
#1 4 6 0 1,560
#2 3 6 1 1,740
#3 0 6 4 540
#4 1 9 0 2,160
#5 2 2 6 1,080
#6 0 6 4 1,200
#7 2 8 0 1,080
#8 0 7 3 1,740
#9 2 4 4 1,380
Descriptive Statistics for both Experimental Groups
Min 0 2 0 540
Max 5 9 6 2,340
Average (Mean) 2.39 6.39 1.22 1,520.00
Standard Dev. 1.54 1.69 3.95 464.30
Average time is indicated in seconds.
adapted from MEEGA+ [
18
], which is a framework for evaluating
serious games tailored to computing education.
The participants were asked to answer the questionnaire immedi-
ately by the end of the experiment. We made it clear to the students
that questionnaire completion was optional and anonymous.
7.1 Attitudinal Survey Results
Eighteen participants completed the questionnaire. Most partici-
pants (thirteen participants, which represents roughly 72.2% of our
sample) are 23 to 28 years old. Also, thirteen participants (72.2%)
claimed that they play games at least once a month, out of which
seven (38.9%) claimed that they play games on a daily basis. Only
three participants (16.7%) claimed never playing games. Regarding
the participants’ experience with Java or object oriented devel-
opment: eleven participants (61.1%) claimed having professional
experience with either Java or object oriented development, and
seven (38.9%) claimed having only academic experience.
Figure 4 highlights the questionnaire results regarding questions
Q5 and Q6, which ask participants about the difficulty of performing
code smell identification activities with and without the support of
CleanGame. The results would seem to suggest that the participants
found the activity more challenging to perform without CleanGame.
Figure 5 shows the answers we collected for questions Q7, Q8,
and Q9. From looking at the answers to Q7, we can see that most
participants avoided skipping questions, regardless of their dif-
ficulty level. As for Q8, only 2 participants (1.1%) affirmed that
Table 3: Questionnaire
ID Questions Type of answer
Q1 Student level Single Choice:
a. Computer Science under-
graduate student;
b. Information Systems under-
graduate student;
c. Computer Science graduate
student.
Q2 Age Nominal Scale:
(1) 17 - 22 years old; (2) 23 a 28
years old; (3) 29 a 34 years old;
(4) 34+ years old.
Q3 How often do you play (digital or non-digital)
games?
Nominal Scale:
(1) Never; (2) Rarely; (3)
Monthly; (4) Weekly; (5) Daily
Q4 What is your experience with Java or Object Ori-
ented development
Nominal Scale:
(1) None; (2) Academic experi-
ence; (3) Beginner professional
experience; (4) Advanced pro-
fessional experience
Q5 How difficult was the execution of the code smell
identification activity without CleanGame?
Nominal Scale:
(1) Very easy; (2) Easy; (3) Bal-
anced; (4) Hard; (5) Very hard.
Q6 How difficult was code smell identification with
CleanGame?
Nominal Scale:
(1) Very easy; (2) Easy; (3) Bal-
anced; (4) Hard; (5) Very hard.
Q7 I skipped questions (i.e., code smell identification ac-
tivities) because it was too hard.
Likert Scale*
Q8 I tried to answer all questions consciously (without
guessing).
Likert Scale*
Q9 I tried to solve the challenges exhaustively before tak-
ing advantage of tips (provided by CleanGame).
Likert Scale*
Q10 [Challenge] CleanGame is adequately challenging
without becoming boring.
Likert Scale*
Q11 [Satisfaction] Completing tasks in CleanGame gave
me the feeling of achievement.
Likert Scale*
Q12 [Satisfaction] I would recommend this game to my
peers.
Likert Scale*
Q13 [Social interaction] CleanGame promotes competi-
tion.
Likert Scale*
Q14 [Fun] There was a element in CleanGame that cap-
tured my attention
Likert Scale*
Q15 [Focus] CleanGame kept me engaged during the exe-
cution of activities/ I lost track of time/ I forgot about
my surroundings.
Likert Scale*
Q16 [Relevance]The contents of CleanGame are relevant
for my interests and it is clear how they are related
to code smell identification skill acquisition
Likert Scale*
Q17 [Relevance] I would like to use more tools similar to
CleanGame throughout my academic formation.
Likert Scale*
Q18 [Relevance] I prefer to practice the concepts of code
smells with CleanGame than with other educational
methods.
Likert Scale*
Q19 [Learningperception] CleanGame contributed to my
learning and was efficient in comparison to other ac-
tivities.
Likert Scale*
Q20 [Learning perception] CleanGame (Identification
Game) contributed practice the concepts of code
smell identification.
Likert Scale*
Q21 [Learning perception] CleanGame (Quiz) con-
tributed to remember code smell identification.
Likert Scale*
Q22 What were the positive aspects of CleanGame? Open Answer
Q23 What were the negative aspects of CleanGame? Open Answer
Q24 Do you have any additional comments regarding
CleanGame?
Open Answer
* Likert Scale: (-2) Definitely disagree; (-1) Disagree; (0) Indifferent; (1) Agree; (2) Definitely agree
they tried to answer all questions consciously. So we conjecture
that some participants tried to guess the correct answer to some
code smell identification tasks. Finally, the participants had mixed
opinions concerning Q9: our results would seem to suggest that
some participants tried to take advantage of the tips provided by
CleanGame before tackling the code smell identification task. On
the other hand, some participants only took advantage of tips after
exhaustively trying to grasp the code smell identification task at
hand.
Figure 6 shows the answers related to the participants’ experi-
ences using CleanGame. The items are grouped according to the
443
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
0
1
2
9
6
0
2
8
5
3
1 2 3 4 5
Difficulty without CleanGame Difficulty with CleanGame
Figure 4: Diculty in performing the code smell identifica-
tion without and with CleanGame.
9
12
3
1
3
2
2
1
6
5
0
5
1
2
2
Q7
Q8
Q9
Definetely disagree Disagree Indifferent Agree Definetely agree
Figure 5: Results for survey questions Q7, Q8, and Q9.
following factors: challenge, satisfaction, social interaction, fun, fo-
cus, relevance, and learning. For each item, the right most column
presents the median value of the participants’ responses, ranging
from “Definitely Disagree” (-2) to “Definitely Agree” (2). No factors
presented a negative median value. The factors “satisfaction” and
“focus” presented median values of 0, meaning that these aspects
were observed with indifference or mixed opinions by the partici-
pants. For all other factors, the median values were positive. The
items Q14 and Q19 had the most “Definitely agree” responses. These
items are related to the adequacy of the content of CleanGame to
the course, and the learning impact of the quiz on how the partici-
pants committed code smells related concepts to memory. Except
for the items Q11, Q12 and Q15, all other items received more than
50% of positive responses (“Agree” and “Definitely agree”). The
items with the highest count of positive responses are the follow-
ing: Q13 (i.e., likeliness of recommending CleanGame for other
students), Q16 (i.e., adequacy of CleanGame contents) and Q21 (i.e.,
how much CleanGame supports the memorization of code smell
related concepts) with 83.3% of positive responses each. In contast,
items Q15 (i.e., focus), Q11 (i.e., satisfaction) and Q12 (i.e., feeling of
achievement) received the highest number of negative responses,
with 44.4%, 38.9% amd 27.8% of responses “Disagree” or “Definitely
disagree”.
As for
RQ3
, positive and negative feedback from participants
were captured from the answers to Q22, Q23, and Q24. To gather and
synthesize such feedback, we employed an approach inspired in the
coding phrase of ground theory [
24
]. Two researchers analyzed the
responses individually and marked relevant segments with “codes”
(i.e., tagging with keywords). Afterwards, the researchers compared
their codes to reach consensus, and tried to group these codes into
Relevance
Focus
Fun
Social Interaction
Satisfaction
Challenge
Learning
Perception
1
0
0
1
1
0
1
1
1
1
2
1
Figure 6: Participants’ experience with CleanGame
relevant categories. Consequently, it is possible to count the number
of occurrences of codes and the number of items in each category
to understand what recurring positive and negative were reported
by the participants. Tables 4 and 5 list the positive and negative
aspects reported by the participants. The column “code” groups
recurring items observed in the responses. The column “category”
presents a broad category used to group the codes. The column
“occurrences” presents the number of times the each code appeared
in the responses.
Table 4: Positive Aspects stated by the participants
Positive Aspects Category #
Ludic and interactive tool Design / Usability 7
Support comprehension on code smells Learning 6
Competition Gamification 5
Easiness of use Design / Usability 3
Tips Gamification 2
Dynamic leaderboards Gamification 2
Multiple choice questions Question structure 2
Adequate to different profiles of students Learning 1
Score system Gamification 1
Question and tip visualization Design / Usability 1
Motivating Learning 1
Interesting for online courses Learning 1
We found 32 occurrences of 12 distinct codes describing posi-
tive aspects. These codes are grouped in four categories: “Design
and Usability” (11 occurrences); “Gamification” (10 occurrences);
“Learning” (9 occurrences); and “Question structure” (2 occurrences).
The most recurring codes found were the following: “Ludic and
444
CleanGame: Gamifying the Identification of Code Smells SBES 2019, September 23ś27, 2019, Salvador, Brazil
interactive tool” (7 occurrences); “Support comprehension on code
smells” (6 occurrences); and “Competition” (5 occurrences). These
results provide evidence of the positive attitude of students towards
the effects of CleanGame (and its gamification approach) when
applied for the acquisition of code smell identification skills.
We found 28 occurrences of 15 distinct codes representing nega-
tive feedback. These codes are grouped in three categories: “Design
and usability” (11 occurrences); “Business rules” (10 occurrences);
and “Experiment design” (7 occurrences). Problems in the “Design
and usability” group are related to graphical user interface, how
its visual elements are arranged, or the lack of a particular visual
element. “Business rules” are problems related to how things work
in the software. These are the most interesting feedback, because
they affect the functional aspect of CleanGame. For instance, the
most recurring “Business rules” codes were related to showing the
correct answer as a feedback for the user after picking a wrong
choice, and the suggestion to not disclosing the score of all users,
as it may lead to embarrassment or undermine the motivation of
users with lower performance. Finally, the category “Experiment
design” groups codes related to complaints regarding how the ex-
periment was organized. For instance, two users complained about
the duration of the classes that took place before the experiment.
We observed that most of the negative aspects are actually oppor-
tunities of improvement and do not jeopardize the learning process
using CleanGame.
Table 5: Negative Aspects stated by the participants
Negative Aspects Category #
Interface problems Design / Usability 7
Should show the correct answer after failure Business rules 4
Disclosing scores of all participants Business rules 4
Code length Experiment design 2
Confusing scoring system Business rules 1
Rules for losing score for using tips Business rules 1
Should have ohter types of questions Design / Usability 1
Not being able to see tips already used Business rules 1
Experiment duration Experiment design 1
Form of displaying earned and lost points Design / Usability 1
Difference betwaeen the duration of the quiz
and identification activities
Experiment design 1
Poor experiment instructions Experiment design 1
Should have provided the correct answers by
the end of the experiment
Experiment design 1
Unknown metrics used Exp eriment design 1
Should show the quantity of questions Design / Usability 1
As for
RQ2
, the results of our attitudinal survey would seem
to suggest that most participants showed a positive attitude to-
wards CleanGame. Results of Q5 and Q6 indicate that the partic-
ipants found it less difficult to practice code smell identification
with CleanGame support. The results related to the participants
experience with CleanGame show positive perception regarding
relevance, perception of learning, and social interaction aspects of
the tool. However, there were some mixed opinions regarding focus
and satisfaction. Among the positive aspects described by the par-
ticipants, there were 10 mentions to gamification and 9 mentions to
positive effects on learning, out of 32 codes identified. Therefore we
have positive findings about students attitude towards CleanGame,
specially regarding the gamification strategy used in the tool and
its effect on learning.
8 THREATS TO VALIDITY
As with any empirical study, this experiment has several threats
to validity. In this subsection we outline the main threats to four
types of validity that might jeopardize our experiment: (i) internal,
(ii) external, (iii) conclusion, and (iv) construct. Internal validity
has to do with the confidence that can be placed in the cause-effect
relationship between the treatments and the dependent variables in
the experiment. External validity is concerned with generalization:
whether the cause-effect relationship between the treatments and
the dependent variables can be generalized outside the scope of the
experiment. Conclusion validity is centered around the conclusions
that can be drawn from the relationship between treatment and
outcome. Finally, construct validity is about to the relationship
between theory and observation: whether the treatments properly
reflect the cause the suitability of the outcomes in representing the
effect.
8.1 Internal Validity
We mitigated the selection bias issue by using randomization. How-
ever, since we assumed that all subjects have similar background,
no blocking factor was applied to minimize the threat of possible
variations in the performance of the subjects. Therefore, we cannot
rule out the possibility that some variability in how the subjects
performed stems from their previous knowledge and experience.
Another possible threat to the internal validity has to do with the
files containing the code smells we used in our experiment: if we
had used other files, the results could have been different. Never-
theless, we tried to mitigate this threat by selecting files with code
smells that are representative for the experience level of undergrad-
uate and grad students alike. Specifically, we selected code smells
from Landfill [
17
], which is a web-based platform for sharing and
validating code smell datasets. To the best of our knowledge, Land-
fill comprises the largest publicly available collection of manually
validated smells.
8.2 External Validity
The main threat to the external validity of our results is the sam-
ple: as mentioned, the experiment was staffed by students, so all
subjects have academic backgrounds. Thus, the insights gained
from our experiment can be generalized only to similar settings
(i.e., in the context of students with similar experience). We are
aware that further replication of our experiment is needed to es-
tablish more conclusive results: to increase external validity, we
need to replicate our study with a larger sample. Moreover, our
sample consisted solely of students. Therefore, we cannot be sure
whether CleanGame would also be able to engage practitioners
while helping them hone their code smell identification skills. We
cannot rule out the threat that the results could have been different
if practitioners were selected as subjects. So, it may be worthwhile
to replicate our experiment with a more diverse sample (including
practitioners) to corroborate our findings. It is also worth point-
ing out that the subjects in sample may have higher affinity for
video games and thus better attitudes toward a gamification based
approach than the general population.
Additionally, regarding the generalization of our findings, we
are aware that we limited the scope of our experiment to only Java
445
SBES 2019, September 23ś27, 2019, Salvador, Brazil Santos et al.
programs. For future studies, we intend to replicate our experiment
using programs written in other programming languages. It is also
worth noting that we focused on code smells found in open-source
programs, hence we cannot speculate about how the results of our
experiment would be different when taking into account industrial-
scale software. However, we conjecture that using programs that
are way to complex might hinder learning.
8.3 Conclusion Validity
The approach we used to analyze the results of our experiment
represents the main threat to the conclusions we can draw from our
study: we discussed our results by presenting descriptive statistics
and statistical hypothesis tests.
8.4 Construct Validity
We cannot rule out the possibility that the measures we employed
in our experiment may not be appropriate to quantify the effects
we set out to investigate. For instance, the amount of correct an-
swers may neither be the only nor the most important predictor
of post-training effectiveness and engagement. If the measures we
used do not reflect the target theoretical construct, the results of
our experiment might be less reliable. One way to extend our study
is to examine which other measures might be relevant for a model
of post-training effectiveness in the context of code smell identifi-
cation. Moreover, given that the experiment took place as part of a
course, in which students are graded, students (i.e., subjects) may
bias their answers in hope of getting a better grade. To mitigate
this threat, subjects were assured that the experiment would not
have any effect on their grades.
9 CONCLUDING REMARKS
We empirically evaluated whether gamification can have a positive
impact on post-training reinforcement for code smell identification
skills and concepts. To the best of our knowledge, this is the first
study that applies gamification to engage students in code smell
identification tasks during post-training reinforcement. To evaluate
the effectiveness of gamification in this context, we used the average
rate of correct answers as a proxy. According to the results of
our experiment, on average, subjects managed to identify twice as
much code smells during learning reinforcement with a gamified
approach in comparison to the IDE-driven approach: the results of a
non-parametric test show that subjects perform significantly better
when using CleanGame than when using an IDE. We interpret these
findings as general support for our hypothesis that gamification can
be applied to engage students in activities that tend to be somewhat
tedious and complex, viz., code smell identification. Furthermore,
subjects were less apt to skip code identification tasks when using
CleanGame. We believe that this can be ascribed to the metrics-,
refactoring-, and definition-related tips provided by the tool.
The results of our post-experiment attitudinal survey suggest
that most participants showed a positive attitude towards CleanGame
(and its gamification strategy) as an educational support tool for
practicing code smell identification. The participants’ evaluation of
the tool aesthetics revealed that, while “Focus” and “Satisfaction”
were the lowest rated aspects of the game (however, not negatively
rated), the other aspects were rated positively, specially “Relevance”,
“Learning Perception” and “Social Interaction”.
REFERENCES
[1]
B. S. Akpolat and W. Slany. 2014. Enhancing software engineering student team
engagement in a high-intensity Extreme Programming course using gamification.
In 27th IEEE Conference on Software Engineering Education and Training (CSEE T).
149–153.
[2]
Manal M. Alhammad and Ana M. Moreno. 2018. Gamification in Software
Engineering Education: A Systematic Mapping. Journal of Systems and Software
141 (2018), 131–150.
[3]
M. M. Alhammad and A. M. Moreno. 2018. Gamification in software engineering
education: A systematic mapping. Journal of Systems and Software 141 (2018),
131–150.
[4]
Wendy L. Bedwell, Davin Pavlas, Kyle Heyne, Elizabeth H. Lazzara, and Eduardo
Salas. 2012. Toward a taxonomy linking game attributes to learning: An empirical
study. Simulation & Gaming 43 (2012), 729–760.
[5]
Jonathan Bell, Swapneel Sheth, and Gail Kaiser. 2011. Secret ninja testing with
HALO software engineering. In 4th Proceedings of the International Workshop on
Social Software Engineering (SSE). 43–47.
[6]
Pierre Bourque and Richard E. Fairley. 2014. Guide to the Software Engineering
Body of Knowledge (SWEBOK(R)): Version 3.0. IEEE Computer Society Press.
[7]
Luis Cruz, Rui Abreu, and Jean-Noël Rouvignac. 2017. Leafactor: Improving
Energy Efficiency of Android Apps via Automatic Refactoring. In 4th International
Conference on Mobile Software Engineering and Systems (MOBILESoft). 205–206.
[8]
Sebastian Deterding, Miguel Sicart, Lennart Nacke, Kenton O’Hara, and Dan
Dixon. 2011. Gamification. using game-design elements in non-gaming contexts.
In 11st Extended Abstracts on Human Factors in Computing Systems (CHI). 2425–
2428.
[9]
Daniel J. Dubois and Giordano Tamburrelli. 2013. Understanding gamification
mechanisms for software development. In 9th Proceedings of the 2013 Joint Meeting
on Foundations of Software Engineering (ESEC/FSE). 659–662.
[10]
Leonard Elezi, Sara Sali, Serge Demeyer, Alessandro Murgia, and Javier Pérez.
2016. A game of refactoring: Studying the impact of gamification in software
refactoring. In 16th Proceedings of the Scientic Workshop Proceedings (XP2016).
1–6.
[11]
Martin Fowler. 2018. Refactoring: Improving the Design of Existing Code. Addison-
Wesley Professional.
[12]
G. Fraser. 2017. Gamification of Software Testing. In 12th 2017 IEEE/ACM Inter-
national Workshop on Automation of Software Testing (AST). 2–7.
[13]
Felix Garcia, Oscar Pedreira, Mario Piattini, Ana Cerdeira-Pena, and Miguel
Penabad. 2017. A framework for gamification in software engineering. Journal
of Systems and Software 132 (2017), 21–40.
[14]
Eduardo Herranz, Ricardo Colomo-Palacios, and Antonio de Amescua Seco. 2015.
Gamiware: A Gamification Platform for Software Process Improvement. In 22nd
Systems, Software and Services Process Improvement (SPI). 127–139.
[15]
P. Kruchten, R. L. Nord, and I. Ozkaya. 2012. Technical Debt: From Metaphor to
Theory and Practice. IEEE Software 29 (2012), 18–21.
[16]
The Joint Task Force on Computing Curricula. 2015. Curriculum Guidelines
for Undergraduate Degree Programs in Software Engineering. Technical Report.
MISSING.
[17]
F. Palomba, D. Di Nucci, M. Tufano, G. Bavota, R. Oliveto, D. Poshyvanyk, and A.
De Lucia. 2015. Landfill: An Open Dataset of Code Smells with Public Evaluation.
In 12th 2015 IEEE/ACM Working Conference on Mining Software Repositories (MSR).
482–485.
[18]
Giani Petri, Christiane Gresse von Wangenheim, and Adriano Ferreti Borgatto.
2017. A large-scale evaluation of a model for the evaluation of games for teaching
software engineering. In 39th 2017 IEEE/ACM International Conference on Software
Engineering: Software Engineering Education and Training Track (ICSE-SEET). 180–
189.
[19] Jose Miguel Rojas and Gordon Fraser. 2016. Code defenders: a mutation testing
game. In 9th Software Testing, Verication and Validation Workshops (ICSTW).
162–167.
[20]
Tommaso Dal Sasso, Andrea Mocci, Michele Lanza, and Ebrisa Mastrodicasa.
2017. How to gamify software engineering. In 24th Software Analysis, Evolution
and Reengineering (SANER). 261–271.
[21]
Tushar Sharma and Diomidis Spinellis. 2018. A survey on software smells. Journal
of Systems and Software 138 (2018), 158–173.
[22]
Mauricio Ronny Almeida Souza, Lucas Veado, Renata Teles Moreira, Eduardo
Figueiredo, and Heitor Costa. 2018. A Systematic Mapping Study on Game-
related Methods for Software Engineering Education. Information and Software
Technology 95 (2018), 201–218.
[23] D. Spinellis. 2012. Refactoring on the Cheap. IEEE Software 29 (2012), 96–95.
[24]
Klaas-Jan Stol, Paul Ralph, and Brian Fitzgerald. 2016. Grounded theory in
software engineering research: a critical review and guidelines. In 38th 2016
IEEE/ACM International Conference on Software Engineering (ICSE). 120–131.
[25]
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. 2012.
Experimentation in Software Engineering. Springer.
446
... Evidences of the effectiveness of this method have been reported, especially on the short term, by many researches that applied innovative tools to support the teaching of software testing. [4,5] Code Defenders is a tool focused on mutation testing and meant to improve the tester's satisfaction by introducing challenges in the process. Testers take on a challenge by dividing in two teams: the attacker team has to inject bugs and the defenders have to enforce the test suite, trying to predict the newly added bugs. ...
... Clean Game applies Gamification aspects to the refactoring activity, that is the process of locating and transforming code smells [4]. Code smells are poor design or implementation choices in the source code, that should be rewritten and rethought in favor of more readable and maintainable code. ...
... This form of graphical feedback resembles the typical gaming feature of achievements, which are used to keep the player always aware of his progress through the completion of challenges [8]. Several studies have highlighted the benefits of the application of Gamification to a given technique, especially regarding a better involvement, a higher grade of satisfaction and enjoyment and also better performance when applying the concepts learnt [4,5,6]. ...
... Many developers are not aware of code smells [8], in addition to that identification of smells requires training and experience [9]. Automatic or semi-automatic detection and refactoring tools that support the identification of code smells and refactoring them, help the developers to save their effort and time. ...
... Previous survey studies discussed the challenges of code smells by analyzing the code smells definitions, detection techniques, detection and refactoring tools [2,3,9,10,11,41]. Some studies discussed the awareness of developers about code smells, how the developers perceived code smells and the motivation for removing the smells [6,8,16,42]. ...
... The former earn points if injected faults are not detected and the latter, in the opposite way, earning more points the more faults are detected. The Code Defenders experiment was judged as overall positive by its organizers, who evaluated results in terms of student improvement (code coverage and mutation scores achieved during the semester, both generally increase as the course went on), grades (a moderate correlation between active participation in the experience and final exam scores was observed, as more active players got generally higher scores) and student feedback (through an optional anonymous survey at the end of the experience, with the overall majority of the participants revealing that playing the CodeDefenders game helped them improve and learn more skills, especially when playing the defending role) Another example of Gamification being used in the field of software development is given by the CleanGame platform by dos Santos et al. [6], used to aid the refactoring practice commonly performed during development. Refactoring consists in the act of changing the internal structure of code (i.e. ...
Chapter
Software engineering researchers have been actively investigating novel approaches that focus on the effective development, evolution, and maintenance of high-quality, complex systems for over 50 years. Recently, an interdisciplinary research community has emerged that spans software engineering and games. This community addresses a broad range of issues that prevail in developing games for entertainment, serious games, and gamified applications. In this book, the focus is on the latter two. Serious games are also known as games with a purpose. Beyond their entertainment value, they also fulfill a purpose such as educating or training users on specific learning objectives. Gamified systems are non-entertainment applications that are enhanced with game elements to help motivate and engage users to improve their productivity, satisfaction, time on tasks, and so on. Although distinct research topics, serious games and gamification share a core quality of service attribute: user experience. These applications possess the inherent, interdisciplinary complexity of creating user experiences that engage and motivate users to accomplish specific goals. This introductory chapter begins with a brief presentation of background material covering serious games, gamified systems, and a description of their inherent interdisciplinary development nature. This is followed by a summary of examples for recent advances that are reported in peer-reviewed publications (2016–2022) at the intersection of software engineering and gameful systems. The results are organized around established software engineering research topics. In addition, this chapter provides an overview of the book structure and content; brief summaries of the 11 core chapters are included.
Preprint
Full-text available
While functionality and correctness of code has traditionally been the main focus of computing educators, quality aspects of code are getting increasingly more attention. High-quality code contributes to the maintainability of software systems, and should therefore be a central aspect of computing education. We have conducted a systematic mapping study to give a broad overview of the research conducted in the field of code quality in an educational context. The study investigates paper characteristics, topics, research methods, and the targeted programming languages. We found 195 publications (1976-2022) on the topic in multiple databases, which we systematically coded to answer the research questions. This paper reports on the results and identifies developments, trends, and new opportunities for research in the field of code quality in computing education.
Book
Full-text available
In the Guide to the Software Engineering Body of Knowledge (SWEBOK® Guide), the IEEE Computer Society establishes a baseline for the body of knowledge for the field of software engineering, and the work supports the Society’s responsibility to promote the advancement of both theory and practice in this field. It should be noted that the Guide does not purport to define the body of knowledge but rather to serve as a compendium and guide to the knowledge that has been developing and evolving over the past four decades. Now in Version 3.0, the Guide’s 15 knowledge areas summarize generally accepted topics and list references for detailed information. The editors for Version 3.0 of the SWEBOK® Guide are Pierre Bourque (École de technologie supérieure (ÉTS), Université du Québec) and Richard E. (Dick) Fairley (Software and Systems Engineering Associates (S2EA)).
Article
Full-text available
Context Smells in software systems impair software quality and make them hard to maintain and evolve. The software engineering community has explored various dimensions concerning smells and produced extensive research related to smells. The plethora of information poses challenges to the community to comprehend the state-of-the-art tools and techniques. Objective We aim to present the current knowledge related to software smells and identify challenges as well as opportunities in the current practices. Method We explore the definitions of smells, their causes as well as effects, and their detection mechanisms presented in the current literature. We studied 445 primary studies in detail, synthesized the information, and documented our observations. Results The study reveals five possible defining characteristics of smells — indicator, poor solution, violates best-practices, impacts quality, and recurrence. We curate ten common factors that cause smells to occur including lack of skill or awareness and priority to features over quality. We classify existing smell detection methods into five groups — metrics, rules/heuristics, history, machine learning, and optimization-based detection. Challenges in the smells detection include the tools’ proneness to false-positives and poor coverage of smells detectable by existing tools.
Article
Full-text available
Context : The use of games in software engineering education is not new. However, recent technologies have provided new opportunities for using games and their elements to enhance learning and student engagement. Objective : The goal of this paper is twofold. First, we discuss how game-related methods have been used in the context of software engineering education by means of a systematic mapping study. Second, we investigate how these game-related methods support specific knowledge areas from software engineering. By achieving these goals, we aim not only to characterize the state of the art on the use of game-related methods on software engineering education, but also to identify gaps and opportunities for further research. Method : We carried out a systematic mapping study to identify primary studies which address the use, proposal or evaluation of games and their elements on software engineering education. We classified primary studies based on type of approaches, learning goals based on software engineering knowledge areas, and specific characteristics of each type of approach. Results : We identified 156 primary studies, published between 1974 and June 2016. Most primary studies describe the use of serious games (86) and game development (57) for software engineering education, while Gamification is the least explored method (10). Learning goals of these studies and their development of skills are mostly related to the knowledge areas of “Software Process”, “Software Design”, and “Professional Practices”. Conclusions : The use of games in software engineering education is not new. However, there are some knowledge areas where the use of games can still be further explored. Gamification is a new trend and existing research in the field is quite preliminary. We also noted a lack of standardization both in the definition of learning goals and in the classification of game-related methods.
Article
The potential of gamification in education is based on the hypothesis that it supports and motivates students and can thus lead to enhanced learning processes and outcomes. Gamification in software engineering (SE) education is in its infancy. However, as SE educators we are particularly interested in understanding how gamification is pollinating our field and the extent to which the above claim is valid in our context. A systematic literature mapping has underscored the difficulty in fully corroborating the above claim because few empirical data are available so far. However, key trends and challenges have been identified. We found that the purpose of applying gamification in the SE field is mostly directly related to improving student engagement and, to a lesser extent, to improving student knowledge, although other targets are the application of SE best practices and socialization. We have also discussed insightful issues regarding the implementation cost of gamification, patterns in the most often used gamification elements, and the SE processes and teaching activities addressed. Of the identified challenges, we should highlight the complexity of deciding which gamification approach to follow, the lack of information for choosing gamification elements and the need to control the impact of gamification.
Conference Paper
Code smells are symptoms of poor design and implementation choices that may hinder code comprehension and possibly increase change-and fault-proneness of source code. Several techniques have been proposed in the literature for detecting code smells. These techniques are generally evaluated by comparing their accuracy on a set of detected candidate code smells against a manually-produced oracle. Unfortunately, such comprehensive sets of annotated code smells are not available in the literature with only few exceptions. In this paper we contribute (i) a dataset of 243 instances of five types of code smells identified from 20 open source software projects, (ii) a systematic procedure for validating code smell datasets, (iii) LANDFILL, a Web-based platform for sharing code smell datasets, and (iv) a set of APIs for programmatically accessing LANDFILL's contents. Anyone can contribute to Landfill by (i) improving existing datasets (e.g., adding missing instances of code smells, flagging possibly incorrectly classified instances), and (ii) sharing and posting new datasets. Landfill is available at www.sesa.unisa.it/landfill/, while the video demonstrating its features in action is available at
Poster
Code smells are symptoms of poor design and implementation choices that may hinder code comprehension and possibly increase change-and fault-proneness of source code. Several techniques have been proposed in the literature for detecting code smells. These techniques are generally evaluated by comparing their accuracy on a set of detected candidate code smells against a manually-produced oracle. Unfortunately, such comprehensive sets of annotated code smells are not available in the literature with only few exceptions. In this paper we contribute (i) a dataset of 243 instances of five types of code smells identified from 20 open source software projects, (ii) a systematic procedure for validating code smell datasets, (iii) LANDFILL, a Web-based platform for sharing code smell datasets, and (iv) a set of APIs for programmatically accessing LANDFILL's contents. Anyone can contribute to Landfill by (i) improving existing datasets (e.g., adding missing instances of code smells, flagging possibly incorrectly classified instances), and (ii) sharing and posting new datasets. Landfill is available at www.sesa.unisa.it/landfill/, while the video demonstrating its features in action is available at http://www.sesa.unisa.it/tools/landfill.jsp.
Article
Gamification seeks for improvement of the user's engagement, motivation, and performance when carrying out a certain task; it does so by incorporating game mechanics and elements, thus making that task more attractive. The application of gamification in Software Engineering can be promising; software projects can be organized as a set of challenges which can be ordered and that need to be fulfilled, for which some skills, and mainly much collective effort, are required. The objective of this paper is to propose a complete framework for the introduction of gamification in software engineering environments. This framework is composed of an ontology, a methodology guiding the process, and a support gamification engine. We carried out a case study in which the proposed framework was applied in a real company. In this project the company used the framework to gamify the areas of project management, requirements management, and testing. As a result, the methodology has clearly enabled the company to introduce gamification in its work environment, achieving a quality solution with appropriate design and development effort. The support tool allowed the company to gamify its current tools very easily.