ArticlePDF Available

Investigating the robustness of locators in template-based Web application testing using a GUI change classification model

Authors:
The Journal of Systems and Software 210 (2024) 111932
Available online 18 December 2023
0164-1212/© 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
Contents lists available at ScienceDirect
The Journal of Systems & Software
journal homepage: www.elsevier.com/locate/jss
Investigating the robustness of locators in template-based Web application
testing using a GUI change classification model
Marco De Luca,Anna Rita Fasolino,Porfirio Tramontana
Department of Electrical Engineering and Information Technology (DIETI), University Federico II of Naples, Via Claudio, 21, Naples, 80125, Italy
ARTICLE INFO
Keywords:
Capture & Replay
GUI-based testing
Regression testing
Web application testing
E2E testing
Locators robustness
GUI change classification
ABSTRACT
GUI-based test-cases generated by Capture and Replay tools suffer from the well-known fragility problem:
they may break even if small layout changes are operated in a Web application, without modifying the app
functionality. An approach based on the automatic injection of HTML tag attributes named hooks in the
source code of Web templates has been recently proposed to solve this problem. Such hooks allow the unique
identification of GUI items to be located during test case execution. This technique showed its effectiveness
in a preliminary validation study, where it allowed to significantly reduce the number of test case locator
breakages in regression testing of student-made Web applications. This paper presents a further validation
study where we compared the robustness of hook-based test cases against state-of-the-art and state-of-the-
practice techniques for locating GUI objects. We proposed a three dimensional model for classifying different
types of layout changes and used it to define a benchmark of realistic changes. Thanks to the model, we
systematically compared the robustness of test cases generated by different techniques with respect to specific
types of changes and studied the relationship between fragility issues and types of changes in different test
case generation techniques.
1. Introduction
According to Banerjee et al. (2013), GUI testing is system testing of
a software that has a graphical-user interface (GUI) front-end. Because
system testing entails that the entire software system, including the
user interface, be tested as a whole, during GUI testing test cases
are developed and executed on the software by exercising the GUI’s
widgets (e.g., text boxes and clickable buttons). Capture and Re-
play (C&R) techniques have been widely used in industry since thirty
years to perform GUI testing without requiring advanced testing/pro-
gramming skills (Hammontree et al.,1992). A tester can exploit a
C&R tool to automatically generate GUI-based test scripts starting
from real sequences of user interactions on the GUI of the appli-
cation under test, including mouse clicks, keyboard entries, naviga-
tion commands, etc. These sequences are recorded by the tool and
automatically translated into executable test scripts. C&R techniques
showed their usability in exploratory testing (Kaner,2008) and the
capability to achieve good effectiveness results in mobile application
testing (Di Martino et al.,2020). They are very popular especially in
regression testing activities, where the captured test cases can be used
to re-test a modified application. Capture and Replay (C&R) tools are
also commonly used in End-to-End (E2E) testing of Web applications,
Editor: Prof. Raffaela Mirandola.
Corresponding author.
E-mail addresses: marco.deluca2@unina.it (M. De Luca), fasolino@unina.it (A.R. Fasolino), ptramont@unina.it (P. Tramontana).
where the application is tested as a whole from the perspective of the
end-user (Ricca et al.,2019) for different types of testing, including
robustness, responsiveness, accessibility, compatibility, etc.
However, in spite of their notable advantages, C&R techniques
present some known limitations. One of them consists in the incom-
pleteness of the generated test cases, because not all the application
behaviors can be solicited by GUI based test cases (Rafi et al.,2012).
Generated test cases may also be affected by fragility issues, since
they may fail (and cease to be applicable) even if small changes are
operated in the GUI, without modifications of the functionality of the
app (Coppola et al.,2019). Such test case failures are usually called
‘‘test breakages’’ (Hammoudi et al.,2016) in order to distinguish them
from application failures that occur when tests operate correctly and
reveal faults in the system under test. For example, we may consider
a layout change consisting in moving an HTML tag in another point
of the HTML page. This change may break a test case that located that
tag by an XPath expression referencing the previous tag position within
the HTML tree. Analogously, any change of a HTML tag value attribute
(e.g. a change of the class attribute) may break any test case based on
a XPath involving that attribute value.
https://doi.org/10.1016/j.jss.2023.111932
Received 15 March 2023; Received in revised form 1 October 2023; Accepted 8 December 2023
The Journal of Systems & Software 210 (2024) 111932
2
M. De Luca et al.
Test breakages are a cause of inefficiency in regression testing
processes, since extra time is needed for repairing the broken tests or
substituting them with up-dated ones.
The problem of Web test fragility is well-known in the literature,
where several techniques have been proposed either to generate more
robust test cases (Yandrapally et al.,2014;Leotta et al.,2014;Pirzadeh
and Shanian,2014;Leotta et al.,2016;Kirinuki et al.,2019) or for
repairing broken tests (Choudhary et al.,2011;Hammoudi et al.,
2016;Leotta et al.,2015). A third type of approaches described in the
literature is intended to improve the testability of the Web applications
at the origin (Bajaj et al.,2015b,a) by Web page design or refactoring
techniques that enable the definition of more robust locators and test
cases.
In our previous work (Fasolino and Tramontana,2022), we pro-
posed an approach of the third type that targets the category of
template-based Web applications and exploits the concept of hook-
based locators. Template-based applications use Web templates for
implementing the well-known Model-View separation principle for
designing GUI based applications (Parr,2004). Each template statically
includes the source code of the target Web pages, i.e. the Views that
will be dynamically generated at run-time. The technique that we
proposed to improve the robustness of test cases is based on ‘‘Hook-
based’’ locators that exploit an HTML tag attribute, the so-called hook,
to allow the unique identification of each distinct tag of a Web page.
The point of novelty of our approach is that it artificially introduces
such hook attributes in the Web page templates that are the origin of
each rendered web page. In this way, when a new page is generated, its
tags will already include the hook attributes. Hook attributes will allow
the definition of a new type of DOM-based locators, which are likely to
be more resilient to Web page changes involving layout properties.
In Fasolino and Tramontana (2022) we performed a validation
experiment that preliminarily showed the ability of hook-based locators
to reduce the number of test case breakages and the effort of the regres-
sion testing activities. The study involved three small Web applications
which were iteratively developed by university students of an Advanced
Software Engineering course. The regression test cases developed using
the proposed approach turned out more robust than test cases based
on the default locators generated by the state-of-the-practice Katalon
Recorder tool.1We observed indeed that 16.6% of test cases broke by
fragility issues when using Katalon Recorder locators, whereas none of
them broke when using Hook-based locators.
These experimental results, though encouraging, were limited to the
context of Web applications developed ad hoc by students, which may
not be representative of the user interface complexity of real template-
based Web applications. Another limitation of the study was that the
set of layout changes implemented by the students were limited in
number, since only a few releases (less than 10) were developed and
only some of them included layout changes. Finally, the study only
compared hook-based locators against the ones provided by the Katalon
Recorder tool, whereas further locators among the ones described in the
literature are worth considering.
In this paper we present a new study we performed to further
validate our technique. The aims of this study were (1) to evaluate
the effectiveness of our approach with respect to GUI layout changes
which may occur in real web applications and (2) to compare it against
locators already presented in the literature and used in the practice.
To carry out this study we defined a novel classification model of
GUI changes that may affect the layout of a Web application. The model
distinguishes changes on the basis of the type of GUI item or property
involved in the change (i.e. HTML tag, tag attribute, tag attribute value,
text included between tags, template tag), the type of change (i.e. GUI
item insertion or removal, property modification, position change) and
the relative position of the changed GUI item with respect to the
1Katalon Recorder, https://katalon.com/katalon-recorder- ide/.
one directly involved in the test case. Then, we considered two open-
source applications obtained from GitHub repositories and injected a
benchmark of 192 exemplar changes of 55 different categories in their
code, to modify the layout of these Web applications. The benchmark
of changes was defined according to the proposed GUI change model.
Eventually, we systematically compared the robustness of existing types
of locators with respect to the different implemented changes. This
study allowed us to investigate the robustness of existing locators and
showed their points of strength and weakness with respect to typical
Web layout changes.
While this paper shares with the previous publication (Fasolino
and Tramontana,2022) the proposed hook-based locator technique, it
provides the following three additional research contributions. First,
it proposes a novel classification model of Web application layout
changes that can be used to systematically generate a benchmark of
changes for experimental studies. Second, it presents a validation study
that systematically compares the robustness of the hook-based test
cases proposed in Fasolino and Tramontana (2022) against test cases
based on state-of-the-art and state-of-the-practice locators. Third, we
make available on a GitHub repository all the experimental materials
we developed to carry out this study, for replication aims.
The remainder of the paper is structured as follows. Section 2pro-
vides background information about template-based web applications
and different types of locator proposed in the literature and used in
practice. Section 3presents the technique proposed for the generation
of hook-based locators in template-based Web Applications. In Sec-
tion 4we present the proposed GUI change classification model, while
in Section 5we illustrate the validation study we performed. In Sec-
tion 6related work is presented while Section 7discusses conclusions
and future works.
2. Background
In this Section we report background information about Web apps
based on templates and about the types of locators implemented in
existing C&R tools or described in the literature. Moreover we discuss
the problem of Web apps test case fragility.
2.1. Web apps based on web templates
Unlike the technologies for the server-side development of web
applications used in the past (such as Java Servlet, or JSP, etc.) (Parr,
2004), most modern solutions are today inspired by the principle of
separation between business logic, presentation logic and logic of data,
which allows to obtain applications that are more easily developed
and maintainable. Currently, many web development frameworks are
based on the well-known MVC pattern (or its variants, such as MVVM)
that was created precisely to implement this principle. Many modern
frameworks also adopt the technology of Web templates and tem-
plate engines (Parr,2004) that allow to manage the presentation logic
separately from data and business logic.
A template indeed focuses on how to present the data, and other
components can focus on what data to present. Template files consist
of prewritten markup and template tag blocks where data are inserted.
They are created by developers and then processed by the template
engines that take in tokenized strings and produce rendered strings with
values in place of the tokens as output.
In the following, we report some details about an example Web
application developed using templates. The application has been built
using the Angular framework,2which is an MVC development frame-
work using a JavaScript template engine for generating the Views to
be rendered in the client browser. The application is very simple and
2Angular, https://angular.io/.
The Journal of Systems & Software 210 (2024) 111932
3
M. De Luca et al.
Fig. 1. Screenshot of the example Web application showing the ‘Make Coffee’ task in ‘To Do’ status.
Fig. 2. Screenshot of the example Web application showing the ‘Make Coffee’ task in ‘Done’ status.
it allows to manage user tasks by providing features to show the task
list, to assign them to different users, and to manage the task status.
Fig. 1 shows the app View reporting the ‘Make Coffee’ task assigned
to ‘John Doe’ that is in the ‘ToDo’ status. Fig. 2 represents the View
that is obtained after that the user clicks on the ‘Mark as Done’ action
button: in this latter View the status of the ‘Make Coffee’ task is ‘Done’
and its text is struck through. Listings 1and 2respectively show
an excerpt of the HTML source code of the starting page of the Web
application and of the HTML template including the task table.
Listing 1: Code excerpt of the Starting HTML Page of the Example
Web Application
1<html>
2...
3<body ng -a pp = " my App " >
4<div cl as s= " c ont ai ner " >
5<ui-v ie w ></ui - vi ew >
6</ div >
7...
8</ body>
9</ html>
Listing 2: Code excerpt of a Template of the Example Web
Application
1<ui-v ie w >
2...
3<table>
4...
5<tb od y >
6<tr n g- re pea t= " ta ski nc trl . tas ks " >
7...
8<td >
9<sp an c la ss = " l abe l -d efa ul t " ng -i f= " !
ta sk .do ne " > To D o</ span>
10 <sp an c la ss = " l abe l -s ucc es s " ng -i f= " ! !
task.done " > Done</span >
11 </ td >
12 <td >
13 <div cl as s= " for m -gr oup " >
14 <bu tt on type = " b utt on " ng- if = " !! ta sk.
do ne " ng -c lic k= " task. do ne= fa lse " >
&# x27 0E ; Mark as To Do </ b ut ton >
15 <bu tt on type = " b utt on " ng- if = " !t ask .
do ne " ng -c lic k= " task. do ne= tr ue " >&#
x2 71 4 ; Mark as Done </button>
16 <bu tt on type = " b utt on " ng- if = " !t ask .
do ne " u i - sr ef = " .e di t( { ta s kI nd ex :
$i nde x} ) " >&# x2 70E ; E dit </ bu tt on >
17 <bu tt on type = " bu tto n ">& #x2718 ; Rem ove <
/button>
18 </ div >
19 </ td >
20 </ tr >
21 </ tbo dy >
22 </ tab le >
23 </ ui - view>
Observe that line 5 of Listing 1acts as a reference pointing to
the template definition reported from the line 1 of Listing 2. Listing
2contains the template of the HTML of the table shown in Figs. 1 and
2. This code includes basic HTML tags (such as td, div, button) and
Angular keywords (given by tag attributes starting with the ng prefix)
that are responsible for the different possible appearances of the table.
For example, the two action buttons ‘Mark as Done’ and ‘Mark as To Do’
can be shown alternatively, on the basis of the values of the boolean
task.done variable.
2.2. Locators in C&R tools
C&R tools record the user interactions with the GUI of the ap-
plication under test and translate them into executable test scripts.
Afterwards, such interactions can be replayed in order to replicate the
recorded execution. In the case of Web applications, such interactions
are captured and replayed through a Web browser.
The fundamental problem faced by C&R tools is the generation of lo-
cators able to uniquely identify the items of the Web page on which the
interactions have to be replayed. On the basis of the solution adopted to
solve this problem (Ardito et al.,2019), these tools can be classified in
‘‘Coordinate-based’’, which identify GUI items by registering the exact
screen coordinates at which the interactions have to be performed,
‘‘Visual tools’’ that identify the GUI items through image matching
algorithms like in Sikuli (Yeh et al.,2009), or ‘‘Layout-based’’ ones that
implement locators that directly point to specific items of an HTML
document.
The Journal of Systems & Software 210 (2024) 111932
4
M. De Luca et al.
In Layout-based C&R tools, the most common techniques to identify
Web elements in a page are DOM-based (Chapman and Evans,2011)
and locators are usually expressed as XPath expressions (Leotta et al.,
2014). XPath expressions allow pointing to different parts of an HTML
document, providing a flexible way of navigating through the DOM of
the document. They are usually preferred to other technologies for their
high expressiveness and for their general applicability (Leotta et al.,
2016).
The most popular layout based C&R tools for Web application test-
ing are those offered by the Selenium3and Katalon4projects. Katalon
Recorder is the free C&R tool offered by the Katalon Project, which is
built on top of the Selenium open source automation framework. It is
able to propose different locators for identifying the involved widgets.
An interesting characteristic is that it can also be freely extended by
customized locator generation techniques. In this way, the tester can
choose the locator he prefers among a range of possible alternatives.
In the following we present the characteristics of different types
of locator expressed in XPath: we consider Absolute and Relative
locators, locators generated by ROBULA, one of the state of the art
techniques (Leotta et al.,2014), and some of the ones generated by
Katalon Recorder and Selenium, two of the state-of-the-practice tools.
We will illustrate these locators with reference to a pair of common
examples. We will show the different types of locators pointing to (1)
the ‘‘Mark as Done’’ action button of the ‘‘Make Coffee’’ task assigned
to ‘‘John Doe’’ in the Web page reported in Fig. 2, and (2) to the
‘‘Done’’ label shown in the row corresponding to the ‘‘Make Coffee’’ task
reported in Fig. 2.
2.2.1. Absolute locators
An absolute locator can be represented by a XPath expression
including references to all the items traversed by a path from the root
(i.e., the HTML root tag) to the target element to be located.
When any traversed node presents siblings in the HTML tree, index
values can be used to select the correct node among the siblings. The
absolute locators for the ‘‘Mark as Done’’ button and for the ‘‘Done’’
label of the previous example are reported in Listing 3.
Listing 3: Examples of Absolute locators
1/html / bo dy / di v /ui-view /div/ ui - vie w/ui-v ie w/
ta bl e / tb od y /t r [2 ]/ t d [4 ]/ d iv / bu t to n [1 ]
2/html / bo dy / di v /ui-view /div/ ui - vie w/ui-v ie w/
ta bl e / tb od y /t r [2 ]/ t d [3 ]/ s pa n [1 ]
As it can be seen, absolute locators are very detailed and are able to
uniquely point to a specific tag of HTML pages. Of course, any change of
the items referenced by the XPath expression may break these locators.
2.2.2. Relative locators
A relative locator specifies a path that does not start from the root
node but includes references to the target tag, its parent, or its ancestors
and/or attributes or text of these tags that allow to uniquely identify
the target element.
Two possible relative locators for the same ‘‘Mark as Done’’ button
and for the ‘‘Done’’ label of the previous Figures are shown in Listing
4. The former locator contains references to the div parent tag of the
button to be located and to some of its ancestors, whereas the latter
one includes only a reference to the exact value of the text contained
in the label.
Listing 4: Examples of Relative locators
1// div [@ cla ss = 'container '] / ui -v iew / div / ui -
vi ew / ui - vi ew / t ab le / t bo dy / tr [ 2] / td [ 4] / di v
/ bu tt on [ 1]
2// sp an [n orm al ize - sp ac e( ) = 'Done ']
3Selenium, https://www.selenium.dev/.
4Katalon Recorder, https://www.katalon.com/katalon-recorder- ide/.
A problem with relative locators is that they may become am-
biguous when additional Web elements satisfying the same XPath
expression are inserted into the Web page.
2.2.3. ROBULA locators
ROBULA (Leotta et al.,2014) is a technique to build locators
by following a top-down approach that starts from the most general
locator (i.e., ‘‘//*’’, matching all the elements in the document) and
specializes it via transformation steps. The transformation steps start
by considering the target tag and adding information about the tag
type and attributes into the locator. If a unique locator is obtained,
then it is returned by the algorithm otherwise the algorithm repeats
the transformations for the parent tag. In the worst case, the algorithm
returns an absolute locator. The ROBULA algorithm can be considered
as an optimal trade-off between absolute and relative locator.
ROBULA expressions for the ‘‘Mark as Done’’ button and for the
‘‘Done’’ label are the ones in Listing 5.
Of course, they may break if some of the referenced tag changes and
also if the position of the items in the table changes.
Listing 5: Examples of ROBULA locators
1// tr [ 2] / td [ 4] / di v / bu tt on [ 1]
2// tr [ 2] / td [ 3] / sp an [ 1]
2.2.4. Katalon recorder locators
Katalon Recorder provides several locator generation techniques.
The locators provided by Katalon Recorder try to pursue the same
objectives as those of ROBULA: they are able to uniquely identify the
target element with an expression containing a very limited number of
references. In order to obtain this objective they may include references
to the target attribute, to its position, to the text it includes and
references to the neighbor tags.
Katalon Recorder expressions suggested for the ‘‘Mark as Done’’
button and for the ‘‘Done’’ label are reported in Listing 6.
Listing 6: Examples of Katalon Recorder locators
1xp at h =( ./ /* [ no rm al iz e - sp ac e (t ex t () ) a nd
no rm al iz e - s pa ce ( .) = 'T o Do '] ) [ 2]/
fo ll o wi ng : : bu tt on [1 ]}
2xp at h =/ / */ t ex t( ) [n o rm al iz e - sp ac e (. ) = 'Done '] /
pa re nt : :*
The first expression locates the button by referring to the preceding
item including the ‘‘To Do’’ text, whereas the second one individuates
the tag including the ‘‘Done’’ text avoiding to specify the tag type.
The first locator appears fragile to changes of the relative positions
of the elements into the page, whereas the second one is not fragile
with respect to HTML elements but it can break if the text ‘‘Done’’ is
translated or if it is added in another part of the page.
2.2.5. Selenium locators
Selenium is one of the most commonly used tools for recording
GUI-based test cases via C&R. Similarly to Katalon Recorder, it offers
a large set of possible locators for each considered GUI item by using
different techniques. In particular, Selenium provides a set of locators
consisting in XPath expressions that use tag, attributes, attribute values,
included text and other properties to uniquely locate a GUI item. Of
course, not all locators are available for each GUI item (e.g. XPath
locators including the href attribute can be only applied on anchors).
In addition, Selenium provides a set of locators based on CSS style
properties (i.e. on the values of the class attributes).
Selenium expressions suggested for the ‘‘Mark as Done’’ button and
for the ‘‘Done’’ label are reported in Listing 7. The first expression
locates the button by referring to the position of the button in the
Web page, whereas the second one is based on the existence of the text
‘‘Done’’ in a span tag.
The Journal of Systems & Software 210 (2024) 111932
5
M. De Luca et al.
Listing 7: Examples of Selenium locators
1xp at h =( // b ut to n [ @t yp e = 'button '] ) [ 2]
2xp at h =/ / sp an [ c on ta i ns ( . , 'Done ') ]
The first locator appears fragile, for example, to changes in the order
of the buttons, whereas the second one is fragile with respect to changes
of the container tag (e.g. if span is substituted by div) or in text changes
(i.e. abbreviations or translations).
Examples of the CSS-based locators for the same two GUI items
are reported in Listing 8. Of course, the potential robustness of these
locators is based on the internal design of the Web application: if each
GUI item has its unique style that is never changed, these locators may
not break.
Listing 8: Examples of Selenium locators based on CSS
1cs s =. btn - s uc ce ss
2css=. label
3. Hook-based locators and the technique for generating them
AHook can be defined as an artificial attribute inserted into a HTML
tag. A Hook-based locator is an XPath query including references to hook
attributes. Hook-based locators have been proposed in Fasolino and
Tramontana (2022) as an alternative category of locators, applicable
to Web applications developed using the template technology. These
locators require that a hook attribute, characterized by the property
that its name is uniquely defined in the context of the Web application,
is associated to each tag of each template and each static HTML page
of a Web application. Hook-based locators are thus a particular type
of XPath queries including references just to the hooks of the widgets
to be pointed out and possibly to selected hooks of templates and
containers including it. Hook attributes can be injected in the HTML
tags of the Web pages by means of an automated technique, named
Hook Injection, and the Hook-based locators can be generated by a
Hook-based locator generation technique that will both be presented in
the following subsections.
3.1. Hook injection technique
This technique aims at injecting two different types of hook in the
tags belonging to each template and each static HTML page of a Web
application. The former type of hook is associated with the root tag
of each of them. In this case, the hook is defined by a constant string
made by the prefix ‘‘x-test-tpl-’’ and a variable suffix made by a positive
integer number. The suffix is uniquely defined for each root tag of the
web application components using a consecutive integer numbering.
The latter type is associated with each template HTML tag and is
defined by a constant string made by the prefix ‘‘x-test-hook-’’ and a
variable suffix made by a positive integer number. The suffix part is
also uniquely defined for each tag of the web app components using
a consecutive integer numbering. The hook attributes do not affect
the normal behavior of the Web application and do not conflict with
possible HTML id or name attributes already assigned to the tags.
Listing 9shows an excerpt of the modified code of the starting page
reported in Listing 1that is obtained after the template hook injection.
Analogously, Listing 10 shows the excerpt of the refactored code of the
example template reported in Listing 2, where hooks have been injected
for each HTML tag and in the template definition tag (at line 1).
Listing 9: Code Excerpt of the Starting HTML Page Generated after
Hook Injection
1<html x -t est - tpl - 1 >
2...
3<body ng - app = " m yA pp " x -t est - ho ok -7 >
4<div cl as s= " c on ta in er " x - tes t -h ook -8 >
5<u i- v iew x -t est - ho ok - 9> </ ui - vie w >
6</ div >
7...
8</ body>
9</ html>
Listing 10: Code Excerpt of the Refactored Example Template
1<u i- v iew x -t est - tp l -6 2>
2...
3<ta bl e x -t est - ho ok - 65 >
4...
5<tb od y x -t est - ho ok - 72 >
6<tr n g- r epe at = " t as ki nc tr l. t as ks " x - tes t -
ho ok -7 3>
7...
8<td x - tes t -h ook - 76 >
9<sp an c la ss = " l abe l -d efa ul t " ng -i f= " !
ta sk . don e " x - tes t -h ook -7 7 >T o Do < /
span>
10 <sp an c la ss = " l abe l -s ucc es s " ng -i f= " ! !
ta sk . don e " x - tes t -h ook -7 8 >D on e< /
span>
11 </ td >
12 <td x - tes t -h ook - 79 >
13 <div cl as s= " f orm - g ro up " x - tes t -h oo k -80 >
14 <bu tt on type = " b utt on " ng -i f= " ! !t ask .
do ne " ng -c lic k= " task. do ne= fa lse " x
- tes t -h ook - 82 > &# x2 70 E; Ma rk a s To
Do </b utton >
15 <bu tt on type = " b utt on " ng -i f= " ! ta sk .
do ne " ng -c lic k= " task. do ne= tr ue " x-
test - ho ok -8 1>& #x2 71 4; Mark as Done
</ but to n >
16 <bu tt on type = " b utt on " ng -i f= " ! ta sk .
do ne " u i -s re f = " . ed it ( { ta sk In d ex :
$i nd ex }) " x - tes t -h ook -8 3 >& #x 27 0E ;
Ed it < /bu tt on >
17 <bu tt on type = " bu tto n " x -te st -ho ok -84 >
&# x27 18 ; Rem ov e </ b ut to n >
18 </ div >
19 </ td >
20 </ tr >
21 </ tbo dy >
22 </ tab le >
23 </ ui - view>
The injection technique is applicable not only to templates not pre-
senting previously injected hooks, but also to the ones already including
hooks and needing to be refactored after a maintenance intervention.
In the latter case, the technique is able to preserve existing hooks. The
HookInjection procedure is described in Algorithm 1. The first loop in
the upper part of the procedure searches for the maximum integer suffix
number (𝑚𝑎𝑥) used by the already existing template and tag hooks.
Thus, in the latter lines of the procedure the 𝑛𝑒𝑤𝑇 𝑒𝑚𝑝𝑙𝑎𝑡𝑒𝐻 𝑜𝑜𝑘(𝑚𝑎𝑥)
function is used to create a new hook for each tag not already associated
with its hook. This function will use increasing values in the suffix part
of the hooks. Analogously, the 𝑛𝑒𝑤𝑇 𝑎𝑔𝐻 𝑜𝑜𝑘(𝑚𝑎𝑥)function is used to
create a new hook for each generic tag not already associated with a
hook. This function will also use increasing values in the suffix part of
the hooks.
3.2. Hook-based locator generation
To locate the widgets of a given Web page, the HL (hook-based)
locators will consist of XPath expressions including references to the
hook of the target widget and possibly to selected hooks of templates
and containers including it. The reason for also including template and
container hooks in the XPath query is for guaranteeing the unique
The Journal of Systems & Software 210 (2024) 111932
6
M. De Luca et al.
ALGORITHM 1: Hook Injection Algorithm
1Procedure HookInjection(in : webApplication)
2𝑚𝑎𝑥 0;
3foreach 𝑡𝑤𝑒𝑏𝐴𝑝𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛.𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒𝑠 do
4if 𝑒𝑥𝑖𝑠𝑡𝑠(𝑡.ℎ𝑜𝑜𝑘)then
5if 𝑡.ℎ𝑜𝑜𝑘.𝑛𝑢𝑚𝑏𝑒𝑟 > 𝑚𝑎𝑥 then
6𝑚𝑎𝑥 𝑡.ℎ𝑜𝑜𝑘.𝑛𝑢𝑚𝑏𝑒𝑟 ;
7end
8end
9foreach 𝑡𝑎𝑔 𝑡.𝑡𝑎𝑔𝑠 do
10 if 𝑒𝑥𝑖𝑠𝑡𝑠(𝑡𝑎𝑔.ℎ𝑜𝑜𝑘)then
11 if 𝑡𝑎𝑔.ℎ𝑜𝑜𝑘.𝑛𝑢𝑚𝑏𝑒𝑟 > 𝑚𝑎𝑥 then
12 𝑚𝑎𝑥 𝑡𝑎𝑔.ℎ𝑜𝑜𝑘.𝑛𝑢𝑚𝑏𝑒𝑟 ;
13 end
14 end
15 end
16 end
17 foreach 𝑡𝑤𝑒𝑏𝐴𝑝𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛.𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒𝑠 do
18 if !𝑒𝑥𝑖𝑠𝑡𝑠(𝑡.ℎ𝑜𝑜𝑘)then
19 𝑚𝑎𝑥 𝑚𝑎𝑥 + 1 ;
20 𝑡.ℎ𝑜𝑜𝑘 𝑛𝑒𝑤𝑇 𝑒𝑚𝑝𝑙𝑎𝑡𝑒𝐻 𝑜𝑜𝑘(𝑚𝑎𝑥);
21 end
22 foreach 𝑡𝑎𝑔 𝑡.𝑡𝑎𝑔𝑠 do
23 if !𝑒𝑥𝑖𝑠𝑡𝑠(𝑡𝑎𝑔.ℎ𝑜𝑜𝑘)then
24 𝑚𝑎𝑥 𝑚𝑎𝑥 + 1 ;
25 𝑡𝑎𝑔.ℎ𝑜𝑜𝑘 𝑛𝑒𝑤𝑇 𝑎𝑔𝐻 𝑜𝑜𝑘(𝑚𝑎𝑥);
26 end
27 end
28 end
identification of the widget to be located in all those cases in which the
template engine dynamically includes multiple instances of the same
template in a given page, or multiple instances of the same widget in a
given container. For example, the ng-repeat attribute shown at line 6 of
Listing 10 will produce several instances of the tr tag having the same
hook in the generated pages.
The technique for generating Hook-based locators traverses the
page the widget belongs to, in order to find the hooks that will be
included in the XPath query. Heuristic rules are used to select the hooks
to be included. Algorithm 2shows the Hook-based Locator Generation
strategy.
ALGORITHM 2: Hook-based Locator Generation Strategy
1Procedure LocatorGeneration(in: e, out:pathLocator)
2𝑝𝑎𝑡ℎ𝐿𝑜𝑐𝑎𝑡𝑜𝑟 "" ;
3while !𝑒.𝑖𝑠𝑅𝑜𝑜𝑡() do
4if 𝑝𝑎𝑡ℎ𝐿𝑜𝑐𝑎𝑡𝑜𝑟 ==
""||𝑒.ℎ𝑎𝑠𝑆𝑖𝑏𝑙𝑖𝑛𝑔 𝑠()||𝑒.𝑖𝑠𝑇 𝑒𝑚𝑝𝑙𝑎𝑡𝑒()||𝑒.𝑖𝑛𝑐𝑙𝑢𝑑 𝑒𝑠𝑇 𝑒𝑚𝑝𝑙𝑎𝑡𝑒𝑠() then
5if 𝑒.ℎ𝑎𝑠𝑆𝑖𝑏𝑙𝑖𝑛𝑔 𝑠() then
6𝑝𝑎𝑡ℎ𝐸𝑙𝑒𝑚𝑒𝑛𝑡
𝑖𝑛𝑑𝑒𝑥𝑒𝑑 𝐸𝑙𝑒𝑚𝑒𝑛𝑡𝐿𝑜𝑐𝑎𝑡𝑜𝑟(𝑒.ℎ𝑜𝑜𝑘, 𝑒.𝑓 𝑖𝑛𝑑𝐼 𝑛𝑑𝑒𝑥());
7end
8else
9𝑝𝑎𝑡ℎ𝐸𝑙𝑒𝑚𝑒𝑛𝑡 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝐿𝑜𝑐 𝑎𝑡𝑜𝑟(𝑒.ℎ𝑜𝑜𝑘);
10 end
11 𝑝𝑎𝑡ℎ𝐿𝑜𝑐𝑎𝑡𝑜𝑟 𝑝𝑎𝑡ℎ𝐸 𝑙𝑒𝑚𝑒𝑛𝑡 +𝑝𝑎𝑡ℎ𝐿𝑜𝑐𝑎𝑡𝑜𝑟 ;
12 end
13 𝑒𝑒.𝑠𝑢𝑝𝑒𝑟 ;
14 end
The algorithm takes as input the HTML element efor which the
locator has to be generated and returns as output the pathLocator XPath
expression. It iteratively builds the pathLocator string by backward
navigating the hierarchy of the input element efrom its tag to the root
of the page the element ebelongs to. The string is built from right to
left, starting from the hook associated with the input element e. During
the navigation, the hooks of encountered elements that are considered
relevant for the generation of the locator are added to the XPath query.
Relevant hooks are defined on the basis of heuristic rules.
At the first iteration, the pathLocator is empty and the hook asso-
ciated with the element to be located will be added to the pathLo-
cator. At each successive iteration, the algorithm evaluates if one of
the following three conditions is satisfied by the currently analyzed
element:
If the element has some siblings having the same hook (by the
hasSiblings function), the findIndex function evaluates the index
number of that element. Therefore, the indexedElementLocator
function generates a new pathElement to be added to the pathLo-
cator that both takes into account the hook of the element and
its index.
In the case the analyzed item is a template, a pathElement includ-
ing the hook associated with that template is added to the pathLo-
cator. This rule is intended to solve hook homonymy problems
which may derive from the integration in the project of homonym
hooks from different templates of different projects. The tem-
plate items are recognized thanks to the isTemplate function that
evaluates the hook attribute name.
In case the analyzed item is an HTML element including a tem-
plate (such as the ui-view tag on Line 5 of Listing 9), a query
referring the hook of this tag is added to the locator string in
order to distinguish between the different template instances.
The includesTemplates function recognizes these occurrences by
observing the hooks of the element and of its sons.
The proposed algorithms for hook injection and locator genera-
tion were implemented in the tool-chain presented in Fasolino and
Tramontana (2022).
3.3. Example
With respect to the examples reported in Section 2, the obtained
XPath expressions for the ‘‘Mark as Done’’ button and for the ‘‘Done’’
label are shown in Listing 11.
By analyzing the first XPath expression from the right, the first
element is //*[@x-test-hook-81] that refers to the ‘‘Mark as Done’’
button using the hook defined at Line 15 of Listing 10.
The successive element in the expression is the //*[@x-test-hook-
73][2] one that refers to the tr tag on Line 6 of Listing 10. The ng-repeat
Angular directive on that line may cause the instantiation of multiple
rows of the table, all having the same //*[@x-test-hook-73] hook.
For this reason, the Hook Locator generation algorithm individuates
that the row to be located in this case is the second one and adds
the corresponding indexed expression //*[@x-test-hook-73][2] to the
resulting XPath.
Listing 11: Examples of Hook-based locators
1// *[ @x - tes t -t pl - 1]/ /* [ @x -t est - ho ok - 9] // *[ @x -
te st - tpl - 62 ]/ /* [@ x -t est - ho ok - 73 ][ 2] // *[
@x - tes t -h ook - 81]
2// *[ @x - tes t -t pl - 1]/ /* [ @x -t est - ho ok - 9] // *[ @x -
te st - tpl - 40 ]/ /* [@ x -t est - ho ok - 49 ]/ /* [@ x-
te st - tpl - 62 ]/ /* [@ x -t est - ho ok - 73 ][ 2] // *[
@x - tes t -h ook - 78]
The third element from the right is the //*[@x-test-tpl-62] that
refers to the template including the button. The corresponding hook
is the one reported at line 1 of Listing 10. The fourth element of the
expression is //*[@x-test-hook-9] that refers to the ui-view tag included
at line 5 of the starting page, as reported in Listing 9. This expression
refers to the tag causing the inclusion of the ui-view template defined
in Listing 10. Finally, the leftmost element refers to the html tag of
the starting page reported on Line 1 of Listing 9. The second XPath
expression can be analyzed in the same way. We can observe that it
refers to a different template included in the same page (identified as
//*[@x-test-tpl-40]).
The Journal of Systems & Software 210 (2024) 111932
7
M. De Luca et al.
Fig. 3. An example of iterative development process using Hook-based locators.
3.4. Using Hook-based locators in End-to-End regression testing processes
In an End-to-End (E2E) testing process, the application is tested as a
whole from the perspective of the end-user. C&R tools allow to record
E2E test cases and automatically generate test scripts that can also be
re-executed in regression testing. The hook-based technique presented
in this Section provides an alternative type of locator with respect to
the ones implemented by state-of-the-practice tools, like Selenium and
Katalon Recorder.
In our previous work (Fasolino and Tramontana,2022) we pre-
sented a possible implementation of an iterative development process
where the E2E Regression testing activity exploits hook-based locators.
For clarity, we report in Fig. 3 the description of a generic iteration
of this process that includes five either manual or automatic activities
(decorated by different icons in Fig. 3: a hand for manual activities and
a pair of gear wheels for automatic activities).
The process starts when a new version of the application has been
developed (at the end of a generic Development Iteration). This version
may implement either new Web application features and/or correct
faults introduced in the previous iterations. Three sequential automatic
activities are therefore executed: Hook Injection,E2E Regression Testing
and Test Result Evaluation. The Hook Injection automatically injects
unique hooks in the HTML tags, when needed, according to Algo-
rithm 1. The E2E Regression Testing consists of the execution of all the
E2E regression test cases inherited from previous application versions.
These test cases are stored in the Test Case Repository. In the Test Result
Evaluation activity, test cases are classified in three different categories
according to the obtained results: (1) passed, (2) failed (i.e. test cases
successfully executed but with one or more assertions that fail), or (3)
broken test cases (i.e. test cases that did not successfully execute, due
to a crash of the test code) and are stored in the Test Case Repository as
well.
The successive activity (New Test Cases Recording) consists in record-
ing new E2E test cases, either needed for covering the additional
features of the application, or for substituting broken test cases. This
activity is supported by a Capture and Replay tool implementing the
hook-based locator generation strategy described by Algorithm 2. The
recorded test cases will be pushed in the Test Case repository. At the end
of this activity, a new development iteration will start. This example
of process iteration shows how the hook injection step is the only
additional activity needed before the execution of a traditional E2E
regression testing process.
4. Layout change model for template-based Web applications
Hammoudi et al. (2016) proposed a taxonomy of proximal causes of
locator breakages based on the empirical observation of test breakages
Table 1
Change classification model.
Dimension Possible values
Types of Object Tag, Tag Attribute, Tag Attribute Value, Text, Template Tag
Types of Change Removal, Modification, Insertion, Position Change
Relationships Parent–Child, Ancestor–Descendant, Sibling, Includes
in real Web applications. In their taxonomy they differentiated between
breakages due to modifications in the source code of the applications
(e.g. modifications in attributes, tags, texts), breakages due to the
application behavior (e.g. JavaScript exceptions), and breakages due
to browser behavior (e.g. page reload, user session timeout, pop-up
windows). The first type of causes was the predominant one, covering
73.62% of observed test breakages, and in the rest of the paper we will
focus on them.
Since that taxonomy reports generic modifications in attributes,
tags, and textual contents that can cause locator breakages, we decided
to develop a more detailed classification of changes that may cause such
breakages. Our classification considers three dimensions for character-
izing a layout change of an element of an HTML file: the first dimension
specifies the Type of Object involved in the change (that may be a tag, a
tag attribute, a tag attribute value, a text, or a template tag). The second
dimension specifies the Type of Change, that may consist in either a
Removal, a Modification of an existing Object, the Insertion of a new
Object or the Position Change of an existing object to another point of
the file. The latter dimension consists of the Relationship between the
modified Object and the Object that is involved in the test case and
needs to be detected. Possible relationships include the ‘‘parent–child’’,
‘‘ancestor–descendant’’, and ‘‘sibling’’ relationship between Objects. A
fourth relationship is the ‘‘includes’’ one between the template that
defines the Object to be detected and the same Object. Of course,
changes may also involve the object itself. Table 1 summarizes the three
dimensions we use to classify changes that potentially cause breakages
and the possible values of each dimension.
It is easy to observe that, while the majority of combinations of
these values are representative of feasible changes, a minority of them
will be meaningless. For example, if we consider a ‘‘Tag Attribute Value’’
Object, it will be possible to define a ‘‘Removal’’ or a ‘‘Modification’’ type
of Change, whereas the ‘‘Insertion’’ is not applicable (attributes are not
structured data), and the ‘‘Position Change’’ is irrelevant for attributes.
Table 2 presents meaningful combinations of types of Object and
types of Change derivable from the Model and some clarifying ex-
amples of them. Each row reports the type of Object involved in the
change, a Change Id and its Description, and an example of Original Code
before implementing that change, and eventually the Modified Code. In
the Table, the hChange Id corresponds to changes that are specific to
The Journal of Systems & Software 210 (2024) 111932
8
M. De Luca et al.
Table 2
Example of changes for different Types of Object and Types of Change included in the classification model.
Object type Change ID Change type description Original Code Modified Code
a Attribute value
modification
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > < 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑛𝑒𝑤𝑉 𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
Attribute b Attribute removal < 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > < 𝑡𝑎𝑔 > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
c Attribute identifier
modification
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > < 𝑡𝑎𝑔 𝑛𝑒𝑤𝐴𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
Text d Text content
modification
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > < 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑛𝑒𝑤𝑇 𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
e Text content removal < 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > < 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙 𝑢𝑒’’ >< 𝑡𝑎𝑔 >
f HTML tag movement
(within a container)
< 𝑑𝑖𝑣 >
< 𝑡𝑎𝑔 >< 𝑡𝑎𝑔 >
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
<𝑑𝑖𝑣 >
< 𝑑𝑖𝑣 >
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
< 𝑡𝑎𝑔 >< 𝑡𝑎𝑔 >
<𝑑𝑖𝑣 >
g HTML tag movement
(in any point of the
HTML tree)
< 𝑑𝑖𝑣 >
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
< 𝑡𝑎𝑔 >< 𝑡𝑎𝑔 >
<𝑑𝑖𝑣 >
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
< 𝑑𝑖𝑣 >
< 𝑡𝑎𝑔 >< 𝑡𝑎𝑔 >
<𝑑𝑖𝑣 >
Tag h HTML tag movement
(between two
templates)
< 𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒1>
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
<𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒1>
< 𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒2>
<𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒2>
< 𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒1>
<𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒1>
< 𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒2>
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
<𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒2>
i HTML tag removal < 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > 𝑡𝑒𝑥𝑡
j HTML tag type
modification
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > < 𝑛𝑒𝑤𝑇 𝑎𝑔
𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑛𝑒𝑤𝑇 𝑎𝑔 >
k HTML tag insertion < 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑣𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 > < 𝑡𝑎𝑔 >< 𝑡𝑎𝑔 >
< 𝑡𝑎𝑔 𝑎𝑡𝑡𝑟 =‘‘𝑛𝑒𝑤𝑉 𝑎𝑙𝑢𝑒’’ > 𝑡𝑒𝑥𝑡 < 𝑡𝑎𝑔 >
template-based Web applications, whereas the other ones may also be
applied to traditional HTML pages.
Regarding the Relationship dimension of the classification model,
it can be used to define changes that involve objects having different
types of relationship with the tag to be pointed to. As an example, the
change may involve the item itself, an ancestor, the parent, a sibling of
the item, or the Template including it. In the remainder of the paper,
we will say that the relationship is of type (𝛼), if the change involves
the item to be located itself, it is of type (𝛽) if it involves the parent of
the item, of type (𝛾) if it involves an ancestor, of type (𝛿) if it regards
a sibling, and of type (𝜖) if it regards the template the item belongs to.
Listing 12 reports an excerpt of HTML code to show some examples
of relationship values. With respect to this Listing, if we consider the
anchor tag < 𝑎 𝑜𝑛𝑐𝑙 𝑖𝑐𝑘 =‘‘𝑔𝑜()’’ >at line 4 as the target of a test case,
depending on the item that will be changed, we will have a different
value in the Relationship dimension. As an example, if the change
regards the same anchor tag, the Relationship is of type (𝛼), whereas
a change regarding the < 𝑑𝑖𝑣 > at line 3, will be characterized by a
relationship of type (𝛽). The Listing also illustrates examples of the
other three types of relationship.
Listing 12: An Excerpt of a HTML Code Showing Tag Classification
with respect to the Target Tag (in line 4)
1<t emp la te > <!- - 𝜖(t em pla te ) -- >
2<p > <!- - 𝛾(a nc est or ) -- >
3<di v > <!- - 𝛽(parent) -->
4<a o nc li ck = " g o () " > < !- - 𝛼(t ar get t ag ) -->
5<a h re f =" \ # to p " > <!- - 𝛿( sib lin g) - ->
6</d iv >
7</p >
8</template>
This model can be used to select a subset of relevant types of change
we may be interested to implement in a subject Web application.
Consequently, different benchmarks of implemented changes could be
defined, depending on the chosen types of model changes.
5. Experiment
In this section we present an experiment we performed to investi-
gate the capability of Hook-based locators to prevent regression test
breakages. To this aim, we considered the Layout change classifica-
tion model proposed in Section 4to define a benchmark of exemplar
changes in Web application layout, implemented these changes in real
Web applications and evaluated the fragility of regression tests with
respect to the modified applications. We also compared the fragility of
test cases based on the proposed Hook-based locators against the ones
based on state-of-the-art and state-of-the-practice locators.
5.1. Research questions
Our study investigates two research questions.
RQ1 How does the robustness of test cases using Hook-based locators
vary when different types of Layout change are implemented in
a Web application?
RQ2 How does the robustness of test cases using Hook-based locators
compare to the robustness of test cases using other types of
locators, when different types of Layout change are implemented
in a Web application?
With the first research questions we aim at evaluating to what extent
test cases based on Hook-based locators are resilient to different types
of common layout changes of the Web application. The second one,
instead, aims at comparing the resilience of these test cases against the
one of analogous test cases exploiting diverse types of locators.
5.2. Experimental objects
The experiment involved two open-source template-based Web ap-
plications implemented by the Angular framework, named A1 and A2
hereafter. We selected two applications as a sample of real applications
to be artificially modified for introducing a systematic set of typical GUI
changes which may break locators.
A1 is a small application that allows to manage a contact list.
It offers functionalities for adding a new contact, showing the list
of contacts, and saving the list in a local database. The application
The Journal of Systems & Software 210 (2024) 111932
9
M. De Luca et al.
includes 3 templates and a static HTML page. It has been forked from
a tutorial example published on GitHub.5
A2 is an application that provides an Angular-based clone of the
Spotify web client. The source code of the application is available on
GitHub6where it counts about 300 forks and more than 2K stars. This
application, like the real Spotify client, provides multiple functionali-
ties, such as the possibility to search for an artist or to display the most
popular songs. It is composed of 9 HTML pages and 47 templates.
5.3. Experimented locators
In the experiment we considered 7 types of locators, namely Ab-
solute, Relative, ROBULA, Katalon, Hook-Based, besides two locators
provided by Selenium (respectively, XPath based and CSS based).
The first five locators were implemented by Katalon Recorder that
provides the option of adding ‘‘Extension Scripts’’ that allows to imple-
ment new user-defined locators besides the ones offered by the tool.7
The implementation of these locators has been made available, together
with all the experimental materials in a GitHub repository.8
As to the Absolute and Relative locators, we implemented them
according to the definitions provided in the Background Section. As
to ROBULA locators, we implemented this technique on the basis of
the ROBULA algorithm described in Leotta et al. (2014). As to Katalon
locators, we decided to always consider the first one among the ones
natively proposed by the tool (version 5.9.09) for the following reason.
During the test case recording step, the tool provides a number of
alternative locators associated with the action being recorded. The
number, type and order of offered alternative locators is variable,
depending on the specific type of object. In addition, Katalon does not
explicitly label the type of the proposed locator. As a consequence, to
avoid non-determinisms and assure the repeatibility of our experiment,
we decided to always select the first locator proposed by the tool.
As regards Selenium, we have used the version 3.17.2.10 We chose
to consider two types of locator, the former being based on XPath and
the latter being based on CSS style attributes. Both types of locator
were presented in the Background Section. Since Selenium usually
provides more XPath-based locators, we had to select one of them.
We preferred the XPath-based locators referring to specific attributes
and values (when possible), rather than the ones referring on text and
tag hierarchy, since we considered them more promising based on
our experience as testers. In the following we will refer to the former
Selenium locator as ‘‘S’’ and the latter one as ‘‘C’’.
5.4. Experimental procedure
The experimental procedure we followed included 5 steps. Fig. 4
describes the flow of these steps and the produced artefacts. Automated
steps are reported as rounded boxes with red background while manual
steps are shown as rounded boxes with blue background, and the
produced artefacts are represented as document-shaped boxes. The
automated steps were implemented using GitHub Actions in YAML
scripts, while the artefacts were stored in a GitHub repository.
5https://github.com/bbachi/angular-java- example.
6https://github.com/trungk18/angular-spotify.
7Katalon Recorder Extensions Scripts, https://docs.katalon.com/docs/plug
ins-and-add-ons/katalon-recorder-extension/get-your-job-done/extend-katalon
-recorder/extension-scripts-aka-user-extensions.js-for-custom-locator-builders-
and-actions-in-katalon-recorder.
8Experimental Material, https://github.com/reverse-unina/LocatorsFragilit
yExperimentation.
9Katalon Recorder v.5.9.0, https://chrome.google.com/webstore/detail/ka
talon-recorder-selenium/ljdobmomdgdljniojadhoplhkpialdid.
10 Selenium IDE v. 3.17.2, https://chrome.google.com/webstore/detail/sele
nium-ide/mooikfkahbdckldjjndioackbalphokd.
Table 3
Test case summary.
#Actions #Assertions #Locators #Pages
Test case for A1 6 2 8 1
Test case for A2 8 1 9 6
In the first step we automatically injected hooks in the source code
of the considered applications using the automated Hook-Injector com-
ponent presented in Fasolino and Tramontana (2022). This component
is able to operate on the source code of template-based Web applica-
tions implemented by several different frameworks such as Angular,11
Freemarker,12 Twig13 and Smarty.14 Of course, the tool can operate also
on pure HTML pages. We injected 56 hooks in A1 and 234 in A2,
obtaining two updated versions of the original applications. The effort
required by the component to execute this step was actually negligible,
being of 4 ms and 9 ms, respectively, for A1 and A2.
Therefore, for both applications, we manually recorded a test case
consisting of a sequence of actions (such as click, fill-in a text field, etc.)
to be performed on different Web page items and assertions to check
the test results. In the first application (A1) we recorded a test case
that inserts a new contact in the contact list and visualizes the updated
list. In the second application (A2), the recorded test case includes the
authentication of the user (via login and password), the search for an
artist by name, the selection of one of its songs from the returned song
list and the navigation to the Web page related to the album including
the selected song.
We recorded seven different versions of the same test case, each of
them using one of the different types of locator reported in Section 5.3.
Test cases were recorded using Katalon Recorder and Selenium. Table 3
reports summary data about each test case including the number of
actions, number of assertions, number of locators, and number of Web
pages traversed by the test case.
We also had to design and implement a set of modified app versions.
To this aim we defined a benchmark of exemplar changes for the web
applications under test, according to the change classification model
reported in Section 4. Our benchmark had to cover at least once all the
types of change reported in Table 2 and possible Relationships defined
in the change model.
Each new modified version of the Web application had to include
a single change. As to the implementation of the changes, we con-
veniently chose some GUI item associated to locators included in the
recorded test case and applied them the types of change compatible to
that specific type of object. As an example, for a ‘‘tag attribute’’ type of
object, we could define 3 types of changes (namely a, b, c in Table 2).
In another type of change, we modified one of the neighbors of the
item to be located by the test case. As an example, with respect to the
previous ‘‘tag attribute’’ example, we also applied the three changes
to an attribute included in the father, an ancestor, and a sibling of the
item to be located, as well as to an attribute of the template including
the item to be located.
Tables 4 and 5show a summary of the changes we applied. Table 4
reports, for each application under test, the ID of the application,
the total number of different versions and the number of different
versions subdivided with respect to the 11 different types of changes
(a .. k). Table 5, instead, shows the number of different versions
classified with respect to the 5 different categories of involved items
relationships (𝛼.. 𝜖). The different count values in the tables are due
to the non-applicability of some specific types of changes on the two
applications.
11 Angular, https://angular.io/.
12 Apache Freemarker, https://freemarker.apache.org/.
13 Twig PHP Template Engine, https://twig.symfony.com/.
14 Smarty PHP Template Engine, https://www.smarty.net/.
The Journal of Systems & Software 210 (2024) 111932
10
M. De Luca et al.
Fig. 4. Overview of the experimental procedure. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Table 4
Implemented changes with respect to change types.
Web # Versions Count of versions wrt the change types
application a b c d e f g h i j k
A1 100 12 12 12 6 6 9 8 9 9 9 8
A2 92 10 10 10 4 4 10 10 9 8 9 8
The modified versions of the Web pages/templates were manually
implemented and stored in two GitHub repositories that we made
available1516 and that also report the results of the test executions.
As to the step of Test Case Execution, we exploited the GitHub
Actions through the definition of a YAML script that allowed us to
setup the same test case execution environment for each test case. The
average times needed for executing each test case was 15 s, for the A1
app, and 17 s for the A2 app. As it can be seen, these execution times
were sensibly longer than the times required by the hook-injection step
that we reported at the beginning of this Section.
Finally, in the Test Case Results Classification step, we manually
analyzed the test case results in order to obtain passed test cases and
test cases failed for fragility issues.
15 A1 repository, https://github.com/reverse-unina/A1- ContactList.
16 A2 repository, https://github.com/reverse-unina/A2- Spotify.
Table 5
Implemented changes with respect to the involved items relationships.
Web # Versions Count of versions wrt the involved items relationships
application 𝛼 𝛽 𝛾 𝛿 𝜖
A1 100 25 18 22 22 13
A2 92 18 17 17 22 18
5.5. Variables and metrics
The independent variable of the study is the type of locator loc used
by test cases. The loc considered values are:
A - Absolute locator
R - Relative locator
RO - Locator generated using the ROBULA technique
K - Locator provided by Katalon
H - Hook-based locator
S - XPath-based Locator provided by Selenium
C - Locator provided by Selenium based on CSS style attributes
The dependent variable we consider is the number of test cases broken
for fragility issues. In order to evaluate this metric we need to distinguish
between test cases broken due (1) to obsolescence or (2) to fragility of
the locators. The obsolescence of a test case indicates that the test is no
more applicable and has to be abandoned, whereas in the other cases
the breakage of the test case was unexpected. For example, if a feature
is removed from the Web page (e.g. its enabling button is removed),
The Journal of Systems & Software 210 (2024) 111932
11
M. De Luca et al.
Table 6
Test case execution results for the two AUTs.
Test case A1 : Contact list A2 : Spotify
execution Locator Locator
results A R RO K S C H A R RO K S C H
Passed 58 86 83 74 77 74 88 47 49 64 59 76 69 79
Failed for Obsolescence 7 7 7 7 7 7 7 8 8 8 8 8 8 8
Failed for Fragility 35 7 10 19 16 19 5 37 35 20 25 8 15 5
Total 100 100 100 100 100 100 100 92 92 92 92 92 92 92
then the test case has to be considered obsolete because the test case is
no more executable. On the other hand, if other types of layout changes
occur (e.g. the enabling button is moved to a different position or its
text is changed), then the test case should remain executable. Of course,
the obsolescence of a test case does not depend on the locator types
and has to be manually assessed.
5.6. Experimental results
Table 6 reports, for each of the applications under test, the results
of the execution of the test cases based on the 7 types of locators (A,
R,RO,K,S,C, and H). Each row of the Table shows, for each type
of locator, the cumulative number of passed test cases (Passed), those
that failed due to obsolescence (Failed for Obsolescence), and those that
failed due to locator fragility (Failed for Fragility).
As the Table shows, for the A1 application the test cases that failed
the most for fragility issues were the ones based on Absolute locators
(35 failures), whereas the test cases with Hook-based locators proved
to be the most robust, failing only 5 time out of 100.
Analogous results were observed with respect to the second appli-
cation, A2, where the most robust test cases were always the ones
based on hook locators, which failed for fragility only 5 times out of 92
executed test cases. The least robust test cases were the ones based on
Absolute and Relative locators (with 37 and 35 failures, each), followed
by the ones based on Katalon, ROBULA and Selenium locators.
In order to comprehend to what extent the robustness of test cases
based on different locators varied with the types of change imple-
mented in the Web applications, we analyzed in detail the observed
test breakages. Table 7 reports two Fragility Maps that associate each
considered type of change with the type of locators that caused the
breakage of a test case for fragility issues at least once, respectively in
A1 (in Table 7a) and A2 (in Table 7b). Each table cell corresponds to
a specific type of change: the row identifies the type of Layout change
(using the same Change IDs presented in Table 2), while the column
identifies the Relationship between the modified element and the target
tag (using the notation shown in Section 4). Each cell contains one or
more IDs, corresponding to any of the 7 considered types of locators
(A, R, RO, K, S, C, H) for which at least a failure was observed. A dash
(–) in a cell indicates a ‘‘Don’t Care’’ case with respect to the fragility.
In other terms, we did not detect fragility issues of all locators in that
case, either because it was not possible to apply that type of change,
or because all the test cases failed due to obsolescence. Eventually, a
symbol indicates that all the test cases were robust with respect to
that type of modification in the context of the executed test cases. A
detailed list with all the executed test cases and their execution results
is reported in the experimental material.17
In order to provide an answer to RQ1, we focused on the results
achieved by test cases adopting the Hook-based locators. As Table 6
shows, there were only 5 hook-based test cases in A1 and 5 in A2
that failed by fragility (against overall 167 passed test cases). As the
Fragility Maps show, all these breakages were due to changes involving
templates. In details, a total of 6 breakages (4 in A1 and 2 in A2) were
17 Experimental Material, https://github.com/reverse-unina/LocatorsFragilit
yExperimentation.
Table 7
Maps of the test fragility issues for A1 and A2.
(a) Fragility Map for A1
Layout Relationship with the target tag
change 𝛼 𝛽 𝛾 𝛿 𝜖
a R,RO,C C S,C
b R,RO,S,C C
c R,RO,C C
d R,K K
e R,K K K
f A,RO,K,S,C A,C A,S,C A,K A,K,S,C
g A,K,C A A,K,S,C K
h A,K,S,H A,S A,H A,S,H
iA,S,C A,S,H K A
j A,R,RO,K,S A,S A A
k A,S,C A A,RO,K,S A
(b) Fragility Map for A2
Layout Relationship with the target tag
change 𝛼 𝛽 𝛾 𝛿 𝜖
a RO,K,C K,S,C C K R,K
b RO,C C C R
c RO,C C C R
d R,K R,K
e R,K R,K
f A,R,RO,K A,R,RO,K A,R,RO,K A,R,RO,K A,R,RO,K,H
g A,R,RO,K,S,C A,R,RO,K,S A,K,H A A,R,K
h A,R,RO,K,H A,H
i A,R,RO,K A,R,RO,H A,R,K,C
j A,R,RO,K,S A,K A A,R
k A,R,RO,K A,R,RO A,R,RO,K A,R
Legenda:
Lowercase letters from a to j on first column: layout change types
Uppercase letters from 𝛼to 𝜖on header row: relationships with the target tag
A, R, RO, K, S, C, H: Absolute, Relative, ROBULA, Katalon, Selenium, Selenium CSS,
Hook-based locators
–: inapplicable changes or test cases failed by obsolescence
: no breakages at all.
due to changes of the position of HTML tags between two templates
(change type h), 2 breakages (one for each application) were due to
the removal of a template container (change type i) and 2 breakages in
A2 were caused by a position change of the template tag (change type
g). This result was not surprising, because Hook-based locators always
depend on template references. We point out that also Absolute loca-
tors were fragile to these type of changes, whereas Relative, ROBULA,
Katalon Recorder and both the Selenium locators often did not break
because they do not always have references to template ancestors.
On the basis of the observed results, we could answer the RQ1
research question as follows:
With respect to the considered layout changes, test cases using Hook-
based locators can be considered robust to all layout changes, except
the ones consisting in moving a tag between two templates, or template
changes.
In order to answer to RQ2, we analyzed the breakages of test cases
based on all the 7 considered locators and compared them. We ranked
the different types of locator on the basis of the growing number of
The Journal of Systems & Software 210 (2024) 111932
12
M. De Luca et al.
observed test breakages. After the Hook-based locators that ranked first,
the second most robust ones were the X-Path based ones generated
by Selenium. Selenium is probably the best tool in the current
state-of-practice, thanks to the multiplicity of supported techniques for
generating locators.
Only 24 test cases based on XPath locators proposed by Selenium
failed, whereas 153 did not break after the implementation of the lay-
out changes. Selenium supports many effective strategies for generating
robust locators: its locators exploit id or href attributes when available
and tends to prefer the inclusion in its XPath expressions of properties
of the target tag or of its closer tags. For this reason, locators proposed
by Selenium failed especially in those cases in which the modifications
involved the value of attributes such as id or href, or in which such
attributes were not present.
We could conclude that:
Selenium XPath locators were robust to most implemented changes,
but they resulted fragile when specific attributes are changed (e.g. id
or href) or with changes of the type or position of the target tag.
The third most robust type of locator resulted the one generated by
ROBULA, one of the state-of-the-art techniques (Leotta et al.,2014).
In our experiments, 30 test cases using ROBULA locators failed for
fragility issues, whereas 147 test cases passed. Most of the failures (17
out of 30) concerned changes to the target tag (relationship type 𝛼),
while the other failures concerned the parent tag (5 out of 30), sibling
tags (4 out of 30) or ancestor tags (3 out of 30). Only in a single case
the test case using ROBULA locators failed for a template change. This
result was not surprising since the ROBULA algorithm behaves similar
to the algorithms included in Selenium and produces compact XPath
expressions based on properties of the target tag or of its closer tags.
Regarding the considered change types, we observed that ROBULA
locators failed for each possible change except those involving text
changes (i.e. change types dand e), because text is not considered by
ROBULA locators. We could conclude that:
ROBULA locators are generally fragile to any type of change, espe-
cially to those involving the target tag or its nearest neighbors, but are
robust to changes involving texts included in tags.
The fourth most robust technique in our experiment was the CSS-
based one proposed by Selenium. These locators contain references
to the style of the target tag and possibly the parent tag or other
ancestors. In the experiment, 34 test cases using CSS-based locators
broke after layout changes, whereas 143 remained valid. Most locator
breakages concerned changes in the style of the target tag (i.e. the
class attribute) or of one of its neighbors. In further cases where the
target tag had no class attribute, layout changes involving the target
tag type or movements of this tag within the HTML tree also caused
test breakages.
We could conclude that:
CSS-based Selenium locators were fragile mostly with respect to
changes involving the web page style, whereas they generally resulted
robust when style information used in the page were not involved in
the changes.
The fifth ranked category of locators was the one provided by
Katalon Recorder that caused 44 test breakages against 133 passed test
cases. We had some test breakages for each change type except for
the ones involving just attributes (change types a,band c). This result
can be explained by observing that the locators suggested by Katalon
Recorder usually include predicates related to the text, to the tag type,
and to the neighbors of the target tag but not based on attribute values.
On the other hand we have observed test breakages for tags having all
the considered relationships types 𝛼,𝛽,𝛾,𝛿,𝜖. On the basis of these
data, we could conclude that:
The locator suggested by Katalon Recorder represents the most robust
alternative only when the change involves only attributes.
Finally, we analyzed the performance of the two basic techniques
for locator generation, i.e. Relative and Absolute, that ranked second-
last and last, respectively. As to Relative locators, we observed overall
42 test case breakages, against 135 passed test cases. More in details,
Relative locators caused 7 breakages in A1, against 86 passed test
cases, and 35 test breakages in A2, against 49 passed test cases. The
worse performance of Relative locators in A2 was due to the fact
that this latter application has a more complex user interface, with
many dynamically generated HTML elements having no distinctive
properties. For this reason, the generated XPath expressions contained
many references to different tags of the HTML tree, that made them
more fragile.
There was at least a test breakage for each type of change and
for each type of relationship. With respect to both applications, the
majority of the test cases with Relative locators broke after a change in
the target tag (relationship 𝛼): with respect to 37 changes of type 𝛼that
did not cause obsolescence, there were 20 broken test cases against 17
passed ones. Of course, each XPath expression included a reference to
the target tag, thus the changes on that target caused location breakages
in the majority of test cases. On the basis of these data, we could
conclude that:
Relative locators may be fragile to any type of change, in particular
to the ones involving the target tag and its nearest neighbors.
Absolute locators represent the easiest solution to the problem of
identifying the elements of a test case, but proved to be the most fragile
in our experiment. We observed the breakage of 72 test cases against
105 passed test cases. More in details, since Absolute locators are built
without taking into account attributes and text, they resulted very
fragile with all changes regarding tags and templates (change types f,
g,h,i,j,k). By narrowing our analysis to these six types of change,
we observed 72 test breakages against only 23 passed test cases.
Breakages were observed for each type of considered relationship. We
can conclude that:
Absolute locators are fragile with respect to almost all the considered
types of change, except those regarding only text and attributes.
In conclusion, with respect to the research question RQ2, the ex-
periment showed that none of the considered locator types is robust to
any type of change. Hook-based locators were however the most robust,
even if they are exposed to fragility issue in case of changes impacting
templates. In the latter cases, the other types of locators resulted more
robust, with the exception of the Absolute ones.
If we consider the state-of-the-practice C&R tools Katalon and Sele-
nium, Selenium locators showed to be the most robust. The locators
suggested by Katalon Recorder, another state-of-the practice tools, per-
formed worse than the Hook-based, ROBULA and Selenium locators,
but they showed to be a valid alternative for changes involving only
tag attributes.
The locators produced by the ROBULA technique represent another
valid alternative, even if we observed fragility issues with respect to
any type of change, in particular when the tags directly involved in the
test case are modified. Relative locators were generally not very reliable
but have the advantage of being generally the most readable ones, by
including a few references centered on the element to be identified and
The Journal of Systems & Software 210 (2024) 111932
13
M. De Luca et al.
on its closest neighbors. Absolute locators proved to be the most fragile
solution in all conditions.
5.7. Threats to validity
In this subsection we discuss possible threats to the validity of our
study, according to the classification proposed in Wohlin et al. (2012).
5.7.1. Threats to internal validity
In order to execute all the test cases under the same conditions, we
run all of them in an execution environment consisting of a container
offered by GitHub and configured by a YAML script that defined the use
of the same Ubuntu version (22.04.2) and the same headless version of
Chrome (110.0.5481.77). A delay of one second was added between
each step of each test case in order to reduce the risk of inconsistent
test results due to timing issues. Each test case has been executed twice
and we have always obtained the same results.
5.7.2. Threats to construct validity
Implementations of Absolute, Relative and ROBULA locator gener-
ators were not available in the context of Katalon Recorder so we had
to implement and test them before the execution of the experiment.
Regarding the Absolute and Relative locators, we implemented genera-
tors consistently with their definitions provided in Section 2. Regarding
ROBULA, we implemented it by referring to the algorithm reported
in Leotta et al. (2014). As for Katalon Recorder, we used the locators
provided by the tool. Since Katalon Recorder suggests several types of
locator at each request, we decided to always use the first suggested
one, in order to avoid non determinism in our experiment. Selenium
offers several possible locators for each object (the number and type of
different locators vary with the object to be located). As explained in
Section 5.4 we have selected the most promising locators based on our
experience as testers. As a consequence, we cannot exclude that other
testers might choose different locators. All the locator implementations
have been made available in the experimental material for replication
purposes.18
In order to distinguish between test cases broken by fragility or by
obsolescence, two authors carefully analyzed the modified versions of
the Web applications, in order to evaluate if the functionality involved
in the test case could still be successfully executed or if it was obsolete
after that layout change.
5.7.3. Threats to conclusion validity
The reported conclusions are based on the subset of considered
changes. In order to consider a systematic and realistic set of changes,
we have proposed a change model that extends the one previously
presented in Hammoudi et al. (2016), which was based on the empirical
observation of changes causing test breakages in Web applications. In
addition, our extension takes into account specific types of changes that
may occur in template-based Web applications.
To make a fair comparison, we tried to define an equal number
of changes for each of the considered typologies. However, we imple-
mented fewer changes in some cases because some of them were not
always applicable in the considered web applications. In addition, the
considered frequency of application changes may not be representative
of real change occurrences. In future work, we intend to define a pool
of changes with controlled frequencies of change types, on the basis of
real change repository mining.
18 Experimental Material: Implemented Locators, https://github.com/reverse
-unina/LocatorsFragilityExperimentation/tree/main/LocatorsImplementation.
5.7.4. Threats to external validity
The first threat to the generalization of our conclusions is due to
the small number of considered template-based Web applications. This
limitation was essentially due to the effort required to systematically
design and implement the benchmark of about 100 GUI changes per
application which might break locators. However, even if limited,
our sample is representative of real template-based Web applications,
since they were selected among popular open-source applications from
GitHub.
Another threat depends on the limited number of test cases in-
volved in the study. However, our test cases were defined in order
to include several different locators and to involve different pages of
the applications under test. In this way, we are confident that the test
cases include a representative sample of real interactions with web
application widgets which may be involved in realistic web page layout
changes.
6. Related work
The problem of locators fragility has been originally studied for
Web context extraction (Myllymaki and Jackson,2002;Kowalkiewicz
et al.,2006;Montoto et al.,2011) and Web application wrapping
and migration problems (Di Lucca et al.,2007). A broad discussion
about the causes of test case fragility and the techniques and strategies
proposed in the literature to solve them at different levels is presented
by Ricca et al. in 2019 (Ricca et al.,2019). It is possible to summarize
that the problem of the fragility of the locators has been faced by three
distinct approaches: (1) by generating robust locators, (2) by repairing
broken locators and (3) by refactoring the Web pages to enable the
generation of more robust locators. In the next subsections we will
discuss the main contributions in literature regarding any of these three
approaches.
Techniques for the generation of robust locators. The first family of
techniques aims at improve the process of generation of locators by
using heuristics generation techniques on the basis of the analysis of the
DOM of the Web pages observed during the Web application execution.
Thummalapenta et al. (2012,2013) proposed a tool called ATA (Au-
tomating Test Automation) that helped the tester in the development
of change-resilient test scripts, and in their maintenance and repairing.
In 2014, Yandrapally et al. (2014) proposed a technique to generate
robust XPath queries by focusing on attributes that are supposed to
be invariant during the application evolution, such as labels or IDs. A
similar approach is the one proposed by Pirzadeh and Shanian (2014).
They presented a test development framework able to help testers
in producing resilient tests, i.e. test cases independent on the internal
structure of applications, so that a change in the structure will not break
them. Their approach is based on analyses of both structure and textual
contents of HTML pages. More recently, Kirinuki et al. in 2019 (Kirinuki
et al.,2019) presented COLOR (COrrect LOcator Recommender), a
technique for the automatic generation of locators which takes into
account various properties (i.e., attributes, texts, images, and positions)
and their changes between two Web Application releases. In 2021,
Nguyen et al. (2021) have proposed another heuristic approach that
also considers the neighbors of the Web elements to be located in the
DOM hierarchy.
The most relevant contribution to this field in the last years is
presented in the series of papers based on the ROBULA tool and on its
evolution, starting from 2013. In order to reduce the costs for finding
and repairing broken test cases, Leotta et al. proposed in 2014 ROB-
ULA (ROBUst Locator Algorithm) (Leotta et al.,2014), an algorithm
and a tool for the generation of robust XPath locators. In 2016 they
improved their technique by proposing Robula+ (Leotta et al.,2016),
another heuristic technique that overcome some of the limitations of
Robula by improving the refinement algorithm prioritizing the use of
some attributes that demonstrated their stability on the basis of the
The Journal of Systems & Software 210 (2024) 111932
14
M. De Luca et al.
previous experiments. The most recent contribution of this series is
represented by Sidereal (Leotta et al.,2021), a more complex adaptive
approach based on the minimization of a Fragility Coefficient. All these
approaches have been tested on real open source Web applications and
have brought the number of broken locators between two consecutive
releases down to about 10% in the case of Sidereal (Leotta et al.,2021).
Techniques for repairing broken test cases or locators. An alternative
approach with respect to the proposal of robust locators is the one of
automatically repairs broken locators when a test breakage occurs.
The idea to automatically repair broken test cases of interactive
applications was initially proposed in the context of GUI testing of
desktop applications by Memon (2008). The first relevant approach
in the context of Web testing is the one of Choudhary et al. (2011)
that in 2011 proposed the WATER (Web Application TEest Repair)
approach to repair locators. Their approach is based on the intuition
that a broken locator can be repaired by choosing another locator
that is similar to the original one. The similarity between different
locators is measured by means of the Levenshtein distance between
the corresponding XPath queries. This approach is not more effective
when large updates of the Web Application’s layout occur. In such a
case, the locator may not be correctly repaired even by using WATER.
In 2016, Hammoudi et al. (2016) proposed an incremental test repair
approach called WATERFALL based on an iterative application of the
WATER algorithm.
In 2015 Leotta et al. (2015) extended the ROBULA approach previ-
ously described by implementing an algorithm based on multi-locators,
i.e. an algorithm that propose a set of different promising XPath queries
by applying different heuristics, thus alternative queries can be auto-
matically used if the selected query is not more usable (Leotta et al.,
2015). In 2018, Stocco et al. (2018b,a) have proposed an approach
and a tool to repair DOM-based locators by means of a visual-based
approach that recognizes where is the Web element that should be
pointed by the locator and propose new DOM-based locators for the
new position of the Web element. More recently, Aldalur and Díaz
(2017) and Aldalur et al. (2020) have proposed algorithms to regen-
erate broken locators on the basis of the analysis of alternative locators
from the previous releases of the Web application under test.
Techniques for the preventive improvement of the testability. The third
approach consists in carrying out preventive maintenance interventions
on the code of the applications under test, in order to subsequently
be able to generate robust locators. Few papers in literature proposed
this type of solution. The most relevant are the ones based on the tool
LED (Live Editor for DOM) of Bajaj et al. (2015b,a). This tool helps a
Web developer to find out which are the possible locators (generated
with the support of C&R tools such as Selenium) that can be associated
with the elements of the Web application GUI while browsing. The Web
Developer can add identifiers to the Web application source code in
order to limit the use of less robust locators. This approach has the
advantage of being applicable to a large variety of technologies for the
development of Web applications, but leaves the burden of modifying
the source code to the developer. This approach can be more difficult
to apply in the case of Rich Internet Applications for which it is more
difficult to trace the elements of the DOM to the Javascript code that
generates them. With respect to this approach, our proposal presents a
complete automation of the activities of modification of the source code
of the application under test, applicable on most of the frameworks on
which LED is applicable.
At the best of our knowledge, our approach is the first one com-
pletely supporting the automatic injection of unique identifiers in the
source code of the Web applications that are able to support the
generation of robust locators.
7. Conclusions
In this paper we presented an experiment where we systematically
compared the fragility of seven different types of locator used by C&R
testing techniques. To conduct the experiment, we defined a benchmark
of Web page layout changes using a change classification model that we
proposed.
In the study we focused on template-based Web applications,
to whom the hook-based locator technique applies. We injected 192
different changes on two open source template-based Web applications
based on Angular. We studied the robustness of test cases based on
the seven considered types of locator, as the Web page layout changed
according to our benchmark. The experiment confirmed the validity
of our solution based on hook-based locators, which produced less
breakages than all the considered techniques. It also allowed us to study
how the fragility of different types of locators varied with different
types of GUI layout changes.
This experiment also provided interesting data for exploring the
cost-effectiveness of our hook-injection technique. With respect to other
types of locators, the only additional step required by our technique
consists in injecting hooks in the source code of the subject application
before recording a test case. The injection needs to be repeated each
time a new version of the app is developed. However, as the experiment
showed, the impact of this step on an End-to-End regression testing
process is irrelevant. Thanks to our automatic Hook-Injector software
component, indeed, the injection step requires an execution time negli-
gible with respect to the time required by the overall regression testing
process. On the basis of these preliminary experimental data, we are
confident about the acceptability and practicability of the approach in
E2E regression testing processes. In future work we intend to extend our
study and investigate its acceptability also in industrial CI/CD contexts.
A remark to be made is that the hook-based locator technique is only
applicable to web applications implemented by a template technology,
like Angular, whereas the other ones have a more general applicability.
In future work, we plan to extend our experimentation by consider-
ing further applications and other state-of-the-art locators, such as the
ones proposed by ROBULA+ or other state-of-the-practice solutions. We
also intend to mine real change-sets in open source web application
repositories, in order to extend our change model and to support the
definition of more realistic benchmarks of changes. In order to carry
out further experiments, we also intend to develop a technique for
the automatic injection of layout changes on the basis of the proposed
change classification model.
Eventually, we plan to develop an optimized technique for locator
generation that is able to predict the most promising locators on the
basis of the experimental results of our studies.
CRediT authorship contribution statement
Marco De Luca: Software, Validation, Writing review & editing.
Anna Rita Fasolino: Conceptualization, Methodology, Investigation,
Writing original draft, Writing review & editing. Porfirio Tra-
montana: Conceptualization, Methodology, Investigation, Writing
original draft, Writing review & editing.
Declaration of competing interest
The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to
influence the work reported in this paper.
Data availability
Data will be made available on request.
The Journal of Systems & Software 210 (2024) 111932
15
M. De Luca et al.
References
Aldalur, I., Díaz, O., 2017. Addressing web locator fragility: a case for browser
extensions. In: Campos, J.C., Nunes, N., Campos, P., Calvary, G., Nichols, J.,
Martinie, C., Silva, J.L. (Eds.), Proceedings of the ACM SIGCHI Symposium on
Engineering Interactive Computing Systems, EICS 2017, Lisbon, Portugal, June
26-29, 2017. ACM, pp. 45–50. http://dx.doi.org/10.1145/3102113.3102124.
Aldalur, I., Larrinaga, F., Perez, A., 2020. ABLA: an algorithm for repairing structure-
based locators through attribute annotations. In: Huang, Z., Beek, W., Wang, H.,
Zhou, R., Zhang, Y. (Eds.), Web Information Systems Engineering - WISE 2020 -
21st International Conference, Amsterdam, the Netherlands, October 20-24, 2020,
Proceedings, Part II. In: Lecture Notes in Computer Science, 12343, Springer, pp.
101–113. http://dx.doi.org/10.1007/978-3- 030-62008- 0_7.
Ardito, L., Coppola, R., Morisio, M., Torchiano, M., 2019. Espresso vs. EyeAutomate:
An experiment for the comparison of two generations of android GUI testing.
In: Proceedings of the Evaluation and Assessment on Software Engineering. EASE
’19, Association for Computing Machinery, New York, NY, USA, pp. 13–22. http:
//dx.doi.org/10.1145/3319008.3319022.
Bajaj, K., Pattabiraman, K., Mesbah, A., 2015a. LED: tool for synthesizing web
element locators. In: Cohen, M.B., Grunske, L., Whalen, M. (Eds.), 30th IEEE/ACM
International Conference on Automated Software Engineering, ASE 2015, Lincoln,
NE, USA, November 9-13, 2015. IEEE Computer Society, pp. 848–851. http://dx.
doi.org/10.1109/ASE.2015.110.
Bajaj, K., Pattabiraman, K., Mesbah, A., 2015b. Synthesizing web element locators.
In: Proceedings of the 30th IEEE/ACM International Conference on Automated
Software Engineering. ASE ’15, IEEE Press, pp. 331–341. http://dx.doi.org/10.
1109/ASE.2015.23.
Banerjee, I., Nguyen, B.N., Garousi, V., Memon, A.M., 2013. Graphical user interface
(GUI) testing: Systematic mapping and repository. Inf. Softw. Technol. 55 (10),
1679–1694. http://dx.doi.org/10.1016/j.infsof.2013.03.004.
Chapman, P., Evans, D., 2011. Automated black-box detection of side-channel vul-
nerabilities in web applications. In: Proceedings of the 18th ACM Conference
on Computer and Communications Security. CCS ’11, Association for Computing
Machinery, New York, NY, USA, pp. 263–274. http://dx.doi.org/10.1145/2046707.
2046737.
Choudhary, S.R., Zhao, D., Versee, H., Orso, A., 2011. WATER: Web application test
repair. In: Proceedings of the First International Workshop on End-To-End Test
Script Engineering. ETSE ’11, Association for Computing Machinery, New York,
NY, USA, pp. 24–29. http://dx.doi.org/10.1145/2002931.2002935.
Coppola, R., Morisio, M., Torchiano, M., Ardito, L., 2019. Scripted GUI testing of
android open-source apps: evolution of test code and fragility causes. Empir. Softw.
Eng. http://dx.doi.org/10.1007/s10664-019- 09722-9.
Di Lucca, G.A., Fasolino, A.R., Tramontana, P., 2007. Web pages classification using
concept analysis. In: 23rd IEEE International Conference on Software Maintenance
(ICSM 2007), October 2-5, 2007, Paris, France. IEEE Computer Society, pp.
385–394. http://dx.doi.org/10.1109/ICSM.2007.4362651.
Di Martino, S., Fasolino, A., Tramontana, P., Starace, L., 2020. Comparing the
effectiveness of capture and replay against automatic input generation for android
graphical user interface testing. Softw. Test. Verif. Reliab. 31, http://dx.doi.org/
10.1002/stvr.1754.
Fasolino, A.R., Tramontana, P., 2022. Towards the generation of robust E2E test
cases in template-based web applications. In: 2022 48th Euromicro Conference
on Software Engineering and Advanced Applications (SEAA). pp. 104–111. http:
//dx.doi.org/10.1109/SEAA56994.2022.00024.
Hammontree, M.L., Hendrickson, J.J., Hensley, B.W., 1992. Integrated data capture and
analysis tools for research and testing on graphical user interfaces. In: Proceedings
of the SIGCHI Conference on Human Factors in Computing Systems. CHI ’92,
Association for Computing Machinery, New York, NY, USA, pp. 431–432. http:
//dx.doi.org/10.1145/142750.142886.
Hammoudi, M., Rothermel, G., Stocco, A., 2016. WATERFALL: An incremental approach
for repairing record-replay tests of web applications. In: Proceedings of the
2016 24th ACM SIGSOFT International Symposium on Foundations of Software
Engineering. In: FSE 2016, Association for Computing Machinery, New York, NY,
USA, pp. 751–762. http://dx.doi.org/10.1145/2950290.2950294.
Hammoudi, M., Rothermel, G., Tonella, P., 2016. Why do record/replay tests of web
applications break? In: 2016 IEEE International Conference on Software Testing,
Verification and Validation (ICST). pp. 180–190. http://dx.doi.org/10.1109/ICST.
2016.16.
Kaner, C., 2008. A tutorial in exploratory testing. URL https://www.kaner.com/pdfs/
QAIExploring.pdf.
Kirinuki, H., Tanno, H., Natsukawa, K., 2019. COLOR: correct locator recommender
for broken test scripts using various clues in web application. In: Wang, X.,
Lo, D., Shihab, E. (Eds.), 26th IEEE International Conference on Software Analysis,
Evolution and Reengineering, SANER 2019, Hangzhou, China, February 24-27,
2019. IEEE, pp. 310–320. http://dx.doi.org/10.1109/SANER.2019.8667976.
Kowalkiewicz, M., Orlowska, M.E., Kaczmarek, T., Abramowicz, W., 2006. Robust web
content extraction. In: Carr, L., Roure, D.D., Iyengar, A., Goble, C.A., Dahlin, M.
(Eds.), Proceedings of the 15th International Conference on World Wide Web,
WWW 2006, Edinburgh, Scotland, UK, May 23-26, 2006. ACM, pp. 887–888.
http://dx.doi.org/10.1145/1135777.1135928.
Leotta, M., Ricca, F., Tonella, P., 2021. Sidereal: Statistical adaptive generation of robust
locators for web testing. Softw. Test. Verif. Reliab. 31 (3), http://dx.doi.org/10.
1002/stvr.1767.
Leotta, M., Stocco, A., Ricca, F., Tonella, P., 2014. Reducing web test cases aging by
means of robust xpath locators. In: 25th IEEE International Symposium on Software
Reliability Engineering Workshops, ISSRE Workshops, Naples, Italy, November 3-
6, 2014. IEEE Computer Society, pp. 449–454. http://dx.doi.org/10.1109/ISSREW.
2014.17.
Leotta, M., Stocco, A., Ricca, F., Tonella, P., 2015. Using multi-locators to increase
the robustness of web test cases. In: 2015 IEEE 8th International Conference on
Software Testing, Verification and Validation (ICST). pp. 1–10. http://dx.doi.org/
10.1109/ICST.2015.7102611.
Leotta, M., Stocco, A., Ricca, F., Tonella, P., 2016. Robula+: An algorithm for generating
robust xpath locators for web testing. J. Softw.: Evol. Process 28 (3), 177–204.
http://dx.doi.org/10.1002/smr.1771.
Memon, A., 2008. Automatically repairing event sequence-based GUI test suites for
regression testing. ACM Trans. Softw. Eng. Methodol. 18 (2), http://dx.doi.org/10.
1145/1416563.1416564.
Montoto, P., Pan, A., Raposo, J., Bellas, F., López, J., 2011. Automated browsing in
AJAX websites. Data Knowl. Eng. 70 (3), 269–283. http://dx.doi.org/10.1016/j.
datak.2010.12.001.
Myllymaki, J., Jackson, J., 2002. Robust web data extraction with XML path expres-
sions. IBM Res. Rep. URL https://dominoweb.draco.res.ibm.com/reports/RJ10245.
pdf.
Nguyen, V., To, T., Diep, G., 2021. Generating and selecting resilient and maintainable
locators for web automated testing. Softw. Test. Verif. Reliab. 31 (3), http://dx.
doi.org/10.1002/stvr.1760.
Parr, T.J., 2004. Enforcing strict model-view separation in template engines. In:
Proceedings of the 13th International Conference on World Wide Web. WWW
’04, Association for Computing Machinery, New York, NY, USA, pp. 224–233.
http://dx.doi.org/10.1145/988672.988703.
Pirzadeh, H., Shanian, S., 2014. Resilient user interface level tests. In: Crnkovic, I.,
Chechik, M., Grünbacher, P. (Eds.), ACM/IEEE International Conference on Auto-
mated Software Engineering, ASE ’14, Vasteras, Sweden - September 15 - 19, 2014.
ACM, pp. 683–688. http://dx.doi.org/10.1145/2642937.2642954.
Rafi, D.M., Moses, K.R.K., Petersen, K., Mäntylä, M.V., 2012. Benefits and limitations
of automated software testing: Systematic literature review and practitioner survey.
In: 2012 7th International Workshop on Automation of Software Test (AST). pp.
36–42. http://dx.doi.org/10.1109/IWAST.2012.6228988.
Ricca, F., Leotta, M., Stocco, A., 2019. Chapter three - three open problems in the
context of E2E web testing and a vision: NEONATE. Adv. Comput. 113, 89–133.
http://dx.doi.org/10.1016/bs.adcom.2018.10.005.
Stocco, A., Yandrapally, R., Mesbah, A., 2018a. Vista: web test repair using computer
vision. In: Leavens, G.T., Garcia, A., Pasareanu, C.S. (Eds.), Proceedings of the 2018
ACM Joint Meeting on European Software Engineering Conference and Symposium
on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena
Vista, FL, USA, November 04-09, 2018. ACM, pp. 876–879. http://dx.doi.org/10.
1145/3236024.3264592.
Stocco, A., Yandrapally, R., Mesbah, A., 2018b. Visual web test repair. In: Leavens, G.T.,
Garcia, A., Pasareanu, C.S. (Eds.), Proceedings of the 2018 ACM Joint Meeting on
European Software Engineering Conference and Symposium on the Foundations
of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA,
November 04-09, 2018. ACM, pp. 503–514. http://dx.doi.org/10.1145/3236024.
3236063.
Thummalapenta, S., Devaki, P., Sinha, S., Chandra, S., Gnanasundaram, S., Na-
garaj, D.D., Kumar, S., Kumar, S., 2013. Efficient and change-resilient test
automation: An industrial case study. In: 2013 35th International Conference
on Software Engineering (ICSE). pp. 1002–1011. http://dx.doi.org/10.1109/ICSE.
2013.6606650.
Thummalapenta, S., Singhania, N., Devaki, P., Sinha, S., Chandra, S., Das, A.K.,
Mangipudi, S., 2012. Efficiently scripting change-resilient tests. In: Tracz, W., Ro-
billard, M.P., Bultan, T. (Eds.), 20th ACM SIGSOFT Symposium on the Foundations
of Software Engineering (FSE-20), SIGSOFT/FSE’12, Cary, NC, USA - November 11
- 16, 2012. ACM, p. 41. http://dx.doi.org/10.1145/2393596.2393643.
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., 2012. Experimentation
in software engineering. Springer, http://dx.doi.org/10.1007/978-3- 642-29044- 2.
Yandrapally, R., Thummalapenta, S., Sinha, S., Chandra, S., 2014. Robust test automa-
tion using contextual clues. In: Proceedings of the 2014 International Symposium
on Software Testing and Analysis. In: ISSTA 2014, Association for Computing
Machinery, New York, NY, USA, pp. 304–314. http://dx.doi.org/10.1145/2610384.
2610390.
Yeh, T., Chang, T.-H., Miller, R.C., 2009. Sikuli: Using GUI screenshots for search and
automation. In: Proceedings of the 22nd Annual ACM Symposium on User Interface
Software and Technology. UIST ’09, Association for Computing Machinery, New
York, NY, USA, pp. 183–192. http://dx.doi.org/10.1145/1622176.1622213.
Marco De Luca is currently a Ph.D. Student at the Department of Electrical Engineering
and Information Technology of the University of Napoli Federico II, Italy. His main
interests regard software processes, software quality, testing techniques.
The Journal of Systems & Software 210 (2024) 111932
16
M. De Luca et al.
Anna Rita Fasolino is an Associate Professor at the Department of Electrical Engi-
neering and Information Technology of the University of Napoli Federico II, Italy. She
was previously an Assistant Professor at the University of Bari, Italy. She received her
M.S. Laurea degree in Electronic Engineering and a Ph.D. in Electronic and Computer
Engineering at the University of Naples Federico II.
Her research interests are in the area of software engineering with a focus
on software testing, mobile app testing, reverse engineering, Web engineering, and
embedded software engineering. In such fields she developed and participated in
numerous R&D projects and co-authored more than 100 articles in peer-reviewed
international journals, books, and proceedings of conferences and workshops. She has
won distinguished paper awards at ICSM 2002 and ICSM 2012 for her work on Web
testing and Automated Mobile app GUI testing. Anna Rita serves as a member of
the program committee for several conferences in software engineering and is co-
organizer of special issues and workshops related to testing of event-based software. She
co-chaired the Doctoral Symposium at the 10th IEEE International Conference on
Software Testing, Verification and Validation (ICST 2017). Anna Rita is an academic
editor of the Journal of Systems and Software, of PeerJ Computer Science open access
journal, and of the Computers Journal MDPI.
Porfirio Tramontana is an Associate Professor at the Department of Electrical Engi-
neering and Information Technology of the University of Napoli Federico II, Italy. His
main research interests fall mainly in the field of software engineering and include au-
tomation of reverse engineering, reuse, reengineering, migration, maintenance models,
testing, quality assessment, semantic interoperability, in particular in the contexts of
Web applications, Web services and mobile applications. He has served on numerous
editorial committees of international conferences and journals, in the roles of reviewer,
program committee member, program chair and associate editor.
ResearchGate has not been able to resolve any citations for this publication.
Chapter
Full-text available
The growth of the web has been unstoppable in the last decade, which leads to an increasing demand for extracting information from it. Apart from the need to extract information, this growth also has brought the necessity to adapt web pages to user requirements, create annotations or test web applications. Due to the evolution of web pages, the complexity of the implementation of these techniques has increased. Being able to test, annotate, adapt and extract information from web pages correctly and efficiently has become a primary task. In order to perform all these tasks, it is mandatory to have the best mechanisms to effectively and unequivocally locate the desired elements throughout the web page life cycle, especially when a web page evolves. Different mechanisms are used to find web nodes. These mechanisms, called locators, are prone to fail over time owing to changes on websites. Many authors improve life expectancy of locators developing algorithms that use different types of locators. Some others have created algorithms that regenerate locators by saving extra information from the previous structure of the website. These algorithms extend the useful life of locators but their computational and storage cost is much higher. To avoid these problems, we have designed an algorithm that employs an attribute system embedded in the HTML code. The algorithm is able to regenerate the locators based on these attributes every time a single change takes place in a given element attribute. The evaluation of the proposal shows a much lower computational cost than in previous works.
Conference Paper
Full-text available
The growth of the web has been unstoppable in the last decade, which leads to an increasing demand for extracting information from it. Apart from the need to extract information, this growth also has brought the necessity to adapt web pages to user requirements, create annotations or test web applications. Due to the evolution of web pages, the complexity of the implementation of these techniques has increased. Being able to test, annotate, adapt and extract information from web pages correctly and efficiently has become a primary task. In order to perform all these tasks, it is mandatory to have the best mechanisms to effectively and unequivocally locate the desired elements throughout the web page life cycle, especially when a web page evolves. Different mechanisms are used to find web nodes. These mechanisms, called locators, are prone to fail over time owing to changes on websites. Many authors improve life expectancy of locators developing algorithms that use different types of locators. Some others have created algorithms that regenerate locators by saving extra information from the previous structure of the website. These algorithms extend the useful life of locators but their computational and storage cost is much higher. To avoid these problems, we have designed an algorithm that employs an attribute system embedded in the HTML code. The algorithm is able to regenerate the locators based on these attributes every time a single change takes place in a given element attribute. The evaluation of the proposal shows a much lower computational cost than in previous works.
Article
Full-text available
Evidence from empirical studies suggests that mobile applications are not thoroughly tested as their desktop counterparts. In particular, GUI testing is generally limited. Like web-based applications, mobile apps suffer from GUI testing fragility, i.e., GUI test classes failing or needing interventions because of modifications in the AUT or in its GUI arrangement and definition. The objective of our study is to examine the diffusion of test classes created with a set of popular GUI Automation Frameworks for Android apps, the amount of changes required to keep test classes up to date, and the amount of code churn in existing test suites, along with the underlying modifications in the AUT that caused such modifications. We defined 12 metrics to characterize the evolution of test classes and test methods, and a taxonomy of 28 possible causes for changes to test code. To perform our experiments, we selected six widely used open-source GUI Automation Frameworks for Android apps. We evaluated the diffusion of the tools by mining the GitHub repositories featuring them, and computed our set of metrics on the projects. Applying the Grounded Theory technique, we then manually analyzed diff files of test classes written with the selected tools, to build from the ground up a taxonomy of causes for modifications of test code. We found that none of the considered GUI automation frameworks achieved a major diffusion among open-source Android projects available on GitHub. For projects featuring tests created with the selected frameworks, we found that test suites had to be modified often – specifically, about 8% of developers’ modified LOCs belonged to test code and that a relevant portion (around 50% on average) of those modifications were induced by modifications in GUI definition and arrangement. Test code written with GUI automation fromeworks proved to need significant interventions during the lifespan of a typical Android open-source project. This can be seen as an obstacle for developers to adopt this kind of test automation. The evaluations and measurements of the maintainance needed by test code wrtitten with GUI automation frameworks, and the taxonomy of modification causes, can serve as a benchmark for developers, and the basis for the formulation of actionable guidelines and the development of automated tools to help mitigating the issue.
Article
By ensuring adequate functional coverage, End‐to‐End (E2E) testing is a key enabling factor of continuous integration. This is even more true for web applications, where automated E2E testing is the only way to exercise the full stack used to create a modern application. The test code used for web testing usually relies on DOM locators, often expressed as XPath expressions, to identify the web elements and to extract the data checked in assertions. When applications evolve, the most dominant cost for the evolution of test code is due to broken locators, which fail to locate the target element in the novel versions and must be repaired. In this paper, we formulate the robust XPath locator generation problem as a graph exploration problem, instead of relying on ad‐hoc heuristics as the one implemented by the state of the art tool robula+. Our approach is based on a statistical adaptive algorithm implemented by the tool sidereal, which outperforms robula+'s heuristics in terms of robustness by learning the potential fragility of HTML properties from previous versions of the application under test. sidereal was applied to six applications and to a total of 611 locators and was compared against two baseline algorithms, robula+ and Montoto. The adoption of sidereal results in a significant reduction of the number of broken locators (respectively ‐55% and ‐70%). The time for generating such robust locators was deemed acceptable being in the order of hundredths of second. When web applications evolve the most dominant cost for the evolution of test code, used for automated web testing, is often due to broken locators, which fail to locate the target element in the novel version. In this paper, we propose Sidereal, a statistical adaptive locator generator based on weights learnt on the history of the target application which outperforms state‐of‐the‐art Robula+ generator in terms of robustness.
Article
Web user interface (UI) test automation strategies have been dominated by programmable and record–playback approaches. Of these, record–playback allows creating automation tests easily and reduces the cost of test generation. However, this approach increases the cost of test maintenance due to its unstable generated locators for identifying UI objects during playback. In this paper, we propose a new approach to generating and selecting resilient and maintainable locators. Our approach consists of two parts, a new XPath construction method and selecting the best XPath to locate the target element. Our XPath construction method relies on semantic structures of Web pages to locate the target element using its neighbors. We conducted an experiment on 15 popular websites. The results show that our approach outperforms the state‐of‐the‐practice/art Selenium IDE and Robula+ in locating target elements by effectively avoiding wrong locators. It also produces more readable XPaths (hence more maintainable tests) than do these approaches. This paper presents a new approach to generating and selecting resilient and maintainable locators for Web UI test automation. The approach consists of a method to construct locators of a Web element based on its neighbors and an algorithm to select resilient locators. Our experiment using 2,293 UI elements of 15 popular websites shows the proposed approach outperforming the state‐of‐the‐practice/art Selenium IDE and Robula+ in reducing false positives in locating target elements during automation test execution.
Article
Exploratory testing and fully automated testing tools represent two viable and cheap alternatives to traditional test-case-based approaches for graphical user interface (GUI) testing of Android apps. The former can be executed by capture and replay tools that directly translate execution scenarios registered by testers in test cases, without requiring preliminary test-case design and advanced programming/testing skills. The latter tools are able to test Android GUIs without tester intervention. Even if these two strategies are widely employed, to the best of our knowledge, no empirical investigation has been performed to compare their performance and obtain useful insights for a project manager to establish an effective testing strategy. In this paper, we present two experiments we carried out to compare the effectiveness of exploratory testing approaches using a capture and replay tool (Robotium Recorder) against three freely available automatic testing tools (AndroidRipper, Sapienz, and Google Robo). The first experiment involved 20 computer engineering students who were asked to record testing executions, under strict temporal limits and no access to the source code. Results were slightly better than those of fully automated tools, but not in a conclusive way. In the second experiment, the same students were asked to improve the achieved testing coverage by exploiting the source code and the coverage obtained in the previous tests, without strict temporal constraints. The results of this second experiment showed that students outperformed the automated tools especially for long/complex execution scenarios. The obtained findings provide useful indications for deciding testing strategies that combine manual exploratory testing and automated testing.
Conference Paper
Context: Different approaches exist for automated GUI testing of Android applications, each with its peculiarities, advantages, and drawbacks. The most common are either based on the structure of the GUI or use visual recognition. Goal: In this paper, we present an empirical evaluation of two different GUI testing techniques with the use for each of a representative tool: (1) Visual GUI testing, with the use of EyeAutomate, and (2) Layout-based GUI testing, with the use of Espresso. Method: We conducted an experiment with a population of 78 graduate students. The participants of the study were asked to create the same test suite for a popular, open-source Android app (Omni-Notes) with both the tools, and to answer a survey about their preference to the one or the other, and the perceived difficulties when developing the test scripts. Results: By analyzing the outcomes of the delivered test suites (in terms of number of test scripts delivered and ratio of working ones) and the answers to the survey, we found that the participants showed similar productivity with both the tools, but the test suites developed with EyeAutomate were of higher quality (in terms of correctly working test scripts). The participants expressed a slight preference towards the EyeAutomate testing tool, reflecting a general complexity of Layout-based techniques -- represented by Espresso -- and some obstacles that may make the identification of components of the GUI quite a long and laborious task. Conclusions: The evidence we collected can provide useful hints for researchers aiming at making GUI testing techniques for mobile applications more usable and effective.
Chapter
Web applications are critical assets of our society and thus assuring their quality is of undeniable importance. Despite the advances in software testing, the ever-increasing technological complexity of these applications makes it difficult to prevent errors. In this work, we provide a thorough description of the three open problems hindering web test automation: fragility problem, strong coupling and low cohesion problem, and incompleteness problem. We conjecture that a major breakthrough in test automation is needed, because the problems are closely correlated, and hence need to be attacked together rather than separately. To this aim, we describe NEONATE, a novel integrated testing environment specifically designed to empower the web tester. Our utmost purpose is to make the research community aware of the existence of the three problems and their correlation, so that more research effort can be directed in providing solutions and tools to advance the state of the art of web test automation.