Conference PaperPDF Available

What's in a User? Towards Personalising Transparency for Music Recommender Interfaces

Authors:

Figures

Content may be subject to copyright.
Preprint
What’s in a User? Towards Personalising Transparency For
Music Recommender Interfaces
Martijn Millecamp
Department of Computer Science, KU Leuven
Leuven, Belgium
martijn.millecamp@cs.kuleuven.be
Nyi Nyi Htun
Department of Computer Science, KU Leuven
Leuven, Belgium
nyinyi.htun@cs.kuleuven.be
Cristina Conati
Department of Computer Science, UBC
Vancouver, Canada
conati@cs.ubc.ca
Katrien Verbert
Department of Computer Science, KU Leuven
Leuven, Belgium
katrien.verbert@cs.kuleuven.be
ABSTRACT
We have become increasingly reliant on recommender systems to
help us make decisions in our daily live. As such, it is becoming
essential to explain to users how these systems reason to enable
them to correct system assumptions and to trust the system. The
advantages of explaining the recommendation process has been
shown by a vast amount of research. Additionally, previous studies
showed that personality aects users’ attitudes, tastes and infor-
mation processing. However, it is still unclear whether personality
has an impact on the way users process and perceive explanations.
In this paper, we report the results of a study that investigated
dierences between personal characteristics of the perception and
the gaze pattern of a music recommender interface in the pres-
ence and absence of explanations. We investigated the dierences
between Need For Cognition, Musical Sophistication and the Big
Five personality traits. Results show empirical evidence of the dif-
ferences between Musical Sophistication and Openness on both
perception and gaze pattern. We found that users with a high Musi-
cal Sophistication and a low Openness score benet the most from
explanations.
CCS CONCEPTS
Human-centered computing User studies
;
Information
visualization
;User models;User interface design;Visualization de-
sign and evaluation methods;
Social and professional topics
User characteristics
;
Information systems Personaliza-
tion;Recommender systems.
KEYWORDS
recommender system; explanations; personal characteristics; music;
user characteristics; Big Five; Openness, Musical Sophistication
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from permissions@acm.org.
UMAP ’20, July 14–17, 2020, Genoa, Italy
©2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-6861-2/20/07. . . $15.00
https://doi.org/10.1145/3340631.3394844
ACM Reference Format:
Martijn Millecamp, Nyi Nyi Htun, Cristina Conati, and Katrien Verbert.
2020. What’s in a User? Towards Personalising Transparency For Music
Recommender Interfaces. In Proceedings of the 28th ACM Conference on
User Modeling, Adaptation and Personalization (UMAP ’20), July 14–17, 2020,
Genoa, Italy. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/
3340631.3394844
1 INTRODUCTION
Recommender systems (RS) have permeated our society to the
extent that they inuence most of our daily activities [
1
,
64
]. For
instance, RS help us choose which music we listen to, which things
we buy or even what jobs we apply for [
9
,
53
]. To make these
systems more eective, researchers and practitioners are becoming
increasingly aware of the fact that eectiveness of RS goes beyond
the accuracy of recommendation algorithms [
29
,
48
]. Additionally,
it has been proven that increasing trust is one of the key factors to
increase eectiveness [44, 67].
One of the elements that inuences trust is the ability to under-
stand the internal reasoning of the system so users do not have
to follow recommendations in blind faith [
61
]. A popular way to
support such understanding of the internal reasoning is to enable
users to reason about the RS by providing explanations [
44
,
67
],
and then enable them to control and correct system assumptions
where needed [67].
Additionally, despite increasing interest to incorporate personal-
ity in RS [
66
], there are only a few studies that investigate the inu-
ence of personal characteristics on the perception of explanations
in RS [
53
,
57
]. Moreover, although eye tracking is an established
way to evaluate user interfaces [
17
,
33
], there have been only a few
studies that investigate the gaze pattern of users in recommender
systems [60].
In this paper, we address these gaps by researching the inuence
of several personal characteristics on the perception of a music
recommender interface. More specically, we conducted a within-
subject study (N=30) in which users worked and rated two versions
of a music RS, one with and one without explanations. During the
experiment, we used an eye tracker to collect gaze data.
The objective of the study was to answer the following research
question: In which way do personal characteristics inuence the
perception of explanations? The specic personal characteristics we
investigated are Musical Sophistication, Need For Cognition, and
Preprint
the Big Five personality traits. These characteristics are discussed in
detail in Section 2, together with the motivation to research them.
The results of a random intercept mixed model analysis show
empirical evidence that in a music RS, certain perceptions such as
Use Intention,Novelty and Decision Support are dependent on 1) the
presence of explanations and 2) on personal characteristics such
as Musical Sophistication and Openness. The analysis of the eye
tracking data shows empirical evidence that personal characteristics
inuence the way users look at recommendations in the presence
or the absence of explanations.
The main contribution of this paper is threefold:
First, we identify the inuence of personal characteristics on
explanations in recommender systems.
Second, we provide empirical evidence on dependencies between
personal characteristics and perception of RS interfaces in the pres-
ence and the absence of explanations.
Third, we provide empirical evidence on dependencies between
personal characteristics and gaze pattern in RS interfaces in the
presence and the absence of explanations.
2 RELATED WORK
2.1 Explainable Recommendations
With an increasing popularity of RS in entertainment and online
shopping services, there have been calls to design transparent sys-
tems that promote users trust in recommendations [
30
]. Thus, pre-
vious work in the RS domain had begun to focus on providing
explanations of why the system recommends certain items [29].
In their survey, Tintarev and Mastho [
67
] found that the types
of explanation provided by existing RS fall under three categories:
content-based explanation (e.g. "We have recommended X because
you liked Y"), collaborative-based explanation (e.g. "People who
liked X also liked Y") and preference-based explanation (e.g. "Your
interests suggest that you would like X"). A number of researchers
have also looked into visual ways of providing explanations. In-
teractive visualisations, for instance, have an advantage as they
further allow users to directly manipulate recommender compo-
nents in order to steer the way recommendations are presented [
35
].
Donovan et al. [
59
] designed a movie RS, PeerChooser, to provide
users with a visual explanation of the recommendation process and
the ability to steer recommendations by manipulating the weights
of the recommender algorithm.
In the music recommender domain, Jin et al. [
36
] designed a
Spotify-based music RS and investigated the eects of personal
characteristics (PC) on dierent levels of controllability. However,
the work of Jin et al. focused only on providing a user steerable
interface rather than explanations. In more similar research, Mille-
camp et al. [
53
] used visualisations such as bar charts and scatter
plots to explain the attributes of recommended songs in a music RS.
Our approach is similar to that of Millecamp et al. [
53
] as we also
used bar charts, but there are some dierences which are explained
further in the Interface section.
The work of both Jin et al. [
36
] and Millecamp et al. [
53
] also
found that characteristics of individual users (e.g. cognitive ability,
experience, etc.) can in fact inuence the perception of interface
components in a music RS. To gain a better understanding of the
reasons for such inuence as well as potential design implications
for supporting personalised transparency, we used a comprehensive
list of PC and eye tracking measurements. In the following sections,
we discuss these in detail.
2.2 Personal Characteristics
A number of previous works in HCI [
15
,
16
,
65
,
68
] have studied
the inuence of PC on visualisations and interactive systems. In
the context of RS, personality-based RS are a growing area of study
[
66
]. However, most of this research focuses on incorporating per-
sonality in the algorithm to provide better recommendations [
58
],
to solve the cold start problem [
32
], to nd a balance between simi-
larity and diversity [
13
,
66
], and only a limited amount of research
focuses on the eect of personality on transparency. Naveed et al.
[
57
] investigated the eect of cognitive style on dierent forms of
explanation and found that intuitive thinkers rely more on explana-
tions in complex situations. In the context of music RS, Millecamp
et al. [
53
] investigated the eect of personality on the perception of
explanations, but in contrast to this work they did not investigate
the eect of the Big Five personality traits, and they did not have
eye tracking as it was an online study. In the following sub-sections,
we describe each of the PCs we investigated in detail.
2.2.1 Need for Cognition. Multiple studies have shown already
that cognitive style can impact the perception of RS. Tong et al.
[
70
] showed that users with a higher rational cognitive style are
more willing to use a recommender system. Other studies [
53
,
55
]
showed that there is an interaction eect between the presence of
explanations and cognitive style in the music domain.
A well-established questionnaire to measure the level of rational
thinking style is the 18-items questionnaire of Cacioppo et al. [
8
].
This questionnaire measures Need for Cognition (NFC) which is
a measure of the tendency for an individual to engage in and en-
joy eortful cognitive activities [
8
]. Previous studies have shown
that users with a high NFC enjoy solving complex problems [
4
]
and are more likely to seek out and elaborate on information [
50
].
Conversely, low NFC users are less motivated to study a message
in depth [51].
2.2.2 Musical Sophistication. Various studies have shown the in-
uence of user’s experience when interacting with RS. The expe-
rience can, for example, determine a user’s choice of interaction
methods [
39
]. In the music recommender domain, Kamehkhosh
and Jannach [
38
] discovered that users’ familiarity with a recom-
mended song inuences their choice, i.e. users tend to like a rec-
ommendation when they already know the song. The Goldsmiths
Musical Sophistication Index (Gold-MSI)
1
is regarded as an eective
way to measure music expertise of users, and has shown a strong
correlation with individuals’ music preference [
56
], listening be-
haviour [
21
] and interaction with visual elements in recommender
interfaces [
53
,
54
]. Therefore, we used the Gold-MSI to measure
Musical Sophistication (MS) of participants.
2.2.3 Big Five personality traits. Today, the Big Five personality
model [
26
] is one of the most favoured scientic structural repre-
sentations for personality attributes [
63
]. The taxonomy describes
a personality on ve dierent factors: Openness, Consientiousness,
1https://www.gold.ac.uk/music-mind-brain/gold-msi/ June 2019
Preprint
Extraversion, Agreeableness and Neuroticism [
26
]. The denitions
of these dierent factors are summarised in Table 1.
In the eld of RS, this taxonomy has been used by Tkalcic et al.
[
69
] to calculate the user similarity for collaborative ltering RS.
Additionally, in the eld of music RS, it has been shown that the
Big Five personality traits can have an inuencing factor for users’
preferences in [
14
,
22
,
23
]. Moreover, Berkovsky et al. [
6
] were
able to predict the personality of a user in this taxonomy based on
eye tracking features. In this paper, we used the 44-item Big Five
Inventory to measure these traits [27, 37].
Trait Denition
Extraversion
summarises traits related to activity
and energy, dominance, sociability,
expressiveness and positive emotions.
Agreeableness
contrasts a pro social orientation
towards others with antagonism and
includes traits such as altruism,
tendermindedness, trust and modesty.
Conscientiousness
describes socially prescribed impulse
control that facilitates task- and
goal-directed behavior.
Neuroticism
contrasts emotional stability with
a broad range of negative aects, including
anxiety, sadness, irritability
and nervous tension.
Openness
describes the breadth, depth, and
complexity of an individual’s mental
and experiential life.
Table 1: Denitions of the Big Five personality traits accord-
ing to [5]
2.3 Eye Tracking for User Evaluations
In HCI, eye tracking has been predominantly used for either usabil-
ity evaluation or as an input device to interact with user interfaces
[
17
,
33
] and also in the domain of RS, eye tracking has predomi-
nantly been used for these purposes. For instance, Giordano et al.
[24] proposed a RS that captures a user’s interests through an eye
tracker rather than traditional keyboard and mouse inputs which
require more eort. Evaluation results showed that this system en-
hanced the navigation experience of users. Additionally, Chen and
Pu [
11
,
12
] studied the eectiveness of recommender interface lay-
outs in inuencing users’ decision making strategies by analysing
their eye movements. They repeatedly found particular layouts that
can signicantly attract users’ attention. This study showed that
analysing eye tracking data has the potential to improve a RS inter-
face by gaining a deeper insight in the behaviour of end users. This
is the reason that in our work, we analysed the eye tracking data to
gain a better understanding in the way users perceive explanations
in a RS interface which has not been researched to the best of our
knowledge.
To analyse the eye tracking data, Goldberg and Helfman [
25
]
compiled a list of dierent metrics that can be dened from eye
tracking measures. One of the most widely used metrics are the
percentage and the number of xations within each area of interest
(AOI) as this gives an overall summary of visual attention within
spatial areas.
Additionally, Duchowski [
18
] proposed a more advanced eye
movement analysis such as gaze transition entropy to be able to
compare scan paths and xation sequences. The concept of gaze
transition entropy was rst introduced by Ellis and Stark [
20
] as a
practical eye tracking measure that enables the quantitative com-
parison of sequential gaze pattern. They showed that small entropy
values reect dependencies between xations whereas large en-
tropy values suggest a random scanning pattern. Since the intro-
duction of entropy, other studies have proved that this metric is a
viable measure for comparing scan paths and xation sequences
[
19
]. Thus, in this study, we analysed the eye tracking data by count-
ing xations on dierent AOIs and by calculating the transition and
stationary entropy. In Section 4.3, detailed explanations of these
metrics are provided.
3 INTERFACE
To investigate the eect of explanations, we designed two dierent
versions of a music recommender system, one with explanations
(Exp) and one without (Base). In the following paragraphs, we will
explain in detail the features of both interfaces.
Both interfaces are built on top of a music RS which uses the
Spotify API to recommend songs. This API generates up to 50 rec-
ommendations and takes as input a certain artist and a list preferred
ranges for several audio features 2.
As shown in part A of Figure 1, the rst step for users is to search
for any artist they like. Selected artists are represented with their
name, picture and a set of audio features as presented in part B of
Figure 1. The representation of the audio features, shown in part C
of this gure, is created by the minimum and maximum values for
each audio feature based on the 5 most popular songs of the artist.
As shown in part D of Figure 1, users can use sliders to steer the
recommendation process. For each audio feature, users can adjust
their preference by altering both the minimum and the maximum
value. If they needed a reminder of the meaning of the feature, they
could hover over the
?
to see the denition of the audio feature. To
avoid that users would forget the task, we repeated the instructions
for the task central in the interface, as seen in part E of Figure 1.
Below this task, a total of 12 recommendations were given, as
shown in part F of Figure 1. All recommendations were presented
with a cover of the album (part G) and the possibility to like or
dislike a song (part I). Once users liked or disliked a recommen-
dation, it appeared in the list on the right side of the interface as
shown in part K of Figure 1. Songs which were liked were added
automatically to the playlist, disliked songs were prevented to ap-
pear again as recommendations. In both interfaces a play button
appeared when the user was hovering over the cover of the song
(part J) enabling the participants to listen to a preview of the song
3
.
2
https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-
features/, https://developer.spotify.com/documentation/web-api/reference/browse/
get-recommendations/
3https://developer.spotify.com/documentation/web-api/reference/tracks/get-track/
Preprint
Figure 1: The Exp interface with the dierent parts highlighted in orange. A: Searchbox, B: Artist, C: Attributes of the artist,
D: Preference of the user, E: Task, F: Recommendations, G: Cover of a song, H: Explanations I: (dis)like buttons, J: Play button
K: list of (dis)liked songs
To explain why a song is recommended, we followed a similar
approach to Millecamp et al. [
53
,
55
] as we also explained recom-
mendations by showing the preference of the user and the value
of the song. However, the way the explanations are visualised is
dierent: in this study, the explanations are always visible and not
behind a button to facilitate the eye tracking. As shown in part H
of Figure 1, explanations show the minimum and the maximum
value the user selected for a particular audio feature with a bar in
the same colour as the slider of that feature. On top of that bar, the
exact value of the song is shown by a white line. This part is only
shown in Exp and not in Base.
4 METHOD
To investigate the impact of personal characteristics on the percep-
tion of RS interfaces in the presence or absence of explanations,
we conducted a lab study (N=30) with a within-subject design to
evaluate two dierent RS interfaces. The participants, the study
procedure as well as the measurements are described in detail be-
low.
4.1 Participants
All participants in this study were recruited through yers, e-
mailing lists or social media. A total of 32 users participated, but
due to technical issues we removed 2 of them, resulting in 30 valid
participants (9 female). The study took on average 60 minutes to
complete and participants were rewarded with a movie ticket. The
distribution of the participants across the user characteristics is
listed in Table 2.
Personal Characteristic Possible Range Mean Score (SD)
Age 18-65 24 (2.47)
Musical sophistication 18-126 60.47 (16.46)
Need for cognition 0-100 66.48 (13.46)
Extraversion 0-100 54.88 (20.64)
Agreeableness 0-100 71.39 (13.17)
Concientiousness 0-100 62.50 (18.66)
Neuroticism 0-100 45.83 (20.94)
Openness 0-100 56.08 (7.03)
Table 2: An overview of personal characteristics measured,
together with their highest and lowest possible scores and
summary statistics for the scores of the participants
4.2 Study Procedure
The experiment started with an initial phase in which users lled
in an informed consent form and a questionnaire to gather infor-
mation regarding their personal characteristics. After nishing this
initial phase, the preparation phase started consisting of 1) calibra-
tion with the Tobii 4C eye-tracker
4
, 2) an explanation of the audio
4Using the Tobii Core Software
Preprint
features used by the RS, and 3) the selection of relevant features by
participants.
The dierent audio features (acousticness, danceability, energy,
instrumentalness, popularity, tempo and valence) were selected
based on their popularity in a similar study [
54
]. To make sure that
users understood these features, they were given as much time as
they needed to understand individual features. For each feature, a
textual denition was provided together with three example songs
which users could listen to. The rst example song represents a
low value for the given feature, the second a medium value and
the last one a high value
5
. Users could take as much time as they
needed to understand the features. Once they were ready, they were
instructed to select between three and six features from the initial
set of seven features that were most important to them and would
be used in their RS interface. Next, participants started working
with two versions of the RS, one with and one without explanations
were presented. For each of the interfaces, they had to complete
three phases:
Users were asked to explore the interface until they understood
all of the functionalities of the interface to avoid an order and a
learning eect. Before users could continue, they needed to search
for at least one artist and like or dislike at least one song. For the
experiment task, users had to create a playlist of ve songs for
either a sports, a fun or a relaxing activity which are three of the
most common situations of which people look for music [
28
]. After
each experimental task, users were asked to ll out a post-task
questionnaire on the experience with the version of the system
they just used - detailed in Section 4.3.
4.3 Measurements
4.3.1 User experience. We measured user experience via eight sub-
jective system aspects including novelty,recommendation eective-
ness,choice satisfaction (CS), condence, decision-support (DS), trust,
understanding and use intention [
40
,
42
]. To measure these aspects,
we asked users to rate 17 questions on a 5-points Likert scale. These
questions are based on previous research [
7
,
10
,
41
,
53
,
55
,
60
] and
are presented in Table 3.
4.3.2 Gaze paern. In order to measure the gaze pattern of the
users, we used a Tobii 4C remote eye-tracker installed on a 27"
display, with a sampling rate of 90 Hz. Fixations were detected by
an implementation of the popular ID-T algorithm [
62
] with a disper-
sion threshold of 1
°
and a duration threshold of 100ms [
18
,
46
,
47
,
62
].
After identifying the xations, we categorised them in dierent
AOIs. As Figure 1 shows, the interface contains three major parts
in which we are interested and which we dened as AOI: the artists
(part B), the preference of the user (part D) and the recommenda-
tions (part F). For Exp, we identied an additional AOI, namely the
explanations of the recommendation (part H). For both the analy-
sis of xations on recommendations and on explanations, we rst
dened an AOI for each recommendation/explanation individually
after which we aggregated all of these areas.
To obtain a metric for distribution of visual attention, we anal-
ysed for each AOI the percentage of xations as suggested by Gold-
berg and Helfman [25].
5
https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-
features/
Metric Question(s)
Recommender
Eectiveness
The songs recommended to me match my interests.
The recommender helped me nd good songs for my
playlist.
Perceived Nov-
elty
The recommender system helped me discover new songs.
I knew most of the songs already.*
Good
Understanding
I understood why the songs were recommended to me.
The songs recommended to me had similar attributes to
my preference.
Use Intentions I will use this recommender system again.
Choice
Satisfaction
Overall, I am satised with the system.
I think I chose the best song from the options.
I would recommend the chosen song to others.
I am satised with the song I have chosen.
Trust I trust the system to suggest good songs.
I prefer to look for myself than trusting this system.*
Condence I am condent about the playlist I have created.
Decision
Support
The information provided for the recommended songs is
sucient for me to make a decision for my playlist. (Q1)
Overall, I found it dicult to decide which
songs(s) to select.* (Q2)
The provided information helped me to decide quickly. (Q3)
Table 3: An overview of 5-point Likert scale post-task ques-
tionnaires designed to capture user perception. Questions
with a star are reverse scored.
To be able to compare the scanpaths and xation sequences,
we computed both the gaze transition entropy and the stationary
entropy following the implementation of Krejtz et al. [
43
]. Gaze
transition entropy refers to the element of surprise in a gaze transi-
tion which implies that a maximum transition entropy is reached
when the distribution of transitions is uniform for each AOI, and
thus the transitions are independent of each other [
43
]. Stationary
entropy is dened as the entropy of the stationary distribution
and this refers to the distribution of xations among the AOIs. A
high stationary entropy means a more equally distributed visual
attention whereas a low entropy means that xations tend to be
concentrated on certain AOIs.
5 RESULTS
5.1 Perception
5.1.1 Interaction Eects. To answer the question in which way
personal characteristics inuence the perception of explanations,
we used random intercept models to study the relation between
the experience of the user on the one hand and various PC on the
other, and how that relation is aected by the interface. All models
included as xed eects the main eect of the PC, the main eect of
interface, and the interaction between both. Furthermore, a random
participant eect was added to account for the variability across
the participants [3, 47, 71]. Resulting in following model:
Experience PC inter f ace +(1|I d)(1)
To calculate p-values and eect size, we used the Satterthwaite
method for denominator degrees-of-freedom and F-statistics [
2
,
45
,
49
]. To account for family-wise error, resulting p-values are
adjusted using the Benjamini-Hochberg procedure [
31
]. The result
of this analysis is shown in Table 4 and described in detail below.
The results are ordered by eect size, which are all between high
Preprint
(a) Interaction eect of Open-
ness with interface on Use In-
tention.
(b) Interaction eect of Open-
ness with interface on Novelty.
(c) Interaction eect of Musi-
cal Sophistication with inter-
face on Decision Support.
(d) Detailed analysis of the
questions on Decision Support
for high Musical Sophistica-
tion users. * means reverse
scored
Figure 2: Analysis of the interaction eects
and medium
6
. To visualise the direction of the interaction eects,
we performed a median split on the PC to divide users into a group
with a high and a low score.
DV Eect F(1,28) p r
Use Intention Interface * Openness 5.0101 .033 .58
Decision Support Interface * MS 17.553 .000 .56
Novelty Interface * Openness 4.9980 .034 .42
Table 4: Signicant interaction eects
5.1.2 Use Intention on Openness. As shown in Table 4, there is
a signicant interaction eect of Openness with interface on Use
Intention. This indicates that the intention to use the interface again
depends on the Openness score of the user and on the interface.
Figure 2a shows the direction of this interaction eect based on
6
We report statistical signicance at the .05 level and report eect size as high for r
>.5 and medium r >.3 and low otherwise
a median split of Openness (median = 55). Post-hoc comparisons
show that users with a low Openness score report a higher Use
Intention in Exp than Base (t=-2.071, p=.047).
5.1.3 Decision Support on Musical Sophistication. As shown in
Table 4, there is a signicant interaction eect of Musical Sophis-
tication with interface on Decision Support. Figure 2c shows the
direction of this interaction eect based on a median split of MS
(median = 64).
Post-hoc comparisons show that users with a high MS score
indicated receiving a signicantly higher Decision Support from
Exp than from Base (t=-3.084, p=.005). Additionally, within the Exp
interface itself, a signicant amount of users with a high MS score
indicated receiving a higher Decision Support than those with a
low MS score (t=2.44, p=.018). To get a better understanding of this
result, we show in Figure 2d a box plot of the results of each of the
three questions from the Decision Support metrics for high MS users.
These questions are listed in Table 3 and discussed in Section 4.3.
This gure shows that for all three questions both the median
and the mean are higher in Exp than in Base. The median and the
mean are indicated as a green line and + respectively. This means
that for high MS users the information in Exp was more sucient
to make a decision, that the information helped better to decide
quickly, and that they found it easier to select a song in Exp.
5.1.4 Novelty on Openness. As shown in Table 4, there is a sig-
nicant interaction eect of Openness with interface on Novelty.
This indicates that the novelty of songs the users retrieved depends
on the Openness score of the user and on the interface. Figure 2b
shows the direction of this interaction eect based on a median
split of Openness (median = 55). Post-hoc comparisons do not show
signicant dierences, but this gures hints that low Openness
users found more novel songs in Exp than in Base and this is the
opposite for high Openness users.
5.2 Eye tracking
As a follow up, we investigated whether there are dierences in
gaze pattern in both interfaces between users with a dierent MS
and Openness score.
5.2.1 Fixations on recommendations. To verify in which way PC
inuence the way users look at the interfaces of Exp and Base,
we investigated the percentage of xations on the three AOIs that
are common in the two interfaces. For each PC, we performed a
median split after which we compared for each group the dierence
in percentage of xations on the AOI between Exp and Base. This
test revealed two signicant dierences.
In Figure 3a, we see that the percentage of xations in Base is
higher than Exp for users with a low MS. A t-test revealed that this
dierence is signicant (t=2.1972, p=.036). Figure 3b shows that the
percentage of xation in Exp is higher than Base for users with a
high Openness. A t-test revealed that this dierence is signicant
(t=-2.184, p=.035).
5.2.2 Entropy. To be able to compare the sequential gaze pattern of
dierent groups of users, we calculated the gaze transition entropy
and the stationary entropy in both Base and Exp. As there is an
Preprint
(a) Relative number of xa-
tions on recommendation for
dierent levels of Musical So-
phistication.
(b) Relative number of xa-
tions on recommendation for
dierent levels of Openness.
Figure 3: Analysis of the xations for Openness and Musical
Sophistication in Exp and Base
additional AOI in Exp, it is not meaningful to compare the entropy
between the Base and Exp.
We again performed a median split and tested for dierences
between the groups with a high and low score. A t-test revealed a
signicant dierence in transitional entropy between the two levels
of MS (F(1,28) = 4.556, p=.042) in Exp as shown in Figure 4a.
A higher transition entropy means a more equal distribution
of transition between the dierent AOIs, while a low transitional
entropy means a more careful viewing of AOIs [
43
]. Our results
indicate that high MS users switched more regularly between the
dierent AOIs while low MS users stayed more focused on the
particular AOIs. To investigate this further, we show the transition
matrices for the low and high MS groups as a heatmap in Figures 4b
and 4c. In these gures, we can see that the transition matrices are
very similar and that the biggest dierence between the two groups
was the transition after looking at their preference. Specically,
high MS users spread their focus more equally among the AOI than
low MS users.
6 DISCUSSION
In this section, we provide answers to the research question by
discussing the eects of dierent personal characteristics on the
perception of explanations and at the gaze pattern in a recom-
mender system. The implications of the results are then discussed
in Section 7.
6.1 Musical Sophistication
Our results indicate that explanations are particularly interesting
for users with a high MS score. Figure 2c shows that high MS
users felt more supported in making a decision if the RS provides
explanations. In our analysis of the gaze, we also found that high
MS users had a more equal distribution of transitions among the
dierent AOIs than low MS users. This means that the gaze pattern
of high MS users was less predictable and that they did not need to
view the dierent AOI as carefully as low MS users.
(a) Transitional entropy in
Exp for dierent levels of Mu-
sical Sophistication.
(b) Heatmap of transition ma-
trix in Exp for low Musical So-
phistication.
(c) Heatmap of transition ma-
trix in Exp for high Musical
Sophistication.
Figure 4: Analysis of the entropy in Exp for Musical Sophis-
tication
The reason for this may be that high MS users were better capable
of using the explanations to steer the RS. Previous studies have
already shown that high MS users are more capable of leveraging
dierent control components to explore songs in a RS [
34
]. Tintarev
et al. [
67
] also indicated that explanations and control are closely
related as explanations should be part of a cycle in which the user
understands what is going on and can correct the system where
needed. As such, it makes sense that high MS users beneted the
most from the explanations as they have a better ability to use the
information in the explanations to correct the system.
6.2 Openness
As discussed in the results section, Figure 2a and Figure 2b show
that the Openness score and the presence of explanations have a
signicant impact on Use Intention and Novelty respectively.
From Figure 2a we learn that users with a low Openness score in-
dicated a higher Use Intention for Exp than Base, but from Figure 2b
we learn that they found fewer novel songs with Exp. These two
ndings seem contradictory, but previous work has shown that this
is not necessarily the case. For instance, Chen et al. [
14
] found that
there is some positive correlation between Openness and diversity.
As such, users with a low Openness score can have a lower need to
nd novel songs.
Preprint
However, users with a high Openness score did not seem to share
this preference for Exp. This is illustrated by (i) they xated less
on recommendation in Exp than in Base (despite the absence of
explanations), (ii) there is no signicant dierence in Use Intention,
and (iii) there is only a small trend in nding more novel songs in
Exp than in Base. The reason for this might be that the information
in the explanations in the current form did not enable users with a
high Openness score to look for more novel songs which is why
only those with a low Openness score preferred Exp above Base.
Results of our previous lab study showed that users would use
explanations in the current form mostly when they were looking
for a specic kind of familiar music and thus not for exploring new
songs.
6.3 Need for Cognition
In contrast to similar previous studies [
52
,
53
] we did not nd
an eect of NFC on the perception of explanations. A reason for
this dierence might be that in this study the explanations were
always visible, whereas in both previous studies they had to be
explicitly activated by the users. Thus it is possible that it is the
proactive activation of explanations that brings out the dierences
between low and high NFC users, which is an interesting avenue
of investigation for future work.
7 DESIGN IMPLICATION
With this study, we have found empirical evidence that personal
characteristics inuence the way users perceive and the way users
look to a music recommender system in the presence and the ab-
sence of explanations. Based on these results we propose two dif-
ferent design guidelines for music recommender system interfaces.
The rst guideline is to implement transparency in combination
with control, especially for expert users to enable them to steer
the recommendations in a more ecient way. For novice users,
transparency should support them to learn how they can steer the
recommendation process.
The second guideline is to implement transparency in such a
way that depending on users’ need of diversity, they can steer the
recommender system to receive the right level of novel songs. For
low Openness users, explanations should support steering the rec-
ommendations in a specic direction of music. For high Openness
users, explanations should also support steering the recommenda-
tions for exploring more novel songs.
Additionally, we showed that there are signicant dierences
between the gaze patterns of users with dierent personal charac-
teristics. This means that eye tracking in a RS interface can be used
for user modeling. This is in line with the work of Berkovsky et
al. [
6
] where they had shown that it is possible to predict a range
of personality traits based on gaze data of dedicated stimuli. Our
results show that there is potential to extend this work to predict
personality traits based on the gaze data in a recommender system
interface.
8 CONCLUSION
In this paper, we researched the dierences between personal char-
acteristics on user perception of a transparent music recommender
system. To be specic, the personal characteristics we investigated
were Need for Cognition, Musical Sophistication and the Big Five
personality traits.
We presented the results of a within-subject user study (N=30)
with one interface with and one interface without explanations,
called Exp and Base respectively. With this study, we extend existing
work on personality-based recommender systems by showing the
dierences between personal characteristic on transparency and
on the gaze pattern in a recommender system.
Our results show empirical evidence that not all users perceive
transparency in a similar way. More specically, we found signif-
icant interaction eects of Openness and Musical Sophistication.
Additionally, we also showed the eects of Musical Sophistication
and Openness on the way users look at recommendations which
shows the potential of using eye tracking to predict personality
traits. Based on these results, we suggest implementing explana-
tions in combination with control in such a way that it supports
or teaches users to steer the recommendation process depending
on their level of Musical Sophistication. Additionally, we suggest
implementing explanations in such a way that it supports both
looking for a specic kind of music and for exploring new music.
ACKNOWLEDGMENTS
Part of this research has been supported by the KU Leuven Research
Council (grant agreement C24/16/017).
REFERENCES
[1]
Oscar Alvarado and Annika Waern. 2018. Towards algorithmic experience: Initial
eorts for social media contexts. In Proceedings of the 2018 CHI Conference on
Human Factors in Computing Systems. ACM, 286.
[2] Kamil Barton. 2013. Package ’MuMIn’. Version 1 (2013), 18.
[3]
Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2014. Fitting linear
mixed-eects models using lme4. arXiv preprint arXiv:1406.5823 (2014).
[4]
Rajeev Batra and Douglas M Stayman. 1990. The role of mood in advertising
eectiveness. Journal of Consumer research 17, 2 (1990), 203–214.
[5]
Veronica Benet-Martinez and Oliver P John. 1998. Los Cinco Grandes across
cultures and ethnic groups: Multitrait-multimethod analyses of the Big Five in
Spanish and English. Journal of personality and social psychology 75, 3 (1998),
729.
[6]
Shlomo Berkovsky, Ronnie Taib, Irena Koprinska, Eileen Wang, Yucheng Zeng,
Jingjie Li, and Sabina Kleitman. 2019. Detecting Personality Traits Using Eye-
Tracking Data. In Proceedings of the 2019 CHI Conference on Human Factors in
Computing Systems. ACM, 221.
[7]
Andrea Bunt, Joanna McGrenere, and Cristina Conati. 2007. Understanding
the utility of rationale in a mixed-initiative system for GUI customization. In
International Conference on User Modeling. Springer, 147–156.
[8]
John T Cacioppo, Richard E Petty, and Chuan Feng Kao. 1984. The ecient
assessment of need for cognition. Journal of personality assessment 48, 3 (1984),
306–307.
[9]
Sven Charleer, Francisco Gutiérrez Hernández, and Katrien Verbert. 2018. Sup-
porting job mediator and job seeker through an actionable dashboard. In Pro-
ceedings of the 24th IUI conference on Intelligent User Interfaces. ACM.
[10]
Li Chen and Pearl Pu. 2005. Trust building in recommender agents. In Proceedings
of the Workshop on Web Personalization, Recommender Systems and Intelligent User
Interfaces at the 2nd International Conference on E-Business and Telecommunication
Networks. Citeseer, 135–145.
[11]
Li Chen and Pearl Pu. 2010. Eye-tracking study of user behavior in recom-
mender interfaces. In International Conference on User Modeling, Adaptation, and
Personalization. Springer, 375–380.
[12]
Li Chen and Pearl Pu. 2011. Users’ eye gaze pattern in organization-based
recommender interfaces. In Proceedings of the 16th international conference on
Intelligent user interfaces. ACM, 311–314.
[13]
Li Chen, Wen Wu, and Liang He. 2013. How personality inuences users’ needs
for recommendation diversity?. In CHI’13 Extended Abstracts on Human Factors
in Computing Systems. ACM, 829–834.
[14]
Pei-I Chen, Jen-Yu Liu, and Yi-Hsuan Yang. 2015. Personal factors in music
preference and similarity: User study on the role of personality traits. In Proc. Int.
Symp. Computer Music Multidisciplinary Research (CMMR).
Preprint
[15]
Cristina Conati, Giuseppe Carenini, Enamul Hoque, Ben Steichen, and Dereck
Toker. 2014. Evaluating the impact of user characteristics and dierent layouts
on an interactive visualization for decision making. In Computer Graphics Forum,
Vol. 33. Wiley Online Library, 371–380.
[16]
Cristina Conati, Giuseppe Carenini, Dereck Toker, and Sébastien Lallé. 2015.
Towards user-adaptive information visualization. In Proc. of AAAI ’15. AAAI
Press, 4100–4106.
[17]
Cristina Conati and Christina Merten. 2007. Eye-tracking for user modeling in
exploratory learning environments: An empirical evaluation. Knowledge-Based
Systems 20, 6 (2007), 557–574.
[18]
Andrew T Duchowski. 2007. Eye tracking methodology. Theory and practice 328,
614 (2007), 2–3.
[19]
Islam Akef Ebeid, Nilavra Bhattacharya, and Jacek Gwizdka. 2018. Evaluating
The Ecacy of Real-time Gaze Transition Entropy. 1, 1 (2018), 0–8. https:
//doi.org/10.13140/RG.2.2.36376.03846/1
[20]
Stephen R Ellis and Lawrence Stark. 1986. Statistical dependency in visual
scanning. Human factors 28, 4 (1986), 421–438.
[21]
Bruce Ferwerda and Mark Graus. 2018. Predicting Musical Sophistication from
Music Listening Behaviors: A Preliminary Study. arXiv preprint arXiv:1808.07314
(2018).
[22]
Bruce Ferwerda, Marko Tkalcic, and Markus Schedl. 2017. Personality Traits and
Music Genres: What Do People Prefer to Listen To?. In Proceedings of the 25th
Conference on User Modeling, Adaptation and Personalization. ACM, 285–288.
[23]
Bruce Ferwerda, Emily Yang, Markus Schedl, and Marko Tkalcic. 2015. Personality
traits predict music taxonomy preferences. In Proceedings of the 33rd Annual ACM
Conference Extended Abstracts on Human Factors in Computing Systems. ACM,
2241–2246.
[24]
Daniela Giordano, Isaak Kavasidis, Carmelo Pino, and Concetto Spampinato.
2012. Content based recommender system by using eye gaze data. In Proceedings
of the Symposium on Eye Tracking Research and Applications. ACM, 369–372.
[25]
Joseph H Goldberg and Jonathan I Helfman. 2010. Comparing information graph-
ics: a critical look at eye tracking. In Proceedings of the 3rd BELIV’10 Workshop:
BEyond time and errors: novel evaLuation methods for Information Visualization.
ACM, 71–78.
[26]
Lewis R Goldberg. 1990. An alternative" description of personality": the big-ve
factor structure. Journal of personality and social psychology 59, 6 (1990), 1216.
[27]
Samuel D Gosling, Peter J Rentfrow, and William B Swann Jr. 2003. A very brief
measure of the Big-Five personality domains. Journal of Research in personality
37, 6 (2003), 504–528.
[28]
Fabian Greb, Wol Schlotz, and Jochen Steens. 2018. Personal and situational
inuences on the functions of music listening. Psychology of Music 46, 6 (2018),
763–794.
[29]
Chen He, Denis Parra, and Katrien Verbert. 2016. Interactive recommender
systems: A survey of the state of the art and future research challenges and
opportunities. Expert Systems with Applications 56 (2016), 9–27.
[30]
Jonathan L Herlocker, Joseph A Konstan, and John Riedl. 2000. Explaining col-
laborative ltering recommendations. In Proceedings of the 2000 ACM conference
on Computer supported cooperative work. ACM, 241–250.
[31]
YosefHo chberg and Yoav Benjamini. 1990. More powerful procedures for multiple
signicance testing. Statistics in medicine 9, 7 (1990), 811–818.
[32]
Rong Hu and Pearl Pu. 2011. Enhancing collaborative ltering systems with per-
sonality information. In Proceedings of the fth ACM conference on Recommender
systems. ACM, 197–204.
[33]
Robert JK Jacob and Keith S Karn. 2003. Eye tracking in human-computer
interaction and usability research: Ready to deliver the promises. In The mind’s
eye. Elsevier, 573–605.
[34]
Y Jin. 2019. Mixed-initiative Recommender Systems: Towards a Next Generation
of Recommender Systems through User Involvement. (2019).
[35]
Yucheng Jin, Karsten Seipp, Erik Duval, and Katrien Verbert. 2016. Go with the
ow: eects of transparency and user control on targeted advertising using ow
charts. In Proc. of AVI ’16. ACM, 68–75.
[36]
Yucheng Jin, Nava Tintarev, and Katrien Verbert. 2018. Eects of personal charac-
teristics on music recommender systems with dierent levels of controllability. In
Proceedings of the 12th ACM Conference on Recommender Systems. ACM, 13–21.
[37]
Oliver P John, Eileen M Donahue, and Robert L Kentle. 1991. The big ve
inventory—versions 4a and 54.
[38]
Iman Kamehkhosh and Dietmar Jannach. 2017. User perception of next-track
music recommendations. In Proceedings of the 25th conference on user modeling,
adaptation and personalization. ACM, 113–121.
[39]
Bart P Knijnenburg, Niels JM Reijmer, and Martijn C Willemsen. 2011. Each to his
own: how dierent users call for dierent interaction methods in recommender
systems. In Proceedings of the fth ACM conference on Recommender systems.
ACM, 141–148.
[40]
Bart P Knijnenburg and Martijn C Willemsen. 2015. Evaluating recommender
systems with user experiments. In Recommender Systems Handbook. Springer,
309–352.
[41]
Bart P Knijnenburg, Martijn C Willemsen, Zeno Gantner, Hakan Soncu, and
Chris Newell. 2012. Explaining the user experience of recommender systems.
User Modeling and User-Adapted Interaction 22, 4-5 (2012), 441–504.
[42]
Bart P Knijnenburg, Martijn C Willemsen, and Alfred Kobsa. 2011. A pragmatic
procedure to support the user-centric evaluation of recommender systems. In
Proceedings of the fth ACM conference on Recommender systems. ACM, 321–324.
[43]
Krzysztof Krejtz, Andrew Duchowski, Tomasz Szmidt, Izabela Krejtz, Fernando
González Perilli, Ana Pires, Anna Vilaro, and Natalia Villalobos. 2015. Gaze
transition entropy. ACM Transactions on Applied Perception (TAP) 13, 1 (2015), 4.
[44]
Johannes Kunkel, Tim Donkers, Lisa Michael, Catalin-Mihai Barbu, and Jürgen
Ziegler. 2019. Let Me Explain: Impact of Personal and Impersonal Explanations
on Trust in Recommender Systems. In Proceedings of the 2019 CHI Conference on
Human Factors in Computing Systems. ACM, 487.
[45]
Alexandra Kuznetsova, Per B. Brockho, and Rune H. B. Christensen. 2017.
lmerTest Package: Tests in Linear Mixed Eects Models. Journal of Statistical
Software 82, 13 (2017), 1–26. https://doi.org/10.18637/jss.v082.i13
[46]
Tiany CK Kwok, Peter Kiefer, Victor R Schinazi, Benjamin Adams, and Martin
Raubal. 2019. Gaze-Guided Narratives: Adapting Audio Guide Content to Gaze
in Virtual and Real Environments. In Proceedings of the 2019 CHI Conference on
Human Factors in Computing Systems. ACM, 491.
[47]
Sébastien Lallé, Dereck Toker, and Cristina Conati. 2019. Gaze-Driven Adap-
tive Interventions for Magazine-Style Narrative Visualizations. arXiv preprint
arXiv:1909.01379 (2019).
[48]
Benedikt Loepp, Catalin-Mihai Barbu, and Jürgen Ziegler. 2016. Interactive Rec-
ommending: Framework, State of Research and Future Challenges.. In EnCHIReS@
EICS. 3–13.
[49]
Steven G Luke. 2017. Evaluating signicance in linear mixed-eects models in R.
Behavior research methods 49, 4 (2017), 1494–1502.
[50]
David Luna and Laura A Peracchio. 2002. “Where there is a will. .. ”: Motivation
as a moderator of language processing by bilingual consumers. Psychology &
Marketing 19, 7-8 (2002), 573–593.
[51]
Brett AS Martin, Bodo Lang, Stephanie Wong, and Brett AS Martin. 2003. Con-
clusion explicitness in advertising: The moderating role of need for cognition
(NFC) and argument quality (AQ) on persuasion. Journal of Advertising 32, 4
(2003), 57–66.
[52]
Martijn Millecamp, Robin Haveneers, and Katrien Verbert. 2020. Cogito ergo
quid? The Eect of Cognitive Style in a Transparent Mobile Music Recommender
System.. In Proceedings of the 28th Conference on User Modeling, Adaptation and
Personalization.
[53]
Martijn Millecamp, Nyi Nyi Htun, Cristina Conati, and Katrien Verbert. 2019. To
explain or not to explain: the eects of personal characteristics when explaining
music recommendations.. In IUI. 397–407.
[54]
Martijn Millecamp, Nyi Nyi Htun, Yucheng Jin, and Katrien Verbert. 2018. Con-
trolling Spotify recommendations: eects of personal characteristics on music
recommender user Interfaces. In Proceedings of the 26th Conference on User Mod-
eling, Adaptation and Personalization. ACM, 101–109.
[55]
Martijn Millecamp, Sidra Naveed, Katrien Verbert, and Jürgen Ziegler. 2019.
To Explain or Not to Explain: the Eects of Personal Characteristics When
Explaining Feature-based Recommendations in Dierent Domains. In CEUR
workshop proceedings. CEUR.
[56]
Daniel Müllensiefen, Bruno Gingras, Jason Musil, and Lauren Stewart. 2014. The
musicality of non-musicians: an index for assessing musical sophistication in the
general population. PloS one 9, 2 (2014), e89642.
[57]
Sidra Naveed, Tim Donkers, and Jürgen Ziegler. 2018. Argumentation-Based
Explanations in Recommender Systems: Conceptual Framework and Empirical
Results. In Adjunct Publication of the 26th Conference on User Modeling, Adaptation
and Personalization. ACM, 293–298.
[58]
Maria Augusta Silveira Netto Nunes. 2008. Recommender systems based on per-
sonality traits. Ph.D. Dissertation.
[59]
John O’Donovan, Barry Smyth, Brynjar Gretarsson, Svetlin Bostandjiev, and
Tobias Höllerer. 2008. PeerChooser: visual interactive recommendation. In Pro-
ceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM,
1085–1088.
[60]
Pearl Pu, Li Chen, and Rong Hu. 2011. A user-centric evaluation framework for
recommender systems. In Proceedings of the fth ACM conference on Recommender
systems. ACM, 157–164.
[61]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i
trust you?: Explaining the predictions of any classier. In Proceedings of the 22nd
ACM SIGKDD international conference on knowledge discovery and data mining.
ACM, 1135–1144.
[62]
Dario D Salvucci and Joseph H Goldberg. 2000. Identifying xations and saccades
in eye-tracking protocols. In Proceedings of the 2000 symposium on Eye tracking
research & applications. ACM, 71–78.
[63]
Gerard Saucier. 2009. Recurrent personality dimensions in inclusive lexical
studies: Indications for a Big Six structure. Journal of personality 77, 5 (2009),
1577–1614.
[64]
Martin Schuessler and Philipp Weiß. 2019. Minimalistic Explanations: Capturing
the Essence of Decisions. arXiv preprint arXiv:1905.02994 (2019).
[65]
Nava Tintarev. 2017. Presenting Diversity Aware Recommendations: Making
Challenging News Acceptable. In Proc. of FATREC 17’.
Preprint
[66]
Nava Tintarev, Matt Dennis, and Judith Mastho. 2013. Adapting recommenda-
tion diversity to openness to experience: A study of human behaviour. In Inter-
national Conference on User Modeling, Adaptation, and Personalization. Springer,
190–202.
[67]
Nava Tintarev and Judith Mastho. 2007. A survey of explanations in recom-
mender systems. In 2007 IEEE 23rd international conference on data engineering
workshop. IEEE, 801–810.
[68]
Nava Tintarev and Judith Mastho. 2016. Eects of Individual Dierences in
Working Memory on Plan Presentational Choices. Frontiers in psychology 7
(2016).
[69]
Marko Tkalcic, Matevz Kunaver, Jurij Tasic, and Andrej Košir. 2009. Personal-
ity based user similarity measure for a collaborative recommender system. In
Proceedings of the 5th Workshop on Emotion in Human-Computer Interaction-Real
world challenges. 30–37.
[70]
Stephanie Tom Tong, Elena F Corriero, Robert G Matheny, and Jerey T Hancock.
2018. Online Daters’ Willingness to Use Recommender Technology for Mate
Selection Decisions. In Proceedings of the 5th Joint Workshop on Interfaces and
Human Decision Making for Recommender Systems co-located with ACM Conference
on Recommender Systems (RecSys 2018). ACM, 45–52.
[71]
Bodo Winter. 2013. A very basic tutorial for performing linear mixed eects
analyses. arXiv preprint arXiv:1308.5499 (2013).
... In a recent survey of evaluation methods for recommendation explanations, Chen et al. identied 118 studies, of which 55 contained an online evaluation study [11]. Of these studies, only 7 were case-control studies where one of the experimental conditions corresponded to the absence of explanations [14,30,37,38,42,56,66]. These studies investigated a majority of the seven possible aims of recommendation explanations enumerated by Tintarev and Mastho [51]: ...
... Ooge et al. investigated eects of explanations on recommended mathematics exercises [42], where they increased trust and transparency in the recommender system. In contrast, explanations can also have a negative impact on a recommender system's persuasiveness [37], eectiveness [38,56] and eciency [38]. ...
... Ooge et al. investigated eects of explanations on recommended mathematics exercises [42], where they increased trust and transparency in the recommender system. In contrast, explanations can also have a negative impact on a recommender system's persuasiveness [37], eectiveness [38,56] and eciency [38]. ...
Preprint
Full-text available
Recommender systems typically operate within a single domain, for example, recommending books based on users' reading habits. If such data is unavailable, it may be possible to make cross-domain recommendations and recommend books based on user preferences from another domain, such as movies. However, despite considerable research on cross-domain recommendations, no studies have investigated their impact on users’ behavioural intentions or system perceptions compared to single-domain recommendations. Similarly, while single-domain explanations have been shown to improve users' perceptions of recommendations, there are no comparable studies for the cross-domain case. In this article, we present a between-subject study (N=237) of users’ behavioural intentions and perceptions of book recommendations. The study was designed to disentangle the effects of whether recommendations were single- or cross-domain from whether explanations were present or not. Our results show that cross-domain recommendations have lower trust and interest than single-domain recommendations, regardless of their quality. While these negative effects can be ameliorated by cross-domain explanations, they are still perceived as inferior to single-domain recommendations without explanations. Last, we show that explanations decrease interest in the single-domain case, but increase perceived transparency and scrutability in both single- and cross-domain recommendations. Our findings offer valuable insights into the impact of recommendation provenance on user experience and could inform the future development of cross-domain recommender systems.
... These taxonomies are Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism [18]. Some works focus on using the OCEAN taxonomies to improve the results provided by Internet browsers [19] or incorporating these taxonomies into an AEH (Adaptive Educational Hypermedia) system [20]. Furthermore, the work of Cong et al. [21] defines several behaviours that indicate how the user interface should be adapted to fit the user's personality. ...
... Regarding the user's salary and age, the system offers various ranges that users can choose from, ensuring that the specific value is not disclosed. For example, the system may provide age ranges like [0-10], [11][12][13][14][15][16][17][18][19][20], [21][22][23][24][25][26][27][28][29][30] and so on. Other private information such as gender, education level, nationality, language, or disabilities are encoded with an internal system code, preventing specific information from being obtained. ...
Article
Full-text available
The way end-users interact with a system plays a crucial role in the high acceptance of software. Related to this, the concept of Intelligent User Interfaces has emerged as a solution to learn from user interactions with the system and adapt interfaces to the user’s characteristics and preferences. However, existing approaches to designing intelligent user interfaces are limited by their user models, which are not capable of representing each and every user characteristic valid for any context. This work aims to address this limitation by presenting a user model that can abstractly represent a wide set of user characteristics in any context of interaction. The model is based on a synthesis of previous works that have proposed specific user models. After the analysis of these works, a more sophisticated user model has been defined, including some required characteristics not existing in previous works. This model has been validated with 62 real end-users who have expressed the users’ characteristics that they consider as relevant to adapt the interaction. The results show that most of these characteristics can be represented by the proposed user model. This user model is the first step towards creating intelligent user interfaces that can adapt interactions to users with similar characteristics and preferences in similar contexts.
... Previous studies have shown the need for personalization by providing non-personalized explanations to users and analyzing their perceptions based on different traits [16,31,41,45,54]. Some XAI research has begun to address this need, such as designing personalized explanations for a music recommender system [38], tailored to traits identified as relevant in prior studies [41,42]. However, these personalized explanations were only tested on users with higher or lower levels of the targeted traits rather than in real-time interactions. ...
... Research also shows that personality affects preferences for movie recommendations and corresponding explanations [10]. In the context of a music recommender system, the need for cognition (a trait that measures one's appreciation for effortful cognitive activities [14]) [41] as well as musical sophistication and openness (one of the personality traits in the Big-Five Factor Model [21], which measures the breadth and complexity of an individual's mental and experiential life) [42] have a significant impact on explanation effectiveness. ...
Conference Paper
Full-text available
We explore eXplainable AI (XAI) to enhance user experience and understand the value of explanations in AI-driven pedagogical decisions within an Intelligent Pedagogical Agent (IPA). Our real-time and personalized explanations cater to students’ attitudes to promote learning. In our empirical study, we evaluate the effectiveness of personalized explanations by comparing three versions of the IPA: (1) personalized explanations and suggestions, (2) suggestions but no explanations, and (3) no suggestions. Our results show the IPA with personalized explanations significantly improves students’ learning outcomes compared to the other versions.
... Different users have different information needs based on their cognitive style (Millecamp et al., 2020), the level of importance and urgency of the issue (Mohseni et al., 2021), and their knowledge of algorithms (Cheng et al., 2019). In this regard, some users may require more comprehensive or detailed explanations, while others may prefer brief, simplified ones. ...
... Moreover, eye tracking may be considered as a useful alternative. Being less disruptive, it has become more popular in recent years, e.g., in studies on recommender interfaces and recommendation presentation [42,43], critiquing [44,45], and effects of personal characteristics [46]. However, also in these cases, participants' behavior was observed only in relation to the recommendation component, largely ignoring its surroundings. ...
Conference Paper
Full-text available
Thus far, in most of the user experiments conducted in the area of recommender systems, the respective system is considered as an isolated component, i.e., participants can only interact with the recommender that is under investigation. This fails to recognize the situation of users in real-world settings, where the recommender usually represents only one part of a greater system, with many other options for users to find suitable items than using the mechanisms that are part of the recommender, e.g., liking, rating, or critiquing. For example, in current web applications, users can often choose from a wide range of decision aids, from text-based search over faceted filtering to intelligent conversational agents. This variety of methods, which may equally support users in their decision making, raises the question of whether the current practice in recommender evaluation is sufficient to fully capture the user experience. In this position paper, we discuss the need to take a broader perspective in future evaluations of recommender systems, and raise awareness for evaluation methods which we think may help to achieve this goal, but have not yet gained the attention they deserve.
Article
Full-text available
This paper summarizes an ongoing multi-year project aiming to uncover knowledge and techniques for devising intelligent environments for user-adaptive visualizations. We ran three studies designed to investigate the impact of user and task characteristics on user performance and satisfaction in different visualization contexts. Eye-tracking data collected in each study was analyzed to uncover possible interactions between user/task characteristics and gaze behavior during visualization processing. Finally, we investigated user models that can assess user characteristics relevant for adaptation from eye tracking data.
Conference Paper
Full-text available
Trust in a Recommender System (RS) is crucial for its overall success. However, it remains underexplored whether users trust personal recommendation sources (i.e. other humans) more than impersonal sources (i.e. conventional RS), and, if they do, whether the perceived quality of explanation provided account for the difference. We conducted an empirical study in which we compared these two sources of recommendations and explanations. Human advisors were asked to explain movies they recommended in short texts while the RS created explanations based on item similarity. Our experiment comprised two rounds of recommending. Over both rounds the quality of explanations provided by users was assessed higher than the quality of the system's explanations. Moreover, explanation quality significantly influenced perceived recommendation quality as well as trust in the recommendation source. Consequently, we suggest that RS should provide richer explanations in order to increase their perceived recommendation quality and trustworthiness.
Conference Paper
Full-text available
Personality is an established domain of research in psychology, and individual differences in various traits are linked to a variety of real-life outcomes and behaviours. Personality detection is an intricate task that typically requires humans to fill out lengthy questionnaires assessing specific personality traits. The outcomes of this, however, may be unreliable or biased if the respondents do not fully understand or are not willing to honestly answer the questions. To this end, we propose a framework for objective personality detection that leverages humans' physiological responses to external stimuli. We exemplify and evaluate the framework in a case study, where we expose subjects to affective image and video stimuli, and capture their physiological responses using a commercial-grade eye-tracking sensor. These responses are then processed and fed into a classifier capable of accurately predicting a range of personality traits. Our work yields notably high predictive accuracy, suggesting the applicability of the proposed framework for robust personality detection.
Conference Paper
Full-text available
Exploring a city panorama from a vantage point is a popular tourist activity. Typical audio guides that support this activity are limited by their lack of responsiveness to user behavior and by the difficulty of matching audio descriptions to the panorama. These limitations can inhibit the acquisition of information and negatively affect user experience. This paper proposes Gaze-Guided Narratives as a novel interaction concept that helps tourists find specific features in the panorama (gaze guidance) while adapting the audio content to what has been previously looked at (content adaptation). Results from a controlled study in a virtual environment (n=60) revealed that a system featuring both gaze guidance and content adaptation obtained better user experience, lower cognitive load, and led to better performance in a mapping task compared to a classic audio guide. A second study with tourists situated at a vantage point (n=16) further demonstrated the feasibility of this approach in the real world.
Technical Report
Full-text available
We conducted an eye-tracking study where 32 participants view distorted images of famous artwork and landmarks. We computed the gaze transition entropy and the stationary distribution entropy for their eye-movements both offline (post-process) and online (in real-time). We hypothesized that entropy of participants who recognized the images will be different from those who did not recognize. We also hypothesized that even though online values of the gaze transition entropy and the online stationary distribution entropy can be different, they should produce the same statistical models. Our results show that we could not recreate [Krejtz et al. 2015] results based only on the Monalisa stimulus. Our experimental data was more rigorous on different stimuli per participant, thus we plan to analyze the data on different levels and repeated measures to be able to draw more grounded interpretations of an individual's transition patterns. We also were not able to validate the online approach for computing gaze transition entropy since our statistical analysis did not align with [Krejtz et al. 2015], however we also plan on revising our implementation and re-analyzing the data.
Conference Paper
Full-text available
Recommender systems have been increasingly used in online services that we consume daily, such as Facebook, Netflix, YouTube, and Spotify. However, these systems are often presented to users as a "black box", i.e. the rationale for providing individual recommendations remains unexplained to users. In recent years, various attempts have been made to address this black box issue by providing textual explanations or interactive visualisations that enable users to explore the provenance of recommendations. Among other things, results demonstrated benefits in terms of precision and user satisfaction. Previous research had also indicated that personal characteristics such as domain knowledge, trust propensity and persistence may also play an important role on such perceived benefits. Yet, to date, little is known about the effects of personal characteristics on explaining recommendations. To address this gap, we developed a music recommender system with explanations and conducted an online study using a within-subject design. We captured various personal characteristics of participants and administered both qualitative and quantitative evaluation methods. Results indicate that personal characteristics have significant influence on the interaction and perception of recommender systems, and that this influence changes by adding explanations. For people with a low need for cognition are the explained recommendations the most beneficial. For people with a high need for cognition, we observed that explanations could create a lack of confidence. Based on these results, we present some design implications for explaining recommendations.
Conference Paper
The use of complex machine learning models can make systems opaque to users. Machine learning research proposes the use of post-hoc explanations. However, it is unclear if they give users insights into otherwise uninterpretable models. One minimalistic way of explaining image classifications by a deep neural network is to show only the areas that were decisive for the assignment of a label. In a pilot study, 20 participants looked at 14 of such explanations generated either by a human or the LIME algorithm. For explanations of correct decisions, they identified the explained object with significantly higher accuracy (75.64 % vs. 18.52 %). We argue that this shows that explanations can be very minimalistic while retaining the essence of a decision, but the decision-making contexts that can be conveyed in this manner is limited. Finally, we found that explanations are unique to the explainer and human-generated explanations were assigned 79 % higher trust ratings. As a starting point for further studies, this work shares our first insights into quality criteria of post-hoc explanations.
Conference Paper
Job mediation services can assist job seekers in finding suitable employment through a personalised approach. Consultation or mediation sessions, supported by personal profile data of the job seeker, help job mediators understand personal situation and requests. Prediction and recommendation systems can directly provide job seekers with possible job vacancies. However, incorrect or unrealistic suggestions, and bad interpretations can result in bad decisions or demotivation of the job seeker. This paper explores how an interactive dashboard visualising prediction and recommendation output can help support the dialogue between job mediator and job seeker, by increasing the "explainability" and providing mediators with control over the information that is shown to job seekers.