ArticlePDF Available

Abstract and Figures

In offering personalized content geared toward users’ individual interests, recommender systems are assumed to reduce news diversity and thus lead to partial information blindness (i.e., filter bubbles). We conducted two exploratory studies to test the effect of both implicit and explicit personalization on the content and source diversity of Google News. Except for small effects of implicit personalization on content diversity, we found no support for the filter-bubble hypothesis. We did, however, find a general bias in that Google News over-represents certain news outlets and under-represents other, highly frequented, news outlets. The results add to a growing body of evidence, which suggests that concerns about algorithmic filter bubbles in the context of online news might be exaggerated.
Content may be subject to copyright.
BURST OF THE FILTER BUBBLE?
1
Accepted Manuscript
For the published full paper refer to/check
Haim, Mario, Graefe, Andreas, & Brosius, Hans-Bernd (2017). Burst of the Filter Bubble? Effects of
personalization on the diversity of Google News. Digital Journalism.
DOI: 10.1080/21670811.2017.1338145.
Abstract
In offering personalized content geared towards users individual interests, recommender systems
are assumed to reduce news diversity and thus lead to partial information blindness (i.e., filter bubbles).
We conducted two exploratory studies to test the effect of both implicit and explicit personalization on the
content and source diversity of Google News. Except for small effects of implicit personalization on
content diversity, we found no support for the filter-bubble hypothesis. We did, however, find a general
bias in that Google News over-represents certain news outlets and under-represents other, highly
frequented, news outlets. The results add to a growing body of evidence, which suggests that concerns
about algorithmic filter bubbles in the context of online news might be exaggerated.
Keywords: online news, personalization, Filter Bubble, diversity, experiment
BURST OF THE FILTER BUBBLE?
2
Burst of the Filter Bubble? Effects of Personalization on the Diversity of Google News
News consumption has changed dramatically over the past years, with an ever-increasing share of
news consumed online. For example, based on a survey of 53,000 news consumers in 26 countries, the
Reuters Institute Digital News Report found that 23% of respondents use online (digital) channels as their
main news source. Another 44% reported that they consider digital and traditional sources equally
(Newman, Fletcher, Levy, & Nielsen, 2016). When consuming news online, roughly 40% of respondents
discover news through search engines, approximately 33% via social network sites, and about one-in-
eight through news aggregators (Newman et al., 2016, p. 93). That is, a largeand ever-increasing
share of news consumers rely on algorithmically curated environments in which algorithms automatically
select personalized news based on information about individual news consumers. This personalization
could be explicit, based on information deliberately provided by the user, or implicit, based on
information collected from observing that user’s online behavior (Thurman & Schifferes, 2012;
Zuiderveen Borgesius et al., 2016). In either case, the personalized news offer has been accumulated
automatically. It is not the result of a human editor’s selecting decision.
Hence, while online news consumption increases, human editors sovereignty over diversity has
been decreasing. Diversity is commonly regarded a key principle of news quality (McQuail, 1992) in
ensuring a well-informed public (Strömbäck, 2005). Thereby, diversity can be understood as either source
or content diversity. Source diversity refers to both a news outlet’s inclusion of a multitude of
informational sources as well as a news article’s inclusion of a wide variety of mentioned persons.
Content diversity yields at providing news consumers with a large range of different fields of interest as
well as an entire selection of perspectives on a given topic (Voakes, Kapfer, Kurpius, & Chern, 1996).
Scholars have raised concerns as to whether algorithms value diversity as a key feature of news
quality (Pasquale, 2015). Theoretical concepts such as Parisers (2011) filter-bubble hypothesis suggest
that, instead of ensuring diversity, algorithms aim at maximizing economic gain by increasing media
consumption. According to this rationale, algorithms filter out information that is assumed to be of little
interest to individual users while presenting more content that users are more likely to consume. For
BURST OF THE FILTER BUBBLE? 3
example, users who have a history of consuming a lot of sports news will receive even more sports news,
presumably at the cost of other topics (e.g., political news).
The present study empirically tests this rationale for the case of the news aggregator Google
News. While personalization also affects news diversity in other digital news environments (e.g., social
network sites, search engines), news aggregators solely focus on the distribution of news. As such, their
underlying algorithms are geared toward news consumption rather than social interaction or the
identification of search query patterns. In particular, we study how online news personalization, both
implicitly and explicitly, affects content and source diversity. In the following, we first discuss relevant
literature on news diversity and algorithmic personalization. Then, we present results from two
explorative empirical studiesone using explicit and the other using implicit personalizationbefore
drawing overall conclusions in a general discussion.
News Diversity
Mass media, and particularly its multitude and balance of news, should enable citizens to act
well-informed, especially in the course of democratic decisions (e.g., elections; Westerståhl, 1983).
Among others, balance of news includes diversity of content, sources, or perspectives, and thus aims at
providing citizens with a broad variety of information. News diversity thereby represents the means for a
broadly informed public and is thus one of the “fundamental principles underlying evaluations of the
performance of mass media systems(Napoli, 1999, p. 7). As such, news diversity follows ideals of
deliberative politics and is seen as one of the key dimensions of news quality within any democratic
society around the world (Entman & Wildman, 1992; McQuail, 1992; Porto, 2007; Strömbäck, 2005).
News diversity is a multi-faceted construct that is subject to various interpretations. First,
diversity can aim at a multitude of sources. Such source diversity describes the pluralism of quoted
actors’ affiliations or status positions (Voakes et al., 1996). A more source-diverse news, thus, includes
information from political, economic, non-governmental and any other affected sources. Yet, source
diversity may also depict the variety of news outlets which are included in a recipient’s news diet,
especially in the context of news aggregators (Thurman, 2011). Second, diversity can refer to the variety
BURST OF THE FILTER BUBBLE? 4
of covered topics. This so-called content diversity oftentimes relates to the mere appearance of topics in
their most basic form, such as “public affairs” or “baseball.” Yet, it may also include the multitude of
perspectives, which ideally represent a democracy’s political spectrum of opinions (Entman & Wildman,
1992). For example, a news outlet may deliberately exclude aspects that are contrary to its own political
view. A content-diverse news outlet, however, should include all available aspects of a given topic. Third,
diversity can relate to a variation of viewpoints (i.e., framing). This viewpoint diversity subsumes
available frames on a given topic and is thus clearly the most demanding type of diversity, both for
journalists to produce and for researchers to measure (Baden & Springer, 2015). However, at the same
time, it provides the most reliable measure in terms of news’ normative requirements toward a
“marketplace of ideas” (Strömbäck, 2005, p. 338).
Source, content, and viewpoint diversity represent constructs for measuring news diversity
(McDonald & Dimmick, 2003). In addition, theoretical scholars have offered normative guidelines for
how to establish news diversity, which largely depend on a given media system. For example, media
systems that aim for horizontal news diversity require the different media outlets to ensure diversity
collectively. In comparison, vertical news diversity requires each single media outlet to provide a
satisfactory multitude of news (e.g., Hellman, 2001). Empirical evidence further suggests that diversity
also depends on other factors such as political trends (van Hoof, Jacobi, Ruigrok, & van Atteveldt, 2014),
the size of media outlets (Voakes et al., 1996), or an individual user’s selective exposure (Napoli, 1999).
The effects of online news consumption on diversity remain largely unclear. On the one hand, the
amount of available news has increased dramatically as boundaries for the production and distribution of
news have decreased (i.e., “information overload”; Eppler & Mengis, 2004). Therefore, a much bigger
diversity should be possible (Carlson, 2007). On the other hand, the available information in its entirety
overwhelm users, who thus need to rely on filters to reduce complexity and thus provide individual
representations of news diversity (Napoli, 1999). In recent years, algorithms have increasingly taken over
these tasks.
BURST OF THE FILTER BUBBLE? 5
Algorithmic Personalization
Automatically filtered selections of online information (e.g., news) should help internet users to
overcome the overwhelming amount of available information (Carlson, 2007). Such filter algorithms are
often referred to as recommender algorithms, since they recommend personalized content based on
information about individual users. Thereby, recommender algorithms typically use information about
users’ interests, preferences, and surf behavior as well as contextual information (e.g., time, location) to
derive optimally tailored results based on various forms of statistical clustering (for an overview, see
Oechslein & Hess, 2013).
The type of information used for personalization depends on an individual platform’s goals and
requirements. For example, Thurman and Schifferes (2012; also see Zuiderveen Borgesius et al., 2016)
distinguish between explicit and implicit personalization. While explicit personalization requires users to
proactively reveal their preferences, implicit personalization is based on observations of an individual
user’s online behavior. In practice, combinations of explicit and implicit personalization are possible as
well.
Algorithms also evaluate how well the filtered results match a user’s needs. For example, an
algorithm might interpret a given user’s click or follow-up action (e.g., a comment or Like) on a
recommended item as an accurate match. Yet, such evaluation processes carry the risk of self-
reinforcement and reduced diversity, which may ultimately lead to partial information blindness. This
rationale has become widely known as the “filter bubble” (Pariser, 2011). Similar theoretical constructs
aim at the increasing chance of like-minded contacts (“echo chambers”; Sunstein, 2009) and limited
public spheres (“sphericules”; Gitlin, 1998). Especially the latter refers to a normative fear of
unknowingly missing various information which prohibit individuals from being a properly informed and
rational democratic citizens. As such, public-sphere theories can be said to be primarily concerned with
decreasing viewpoint diversity rather than a decreased diversity of either sources or content. That said,
especially source diversity in horizontally diverse media systems goes along with viewpoint diversity
since different media outlets depict different political perspectives.
BURST OF THE FILTER BUBBLE? 6
Empirical evidence on the existence of filter-bubble effects, especially in the context of news, is
limited. One study found small effects in that Facebook users see more-than-average posts from
politically like-minded users (Bakshy, Messing, & Adamic, 2015). Yet, the study has faced several
methodological criticisms, such as building upon self-reported political orientation (Pariser, 2015). Apart
from social network sites, personalization effects have been looked at within search engines, revealing
almost none (e.g., Flaxman, Goel, & Rao, 2016; Haim, Arendt, & Scherr, 2017) or only minor (e.g., Feuz,
Fuller, & Stalder, 2011; Hannak et al., 2013) effects of partial information blindness.
The aim of this study is to contribute further evidence by investigating howboth implicit and
explicitpersonalization of an online news aggregator affects both content and source diversity. To the
best of our knowledge, this is the first study to analyze the effects of personalization on news diversity for
news aggregators. For this, we focus on Google News (https://news.google.com/), one of the most-visited
online news aggregators (Newman et al., 2016, p. 12). Google News claims to present headlines which
are selected by computer algorithms based on your past activity on Google(Google, 2017a) while at the
same time “working to make sure that [the front page of Google News] reflects a diversity of articles and
sources(Google, 2017c). Our analyses focus on the German version of Google News. This seemed to be
a reasonable choice, given that we used German IP addresses and media user typologies. That said, we
expect that the results generalize to other countries, since Google does not give any reason to expect
differences in the underlying algorithms as the “goal is to offer Google News to all of our users
throughout the world” with the differences are that “[e]ach edition is specifically tailored with news for
that audience(Google, 2017b). As with comparable products and providers, we do not know which
parameters drive personalized outcomes, since the algorithms underlying online news aggregators are
“black boxes” (Pasquale, 2015). Therefore, we can only analyze the effects of personalization on news
diversity based on input-output analyses, for example, by varying a user’s surf behavior or preferences
(i.e., input) and comparing the resulting news offer (i.e., output). For this, we conducted two explorative
studies to control for explicit (Study 1) and implicit (Study 2) personalization.
BURST OF THE FILTER BUBBLE? 7
Study 1
Google News allows users to explicitly select the types of news they are interested in. That is,
users can explicitly personalize their account towards their preferences, by specifying the topics they want
to read about more (or less). The goal of this study was to analyze how explicit personalization of Google
News affects the diversity of the presented news articles.
Method
We created three different Google accounts, each of which was personalized for one of the major
topics as suggested by the platformpolitics, sports, and entertainment. As such, the preference for the
politics account were set to “always” show political news but “rarely” include sports or entertainment
news. The sports account was set to “always” show sports news but “rarely” include political news or
entertainment. Similarly, the entertainment account “always” preferred entertainment but “rarely”
political or sports news.
For each account, we stored the articles from the Google News start page, once a day (i.e., at 8
p.m.), for the six-day period from 27th of May to the 2nd of June in 2014. In addition, we stored the front
pages of a neutral account (i.e., Google News without personalization) as well as those from the popular
online news outlets Bild.de and Spiegel Online. The two news outlets are not entirely comparable to a
Google News account given their journalistic arrangement of political articles at the top of the page. Yet,
by ignoring the order of presented articles and due the topical generality of the two outlets, their diversity
can serve as a baseline to compare the personalized accounts against.
Three research assistants read each article and assigned it to one of eight topic categories: politics,
economics, sports, culture, science, lifestyle, miscellaneous, and service. A reliability check on 16
randomly selected articles resulted in a joint-probability of agreement of 78%. For the Google News
accounts, the research assistants additionally coded the source of each article.
Results
The six-day period resulted in a total of 972 news articles across all six news outlets (i.e., four
Google News accounts, Bild.de, and Spiegel Online). The number of articles per account was evenly
BURST OF THE FILTER BUBBLE? 8
distributed: 137 for the politics version, 133 for the sports version, 132 for the entertainment version, and
123 for the neutral (unpersonalized) version. For Bild.de (189) and Spiegel Online (258), the numbers of
articles were larger.
Figure 1 shows the share of articles per topic and news outlet. The results indicate that the explicit
personalization of Google News worked. As expected, for each of the three topics (politics, sports, and
entertainment) the respective personalized account provided a higher share of articles than the remaining
three accounts. For example, 52% of the articles in the politics account were political news, whereas the
share of political news in the remaining accounts ranged from 37% to 39%. The sports version showed
17% of sports news compared to 9% to 12% in the other accounts. Finally, 33% in the entertainment
version were from the preferred category, compared to 19% to 28% in the other versions.
The comparison to the traditional news outlets shows some interesting results. First, the share of
political news is much higher in all four Google News accounts compared to both Bild.de (11%) and
Spiegel Online (26%). Second, both traditional news outlets focus a considerable share of their coverage
on entertainment and other topics.
Figure 1: Percentage of articles per news outlet and topic category
BURST OF THE FILTER BUBBLE? 9
Figure 2 shows the source diversity for each of the three personalized as well as the
unpersonalized Google News account, measured as the percentage of articles per news outlet. Source
diversity did not significantly vary between the four accounts. Yet, when summed up, the eleven news
outlets that account for at least three percent of the total number of articles are responsible for 86% of all
articles at Google News. Thereby, the top outlets, such as Focus Online (24%) and Die Welt (13%) make
up for particularly large shares, which indicates a biased selection of news sources. That is, surprisingly,
these top sources do not represent news outlets with outstanding reach in Germany. For example, in June
of 2014, Focus Online only ranked 11th among Germany’s most-visited news websites (IVW, 2017), Die
Welt ranks even lower (17th). Vice versa, usually well-frequented outlets, such as Bild.de, T-Online, RTL,
or Stern.de seem underrepresented within Google News. Yet, while there is no demand for a
representation of the media system whatsoever, this result is interesting in that both Focus Online and Die
Welt are rather conservative outlets and are both known for aggressive clickbait headlines and search
engine optimization. The remaining 14% of articles originate from other news outlets.
Figure 2: Source diversity of the four Google News accounts
BURST OF THE FILTER BUBBLE? 10
Study 2
Implicit personalization builds upon statistical deductions from observed user and usage data.
Hence, in order for Google News to be able to provide a personalized news offer, it first needs to have the
chance to observe a given user. We thus followed an explorative agent-based testing approach, in which
we modeled various users and their online behavior to study the effects of implicit personalization
qualitatively.
Method
We modeled online behavior on various dimensions of four virtual agents. Characteristics of the
four virtual agents were derived from two representative German media user typologies (ARD/ZDF,
2015; Sinus Markt- und Sozialforschung, 2015). Each agent stands for an archetypical life standard and
media use:
Agent A. An elderly female conservative widow,
Agent B. A Bourgeois father in his fifties,
Agent C. A forty-year-old job-oriented male single, and
Agent D. A wealthy 30-year-old female marketing manager and early adopter.
For each agent, a new (virtual) computer was set up which was only used for the purpose of this
experiment. All computers’ IP addresses were equal and thus held constant. That is, while the identities
suggest different life situations and locations, the computers on which the virtual agents operated were in
the same location in order to minimize IP-address influences. For each agent, we created both a Facebook
and a Google+ account with information about the agents age, gender, living situation, education, job,
relationship status as well as favorite books, sports, music, and movies, all according to the agents’
media-user typology characteristics shown in Table 1.
The study consisted of two phases. In accordance with similar studies (Haim et al., 2017; Hannak
et al., 2013) the initial training phase lasted for one week. Throughout this week, we repeatedly (1)
searched Google for five agent-specific search terms and clicked the top three results, (2) expressed
approval (through Likes or +1’s) for various products’ and artists’ profile pages on both Facebook and
BURST OF THE FILTER BUBBLE? 11
Google+ (five each), (3) put five individually favorable products into the shopping cart on Amazon, and
(4) navigated through approximately ten articles on individually favorable media outlets. For example, the
elderly female conservative widow liked religious art, wanted to buy a heating blanket, and used local
news outlets. This training process was repeated every day over the course of one week in June of 2014.
After this week, the test phase required all agents to subsequently enter three ambiguous terms in Google
News that were newsworthy at that time. The search terms were queried in German and included taxation
(“Steuer”), allowing for a variety of article topics, Germany (“Deutschland”), allowing for a variety of
focuses, especially due to the ongoing soccer world cup, and Alstom, a French producer of trains largely
in use in Germany, which at the time of the study was under consideration for either being sold to a
foreign competitor or acquiring another company, thus affecting large numbers of German employees and
thereby allowing for a variety of viewpoints.
For each of the three search terms, we stored the first ten results pages. Since each result page
shows ten articles, we thus collected 300 articles (i.e., 3 search terms × 10 result pages × 10 articles) per
agent. In addition, we created a fifth (virtual) computer to serve as a control group. This account received
no training and was used only for the test phase. It thus did not allow for observation and implicit
personalization.
Results
First, we counted every agents exclusive entries as compared to the control group. For example,
agent A (i.e., the elderly female conservative widow) saw an article from the conservative newspaper Die
Welt, which was not shown in the control-group agent. Instead, an article from the local news outlet
Augsburger Allgemeine was shown. Second, we estimated rank-order differences relative to the control
group based on a qualitative analysis of the result lists. Exclusive articles affect diversity directly. In
comparison, rank-order differences inhibit diversity as news consumers are known to use only top-ranked
results from the first few result pages (Pan et al., 2007).
Table 1: Overview of the employed agents and their personalized results
BURST OF THE FILTER BUBBLE? 12
Agent A
Agent B
Agent C
Agent D
Age
69
57
40
30
Gender
female
male
male
female
Living situation
alone
w/ wife & kid
alone
w/ partner
Education
sec. school
sec. school
B.Sc. (IT)
M.Sc. (eco.)
Job
retired
senior tiler
software dev.
head of PR
Relationship status
widow
married
single
partnership
Political leaning
strongly cons.
conservative
liberal
strongly lib.
Interests
health, religion
DIY, nature
culture, tech
sports, fashion
Media repertoire
local news, print
& TV
boulevard, TV
national news,
online & PBS
streaming,
magazines
Taxation
rank-order deviation
minor
weak
weak
weak
first page with deviation
6
4
4
6
exclusive articles
4
-
-
3
Germany
rank-order deviation
minor
major
minor
minor
first page with deviation
2
1
1
2
exclusive articles
4
4
3
3
Alstom
rank-order deviation
major
weak
weak
major
first page with deviation
6
9
7
6
exclusive articles
6
0
0
3
Overall, we found only minor differences across the four accounts. For any given search query,
the four agents saw almost the exact same 100 articles (Table 1). The highest deviations were six
exclusive articles for agent A on topic Alstom and four exclusives in various situations (e.g., agent B on
topic Germany). However, these exclusives appeared on less-prominent result pages (i.e., pages six
through ten), which was a common finding for all four agents.
In sum, we found only 30 cases of exclusive articles out of 1200 comparisons, a share of only
2.5%. In other words, there either was no personalization, or our training did not work. That said, those 30
articles that were identified as exclusive suggest that the training did indeed work. For example, agent A
(i.e., the elderly female conservative widow) missed some articles from economy outlets but was instead
presented with articles from more general news outlets.
BURST OF THE FILTER BUBBLE? 13
Similarly, the vast majority of articles was shown at the exact same position as in the control
group agent. That is, an untrained account revealed more or less the same results than each of the four
trained accounts. We refrain from a numerical quantification of rank-order differences. The reason is that
such a quantitative analysis would mislead the reader, since deviations on early result pages can cause
deviations on subsequent result pages, which would bias the results toward the first occurrence of rank-
order deviations. Instead, we provide qualitative estimates of “weak”, “minor”, and “major” deviations
which relate to the rough number of affected articles, namely up to ten, between ten and twenty, and more
than twenty, respectively. In cases where the rank order between the trained and the control group agent
differed, this mix-up seemed to vary due to updates inside an article. In other words, the differences may
have occurred as a result of small time delays in the daily storage of the results page. As indicated by a
timestamp next to an article’s news outlet, Google News seems to prefer newerand more recently
updateditems. This rationale finds support by the fact that the topics of Alstom and Germany showed
more deviation (the former due to its selling process, the latter due to the ongoing soccer world
championship) than the topic of taxation, which did not reveal many rank-order differences.
Discussion
The two explorative studies provide empirical evidence on how both explicit and implicit forms
of personalization within an online news aggregator (Google News) affect both content and source
diversity. We found only minor effects of personalization on content diversity. While explicit
personalization slightly affected content diversity in that users saw more articles for their preferred topics,
implicit personalization based on manipulations of user behavior did not affect content diversity.
Furthermore, neither type of personalization had any effect on source diversity.
That said, we found a bias in that Google News over-represented certain news outlets and under-
represented other, highly frequented, news outlets. Given the over-represented outlets’ conservative
nature, this bias can be troubling, especially in terms of viewpoint diversity. We can only speculate on the
reasons for this result. First, the over-represented outlets are known to put quite some effort into search-
engine optimization. While this would render Google News in a rather simplistic light, it seems plausible
BURST OF THE FILTER BUBBLE? 14
from a technological point of view as the outlets offer highly up-to-date information in a machine-
readable manner with all relevant keywords mentioned. Second, Google News may punish outlets with
paywalls, as they diminish a user’s browsing experience. Third, the algorithms may have difficulties to
identify a stories’ main topic based on images rather than on texts and may thus decrease the weight of
image-heavy reporting (e.g., Bild.de). Fourth, some German publishers (e.g., Axel Springer) have been
fighting a legal case regarding Google’s right to include snippets of articles on their platform without the
possession of the articles’ intellectual property. It is possible that Google News punished those publishers
by partly excluding their content (e.g., Bild.de).
Overall, our findings suggest that the filter-bubble phenomenon may be overestimated in the case
of algorithmic personalization within Google News. In the case of explicit personalization, the share of
political news was higher than on Spiegel Online, a major source for political news in Germany, even for
those users who explicitly stated that they rarely want to see political news. In other words, while
personalization effects were visible (which provides support for the applicability of our method), the
results did not blind out essential shares of information (which the filter-bubble hypothesis would
suggest).
Our study is subject to various limitations. First, we focused on only one news aggregator and are
thereby empirically bound to conclusions about Google News rather than on general digital news
personalization. Second, although we compared different types of both news diversity and
personalization, our setting does not allow for checking the underlying assumption that personalization
within algorithmically curated environments narrows diversity to a stronger extent than, for example,
individual selective exposure throughout daily newspaper consumption. Third, our agent-based testing
approach may have affected the validity of the results. While our approach controls for various influences
that are difficult to hold constant in real life, one disadvantage is that the method produces a highly
artificial environment, in which other possible influences are left out. This may of course affect the
outcome. For example, since Google bundles a user’s account for all of its services, it seems likely that
personalization not only builds upon a user’s surf behavior inside Google News and Google+, but also
BURST OF THE FILTER BUBBLE? 15
inside YouTube or other services. While we tried to account for that in the second study, we cannot
determine whether our selection of actions throughout the training phase was sufficient for an adequate
and common degree of personalization.
Our results are mostly in line with those from similar studies, which adds confidence in their
validity. While some studies indeed report minor personalization effects (e.g., in the context of search
engines; Feuz et al., 2011; Hannak et al., 2013) all related studies conclude that “the magnitude of the
effects is relatively modest” (Flaxman et al., 2016, p. 318; also see Cozza, Hoang, Petrocchi, &
Spognardi, 2016). Based on a discussion of the relevant literature, Haim and colleagues (2017, p. 257,
emphasis in original) conclude that “no study can support the claim that Google blinds out (i.e., censors)
specific information.
Despite the consistency of these findings, there is reason to remain on guard, as even minor
changes in the underlying algorithms could affect the results. In addition, while the empirical evidence for
both search engines and news aggregators are similar, evidence from social network sites is still sparse.
Furthermore, every study, including our own, can only depict a small point in time and is subject to
changes of the object under investigation. It seems thus necessary for the platforms under investigation
(e.g., Google, Facebook) to be more transparent with changes inside algorithms which affect the news
selections users are presented with (Diakopoulos, 2014; Diakopoulos & Koliska, 2016). This is especially
relevant in the context of online news and news diversity, which is of major importance to a society’s
democratic process. Ideas for improving this situation include guidelines on algorithmic decision-making
(Ananny, 2016, p. 108), public committees (Saurwein, Just, & Latzer, 2015, p. 41), or algorithms’
ombudspeople (Diakopoulos & Koliska, 2016, p. 12) who can provide information and apply necessary
adjustments upon inquiry.
BURST OF THE FILTER BUBBLE? 16
References
Ananny, M. (2016). Toward an ethics of algorithms: Convening, observation, probability, and timeliness.
Science, Technology & Human Values, 41(1), 93117.
https://doi.org/10.1177/0162243915606523
ARD/ZDF. (2015). MedienNutzerTypologie. Retrieved from http://www.ard-zdf-mnt.de/
Baden, C., & Springer, N. (2015). Conceptualizing viewpoint diversity in news discourse. Journalism.
https://doi.org/10.1177/1464884915605028
Bakshy, E., Messing, S., & Adamic, L. (2015). Exposure to ideologically diverse news and opinion on
Facebook. Science, 1160, 12. https://doi.org/10.1126/science.aaa1160
Carlson, M. (2007). Order versus access: News search engines and the challenge to traditional journalistic
roles. Media, Culture & Society, 29(6), 10141030. https://doi.org/10.1177/0163443707084346
Cozza, V., Hoang, V. T., Petrocchi, M., & Spognardi, A. (2016). Experimental measures of news
personalization in Google News. In S. Casteleyn, P. Dolog, & C. Pautasso (Eds.), Current Trends
in Web Engineering. Proceedings of the 2nd International Workshop on Mining the Social Web
(pp. 93104). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-46963-
8_8
Diakopoulos, N. (2014). Algorithmic accountability. Journalistic investigation of computational power
structures. Digital Journalism, 3, 398415. https://doi.org/10.1080/21670811.2014.976411
Diakopoulos, N., & Koliska, M. (2016). Algorithmic transparency in the news media. Digital Journalism,
120. https://doi.org/10.1080/21670811.2016.1208053
Entman, R. M., & Wildman, S. S. (1992). Reconciling economic and non-economic perspectives on
media policy: Transcending the “marketplace of ideas.” Journal of Communication, 42(1), 519.
https://doi.org/10.1111/j.1460-2466.1992.tb00765.x
Eppler, M. J., & Mengis, J. (2004). The concept of information overload: A review of literature from
organization science, accounting, marketing, MIS, and related disciplines. The Information
Society, 20(5), 325344. https://doi.org/10.1080/01972240490507974
BURST OF THE FILTER BUBBLE? 17
Feuz, M., Fuller, M., & Stalder, F. (2011). Personal web searching in the age of semantic capitalism:
Diagnosing the mechanisms of personalisation. First Monday, 16(2).
https://doi.org/10.5210/fm.v16i2.3344
Flaxman, S. R., Goel, S., & Rao, J. M. (2016). Filter bubbles, echo chambers, and online news
consumption. Public Opinion Quarterly, 80(S1), 298320. https://doi.org/10.1093/poq/nfw006
Gitlin, T. (1998). Public spheres or public sphericules. In T. Liebes & J. Curran (Eds.), Media, ritual and
identity (pp. 168174). London: Routledge.
Google. (2017a). How Google News results are selected. Retrieved March 23, 2017, from
https://support.google.com/news/answer/106259?hl=en
Google. (2017b). Languages and regions. Retrieved March 23, 2017, from
https://support.google.com/news/publisher/answer/40237?hl=en
Google. (2017c). Stories on the front page. Retrieved March 23, 2017, from
https://support.google.com/news/publisher/answer/94000?hl=en&ref_topic=2492117
Haim, M., Arendt, F., & Scherr, S. (2017). Abyss or shelter? On Google’s role when googling for suicide.
Health Communication, 32(2), 253258. https://doi.org/10.1080/10410236.2015.1113484
Hannak, A., Sapiezynski, P., Molavi Kakhki, A., Krishnamurthy, B., Lazer, D., Mislove, A., & Wilson,
C. (2013). Measuring personalization of web search. In Proceedings of the 22nd international
conference on World Wide Web (pp. 527538). Rio de Janeiro: International World Wide Web
Conferences Steering Committee.
Hellman, H. (2001). Diversity An end in itself? Developing a multi-measure methodology of television
programme variety studies. European Journal of Communication, 16(2), 181208.
https://doi.org/10.1177/0267323101016002003
IVW. (2017). Online-Nutzungsdaten. Retrieved from http://ausweisung.ivw-online.de
McDonald, D. G., & Dimmick, J. (2003). The conceptualization and measurement of diversity.
Communication Research, 30(1), 6079. https://doi.org/10.1177/0093650202239026
BURST OF THE FILTER BUBBLE? 18
McQuail, D. (1992). Media performance: Mass communication and the public interest. Thousand Oaks,
CA: Sage.
Napoli, P. M. (1999). Deconstructing the diversity principle. Journal of Communication, 49(4), 734.
https://doi.org/10.1111/j.1460-2466.1999.tb02815.x
Newman, N., Fletcher, R., Levy, D. A. L., & Nielsen, R. K. (2016). Digital news report 2016. Oxford.
Retrieved from http://reutersinstitute.politics.ox.ac.uk/sites/default/files/Digital-News-Report-
2016.pdf
Oechslein, O., & Hess, T. (2013). Incorporating social networking information in recommender systems:
The development of a classification framework. In Proceedings of the 26th Bled eConference (pp.
287298). Bled, Slowenien. Retrieved from http://aisel.aisnet.org/bled2013/19
Pan, B., Hembrooke, H., Joachims, T., Lorigo, L., Gay, G., & Granka, L. (2007). In Google we trust:
Users’ decisions on rank, position, and relevance. Journal of Computer-Mediated
Communication, 12(3), 801823. https://doi.org/10.1111/j.1083-6101.2007.00351.x
Pariser, E. (2011). The Filter Bubble: How the new personalized web is changing what we read and how
we think. New York, NY: Penguin.
Pariser, E. (2015, May 7). Did Facebook’s big study kill my Filter Bubble thesis? Retrieved May 26,
2015, from https://medium.com/backchannel/facebook-published-a-big-new-study-on-the-filter-
bubble-here-s-what-it-says-ef31a292da95
Pasquale, F. (2015). The black box society: The secret algorithms that control money and information.
Cambridge, MA: Harvard University Press.
Porto, M. P. (2007). Frame diversity and citizen competence: Towards a critical approach to news quality.
Critical Studies in Media Communication, 24(4), 303321.
https://doi.org/10.1080/07393180701560864
Saurwein, F., Just, N., & Latzer, M. (2015). Governance of algorithms: Options and limitations. Info,
17(6), 3549. https://doi.org/10.1108/info-05-2015-0025
BURST OF THE FILTER BUBBLE? 19
Sinus Markt- und Sozialforschung. (2015). Die Sinus-Milieus. INTEGRAL Wien. Retrieved from
http://www.sinus-institut.de
Strömbäck, J. (2005). In search of a standard: Four models of democracy and their normative implications
for journalism. Journalism Studies, 6(3), 331345. https://doi.org/10.1080/14616700500131950
Sunstein, C. R. (2009). Republic.com 2.0. Princeton, N.J.: Princeton University Press.
Thurman, N. (2011). Making “The Daily Me”: Technology, economics and habit in the mainstream
assimilation of personalized news. Journalism, 12(4), 395415.
https://doi.org/10.1177/1464884910388228
Thurman, N., & Schifferes, S. (2012). The future of personalization at news websites. Journalism Studies,
13(56), 775790. https://doi.org/10.1080/1461670X.2012.664341
van Hoof, A. M., Jacobi, C., Ruigrok, N., & van Atteveldt, W. (2014). Diverse politics, diverse news
coverage? A longitudinal study of diversity in Dutch political news during two decades of
election campaigns. European Journal of Communication, 29(6), 668686.
https://doi.org/10.1177/0267323114545712
Voakes, P. S., Kapfer, J., Kurpius, D., & Chern, D. S.-Y. (1996). Diversity in the news: A conceptual and
methodological framework. Journalism & Mass Communication Quarterly, 73(3), 582593.
https://doi.org/10.1177/107769909607300306
Westerståhl, J. (1983). Objective news reporting. General premises. Communication Research, 10(3),
403424. https://doi.org/10.1177/009365083010003007
Zuiderveen Borgesius, F. J., Trilling, D., Möller, J., Balázs, B., de Vreese, C. H., & Helberger, N. (2016).
Should we worry about filter bubbles? Internet Policy Review, 5(1), 116.
https://doi.org/10.14763/2016.1.401
... The first aspect relates to the fact that existing studies mainly focus on a limited number of sources. Examples are analyses of the content diversity in online news or search engine results (Evans et al., 2022;Haim et al., 2018;Humprecht & Esser, 2018) or exposure to political disagreement on social networking sites (Bakshy et al., 2015). They offer valuable insights into the diversity of viewpoints of single information sources, but they do not provide a comprehensive picture of people's viewpoint environments. ...
Article
This study investigates the relationship between attitude extremity and perceived exposure to diverse political arguments in the debate about COVID-19 health policy measures. Based on a comparative, cross-sectional survey in Germany and Switzerland, we show that extreme attitudes towards wearing face masks inhibit citizens’ perceived diversity of arguments regarding the issue in both countries. This tendency is slightly more pronounced for supporters of mask-wearing than opponents. However, contrary to existing concerns about issue-specific echo chambers, even respondents showing strong attitude extremity still experience exposure to a relatively diverse range of arguments for and against wearing face masks.
... These insights are not only relevant to understanding the factors that drive and hinder the willingness to pay for this merit good at the individual level, but they also provide early indications of the influence of the country in this context. Despite the opportunities, the literature discusses a number of risks on different levels of the media system: 1) on the macro level, a polarisation of the public sphere (Couldry & Turow, 2014;Sunstein, 2007) and the (relatively modest) risk of filter bubbles (see, e.g., Flaxman et al., 2016, p. 318;Geiß et al., 2021;Haim et al., 2018); 2) on the meso level of the media organisation, a loss of control over its own brand and contents (Thurman, 2011) and 3) on the micro level of the individual user, concerns about privacy (Bastian et al., 2020) and 'algorithmic accountability' (Diakopoulos & Koliska, 2017). Studies by Helberger (2018Helberger ( , 2019 suggest that to tackle these risks, news diversity should be seen as an integral part of news recommender systems. ...
Article
Full-text available
Audiences’ declining trust in legacy media, including PSM, has been a concern for several decades. The Nordics have been a region where media trust is still strong–despite the worldwide economic challenges of journalism, the prominence of global platforms, and increasing threats posed by disinformation in national media markets. A lesser-researched aspect is how audiences experience different dimensions of trust, especially regarding PSM, and how media professionals and adjacent experts see the role of PSM in creating and building trust. This essay depicts the views of these stakeholder groups in Denmark, Finland, Norway, and Sweden and discusses some vital takeaways for PSM strategy.
... What we identify here is similar to the "filter bubble" (Pariser 2011) phenomenon, whereby recommender systems may narrow a user's content consumption. Most prior research has studied filter bubbles in the context of news and ideological polarization (Bakshy, Messing, and Adamic 2015;Haim, Graefe, and Brosius 2018;Guess et al. 2023), where people undoubtedly have different information-seeking behaviors and different psychological incentives-e.g., greater risk of selective exposure (Dahlgren 2021)-than in peer production. The outcomes of this research are mixed (Bakshy, Messing, and Adamic 2015;Bechmann and Nielbo 2018;Michiels et al. 2022), and we know of no empirical investigations of filter bubbles in our context, but our findings here and prior work on self-focus bias (Hecht and Gergle 2009) suggest a filter bubble hypothesis is plausible. ...
Article
Peer production platforms like Wikipedia commonly suffer from content gaps. Prior research suggests recommender systems can help solve this problem, by guiding editors towards underrepresented topics. However, it remains unclear whether this approach would result in less relevant recommendations, leading to reduced overall engagement with recommended items. To answer this question, we first conducted offline analyses (Study 1) on SuggestBot, a task-routing recommender system for Wikipedia, then did a three-month controlled experiment (Study 2). Our results show that presenting users with articles from underrepresented topics increased the proportion of work done on those articles without significantly reducing overall recommendation uptake. We discuss the implications of our results, including how ignoring the article discovery process can artificially narrow recommendations on peer production platforms.
Article
Full-text available
The study of algorithmically curated media environments through emulated browsing has become a key method of computational social science. Here, we review underlying concepts and typical implementations of these so-called agent-based testing (ABT) studies, particularly those that experimentally vary inputs on search engines, social media, or aggregation platforms, to study subsequent outputs. This review aims to identify, evaluate, and discuss implementations and reporting standards of ABT studies within the experimental research paradigm. First, we discuss general assumptions of the experimental research paradigm and to what extent ABT experiments align with them, finding a considerable overlap with minor deviations. Second, we systematically reviewed 66 ABT studies from the social and information sciences, finding little reference to the experimental research paradigm vis-à-vis large variations in implementation and reporting practices. Third, our findings then inform five best-practice guidelines for future ABT experiments where we suggest to explicitly and much more strictly align those studies with the experimental research paradigm. We argue that ABT experiments would benefit from such an adaptation of already established experimental practices, thereby improving reproducibility and replicability.
Article
Drawing on Chantel Mouffe, Mikhail Bakhtin, Judith Butler and others, we examined how English-speaking YouTube users discussed the Wuhan lockdown from late January to June 2020. We argue that news-prompted public spheres are affective, contextualised and short-lived and our findings suggest that the severity of the COVID-19 pandemic outside of China is closely related to the worsening sentiments of English-speaking publics on YouTube. Comments are also significantly associated with the changing severity of the pandemic, reflecting the changing concerns of publics regardless of the central theme of news stories. The divergence of the sentiment scores of user comments is also salient among different media spaces. These findings suggest that a de-localised, universal global public sphere is misleading and more nuanced and contextualised studies are warranted.
Article
In recent years, the surge in online content has necessitated the development of intelligent recommender systems capable of offering personalized suggestions to users. However, these systems often encapsulate users within a “filter bubble”, limiting their exposure to a narrow range of content. This study introduces a novel approach to address this issue by integrating a novel diversity module into a knowledge graph-based explainable recommender system. Utilizing the Movie Lens 1M dataset, this research pioneers in fostering a more nuanced and transparent user experience, thereby enhancing user trust and broadening the spectrum of recommendations. Looking ahead, we aim to further refine this system by incorporating an explicit feedback loop and leveraging Natural Language Processing (NLP) techniques to provide users with insightful explanations of recommendations, including a comprehensive analysis of filter bubbles. This initiative marks a significant stride towards creating a more inclusive and informed recommendation landscape, promising users not only a wider array of content but also a deeper understanding of the recommendation mechanisms at play.
Article
Full-text available
Some fear that personalised communication can lead to information cocoons or filter bubbles. For instance, a personalised news website could give more prominence to conservative or liberal media items, based on the (assumed) political interests of the user. As a result, users may encounter only a limited range of political ideas. We synthesise empirical research on the extent and effects of self-selected personalisation, where people actively choose which content they receive, and pre-selected personalisation, where algorithms personalise content for users without any deliberate user choice. We conclude that at present there is little empirical evidence that warrants any worries about filter bubbles.
Book
This book constitutes the thoroughly refereed post-workshop proceedings of the 16th International Conference on Web Engineering, ICWE 2016, held in Lugano, Switzerland, in June 2016. The 15 revised full papers together with 5 short papers were selected form 37 submissions. The workshops complement the main conference, and provide a forum for researchers and practitioners to discuss emerging topics. As a result, the workshop committee accepted six workshops, of which the following four contributed papers to this volume: • 2nd International Workshop on TEchnical and LEgal aspects of data pRIvacy and SEcurity (TELERISE 2016) • 2nd International Workshop on Mining the Social Web (SoWeMine 2016) • 1st International Workshop on Liquid Multi-Device Software for the Web (LiquidWS 2016) • 5th Workshop on Distributed User Interfaces: Distributing Interactions (DUI 2016)
Conference Paper
Search engines and social media keep trace of profile- and behavioral-based distinct signals of their users, to provide them personalized and recommended content. Here, we focus on the level of web search personalization, to estimate the risk of trapping the user into so called Filter Bubbles. Our experimentation has been carried out on news, specifically investigating the Google News platform. Our results are in line with existing literature and call for further analyses on which kind of users are the target of specific recommendations by Google.
Article
The growing use of difficult-to-parse algorithmic systems in the production of news, from algorithmic curation to automated writing and news bots, problematizes the normative turn toward transparency as a key tenet of journalism ethics. Pragmatic guidelines that facilitate algorithmic transparency are needed. This research presents a focus group study that engaged 50 participants across the news media and academia to discuss case studies of algorithms in news production and elucidate factors that are amenable to disclosure. Results indicate numerous opportunities to disclose information about an algorithmic system across layers such as the data, model, inference, and interface. Findings underscore the deeply entwined roles of human actors in such systems as well as challenges to adoption of algorithmic transparency including the dearth of incentives for organizations and the concern for overwhelming end-users with a surfeit of transparency information.
Article
Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person’s previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal Internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, and suicide-unrelated) × 2 (reactive: clicking the top-most result link and no clicking) experimental design applying agent-based testing. While findings show no influences either of search histories or of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms “decided” whether or not to present a helpline, and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.
Article
Online publishing, social networks, and web search have dramatically lowered the costs of producing, distributing, and discovering news articles. Some scholars argue that such technological changes increase exposure to diverse perspectives, while others worry that they increase ideological segregation. We address the issue by examining web-browsing histories for 50,000 US-located users who regularly read online news. We find that social networks and search engines are associated with an increase in the mean ideological distance between individuals. However, somewhat counterintuitively, these same channels also are associated with an increase in an individual’s exposure to material from his or her less preferred side of the political spectrum. Finally, the vast majority of online news consumption is accounted for by individuals simply visiting the home pages of their favorite, typically mainstream, news outlets, tempering the consequences—both positive and negative—of recent technological changes. We thus uncover evidence for both sides of the debate, while also finding that the magnitude of the effects is relatively modest.
Article
Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises, and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter-bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person’s previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, suicide-unrelated) × 2 (reactive: clicking the top-most result link, no clicking) experimental design applying agent-based testing. While findings show no influences neither of search histories nor of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms “decided” whether or not to present a helpline and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.
Article
Part of understanding the meaning and power of algorithms means asking what new demands they might make of ethical frameworks, and how they might be held accountable to ethical standards. I develop a definition of networked information algorithms (NIAs) as assemblages of institutionally situated code, practices, and norms with the power to create, sustain, and signify relationships among people and data through minimally observable, semiautonomous action. Starting from Merrill’s prompt to see ethics as the study of “what we ought to do,” I examine ethical dimensions of contemporary NIAs. Specifically, in an effort to sketch an empirically grounded, pragmatic ethics of algorithms, I trace an algorithmic assemblage’s power to convene constituents, suggest actions based on perceived similarity and probability, and govern the timing and timeframes of ethical action.