ArticlePDF Available

Burst of the Filter Bubble?: Effects of personalization on the diversity of Google News

March 2018
Digital Journalism 6(3):330-343

March 2018
6(3):330-343

DOI:10.1080/21670811.2017.1338145

Authors:

Mario Haim

Ludwig-Maximilians-University of Munich

Andreas Graefe

MHMK, Macromedia Hochschule für Medien und Kommunikation

Hans-Bernd Brosius

Ludwig-Maximilians-University of Munich

In offering personalized content geared toward users’ individual interests, recommender systems are assumed to reduce news diversity and thus lead to partial information blindness (i.e., filter bubbles). We conducted two exploratory studies to test the effect of both implicit and explicit personalization on the content and source diversity of Google News. Except for small effects of implicit personalization on content diversity, we found no support for the filter-bubble hypothesis. We did, however, find a general bias in that Google News over-represents certain news outlets and under-represents other, highly frequented, news outlets. The results add to a growing body of evidence, which suggests that concerns about algorithmic filter bubbles in the context of online news might be exaggerated.

Percentage of articles per news outlet and topic category

…

Source diversity of the four Google News accounts

…

Figures - uploaded by Andreas Graefe

Content may be subject to copyright.

Content uploaded by Andreas Graefe

Content may be subject to copyright.

BURST OF THE FILTER BUBBLE?

Accepted Manuscript

For the published full paper refer to/check

Haim, Mario, Graefe, Andreas, & Brosius, Hans-Bernd (2017). Burst of the Filter Bubble? Effects of

personalization on the diversity of Google News. Digital Journalism.

DOI: 10.1080/21670811.2017.1338145.

Abstract

In offering personalized content geared towards users’ individual interests, recommender systems

are assumed to reduce news diversity and thus lead to partial information blindness (i.e., filter bubbles).

We conducted two exploratory studies to test the effect of both implicit and explicit personalization on the

content and source diversity of Google News. Except for small effects of implicit personalization on

content diversity, we found no support for the filter-bubble hypothesis. We did, however, find a general

bias in that Google News over-represents certain news outlets and under-represents other, highly

frequented, news outlets. The results add to a growing body of evidence, which suggests that concerns

about algorithmic filter bubbles in the context of online news might be exaggerated.

Keywords: online news, personalization, Filter Bubble, diversity, experiment

BURST OF THE FILTER BUBBLE?

Burst of the Filter Bubble? Effects of Personalization on the Diversity of Google News

News consumption has changed dramatically over the past years, with an ever-increasing share of

news consumed online. For example, based on a survey of 53,000 news consumers in 26 countries, the

Reuters Institute Digital News Report found that 23% of respondents use online (digital) channels as their

main news source. Another 44% reported that they consider digital and traditional sources equally

(Newman, Fletcher, Levy, & Nielsen, 2016). When consuming news online, roughly 40% of respondents

discover news through search engines, approximately 33% via social network sites, and about one-in-

eight through news aggregators (Newman et al., 2016, p. 93). That is, a large—and ever-increasing—

share of news consumers rely on algorithmically curated environments in which algorithms automatically

select personalized news based on information about individual news consumers. This personalization

could be explicit, based on information deliberately provided by the user, or implicit, based on

information collected from observing that user’s online behavior (Thurman & Schifferes, 2012;

Zuiderveen Borgesius et al., 2016). In either case, the personalized news offer has been accumulated

automatically. It is not the result of a human editor’s selecting decision.

Hence, while online news consumption increases, human editors’ sovereignty over diversity has

been decreasing. Diversity is commonly regarded a key principle of news quality (McQuail, 1992) in

ensuring a well-informed public (Strömbäck, 2005). Thereby, diversity can be understood as either source

or content diversity. Source diversity refers to both a news outlet’s inclusion of a multitude of

informational sources as well as a news article’s inclusion of a wide variety of mentioned persons.

Content diversity yields at providing news consumers with a large range of different fields of interest as

well as an entire selection of perspectives on a given topic (Voakes, Kapfer, Kurpius, & Chern, 1996).

Scholars have raised concerns as to whether algorithms value diversity as a key feature of news

quality (Pasquale, 2015). Theoretical concepts such as Pariser’s (2011) filter-bubble hypothesis suggest

that, instead of ensuring diversity, algorithms aim at maximizing economic gain by increasing media

consumption. According to this rationale, algorithms filter out information that is assumed to be of little

interest to individual users while presenting more content that users are more likely to consume. For

BURST OF THE FILTER BUBBLE? 3

example, users who have a history of consuming a lot of sports news will receive even more sports news,

presumably at the cost of other topics (e.g., political news).

The present study empirically tests this rationale for the case of the news aggregator Google

News. While personalization also affects news diversity in other digital news environments (e.g., social

network sites, search engines), news aggregators solely focus on the distribution of news. As such, their

underlying algorithms are geared toward news consumption rather than social interaction or the

identification of search query patterns. In particular, we study how online news personalization, both

implicitly and explicitly, affects content and source diversity. In the following, we first discuss relevant

literature on news diversity and algorithmic personalization. Then, we present results from two

explorative empirical studies—one using explicit and the other using implicit personalization—before

drawing overall conclusions in a general discussion.

News Diversity

Mass media, and particularly its multitude and balance of news, should enable citizens to act

well-informed, especially in the course of democratic decisions (e.g., elections; Westerståhl, 1983).

Among others, balance of news includes diversity of content, sources, or perspectives, and thus aims at

providing citizens with a broad variety of information. News diversity thereby represents the means for a

broadly informed public and is thus one of the “fundamental principles underlying evaluations of the

performance of mass media systems” (Napoli, 1999, p. 7). As such, news diversity follows ideals of

deliberative politics and is seen as one of the key dimensions of news quality within any democratic

society around the world (Entman & Wildman, 1992; McQuail, 1992; Porto, 2007; Strömbäck, 2005).

News diversity is a multi-faceted construct that is subject to various interpretations. First,

diversity can aim at a multitude of sources. Such source diversity describes the pluralism of quoted

actors’ affiliations or status positions (Voakes et al., 1996). A more source-diverse news, thus, includes

information from political, economic, non-governmental and any other affected sources. Yet, source

diversity may also depict the variety of news outlets which are included in a recipient’s news diet,

especially in the context of news aggregators (Thurman, 2011). Second, diversity can refer to the variety

BURST OF THE FILTER BUBBLE? 4

of covered topics. This so-called content diversity oftentimes relates to the mere appearance of topics in

their most basic form, such as “public affairs” or “baseball.” Yet, it may also include the multitude of

perspectives, which ideally represent a democracy’s political spectrum of opinions (Entman & Wildman,

1992). For example, a news outlet may deliberately exclude aspects that are contrary to its own political

view. A content-diverse news outlet, however, should include all available aspects of a given topic. Third,

diversity can relate to a variation of viewpoints (i.e., framing). This viewpoint diversity subsumes

available frames on a given topic and is thus clearly the most demanding type of diversity, both for

journalists to produce and for researchers to measure (Baden & Springer, 2015). However, at the same

time, it provides the most reliable measure in terms of news’ normative requirements toward a

“marketplace of ideas” (Strömbäck, 2005, p. 338).

Source, content, and viewpoint diversity represent constructs for measuring news diversity

(McDonald & Dimmick, 2003). In addition, theoretical scholars have offered normative guidelines for

how to establish news diversity, which largely depend on a given media system. For example, media

systems that aim for horizontal news diversity require the different media outlets to ensure diversity

collectively. In comparison, vertical news diversity requires each single media outlet to provide a

satisfactory multitude of news (e.g., Hellman, 2001). Empirical evidence further suggests that diversity

also depends on other factors such as political trends (van Hoof, Jacobi, Ruigrok, & van Atteveldt, 2014),

the size of media outlets (Voakes et al., 1996), or an individual user’s selective exposure (Napoli, 1999).

The effects of online news consumption on diversity remain largely unclear. On the one hand, the

amount of available news has increased dramatically as boundaries for the production and distribution of

news have decreased (i.e., “information overload”; Eppler & Mengis, 2004). Therefore, a much bigger

diversity should be possible (Carlson, 2007). On the other hand, the available information in its entirety

overwhelm users, who thus need to rely on filters to reduce complexity and thus provide individual

representations of news diversity (Napoli, 1999). In recent years, algorithms have increasingly taken over

these tasks.

BURST OF THE FILTER BUBBLE? 5

Algorithmic Personalization

Automatically filtered selections of online information (e.g., news) should help internet users to

overcome the overwhelming amount of available information (Carlson, 2007). Such filter algorithms are

often referred to as recommender algorithms, since they recommend personalized content based on

information about individual users. Thereby, recommender algorithms typically use information about

users’ interests, preferences, and surf behavior as well as contextual information (e.g., time, location) to

derive optimally tailored results based on various forms of statistical clustering (for an overview, see

Oechslein & Hess, 2013).

The type of information used for personalization depends on an individual platform’s goals and

requirements. For example, Thurman and Schifferes (2012; also see Zuiderveen Borgesius et al., 2016)

distinguish between explicit and implicit personalization. While explicit personalization requires users to

proactively reveal their preferences, implicit personalization is based on observations of an individual

user’s online behavior. In practice, combinations of explicit and implicit personalization are possible as

well.

Algorithms also evaluate how well the filtered results match a user’s needs. For example, an

algorithm might interpret a given user’s click or follow-up action (e.g., a comment or Like) on a

recommended item as an accurate match. Yet, such evaluation processes carry the risk of self-

reinforcement and reduced diversity, which may ultimately lead to partial information blindness. This

rationale has become widely known as the “filter bubble” (Pariser, 2011). Similar theoretical constructs

aim at the increasing chance of like-minded contacts (“echo chambers”; Sunstein, 2009) and limited

public spheres (“sphericules”; Gitlin, 1998). Especially the latter refers to a normative fear of

unknowingly missing various information which prohibit individuals from being a properly informed and

rational democratic citizens. As such, public-sphere theories can be said to be primarily concerned with

decreasing viewpoint diversity rather than a decreased diversity of either sources or content. That said,

especially source diversity in horizontally diverse media systems goes along with viewpoint diversity

since different media outlets depict different political perspectives.

BURST OF THE FILTER BUBBLE? 6

Empirical evidence on the existence of filter-bubble effects, especially in the context of news, is

limited. One study found small effects in that Facebook users see more-than-average posts from

politically like-minded users (Bakshy, Messing, & Adamic, 2015). Yet, the study has faced several

methodological criticisms, such as building upon self-reported political orientation (Pariser, 2015). Apart

from social network sites, personalization effects have been looked at within search engines, revealing

almost none (e.g., Flaxman, Goel, & Rao, 2016; Haim, Arendt, & Scherr, 2017) or only minor (e.g., Feuz,

Fuller, & Stalder, 2011; Hannak et al., 2013) effects of partial information blindness.

The aim of this study is to contribute further evidence by investigating how—both implicit and

explicit—personalization of an online news aggregator affects both content and source diversity. To the

best of our knowledge, this is the first study to analyze the effects of personalization on news diversity for

news aggregators. For this, we focus on Google News (https://news.google.com/), one of the most-visited

online news aggregators (Newman et al., 2016, p. 12). Google News claims to present headlines which

“are selected by computer algorithms based on your past activity on Google” (Google, 2017a) while at the

same time “working to make sure that [the front page of Google News] reflects a diversity of articles and

sources” (Google, 2017c). Our analyses focus on the German version of Google News. This seemed to be

a reasonable choice, given that we used German IP addresses and media user typologies. That said, we

expect that the results generalize to other countries, since Google does not give any reason to expect

differences in the underlying algorithms as the “goal is to offer Google News to all of our users

throughout the world” with the differences are that “[e]ach edition is specifically tailored with news for

that audience” (Google, 2017b). As with comparable products and providers, we do not know which

parameters drive personalized outcomes, since the algorithms underlying online news aggregators are

“black boxes” (Pasquale, 2015). Therefore, we can only analyze the effects of personalization on news

diversity based on input-output analyses, for example, by varying a user’s surf behavior or preferences

(i.e., input) and comparing the resulting news offer (i.e., output). For this, we conducted two explorative

studies to control for explicit (Study 1) and implicit (Study 2) personalization.

BURST OF THE FILTER BUBBLE? 7

Study 1

Google News allows users to explicitly select the types of news they are interested in. That is,

users can explicitly personalize their account towards their preferences, by specifying the topics they want

to read about more (or less). The goal of this study was to analyze how explicit personalization of Google

News affects the diversity of the presented news articles.

Method

We created three different Google accounts, each of which was personalized for one of the major

topics as suggested by the platform—politics, sports, and entertainment. As such, the preference for the

politics account were set to “always” show political news but “rarely” include sports or entertainment

news. The sports account was set to “always” show sports news but “rarely” include political news or

entertainment. Similarly, the entertainment account “always” preferred entertainment but “rarely”

political or sports news.

For each account, we stored the articles from the Google News start page, once a day (i.e., at 8

p.m.), for the six-day period from 27th of May to the 2nd of June in 2014. In addition, we stored the front

pages of a neutral account (i.e., Google News without personalization) as well as those from the popular

online news outlets Bild.de and Spiegel Online. The two news outlets are not entirely comparable to a

Google News account given their journalistic arrangement of political articles at the top of the page. Yet,

by ignoring the order of presented articles and due the topical generality of the two outlets, their diversity

can serve as a baseline to compare the personalized accounts against.

Three research assistants read each article and assigned it to one of eight topic categories: politics,

economics, sports, culture, science, lifestyle, miscellaneous, and service. A reliability check on 16

randomly selected articles resulted in a joint-probability of agreement of 78%. For the Google News

accounts, the research assistants additionally coded the source of each article.

Results

The six-day period resulted in a total of 972 news articles across all six news outlets (i.e., four

Google News accounts, Bild.de, and Spiegel Online). The number of articles per account was evenly

BURST OF THE FILTER BUBBLE? 8

distributed: 137 for the politics version, 133 for the sports version, 132 for the entertainment version, and

123 for the neutral (unpersonalized) version. For Bild.de (189) and Spiegel Online (258), the numbers of

articles were larger.

Figure 1 shows the share of articles per topic and news outlet. The results indicate that the explicit

personalization of Google News worked. As expected, for each of the three topics (politics, sports, and

entertainment) the respective personalized account provided a higher share of articles than the remaining

three accounts. For example, 52% of the articles in the politics account were political news, whereas the

share of political news in the remaining accounts ranged from 37% to 39%. The sports version showed

17% of sports news compared to 9% to 12% in the other accounts. Finally, 33% in the entertainment

version were from the preferred category, compared to 19% to 28% in the other versions.

The comparison to the traditional news outlets shows some interesting results. First, the share of

political news is much higher in all four Google News accounts compared to both Bild.de (11%) and

Spiegel Online (26%). Second, both traditional news outlets focus a considerable share of their coverage

on entertainment and other topics.

Figure 1: Percentage of articles per news outlet and topic category

BURST OF THE FILTER BUBBLE? 9

Figure 2 shows the source diversity for each of the three personalized as well as the

unpersonalized Google News account, measured as the percentage of articles per news outlet. Source

diversity did not significantly vary between the four accounts. Yet, when summed up, the eleven news

outlets that account for at least three percent of the total number of articles are responsible for 86% of all

articles at Google News. Thereby, the top outlets, such as Focus Online (24%) and Die Welt (13%) make

up for particularly large shares, which indicates a biased selection of news sources. That is, surprisingly,

these top sources do not represent news outlets with outstanding reach in Germany. For example, in June

of 2014, Focus Online only ranked 11th among Germany’s most-visited news websites (IVW, 2017), Die

Welt ranks even lower (17th). Vice versa, usually well-frequented outlets, such as Bild.de, T-Online, RTL,

or Stern.de seem underrepresented within Google News. Yet, while there is no demand for a

representation of the media system whatsoever, this result is interesting in that both Focus Online and Die

Welt are rather conservative outlets and are both known for aggressive clickbait headlines and search

engine optimization. The remaining 14% of articles originate from other news outlets.

Figure 2: Source diversity of the four Google News accounts

BURST OF THE FILTER BUBBLE? 10

Study 2

Implicit personalization builds upon statistical deductions from observed user and usage data.

Hence, in order for Google News to be able to provide a personalized news offer, it first needs to have the

chance to observe a given user. We thus followed an explorative agent-based testing approach, in which

we modeled various users and their online behavior to study the effects of implicit personalization

qualitatively.

Method

We modeled online behavior on various dimensions of four virtual agents. Characteristics of the

four virtual agents were derived from two representative German media user typologies (ARD/ZDF,

2015; Sinus Markt- und Sozialforschung, 2015). Each agent stands for an archetypical life standard and

media use:

Agent A. An elderly female conservative widow,

Agent B. A Bourgeois father in his fifties,

Agent C. A forty-year-old job-oriented male single, and

Agent D. A wealthy 30-year-old female marketing manager and early adopter.

For each agent, a new (virtual) computer was set up which was only used for the purpose of this

experiment. All computers’ IP addresses were equal and thus held constant. That is, while the identities

suggest different life situations and locations, the computers on which the virtual agents operated were in

the same location in order to minimize IP-address influences. For each agent, we created both a Facebook

and a Google+ account with information about the agent’s age, gender, living situation, education, job,

relationship status as well as favorite books, sports, music, and movies, all according to the agents’

media-user typology characteristics shown in Table 1.

The study consisted of two phases. In accordance with similar studies (Haim et al., 2017; Hannak

et al., 2013) the initial training phase lasted for one week. Throughout this week, we repeatedly (1)

searched Google for five agent-specific search terms and clicked the top three results, (2) expressed

approval (through Likes or +1’s) for various products’ and artists’ profile pages on both Facebook and

BURST OF THE FILTER BUBBLE? 11

Google+ (five each), (3) put five individually favorable products into the shopping cart on Amazon, and

(4) navigated through approximately ten articles on individually favorable media outlets. For example, the

elderly female conservative widow liked religious art, wanted to buy a heating blanket, and used local

news outlets. This training process was repeated every day over the course of one week in June of 2014.

After this week, the test phase required all agents to subsequently enter three ambiguous terms in Google

News that were newsworthy at that time. The search terms were queried in German and included taxation

(“Steuer”), allowing for a variety of article topics, Germany (“Deutschland”), allowing for a variety of

focuses, especially due to the ongoing soccer world cup, and Alstom, a French producer of trains largely

in use in Germany, which at the time of the study was under consideration for either being sold to a

foreign competitor or acquiring another company, thus affecting large numbers of German employees and

thereby allowing for a variety of viewpoints.

For each of the three search terms, we stored the first ten results pages. Since each result page

shows ten articles, we thus collected 300 articles (i.e., 3 search terms × 10 result pages × 10 articles) per

agent. In addition, we created a fifth (virtual) computer to serve as a control group. This account received

no training and was used only for the test phase. It thus did not allow for observation and implicit

personalization.

Results

First, we counted every agent’s exclusive entries as compared to the control group. For example,

agent A (i.e., the elderly female conservative widow) saw an article from the conservative newspaper Die

Welt, which was not shown in the control-group agent. Instead, an article from the local news outlet

Augsburger Allgemeine was shown. Second, we estimated rank-order differences relative to the control

group based on a qualitative analysis of the result lists. Exclusive articles affect diversity directly. In

comparison, rank-order differences inhibit diversity as news consumers are known to use only top-ranked

results from the first few result pages (Pan et al., 2007).

Table 1: Overview of the employed agents and their personalized results

BURST OF THE FILTER BUBBLE? 12

Agent A

Agent B

Agent C

Agent D

Age

Gender

female

male

female

Living situation

alone

w/ wife & kid

alone

w/ partner

Education

sec. school

B.Sc. (IT)

M.Sc. (eco.)

Job

retired

senior tiler

software dev.

head of PR

Relationship status

widow

married

single

partnership

Political leaning

strongly cons.

conservative

liberal

strongly lib.

Interests

health, religion

DIY, nature

culture, tech

sports, fashion

Media repertoire

local news, print

& TV

boulevard, TV

national news,

online & PBS

streaming,

magazines

Taxation

rank-order deviation

minor

weak

first page with deviation

exclusive articles

Germany

rank-order deviation

minor

major

minor

first page with deviation

exclusive articles

Alstom

rank-order deviation

major

weak

major

first page with deviation

exclusive articles

Overall, we found only minor differences across the four accounts. For any given search query,

the four agents saw almost the exact same 100 articles (Table 1). The highest deviations were six

exclusive articles for agent A on topic Alstom and four exclusives in various situations (e.g., agent B on

topic Germany). However, these exclusives appeared on less-prominent result pages (i.e., pages six

through ten), which was a common finding for all four agents.

In sum, we found only 30 cases of exclusive articles out of 1200 comparisons, a share of only

2.5%. In other words, there either was no personalization, or our training did not work. That said, those 30

articles that were identified as exclusive suggest that the training did indeed work. For example, agent A

(i.e., the elderly female conservative widow) missed some articles from economy outlets but was instead

presented with articles from more general news outlets.

BURST OF THE FILTER BUBBLE? 13

Similarly, the vast majority of articles was shown at the exact same position as in the control

group agent. That is, an untrained account revealed more or less the same results than each of the four

trained accounts. We refrain from a numerical quantification of rank-order differences. The reason is that

such a quantitative analysis would mislead the reader, since deviations on early result pages can cause

deviations on subsequent result pages, which would bias the results toward the first occurrence of rank-

order deviations. Instead, we provide qualitative estimates of “weak”, “minor”, and “major” deviations

which relate to the rough number of affected articles, namely up to ten, between ten and twenty, and more

than twenty, respectively. In cases where the rank order between the trained and the control group agent

differed, this mix-up seemed to vary due to updates inside an article. In other words, the differences may

have occurred as a result of small time delays in the daily storage of the results page. As indicated by a

timestamp next to an article’s news outlet, Google News seems to prefer newer—and more recently

updated—items. This rationale finds support by the fact that the topics of Alstom and Germany showed

more deviation (the former due to its selling process, the latter due to the ongoing soccer world

championship) than the topic of taxation, which did not reveal many rank-order differences.

Discussion

The two explorative studies provide empirical evidence on how both explicit and implicit forms

of personalization within an online news aggregator (Google News) affect both content and source

diversity. We found only minor effects of personalization on content diversity. While explicit

personalization slightly affected content diversity in that users saw more articles for their preferred topics,

implicit personalization based on manipulations of user behavior did not affect content diversity.

Furthermore, neither type of personalization had any effect on source diversity.

That said, we found a bias in that Google News over-represented certain news outlets and under-

represented other, highly frequented, news outlets. Given the over-represented outlets’ conservative

nature, this bias can be troubling, especially in terms of viewpoint diversity. We can only speculate on the

reasons for this result. First, the over-represented outlets are known to put quite some effort into search-

engine optimization. While this would render Google News in a rather simplistic light, it seems plausible

BURST OF THE FILTER BUBBLE? 14

from a technological point of view as the outlets offer highly up-to-date information in a machine-

readable manner with all relevant keywords mentioned. Second, Google News may punish outlets with

paywalls, as they diminish a user’s browsing experience. Third, the algorithms may have difficulties to

identify a stories’ main topic based on images rather than on texts and may thus decrease the weight of

image-heavy reporting (e.g., Bild.de). Fourth, some German publishers (e.g., Axel Springer) have been

fighting a legal case regarding Google’s right to include snippets of articles on their platform without the

possession of the articles’ intellectual property. It is possible that Google News punished those publishers

by partly excluding their content (e.g., Bild.de).

Overall, our findings suggest that the filter-bubble phenomenon may be overestimated in the case

of algorithmic personalization within Google News. In the case of explicit personalization, the share of

political news was higher than on Spiegel Online, a major source for political news in Germany, even for

those users who explicitly stated that they rarely want to see political news. In other words, while

personalization effects were visible (which provides support for the applicability of our method), the

results did not blind out essential shares of information (which the filter-bubble hypothesis would

suggest).

Our study is subject to various limitations. First, we focused on only one news aggregator and are

thereby empirically bound to conclusions about Google News rather than on general digital news

personalization. Second, although we compared different types of both news diversity and

personalization, our setting does not allow for checking the underlying assumption that personalization

within algorithmically curated environments narrows diversity to a stronger extent than, for example,

individual selective exposure throughout daily newspaper consumption. Third, our agent-based testing

approach may have affected the validity of the results. While our approach controls for various influences

that are difficult to hold constant in real life, one disadvantage is that the method produces a highly

artificial environment, in which other possible influences are left out. This may of course affect the

outcome. For example, since Google bundles a user’s account for all of its services, it seems likely that

personalization not only builds upon a user’s surf behavior inside Google News and Google+, but also

BURST OF THE FILTER BUBBLE? 15

inside YouTube or other services. While we tried to account for that in the second study, we cannot

determine whether our selection of actions throughout the training phase was sufficient for an adequate

and common degree of personalization.

Our results are mostly in line with those from similar studies, which adds confidence in their

validity. While some studies indeed report minor personalization effects (e.g., in the context of search

engines; Feuz et al., 2011; Hannak et al., 2013) all related studies conclude that “the magnitude of the

effects is relatively modest” (Flaxman et al., 2016, p. 318; also see Cozza, Hoang, Petrocchi, &

Spognardi, 2016). Based on a discussion of the relevant literature, Haim and colleagues (2017, p. 257,

emphasis in original) conclude that “no study can support the claim that Google blinds out (i.e., censors)

specific information.”

Despite the consistency of these findings, there is reason to remain on guard, as even minor

changes in the underlying algorithms could affect the results. In addition, while the empirical evidence for

both search engines and news aggregators are similar, evidence from social network sites is still sparse.

Furthermore, every study, including our own, can only depict a small point in time and is subject to

changes of the object under investigation. It seems thus necessary for the platforms under investigation

(e.g., Google, Facebook) to be more transparent with changes inside algorithms which affect the news

selections users are presented with (Diakopoulos, 2014; Diakopoulos & Koliska, 2016). This is especially

relevant in the context of online news and news diversity, which is of major importance to a society’s

democratic process. Ideas for improving this situation include guidelines on algorithmic decision-making

(Ananny, 2016, p. 108), public committees (Saurwein, Just, & Latzer, 2015, p. 41), or algorithms’

ombudspeople (Diakopoulos & Koliska, 2016, p. 12) who can provide information and apply necessary

adjustments upon inquiry.

BURST OF THE FILTER BUBBLE? 16

References

Ananny, M. (2016). Toward an ethics of algorithms: Convening, observation, probability, and timeliness.

Science, Technology & Human Values, 41(1), 93–117.

https://doi.org/10.1177/0162243915606523

ARD/ZDF. (2015). MedienNutzerTypologie. Retrieved from http://www.ard-zdf-mnt.de/

Baden, C., & Springer, N. (2015). Conceptualizing viewpoint diversity in news discourse. Journalism.

https://doi.org/10.1177/1464884915605028

Bakshy, E., Messing, S., & Adamic, L. (2015). Exposure to ideologically diverse news and opinion on

Facebook. Science, 1160, 1–2. https://doi.org/10.1126/science.aaa1160

Carlson, M. (2007). Order versus access: News search engines and the challenge to traditional journalistic

roles. Media, Culture & Society, 29(6), 1014–1030. https://doi.org/10.1177/0163443707084346

Cozza, V., Hoang, V. T., Petrocchi, M., & Spognardi, A. (2016). Experimental measures of news

personalization in Google News. In S. Casteleyn, P. Dolog, & C. Pautasso (Eds.), Current Trends

in Web Engineering. Proceedings of the 2nd International Workshop on Mining the Social Web

(pp. 93–104). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-46963-

8_8

Diakopoulos, N. (2014). Algorithmic accountability. Journalistic investigation of computational power

structures. Digital Journalism, 3, 398–415. https://doi.org/10.1080/21670811.2014.976411

Diakopoulos, N., & Koliska, M. (2016). Algorithmic transparency in the news media. Digital Journalism,

1–20. https://doi.org/10.1080/21670811.2016.1208053

Entman, R. M., & Wildman, S. S. (1992). Reconciling economic and non-economic perspectives on

media policy: Transcending the “marketplace of ideas.” Journal of Communication, 42(1), 5–19.

https://doi.org/10.1111/j.1460-2466.1992.tb00765.x

Eppler, M. J., & Mengis, J. (2004). The concept of information overload: A review of literature from

organization science, accounting, marketing, MIS, and related disciplines. The Information

Society, 20(5), 325–344. https://doi.org/10.1080/01972240490507974

BURST OF THE FILTER BUBBLE? 17

Feuz, M., Fuller, M., & Stalder, F. (2011). Personal web searching in the age of semantic capitalism:

Diagnosing the mechanisms of personalisation. First Monday, 16(2).

https://doi.org/10.5210/fm.v16i2.3344

Flaxman, S. R., Goel, S., & Rao, J. M. (2016). Filter bubbles, echo chambers, and online news

consumption. Public Opinion Quarterly, 80(S1), 298–320. https://doi.org/10.1093/poq/nfw006

Gitlin, T. (1998). Public spheres or public sphericules. In T. Liebes & J. Curran (Eds.), Media, ritual and

identity (pp. 168–174). London: Routledge.

Google. (2017a). How Google News results are selected. Retrieved March 23, 2017, from

https://support.google.com/news/answer/106259?hl=en

Google. (2017b). Languages and regions. Retrieved March 23, 2017, from

https://support.google.com/news/publisher/answer/40237?hl=en

Google. (2017c). Stories on the front page. Retrieved March 23, 2017, from

https://support.google.com/news/publisher/answer/94000?hl=en&ref_topic=2492117

Haim, M., Arendt, F., & Scherr, S. (2017). Abyss or shelter? On Google’s role when googling for suicide.

Health Communication, 32(2), 253–258. https://doi.org/10.1080/10410236.2015.1113484

Hannak, A., Sapiezynski, P., Molavi Kakhki, A., Krishnamurthy, B., Lazer, D., Mislove, A., & Wilson,

C. (2013). Measuring personalization of web search. In Proceedings of the 22nd international

conference on World Wide Web (pp. 527–538). Rio de Janeiro: International World Wide Web

Conferences Steering Committee.

Hellman, H. (2001). Diversity – An end in itself? Developing a multi-measure methodology of television

programme variety studies. European Journal of Communication, 16(2), 181–208.

https://doi.org/10.1177/0267323101016002003

IVW. (2017). Online-Nutzungsdaten. Retrieved from http://ausweisung.ivw-online.de

McDonald, D. G., & Dimmick, J. (2003). The conceptualization and measurement of diversity.

Communication Research, 30(1), 60–79. https://doi.org/10.1177/0093650202239026

BURST OF THE FILTER BUBBLE? 18

McQuail, D. (1992). Media performance: Mass communication and the public interest. Thousand Oaks,

CA: Sage.

Napoli, P. M. (1999). Deconstructing the diversity principle. Journal of Communication, 49(4), 7–34.

https://doi.org/10.1111/j.1460-2466.1999.tb02815.x

Newman, N., Fletcher, R., Levy, D. A. L., & Nielsen, R. K. (2016). Digital news report 2016. Oxford.

Retrieved from http://reutersinstitute.politics.ox.ac.uk/sites/default/files/Digital-News-Report-

2016.pdf

Oechslein, O., & Hess, T. (2013). Incorporating social networking information in recommender systems:

The development of a classification framework. In Proceedings of the 26th Bled eConference (pp.

287–298). Bled, Slowenien. Retrieved from http://aisel.aisnet.org/bled2013/19

Pan, B., Hembrooke, H., Joachims, T., Lorigo, L., Gay, G., & Granka, L. (2007). In Google we trust:

Users’ decisions on rank, position, and relevance. Journal of Computer-Mediated

Communication, 12(3), 801–823. https://doi.org/10.1111/j.1083-6101.2007.00351.x

Pariser, E. (2011). The Filter Bubble: How the new personalized web is changing what we read and how

we think. New York, NY: Penguin.

Pariser, E. (2015, May 7). Did Facebook’s big study kill my Filter Bubble thesis? Retrieved May 26,

2015, from https://medium.com/backchannel/facebook-published-a-big-new-study-on-the-filter-

bubble-here-s-what-it-says-ef31a292da95

Pasquale, F. (2015). The black box society: The secret algorithms that control money and information.

Cambridge, MA: Harvard University Press.

Porto, M. P. (2007). Frame diversity and citizen competence: Towards a critical approach to news quality.

Critical Studies in Media Communication, 24(4), 303–321.

https://doi.org/10.1080/07393180701560864

Saurwein, F., Just, N., & Latzer, M. (2015). Governance of algorithms: Options and limitations. Info,

17(6), 35–49. https://doi.org/10.1108/info-05-2015-0025

BURST OF THE FILTER BUBBLE? 19

Sinus Markt- und Sozialforschung. (2015). Die Sinus-Milieus. INTEGRAL Wien. Retrieved from

http://www.sinus-institut.de

Strömbäck, J. (2005). In search of a standard: Four models of democracy and their normative implications

for journalism. Journalism Studies, 6(3), 331–345. https://doi.org/10.1080/14616700500131950

Sunstein, C. R. (2009). Republic.com 2.0. Princeton, N.J.: Princeton University Press.

Thurman, N. (2011). Making “The Daily Me”: Technology, economics and habit in the mainstream

assimilation of personalized news. Journalism, 12(4), 395–415.

https://doi.org/10.1177/1464884910388228

Thurman, N., & Schifferes, S. (2012). The future of personalization at news websites. Journalism Studies,

13(5–6), 775–790. https://doi.org/10.1080/1461670X.2012.664341

van Hoof, A. M., Jacobi, C., Ruigrok, N., & van Atteveldt, W. (2014). Diverse politics, diverse news

coverage? A longitudinal study of diversity in Dutch political news during two decades of

election campaigns. European Journal of Communication, 29(6), 668–686.

https://doi.org/10.1177/0267323114545712

Voakes, P. S., Kapfer, J., Kurpius, D., & Chern, D. S.-Y. (1996). Diversity in the news: A conceptual and

methodological framework. Journalism & Mass Communication Quarterly, 73(3), 582–593.

https://doi.org/10.1177/107769909607300306

Westerståhl, J. (1983). Objective news reporting. General premises. Communication Research, 10(3),

403–424. https://doi.org/10.1177/009365083010003007

Zuiderveen Borgesius, F. J., Trilling, D., Möller, J., Balázs, B., de Vreese, C. H., & Helberger, N. (2016).

Should we worry about filter bubbles? Internet Policy Review, 5(1), 1–16.

https://doi.org/10.14763/2016.1.401

Attitude extremity and perceived argument diversity exposure in the COVID-19 debate

Article

Jan 2024
SCM

This study investigates the relationship between attitude extremity and perceived exposure to diverse political arguments in the debate about COVID-19 health policy measures. Based on a comparative, cross-sectional survey in Germany and Switzerland, we show that extreme attitudes towards wearing face masks inhibit citizens’ perceived diversity of arguments regarding the issue in both countries. This tendency is slightly more pronounced for supporters of mask-wearing than opponents. However, contrary to existing concerns about issue-specific echo chambers, even respondents showing strong attitude extremity still experience exposure to a relatively diverse range of arguments for and against wearing face masks.

Building and Experiencing Media Trust in the Nordics: Views of Professionals and Audiences

Article

Full-text available

May 2024

Audiences’ declining trust in legacy media, including PSM, has been a concern for several decades. The Nordics have been a region where media trust is still strong–despite the worldwide economic challenges of journalism, the prominence of global platforms, and increasing threats posed by disinformation in national media markets. A lesser-researched aspect is how audiences experience different dimensions of trust, especially regarding PSM, and how media professionals and adjacent experts see the role of PSM in creating and building trust. This essay depicts the views of these stakeholder groups in Denmark, Finland, Norway, and Sweden and discusses some vital takeaways for PSM strategy.

Leveraging Recommender Systems to Reduce Content Gaps on Peer Production Platforms

Article

May 2024

Peer production platforms like Wikipedia commonly suffer from content gaps. Prior research suggests recommender systems can help solve this problem, by guiding editors towards underrepresented topics. However, it remains unclear whether this approach would result in less relevant recommendations, leading to reduced overall engagement with recommended items. To answer this question, we first conducted offline analyses (Study 1) on SuggestBot, a task-routing recommender system for Wikipedia, then did a three-month controlled experiment (Study 2). Our results show that presenting users with articles from underrepresented topics increased the proportion of work done on those articles without significantly reducing overall recommendation uptake. We discuss the implications of our results, including how ignoring the article discovery process can artificially narrow recommendations on peer production platforms.

Algorithmic Misjudgement in Google Search Results: Evidence from Auditing the US Online Electoral Information Environment

Conference Paper

Jun 2024

Aligning agent-based testing (ABT) with the experimental research paradigm: a literature review and best practices

Article

Full-text available

May 2024

The study of algorithmically curated media environments through emulated browsing has become a key method of computational social science. Here, we review underlying concepts and typical implementations of these so-called agent-based testing (ABT) studies, particularly those that experimentally vary inputs on search engines, social media, or aggregation platforms, to study subsequent outputs. This review aims to identify, evaluate, and discuss implementations and reporting standards of ABT studies within the experimental research paradigm. First, we discuss general assumptions of the experimental research paradigm and to what extent ABT experiments align with them, finding a considerable overlap with minor deviations. Second, we systematically reviewed 66 ABT studies from the social and information sciences, finding little reference to the experimental research paradigm vis-à-vis large variations in implementation and reporting practices. Third, our findings then inform five best-practice guidelines for future ABT experiments where we suggest to explicitly and much more strictly align those studies with the experimental research paradigm. We argue that ABT experiments would benefit from such an adaptation of already established experimental practices, thereby improving reproducibility and replicability.

How do English-Speaking Users Discuss the Wuhan Lockdown: A Longitudinal Analysis of Public Spheres on YouTube

Article

May 2024

Drawing on Chantel Mouffe, Mikhail Bakhtin, Judith Butler and others, we examined how English-speaking YouTube users discussed the Wuhan lockdown from late January to June 2020. We argue that news-prompted public spheres are affective, contextualised and short-lived and our findings suggest that the severity of the COVID-19 pandemic outside of China is closely related to the worsening sentiments of English-speaking publics on YouTube. Comments are also significantly associated with the changing severity of the pandemic, reflecting the changing concerns of publics regardless of the central theme of news stories. The divergence of the sentiment scores of user comments is also salient among different media spaces. These findings suggest that a de-localised, universal global public sphere is misleading and more nuanced and contextualised studies are warranted.

Mitigating filter bubbles: Diverse and explainable recommender systems

Article

Apr 2024
J INTELL FUZZY SYST

In recent years, the surge in online content has necessitated the development of intelligent recommender systems capable of offering personalized suggestions to users. However, these systems often encapsulate users within a “filter bubble”, limiting their exposure to a narrow range of content. This study introduces a novel approach to address this issue by integrating a novel diversity module into a knowledge graph-based explainable recommender system. Utilizing the Movie Lens 1M dataset, this research pioneers in fostering a more nuanced and transparent user experience, thereby enhancing user trust and broadening the spectrum of recommendations. Looking ahead, we aim to further refine this system by incorporating an explicit feedback loop and leveraging Natural Language Processing (NLP) techniques to provide users with insightful explanations of recommendations, including a comprehensive analysis of filter bubbles. This initiative marks a significant stride towards creating a more inclusive and informed recommendation landscape, promising users not only a wider array of content but also a deeper understanding of the recommendation mechanisms at play.

Auditing Entertainment Traps on YouTube: How Do Recommendation Algorithms Pull Users Away from News

Article

Apr 2024

The Power of Digital Platforms in Journalism: Strategies and Practices for Adapting Content in TikTok

Conference Paper

Apr 2024

The Digital Transition of Local Newspapers: New Communication Strategies for Audience Retention

Conference Paper

Apr 2024

Should we worry about filter bubbles?

Article

Full-text available

Mar 2016

Some fear that personalised communication can lead to information cocoons or filter bubbles. For instance, a personalised news website could give more prominence to conservative or liberal media items, based on the (assumed) political interests of the user. As a result, users may encounter only a limited range of political ideas. We synthesise empirical research on the extent and effects of self-selected personalisation, where people actively choose which content they receive, and pre-selected personalisation, where algorithms personalise content for users without any deliberate user choice. We conclude that at present there is little empirical evidence that warrants any worries about filter bubbles.

The Black Box Society: The Secret Algorithms That Control Money and Information

Book

Jan 2015

Frank Pasquale

Current Trends in Web Engineering: ICWE 2016 International Workshops, DUI, TELERISE, SoWeMine, and Liquid Web, Lugano, Switzerland, June 6-9, 2016. Revised Selected Papers

Book

Jan 2016
Lect Notes Comput Sci

This book constitutes the thoroughly refereed post-workshop proceedings of the 16th International Conference on Web Engineering, ICWE 2016, held in Lugano, Switzerland, in June 2016. The 15 revised full papers together with 5 short papers were selected form 37 submissions. The workshops complement the main conference, and provide a forum for researchers and practitioners to discuss emerging topics. As a result, the workshop committee accepted six workshops, of which the following four contributed papers to this volume: • 2nd International Workshop on TEchnical and LEgal aspects of data pRIvacy and SEcurity (TELERISE 2016) • 2nd International Workshop on Mining the Social Web (SoWeMine 2016) • 1st International Workshop on Liquid Multi-Device Software for the Web (LiquidWS 2016) • 5th Workshop on Distributed User Interfaces: Distributing Interactions (DUI 2016)

Experimental Measures of News Personalization in Google News

Conference Paper

Jun 2016

Search engines and social media keep trace of profile- and behavioral-based distinct signals of their users, to provide them personalized and recommended content. Here, we focus on the level of web search personalization, to estimate the risk of trapping the user into so called Filter Bubbles. Our experimentation has been carried out on news, specifically investigating the Google News platform. Our results are in line with existing literature and call for further analyses on which kind of users are the target of specific recommendations by Google.

Reconciling economic and non-economic perspectives on media policy: Transcending the "marketplace of ideas."

Article

Jan 1992

Algorithmic Transparency in the News Media

Article

Jul 2016

The growing use of difficult-to-parse algorithmic systems in the production of news, from algorithmic curation to automated writing and news bots, problematizes the normative turn toward transparency as a key tenet of journalism ethics. Pragmatic guidelines that facilitate algorithmic transparency are needed. This research presents a focus group study that engaged 50 participants across the news media and academia to discuss case studies of algorithms in news production and elucidate factors that are amenable to disclosure. Results indicate numerous opportunities to disclose information about an algorithmic system across layers such as the data, model, inference, and interface. Findings underscore the deeply entwined roles of human actors in such systems as well as challenges to adoption of algorithmic transparency including the dearth of incentives for organizations and the concern for overwhelming end-users with a surfeit of transparency information.

Abyss or Shelter? On the Relevance of Web Search Engines’ Search Results When People Google for Suicide

Article

May 2016

Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person’s previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal Internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, and suicide-unrelated) × 2 (reactive: clicking the top-most result link and no clicking) experimental design applying agent-based testing. While findings show no influences either of search histories or of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms “decided” whether or not to present a helpline, and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.

Filter Bubbles, Echo Chambers, and Online News Consumption

Article

Mar 2016

Online publishing, social networks, and web search have dramatically lowered the costs of producing, distributing, and discovering news articles. Some scholars argue that such technological changes increase exposure to diverse perspectives, while others worry that they increase ideological segregation. We address the issue by examining web-browsing histories for 50,000 US-located users who regularly read online news. We find that social networks and search engines are associated with an increase in the mean ideological distance between individuals. However, somewhat counterintuitively, these same channels also are associated with an increase in an individual’s exposure to material from his or her less preferred side of the political spectrum. Finally, the vast majority of online news consumption is accounted for by individuals simply visiting the home pages of their favorite, typically mainstream, news outlets, tempering the consequences—both positive and negative—of recent technological changes. We thus uncover evidence for both sides of the debate, while also finding that the magnitude of the effects is relatively modest.

Abyss or Shelter? On Google's Role when Googling for Suicide

Article

Oct 2015
HEALTH COMMUN

Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises, and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter-bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person’s previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, suicide-unrelated) × 2 (reactive: clicking the top-most result link, no clicking) experimental design applying agent-based testing. While findings show no influences neither of search histories nor of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms “decided” whether or not to present a helpline and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.

Toward an Ethics of Algorithms: Convening, Observation, Probability, and Timeliness

Article

Sep 2015

Mike Ananny

Part of understanding the meaning and power of algorithms means asking what new demands they might make of ethical frameworks, and how they might be held accountable to ethical standards. I develop a definition of networked information algorithms (NIAs) as assemblages of institutionally situated code, practices, and norms with the power to create, sustain, and signify relationships among people and data through minimally observable, semiautonomous action. Starting from Merrill’s prompt to see ethics as the study of “what we ought to do,” I examine ethical dimensions of contemporary NIAs. Specifically, in an effort to sketch an empirically grounded, pragmatic ethics of algorithms, I trace an algorithmic assemblage’s power to convene constituents, suggest actions based on perceived similarity and probability, and govern the timing and timeframes of ethical action.

Burst of the Filter Bubble?: Effects of personalization on the diversity of Google News

Abstract and Figures

Recommended publications

Individualizing user profile from viewing logs of several people for TV program recommendation

A State of Art Survey on Shilling Attack in Collaborative Filtering Based Recommendation System

Evaluating the Relative Performance of Neighbourhood-Based Recommender Systems

Closed-Loop Opinion Formation