PreprintPDF Available

Conceptualizing Visual Analytic Interventions for Content Moderation

September 2021

September 2021

Authors:

Sahaj Vaidya

New Jersey Institute of Technology

Azadeh Naderi

New Jersey Institute of Technology

Show all 6 authorsHide

Preprints and early-stage research may not have been peer reviewed yet.

Modern social media platforms like Twitch, YouTube, etc., embody an open space for content creation and consumption. However, an unintended consequence of such content democratization is the proliferation of toxicity and abuse that content creators get subjected to. Commercial and volunteer content moderators play an indispensable role in identifying bad actors and minimizing the scale and degree of harmful content. Moderation tasks are often laborious, complex, and even if semi-automated, they involve high-consequence human decisions that affect the safety and popular perception of the platforms. In this paper, through interdisciplinary collaboration among researchers from social science, human-computer interaction, and visualization, we present a systematic understanding of how visual analytics can help in human-in-the-loop content moderation. We contribute a characterization of the data-driven problems and needs for proactive moderation and present a mapping between the needs and visual analytic tasks through a task abstraction framework. We discuss how the task abstraction framework can be used for transparent moderation, design interventions for moderators' well-being, and ultimately, for creating futuristic human-machine interfaces for data-driven content moderation.

Mapping between content moderation goals and visual analytic tasks. Scenarios illustrating how moderators can leverage the expressive power of visualizations for making offline and real-time decisions about the who, what, and how dimensions of data-driven moderation.

…

Figures - uploaded by Aritra Dasgupta

Content may be subject to copyright.

Content uploaded by Aritra Dasgupta

Content may be subject to copyright.

Conceptualizing Visual Analytic Interventions

for Content Moderation

Sahaj Vaidya, Jie Cai, Soumyadeep Basu, Azadeh Naderi, Donghee Yvette Wohn and Aritra Dasgupta

Abstract

—Modern social media platforms like Twitch, YouTube, etc., embody an open space for content creation and consumption.

However, an unintended consequence of such content democratization is the proliferation of toxicity and abuse that content creators

get subjected to. Commercial and volunteer content moderators play an indispensable role in identifying bad actors and minimizing the

scale and degree of harmful content. Moderation tasks are often laborious, complex, and even if semi-automated, they involve high-

consequence human decisions that affect the safety and popular perception of the platforms. In this paper, through an interdisciplinary

collaboration among researchers from social science, human-computer interaction, and visualization, we present a systematic

understanding of how visual analytics can help in human-in-the-loop content moderation. We contribute a characterization of the

data-driven problems and needs for proactive moderation and present a mapping between the needs and visual analytic tasks

through a task abstraction framework. We discuss how the task abstraction framework can be used for transparent moderation,

design interventions for moderators’ well-being, and ultimately, for creating futuristic human-machine interfaces for data-driven content

moderation.

Index Terms—Content Moderation, Social Media, Task Abstractions, Real-time Decision-Making

1 INTRODUCTION

Content moderation has emerged as a major challenge confronting the

safety and acceptance of modern social media platforms, like Facebook,

Twitter, YouTube, Twitch, etc. Companies are increasingly allocating

valuable resources, in terms of building automated models [10,23, 24]

and training or hiring human moderators [43, 47] to deal with the

growing menace of negativity and toxicity online. Data-driven ap-

proaches, like those based on machine learning, have become necessary

for automatically detecting content that violates community guidelines.

However, these approaches remain opaque, unaccountable, and poorly

understood [19]. Additionally, automated moderation is not sufﬁcient

due to the inherent complexity and ambiguity of moderation tasks [44].

In this paper, through interdisciplinary collaboration among researchers

from social science, human-computer interaction, and visualization,

we study human-in-the-loop content moderation processes through the

lens of visual analytics. We analyze how visual analytic interventions

can empower content moderators with greater data-driven awareness

about who to monitor, what kind of messages need attention, and how

to ensure transparent implementation of rules and policies (Figure 1).

While the term “content” can be broadly interpreted, we focus

our discussion on moderation activities in platforms that involve syn-

chronous communication among users of live-streaming platforms like

Twitch, YouTube, Discord, Clubhouse, etc. For moderators, the real-

time interactions and the need to make consequential decisions with

very limited lead time can often lead to high cognitive load [7] and take

an emotional toll [49]. The conventional understanding is that modera-

tion of online conversations in live-streaming platforms is inherently

reactive, where moderators see and then react to content generated by

users, typically by removing them. However, a signiﬁcant portion of

work performed by volunteer moderators is social and communicative

in nature [47]: moderation decisions need to be transparently com-

municated to the users and there is a high consequence for decisions

• Sahaj Vaidya is with NJIT. E-mail: ssv47@njit.edu.

• Jie Cai is with NJIT. Email: jie.cai@njit.edu.

• Soumyadeep Basu is with NJIT. Email: sb2356@njit.edu.

• Azadeh Nadari is with NJIT. Email: azadeh.nadari7@gmail.com.

• Donghee Yvette Wohn is with NJIT. Email: donghee.y.wohn@njit.edu.

• Aritra Dasgupta is with NJIT. E-mail: aritra.dasgupta@njit.edu.

Manuscript received xx xxx. 201x; accepted xx xxx. 201x. Date of Publication

xx xxx. 201x; date of current version xx xxx. 201x. For information on

obtaining reprints of this article, please send e-mail to: reprints@ieee.org.

Digital Object Identiﬁer: xx.xxxx/TVCG.201x.xxxxxxx

that can be perceived as unfair or incorrect. A shared vision among

researchers in content moderation and visualization, who are co-authors

of this paper, is that access to visual analytic techniques has a transfor-

mative potential on moderation activities in live-streaming platforms.

Visual analytics tools and interfaces will allow moderators to summa-

rize conversations, interpret and reason about why automated methods

might have ﬂagged certain messages, and ultimately, engage in a more

proactive, data-driven moderation process.

To realize this vision, in this paper, we discuss the results from our

six-month-long collaborative effort towards distilling the data-driven

problems and corresponding visual analytic interventions for proac-

tive content moderation. Following Munzner’s nested model [34],

we ﬁrst analyze the content moderation goals and the associated data

abstraction. Next, we contribute a visual analytic task abstraction frame-

work for mapping the problems and challenges to concrete moderators’

decision-making tasks. We also discuss the applications and implica-

tions of our framework for future research on data-driven, human-in-

the-loop content moderation processes.

2 PRO B LE M CHARACTERIZATION

Grimmelmann [20] deﬁnes moderation as “the governance mechanisms

that structure participation in a community to facilitate cooperation and

prevent abuse.” With the proliferation of online communities, the num-

ber of human moderators is vastly outnumbered by the user-generated

content and the increased negativity, which is a concern when content

creation is growing at an exponential speed and a core element of many

of the major informational and social platforms today. To reduce online

negativity, commercial platforms apply many techniques to ﬁlter abu-

sive language, such as improving algorithms and applying automation

tools [10,23]. Though these automated tools can identify new instances

of negativity such as harassment and hate speech with pattern matching,

violators always seek ways to circumvent the algorithms and cheat

the tools with variants [9]. To supplement algorithmic moderation,

platforms also rely on human moderators to remove ﬂagged content or

review instances in context-sensitive situations.

2.1 Moderation Goals and Challenges

The moderation process involves how human moderators govern both

content and community members and the standards development for

the governance. There are mainly two threads of research about human

moderators handling offensive content and users: proactively prevent-

ing mechanism and reactive punishing mechanism. A thread of research

focuses on proactively preventing offensive behaviors via norm-setting

such as setting a good example in the chatroom to inﬂuence other view-

ers in live streaming chat [7,46], or engaging in rule developments [47].

What

How

Who

Moderation Scenario

Moderators

Social Media Users

to channels

post chat!

messages

understand!

policies

What

How

Who

Moderators

Social Media Users

to channels

post chat!

messages

understand!

policies

What

How

Who

Moderators

Social Media Users

to channels

post chat!

messages

understand!

policies

Visual Analytic Tasks Decision-making

Offline Real Time

Data Entities

Messages

User’s reactions &!

interactions

A user sends an abusive message on

the channel, but it’s their first time doing

so. How can moderator take actions?

A user has prior history of violations,

and the same pattern is seen recurring

in their chat logs. What can a moderator

do to avoid toxicity?

(G2, G3)

On a channel subscribers are now

trying to mask their original usernames

to prevent identification. What kind of

rules can be set to resolve such an

issue?

Users keep on verbally attacking the

streamer in spite of rules in place. How

can moderator analyze the rules?

(G4)

Users

Channel subscribers,

Platform users

Rules

User behavior

guidelines &

policies

A streamer is trolled using inappropriate

words. How can moderator filter the

irrelevant content?

A channel is getting flooded with

several topics of violence and

misinformation. How can a moderator

get a quick summary of these topics to

dilute the conversation?

(G1)

Monitor messages

Summarize conversations

Prioritize users

Analyze audience sentiment

Build/Update rules

Analyze rule efficacy

Utilizing user

sentiments for

toxicity

detection

Summarize

channel

conversations

Analyze user

behavior

through prior

data

Identify

message

attributes

Determine rule

efficacy

through chat

logs

Role of

Visualizations

M1, M2

U1, U2

R1, R2 Determine

which!

rules to apply

Topic-keyword

associations

Detection of!

negativity

Ranking of

individuals

Group !

sentiment-

analysis

Match rules!

& behavior

Visualize rule!

attributes

Fig. 1.

Mapping between content moderation goals and visual analytic tasks.

Scenarios illustrating how moderators can leverage the expressive

power of visualizations for making ofﬂine and real-time decisions about the who,what, and how dimensions of data-driven moderation.

Another thread of research focuses on reactively removing content

and punishing users, such as deleting content and banning users [11]

and explaining and communicating rules to violators [7]. This thread

of research also explores how moderators collaborate with automated

tools [6, 23], and how to use computational approaches to automat-

ically detect and ﬁlter harmful content [3]. According to empirical

research about content moderation and Grimmelmann’s moderation

goal to create a productive, open, and accessible online community [20],

the moderation goals are summarized as follows:

G1:

Get rid of harmful messages/comments and users, at the same

time, curate valuable information in the community ( [11]).

G2:

Retain newcomers and foster the community via interaction and

engagement ( [7, 47]).

G3:

Distinguish between good and bad actors and punish the latter but

avoid excessive punishment towards unintentional violators or ﬁrst-time

violators ( [4, 8]).

G4:

Develop and clarify the moderation guideline and maintain the

transparency of moderation ( [5,26]).

Much of the existing research explores how to achieve these goals

with varied socio-technical conﬁgurations. Our focus is to explore

the inﬂuence of data-driven methods on human moderator’s decision-

making process. We address questions such as: How do moderators

use aggregate information of users to guide the decision-making in

moderation actions? How can visualization help moderators to facilitate

moderation in context-sensitive situations?

2.2 Data Abstraction

To address the goals and research questions, we ﬁrst describe the spe-

ciﬁc data entities that be considered as the building blocks of algorith-

mic moderation tools and that can be used to develop human-in-the-loop

moderation tools. The moderation process comprises three main data

entities: Messages, User Proﬁles, and Rules.

Messages (M)

: Messages encode the response of the users towards

the actual content and their interactions with other users of a channel.

Moderators can leverage text-based analysis of messages to analyze

and monitor the conversations on the channel. This monitoring of

chat helps to ﬂag messages and detect violations of established rules

or signals of abusive content. Human moderators have to accurately

identify negative behaviors on time and rapidly take appropriate actions

to prevent the spread of abusive content. In live streaming environments

such as Twitch [7], this is cognitively demanding as a large volume

of messages is posted in a short span of time making it difﬁcult for

moderators to make timely decisions. Platforms often employ crowd-

sourced moderation strategies in the form of ﬂagging tools that allow

users to express concerns about potentially offensive content and report

them to moderators [28]. This strategy does not perform effectively in

the context of real-time moderation because of the time gap between

reporting bad content and reviewing it [49].

User Proﬁles (U)

: Users of social media platforms are central to the

moderation process. The goal is to encourage user participation in

online communities by providing them value-based content. Moder-

ators can characterize the users based on their engagement in online

activities. On the other hand, moderators can also punish those users

who do not abide by the norms. Data such as message histories and

replies are not accessible by users but can be accessed by the modera-

tors. Online communities do not share the users’ information of each

micro community with customized community guidelines (one user’s

history in one micro-community cannot be seen by moderators from

another). As for live voice moderation, it is even more challenging to

collect voice information for moderators to make decisions [25] such

as Discord Voice chat and Clubhouse. The history of a user’s prior be-

havior is obtained from archival data and does not change dynamically

with time.

Rules (R)

: Rules deﬁne the code of conduct regarding a user’s online

behavior. Moderators take data-driven decisions matching user proﬁles

with rules that are set for a particular stream. The severity of punishment

varies based on the user proﬁle and the importance of the rule [5]. A

key challenge faced by moderators is to go over real-time messages

and ﬁne-tune their mental model for applying chat rules by assessing

the severity of the violation [6]. This exhaustive and tricky process

of uncovering niggling chat messages amongst the other messages

becomes more rigorous when myriad messages need to be scrutinized

promptly. Similar to user proﬁles, rules deﬁning online behavior are

mostly static and do not evolve in real-time.

3 VISUAL ANALYTIC TASK ABSTRACTION

In this section, we map the moderation goals to entity-level visual ana-

lytic tasks, focusing on message analysis (M1, M2), user proﬁling (U1,

U2), and rule building (R1, R2). We discuss the role of analytical

methods and visualization for addressing moderation goals using exam-

Circular plot to visualize shift in

conversation’s topic

M1 M2

Scatter plot to identify

abusive messages

Violin plot to identify

relevant conversational

dynamics

(a)

(b)

(c)

Fig. 2.

Examples of techniques for visualizing conversations.

(a)

ConToVi identiﬁes the shifts in conversation topics for navigating the

online discussions [17], (b) Park et al. describes a user-centric design

approach to select ﬂagged comments with the help of comment analytic

scores which can detect only a small set of messages because of key-

word limitations [41], (c) Seebacher et al. displays relevant conversational

dynamics while fading out the non-relevant ones [45].

ples from the visual analytics literature (a detailed list included in the

supplemental material) and also highlight key gaps and challenges.

3.1 Message Analysis Tasks

M1: Reasoning about Violations:

The real-time nature of the stream-

ing data requires the moderator to maintain the pace of processing the

continuous data and analyze it. This task aims to achieve the goal of

ﬁltering out abusive messages and provide users with qualitative con-

tent (

). The task of determining violations involves two components:

monitoring messages to identify anomalies and identifying message

attributes. If we look at the two scenarios in Figure 1a, monitoring

helps to ﬂag messages based on their content. Identifying message

characteristics is another way to detect patterns in the chat streams.

Annotating and deleting spam messages [36, 46] through user interven-

tion can be helpful to recognize signatures of messages for ﬂagging to

identify change.

The dynamic nature of streaming data makes it difﬁcult to analyze

the chats for offensive content and make timely decisions. Therefore,

platforms are increasingly turning to automated systems to detect abu-

sive content within a shorter duration [40]. When moderators engage

in the task of monitoring messages one of the ways in which they can

overcome the hurdle of information overload is by leveraging the visual

analytics methods to review contextual information. Several visual

analytic approaches provide support to analyze real-time content us-

ing interactivity for anomaly detection in the message streams [1,30].

The Sedimentation View [17] shown in Figure 2a is an example of

representing only the relevant pieces of communication from the entire

conversation. T-Cal [18] is a timeline-based approach that highlights

areas with high information density. This provides a visual cue to the

moderator to monitor those highlighted regions closely.

The challenge for visualizing the dynamic of chat streams lies in the

automatic identiﬁcation of appropriate cues from the message dynam-

ics. However, incorporating this information into an automated tool

produces the risk of getting inaccurate results. Additionally, because of

various nuances in vocabulary and language, the process of automated

content moderation suffers from the limitation of deriving contextual

insights from the messages.

M2: Summarizing Real-time Conversations:

Communication via

stream chat involves interaction between multiple users containing

a large volume of messages. Because of this information density,

simpliﬁcation is required. The topic summary identiﬁed using this

task help moderators to set the tone of the conversation and maintain

the regulations to provide a positive atmosphere for online discussion,

potentially providing insights for them to foster the community via

interaction and engagement (G2).

Generating a summary of conversations involves two components:

text summarization and topic identiﬁcation. Automatic summarization

of messages in a channel is valuable to the moderators but it has certain

limitations. The summarization of conversation necessitates addressing

the trade-off between information loss (e.g., leaving out potentially

relevant information) and abstraction of key topical patterns so that

harmful content can be quickly detected [33]. As chats are dynamic,

it is possible to have multiple topics being discussed simultaneously

and changing with time. Visual analytics interfaces (Figure 2c) can

help identify the shifts in conversation topics for navigating online

discussions. Approaches like trains of thoughts [48] and conversation

clusters [2] group messages of the same theme together. These ap-

proaches can allow moderators to have a better understanding of the

topics of conversation.

Using a visual analytics system to explore the conversations based

on topics is helpful to extract relevant linguistic features from the chat.

With all the approaches discussed above, scalability and adaptation of

visualizations to changes in dynamic conversation streams [13] remain

a challenge. This challenge needs to be handled by assessing the

perceptual limitations of the alternative designs in communicating the

number, frequency, and degree of changes in conversation streams.

3.2 User Proﬁling Tasks

U1: Ranking user proﬁles using prior history:

This task aims to

analyze data about users’ past online behavior. This includes analyzing

users’ historical data and ranking users based on their proﬁles. Study-

ing user’s online behavior helps moderators identify the type of users

they need to pay special attention to (

). Consequently, this task can

help moderators to foster a healthy community of users and retain their

participation (G2).

The collection of user’s historical data incorporates the study of

their characteristics, interests, ratings, usage patterns, and chat logs

to recognize behavioral patterns. It facilitates the understanding of

their online behavior on different social media platforms, reveals in-

sights into characteristics of their communication, and extracts relevant

information to get an idea about the conversations and topics under

discussion. Scoring proﬁles based on recently opened accounts and

user activities [6] helps understand the punishment based on the context

and weight of the violation. This further helps to determine the type of

punishment for the user when situations arise as described in Figure 1b.

An example of this is the work by Oliva et al. which ranks user proﬁles

based on the toxicity level [39]. This can be useful for a moderator to

monitor highly sensitive users based on their proﬁle toxicity scores.

For the methods described above, it remains a challenge to perform

manual tagging and examination of chat logs. This task is often limited

by the ability of automated programs to process the numerous amounts

of user’s archival data and the algorithm used for ranking.

U2: Reasoning about audience sentiment

: Research in NLP has

investigated the problem of sentiment analysis, which is generally clas-

siﬁed as positive, negative, or neutral at the granularity of words. The

task of utilizing the user’s sentiments serves the purpose of determining

the level of toxicity. This task can potentially help moderators to better

understand the potential meaning of each message to avoid excessive

punishment (

). As moderators expressed, sometimes it is challeng-

ing to distinguish between a joke and a serious violation. Thus, making

use of user sentiments avoids such circumstances.

Moderators often face challenges when detecting abusive content

from online communications. They try to mitigate the problem by

implementing reﬁned ﬁlters [22]. But these systems often fail due to

a lack of correlation between the semantic space and user sentiments.

Several authors have proposed solutions for semi-automatic detection

of toxicity. For example, the interface CommentIQ in Figure 2b en-

ables ﬂagging of messages based on keywords [41]. However, this

approach can detect only a small set of messages because of keyword

limitations [37]. Nobata et al. [38] trained a machine learning model to

identify hate speech using a custom-built lexicon. Some other papers

used the bag-of-words method to detect cyber-bullying [16,42] on so-

cial media platforms. All these lexicons have drawbacks that arise from

the limited set of vocabulary. Chatzakou et al. [12] considered senti-

ment as an input to their neural network but did not discuss the impact

on user perception. Visual analytic techniques can enable moderators

to draw inferences based on group sentiment within their audiences,

where groups can be deﬁned based on interests, behavior, etc.

3.3 Rule-Building Tasks

R1: Augmenting the Rule Book: Rules are made to educate the plat-

form users about norms for expected behavior. These rules include

respecting others in the community, following the guidelines made by

the community, etc. The task of augmenting the set of rules includes

building rules and modifying rules based on a user’s behavior. This task

aligns with

at the broad level, helping the community moderators un-

derstand how rules match with violations and add community-speciﬁc

rules based on the streamer’s requirements.

Modifying the rules can be grounded in assessing users’ relative

standing in the community. This includes analyzing the history of

past rule-breaking cases and the severity of the rules that have been

broken [5,7, 49]. Using the set of rules allows the creation of automatic

ﬁlters that remove the unwanted content by comparing it with existing

rules. Such ﬁlters can be leveraged by visual analytics techniques. For

example, the chat circles and vertical line approach described in [15]

can be used to visualize user messages based on rules. With posting

rules from time to time, it is important to visualize the user involvement

before and after posting the rules. Like the distribution of messages

shown in [27] per participant before and after posting rules by a chatbot.

However, a shortcoming of these methods is that they cannot detect the

dynamic reactions to the rules and thus can hinder real-time ﬁltering

and decision-making.

R2: Determining Rule Efﬁcacy

: The task of determining the effec-

tiveness of rules fulﬁlls the purpose of developing moderation guide-

lines to maintain transparency of content moderation (

), by compar-

ing the existing rules with violations and identifying the effective rules

and the missing parts. For this, rule-based techniques help moderators

to detect abusive content and ﬁlter out those messages. It helps mod-

erators to revise the guideline and regulate situations like Figure 1c.

This task composes of inspecting rule accuracy and categorizing rules

based on the severity.

Most of these rules are designed manually. Kontostathis et al. pro-

posed a rule-based system to automatically detect harmful messages

in relay chat [29] using an existing set of rules. Developing a visual

approach that helps moderators to directly examine the effectiveness

of rules facilitates the moderation process. Visualizing the numerical

proﬁle scores and the rule-breaking severity scores of the users will

help the moderators understand the similarity and differences among

“good” or “bad” rules. It will be beneﬁcial in both cases - a popular

channel crowded with users and also newer channels where the moder-

ator lacks prior experience. Maintaining and modifying the rules is a

time-consuming process. It also cannot guarantee that a given set of

rules work under all the circumstances. For example, there may be a

message containing conﬂicting keywords in an appropriate context, but

it can be marked as offensive based on the rules. Whereas in other cases,

a message containing abusive content may still be accepted and marked

as appropriate. Visual analytic interventions can help detect and ﬁll

these gaps by enabling provenance-based retrieval and validation of

rules.

4 APPLICATIONS OF TASK ABSTRACTION FRAMEWORK

In this section, we discuss how our task abstraction framework can be

applied in practice to addressing open problems in visualization design

and human-machine interface development.

Ensuring moderation transparency

: Using the visual analytic tasks,

moderators can examine the rules and criteria through the lens of trans-

parency. Many content moderation systems on social media sites are

black-box in nature; users have to ﬁgure out on their own about why

content is removed [35]. This lack of transparency can create barriers

for user engagement for volunteer moderators who need to proactively

communicate to users about guidelines and action consequences. In

such a high-consequence setting, tasks like M1, R1, U2 can allow

moderators to achieve a balance between preserving the safety of their

communities and mitigating the effects of negative responses to their

corrective actions with transparent communication. Visual analytic

interventions can help achieve this balance using evidence-based com-

munication to explain moderation actions between moderators and

platform users.

Facilitating social and communicative moderation

: Though auto-

mated moderation tools can potentially detect signals of violation within

a large volume of text stream, moderators are still irreplaceable, be-

cause ultimately a moderation process is about social communication.

To foster and grow online communities, volunteer moderators play

multiple roles with social and communicative attributes [7,49] and are

related to tasks U1 and U2. Our framework can guide designers to de-

velop visualization tools to meet the needs of different communities of

volunteer and commercial content moderators. For example, volunteer

moderators have more ﬂexible guidelines for their communities while

commercial moderators have to follow the universal platform policy.

This implies that volunteer moderators are in greater need of tools for

mining users’ behavior (M1, U2) and adapting their rules (R1) accord-

ingly. On the other hand, commercial moderators can beneﬁt from rule

evaluation tasks (R2) for data-driven validation of their policies.

Designing for moderators’ well-being

: Along with reducing the cog-

nitive load of moderators, realizing tasks like M1 and M2 enables

exploration of the visualization design space for addressing psycho-

logical implications of content moderation. Decision-making about

negative content often leads to psychological and emotional distress.

Though reducing distress is not the primary goal of moderation, it can

be embedded in the visualization design space. While not many studies

focus on designing for moderation in live-streaming environments, vi-

sualization design strategies that optimize emotional impact [21] can

reduce moderators’ exposure to problematic content and can work as

interventions to mitigate distress [14, 31].

Instantiating human-machine moderation interfaces

: Mainstream

moderation tools list violators and violations with limited explanations,

and more importantly, lack proactive moderation capabilities. Our

task abstraction framework can be applied for instantiating human-

machine collaboration interfaces, where human and machine efforts

are complementary, leading to optimal task performance as a team [32].

Moderators can ground their exploration process based on facets of

interest (person, topic, region, ﬂagged content, etc.), ﬂag particular

users or sensitive topics, while a machine learning model can be trained

for learning from their interactions and suggesting corrective actions.

5 CONCLUSION AND FUTURE WORK

Our work introduces a visual analytic task abstraction framework for

addressing data-driven problems in proactive content moderation. We

discuss the implications of the visual analytics framework for inﬂuenc-

ing the future of transparent and communicative moderation practices.

As a next step, we plan to realize our proposed visual analytic tasks

within existing content moderation workﬂows. We will conduct em-

pirical studies to evaluate how visual analytic interventions and the

resulting human-machine interfaces help reduce the cognitive load and

emotional toll of content moderators.

6 ACKNOWLEDGEMENT

This work was funded by the National Science Foundation (award

number 1928627).

REFERENCES

[1]

F. Abel, C. Hauff, G.-J. Houben, R. Stronkman, and K. Tao. Twitci-

dent: ﬁghting ﬁre with information from social web streams. In Proc.

International conference on world wide web, pp. 305–308, 2012.

[2]

T. Bergstrom and K. Karahalios. Conversation clusters: grouping conver-

sation topics through human-computer dialog. In Proc. Conference on

Human Factors in Computing Systems (CHI), pp. 2349–2352, 2009.

[3]

L. Blackwell, J. Dimond, S. Schoenebeck, and C. Lampe. Classiﬁca-

tion and its consequences for online harassment: Design insights from

heartmob. Proc. ACM Hum.-Comput. Interact., 1(CSCW):1–19, 2017.

[4]

L. Blackwell, M. Handel, S. T. Roberts, A. Bruckman, and K. Voll. Un-

derstanding” bad actors” online. In Extended Abstracts, Conference on

Human Factors in Computing Systems, pp. 1–7, 2018.

[5]

J. Cai, C. Guanlao, and D. Y. Wohn. Understanding rules in live streaming

micro communities on twitch. In Proceedings of ACM International

Conference on Interactive Media Experiences (IMX’21), 2021.

[6]

J. Cai and D. Y. Wohn. Categorizing live streaming moderation tools: An

analysis of twitch. International Journal of Interactive Communication

Systems and Technologies (IJICST), 9(2):36–50, 2019.

[7]

J. Cai, D. Y. Wohn, and M. Almoqbel. Moderation visibility: Mapping the

strategies of volunteer moderators in live streaming micro communities.

In Proc. ACM Conference on Interactive Media Experiences (IMX), 2021.

[8]

J. Cai and D. Yvette Wohn. After Violation But Before Sanction: Un-

derstanding Volunteer Moderators’ Proﬁling Processes Toward Violators

in Live Streaming Communities. Proc. ACM Hum.-Comput. Interact.,

5(CSCW2):25, 2021. doi: 10. 1145/3479554

[9]

S. Chancellor, J. A. Pater, T. Clear, et al. thyghgapp: Instagram content

moderation and lexical variation in pro-eating disorder communities. In

Proc. Computer-Supported Cooperative Work & Social Computing, pp.

1201–1213, 2016.

[10]

E. Chandrasekharan, C. Gandhi, M. W. Mustelier, and E. Gilbert. Cross-

mod: A cross-community learning-based system to assist reddit modera-

tors. Proc. ACM Hum.-Comput. Interact., 3(CSCW):1–30, 2019.

[11]

E. Chandrasekharan, U. Pavalanathan, A. Srinivasan, et al. You can’t stay

here: The efﬁcacy of reddit’s 2015 ban examined through hate speech.

Proc. ACM on Human-Computer Interaction, 1(CSCW):1–22, 2017.

[12]

D. Chatzakou, N. Kourtellis, J. Blackburn, et al. Mean birds: Detecting

aggression and bullying on twitter. In Proc. 2017 ACM on web science

conference, pp. 13–22, 2017.

[13]

A. Dasgupta, D. L. Arendt, L. R. Franklin, et al. Human factors in

streaming data analysis: Challenges and opportunities for information

visualization. In CGF, vol. 37, pp. 254–272. Wiley, 2018.

[14]

L. Derick, G. Sedrakyan, P. J. Munoz-Merino, et al. Evaluating emotion

visualizations using affectvis, an affect-aware dashboard for students.

Journal of Research in Innovative Teaching & Learning, 2017.

[15]

J. Donath, K. Karahalios, and F. Viegas. Visualizing conversation. Journal

of computer-mediated communication, 4(4):JCMC442, 1999.

[16]

M. Ebrahimi. Automatic identiﬁcation of online predators in chat logs by

anomaly detection and deep learning. PhD thesis, Concordia Univ., 2016.

[17]

M. El-Assady, V. Gold, C. Acevedo, C. Collins, and D. Keim. Contovi:

Multi-party conversation exploration using topic-space views. In Computer

Graphics Forum, vol. 35, pp. 431–440. Wiley Online Library, 2016.

[18]

S. Fu, J. Zhao, H. F. Cheng, H. Zhu, and J. Marlow. T-cal: Understanding

team conversational data with calendar-based visualization. In In Proc.

Conference on Human Factors in Computing Systems, pp. 1–13, 2018.

[19]

R. Gorwa, R. Binns, and C. Katzenbach. Algorithmic content moder-

ation: Technical and political challenges in the automation of platform

governance. Big Data & Society, 7(1), 2020.

[20]

J. Grimmelmann. The virtues of moderation. Yale JL & Tech., 17:42,

2015.

[21]

L. Harrison, R. Chang, and A. Lu. Exploring the impact of emotion

on visual judgement. In In Proc. IEEE Conference on Visual Analytics

Science and Technology (VAST), pp. 227–228. IEEE, 2012.

[22]

H. Hosseini, S. Kannan, B. Zhang, and R. Poovendran. Deceiving

google’s perspective api built for detecting toxic comments. arXiv preprint

arXiv:1702.08138, 2017.

[23]

S. Jhaver, I. Birman, E. Gilbert, and A. Bruckman. Human-machine

collaboration for content regulation: The case of reddit automoderator.

ACM Transactions on Computer-Human Interaction, 26(5):1–35, 2019.

[24]

S. Jhaver, S. Ghoshal, A. Bruckman, and E. Gilbert. Online harassment

and content moderation: The case of blocklists. ACM Transactions on

Computer-Human Interaction (TOCHI), 25(2):1–33, 2018.

[25]

J. A. Jiang, C. Kiene, S. Middler, J. R. Brubaker, and C. Fiesler. Modera-

tion challenges in voice-based online communities on discord. Proceedings

of the ACM on Human-Computer Interaction, 3(CSCW):1–23, 2019.

[26]

P. Juneja, D. Rama Subramanian, and T. Mitra. Through the looking glass:

Study of transparency in reddit’s moderation practices. Proceedings of the

ACM on Human-Computer Interaction, 4(GROUP):1–35, 2020.

[27]

S. Kim, J. Eun, C. Oh, et al. Bot in the bunch: Facilitating group chat

discussion by improving efﬁciency and participation with a chatbot. In

Proc. Conference on Human Factors in Computing Systems, pp. 1–13,

2020.

[28]

K. Klonick. The new governors: The people, rules, and processes govern-

ing online speech. Harv. L. Rev., 131:1598, 2017.

[29]

A. Kontostathis. Chatcoder: Toward the tracking and categorization of

internet predators. In Proceedings of Text Mining Workshop 2009 Held

in Conjunction with the Ninth Siam international Conference on Data

Mining (SDM 2009). SPARKS, NV. MAY 2009. Citeseer, 2009.

[30]

M. Krstajic, E. Bertini, and D. Keim. Cloudlines: Compact display of

event episodes in multiple time-series. IEEE transactions on visualization

and computer graphics, 17(12):2432–2439, 2011.

[31]

J. Liem, C. Perin, and J. Wood. Structure and empathy in visual data

storytelling: Evaluating their inﬂuence on attitude. In Computer Graphics

Forum, vol. 39, pp. 277–289. Wiley Online Library, 2020.

[32]

C. Lyn Paul, L. M. Blaha, et al. Opportunities and challenges for human-

machine teaming in cybersecurity operations. In In Proc. Human Factors

and Ergonomics Society, vol. 63, pp. 442–446. SAGE Publications, 2019.

[33]

A. Marcus, M. S. Bernstein, O. Badar, et al. Twitinfo: aggregating and

visualizing microblogs for event exploration. In Proc. CHI conference on

Human factors in computing systems, pp. 227–236, 2011.

[34]

T. Munzner. A nested model for visualization design and validation. IEEE

Transactions on Visualization and Computer Graphics, 15(6):921–928,

2009.

[35]

S. Myers West. Censored, suspended, shadowbanned: User interpretations

of content moderation on social media platforms. New Media & Society,

20(11):4366–4383, 2018.

[36]

T. K. Naab, A. Kalch, and T. G. Meitz. Flagging uncivil user comments:

Effects of intervention information, type of victim, and response comments

on bystander behavior. New Media & Society, 20(2):777–795, 2018.

[37]

A. Nielsen. A new anew: Evaluation of a word list for sentiment analysis

in microblogs. arXiv preprint arXiv:1103.2903, 2011.

[38]

C. Nobata, J. Tetreault, A. Thomas, et al. Abusive language detection in

online user content. In Proc. WWW, pp. 145–153, 2016.

[39]

T. D. Oliva, D. M. Antonialli, and A. Gomes. Fighting hate speech,

silencing drag queens? artiﬁcial intelligence in content moderation and

risks to lgbtq voices online. Sexuality & Culture, 25(2):700–732, 2021.

[40]

E. Papegnies, V. Labatut, R. Dufour, and G. Linares. Graph-based features

for automatic online abuse detection. In International conference on

statistical language and speech processing, pp. 70–81. Springer, 2017.

[41]

D. Park, S. Sachar, N. Diakopoulos, and N. Elmqvist. Supporting comment

moderators in identifying high quality online news comments. In Proc.

Conference on Human Factors in Computing Systems, pp. 1114–1125,

2016.

[42]

K. Reynolds, A. Kontostathis, and L. Edwards. Using machine learning to

detect cyberbullying. In Conference on Machine learning and applications,

vol. 2, pp. 241–244. IEEE, 2011.

[43]

S. T. Roberts. Commercial content moderation: Digital laborers’ dirty

work. 2016.

[44]

S. T. Roberts. Digital detritus:’error’and the logic of opacity in social

media content moderation. First Monday, 2018.

[45]

D. Seebacher, M. T. Fischer, R. Sevastjanova, D. A. Keim, and M. El-

Assady. Visual analytics of conversational dynamics. arXiv preprint

arXiv:2105.04897, 2021.

[46]

J. Seering, R. Kraut, and L. Dabbish. Shaping pro and anti-social behavior

on twitch through moderation and example-setting. In Proceedings of

the 2017 ACM conference on computer supported cooperative work and

social computing, pp. 111–125, 2017.

[47]

J. Seering, T. Wang, J. Yoon, and G. Kaufman. Moderator engagement and

community development in the age of algorithms. New Media & Society,

21(7):1417–1443, 2019.

[48]

D. Shahaf, C. Guestrin, and E. Horvitz. Trains of thought: Generating

information maps. In Proc. WWW, pp. 899–908, 2012.

[49]

D. Y. Wohn. Volunteer moderators in twitch micro communities: How they

get involved, the roles they play, and the emotional labor they experience.

In Proc. Conference on human factors in computing systems (CHI), pp.

1–13, 2019.

ResearchGate has not been able to resolve any citations for this publication.

Understanding Rules in Live Streaming Micro Communities on Twitch

Conference Paper

Full-text available

Jun 2021

Figure 1: These are screenshots from the channel of popular streamer SypherPK. Left (A), is the "about" page for the streamer's channel, which has a section for rules. Right (B) is a screenshot of when a stream is happening. When a user tries to type in the channel's chat for the first time, a pop-up window will appear with "chat rules". ABSTRACT Rules and norms are critical to community governance. Live streaming communities like Twitch consist of thousands of micro-communities called channels. We conducted two studies to understand the micro-community rules. Study one suggests that Twitch users perceive that both rules transparency and communication frequency matter to channel vibe and frequency of harassment. Study two finds that the most popular channels have no channel or chat rules; among these having rules, rules encouraged by streamers are prominent. We explain why this may happen and how this contributes to community moderation and future research. CCS CONCEPTS • Human-centered computing → Empirical studies in HCI.

After Violation But Before Sanction: Understanding Volunteer Moderators' Profiling Processes Toward Violators in Live Streaming Communities

Article

Full-text available

Oct 2021

Content moderation is an essential part of online community health and governance. While much of extant research is centered on what happens to the content, moderation also involves the management of violators. This study focuses on how moderators (mods) make decisions about their actions after the violation takes place but before the sanction by examining how they "profile" the violators. Through observations and interviews with volunteer mods on Twitch, we found that mods engage in a complex process of collaborative evidence collection and profile violators into different categories to decide the type and extent of punishment. Mods consider violators' characteristics as well as behavioral history and violation context before taking moderation action. The main purpose of the profiling was to avoid excessive punishment and aim to integrate violators more into the community. We discuss the contributions of profiling to moderation practice and suggest design mechanisms to facilitate mods' profiling processes.

Fighting Hate Speech, Silencing Drag Queens? Artificial Intelligence in Content Moderation and Risks to LGBTQ Voices Online

Article

Full-text available

Apr 2021
Sex Cult

Thiago Oliva

Link to access full content: rdcu.be/b90gk Companies operating internet platforms are developing artificial intelligence tools for content moderation purposes. This paper discusses technologies developed to measure the ‘toxicity’ of text-based content. The research builds upon queer linguistic studies that have indicated the use of ‘mock impoliteness’ as a form of interaction employed by LGBTQ people to cope with hostility. Automated analyses that disregard such a pro-social function may, contrary to their intended design, actually reinforce harmful biases. This paper uses ‘Perspective’, an AI technology developed by Jigsaw (formerly Google Ideas), to measure the levels of toxicity of tweets from prominent drag queens in the United States. The research indicated that Perspective considered a significant number of drag queen Twitter accounts to have higher levels of toxicity than white nationalists. The qualitative analysis revealed that Perspective was not able to properly consider social context when measuring toxicity levels and failed to recognize cases in which words, that might conventionally be seen as offensive, conveyed different meanings in LGBTQ speech.

Algorithmic content moderation: Technical and political challenges in the automation of platform governance

Article

Full-text available

Feb 2020

As government pressure on major technology companies builds, both firms and legislators are searching for technical solutions to difficult platform governance puzzles such as hate speech and misinformation. Automated hash-matching and predictive machine learning tools – what we define here as algorithmic moderation systems – are increasingly being deployed to conduct content moderation at scale by major platforms for user-generated content such as Facebook, YouTube and Twitter. This article provides an accessible technical primer on how algorithmic moderation works; examines some of the existing automated tools used by major platforms to handle copyright infringement, terrorism and toxic speech; and identifies key political and ethical issues for these systems as the reliance on them grows. Recent events suggest that algorithmic moderation has become necessary to manage growing public expectations for increased platform responsibility, safety and security on the global stage; however, as we demonstrate, these systems remain opaque, unaccountable and poorly understood. Despite the potential promise of algorithms or ‘AI’, we show that even ‘well optimized’ moderation systems could exacerbate, rather than relieve, many existing problems with content policy as enacted by platforms for three main reasons: automated moderation threatens to (a) further increase opacity, making a famously non-transparent set of practices even more difficult to understand or audit, (b) further complicate outstanding issues of fairness and justice in large-scale sociotechnical systems and (c) re-obscure the fundamentally political nature of speech decisions being executed at scale.

Bot in the Bunch: Facilitating Group Chat Discussion by Improving Efficiency and Participation with a Chatbot

Conference Paper

Full-text available

Feb 2020

Although group chat discussions are prevalent in daily life, they have a number of limitations. When discussing in a group chat, reaching a consensus often takes time, members contribute unevenly to the discussion, and messages are unorganized. Hence, we aimed to explore the feasibility of a facilitator chatbot agent to improve group chat discussions. We conducted a needfinding survey to identify key features for a facilitator chatbot. We then implemented Groupfeed-Bot, a chatbot agent that could facilitate group discussions by managing the discussion time, encouraging members to participate evenly, and organizing members' opinions. To evaluate GroupfeedBot, we performed preliminary user studies that varied for diverse tasks and different group sizes. We found that the group with GroupfeedBot appeared to exhibit more diversity in opinions even though there were no differences in output quality and message quantity. On the other hand, GroupfeedBot promoted members' even participation and effective communication for the medium-sized group.

Opportunities and Challenges for Human-Machine Teaming in Cybersecurity Operations

Article

Full-text available

Nov 2019

Many of the challenges associated with cybersecurity operations are also ripe opportunities for the application of human-machine teaming. Advances in cognitive science, artificial intelligence, and machine learning promise the technology to achieve the level of intelligence and sophistication necessary to create a true intelligent teammate. This panel gathers experts from across the community to discuss the challenges and opportunities for human-machine teaming in cybersecurity operations.

Moderation Challenges in Voice-based Online Communities on Discord

Article

Full-text available

Nov 2019

Online community moderators are on the front lines of combating problems like hate speech and harassment, but new modes of interaction can introduce unexpected challenges. In this paper, we consider moderation practices and challenges in the context of real-time, voice-based communication through 25 in-depth interviews with moderators on Discord. Our findings suggest that the affordances of voice-based online communities change what it means to moderate content and interactions. Not only are there new ways to break rules that moderators of text-based communities find unfamiliar, such as disruptive noise and voice raiding, but acquiring evidence of rule-breaking behaviors is also more difficult due to the ephemerality of real-time voice. While moderators have developed new moderation strategies, these strategies are limited and often based on hearsay and first impressions, resulting in problems ranging from unsuccessful moderation to false accusations. Based on these findings, we discuss how voice communication complicates current understandings and assumptions about moderation, and outline ways that platform designers and administrators can design technology to facilitate moderation.

Categorizing Live Streaming Moderation Tools: An Analysis of Twitch

Article

Full-text available

Jul 2019

Twitch is one of the largest live streaming platforms and is unique from other social media in that it supports synchronous interaction and enables users to engage in moderation of the content through varied technical tools, which include auto-moderation tools provided by Twitch, third-party applications, and home-brew apps. The authors interviewed 21 moderators on Twitch and categorized the current features of real-time moderation tools they are using into four functions (chat control, content control, viewer control, settings control) and explored some new features of tools that they wish to own (e.g., grouping chat by languages, pop out window to hold messages, chat slow down, a set of buttons with pre-written/pre-message content, viewer activity tracking, all in one). Design implications provide suggestions for chatbots and algorithm design and development.

Through the Looking Glass: Study of Transparency in Reddit's Moderation Practices

Article

Jan 2020

Transparency in moderation practices is crucial to the success of an online community. To meet the growing demands of transparency and accountability, several academics came together and proposed the Santa Clara Principles on Transparency and Accountability in Content Moderation (SCP). In 2018, Reddit, home to uniquely moderated communities called subreddits, announced in its transparency report that the company is aligning its content moderation practices with the SCP. But do the moderators of subreddit communities follow these guidelines too? In this paper, we answer this question by employing a mixed-methods approach on public moderation logs collected from 204 subreddits over a period of five months, containing more than 0.5M instances of removals by both human moderators and AutoModerator. Our results reveal a lack of transparency in moderation practices. We find that while subreddits often rely on AutoModerator to sanction newcomer posts based on karma requirements and moderate uncivil content based on automated keyword lists, users are neither notified of these sanctions, nor are these practices formally stated in any of the subreddits' rules. We interviewed 13 Reddit moderators to hear their views on different facets of transparency and to determine why a lack of transparency is a widespread phenomenon. The interviews reveal that moderators' stance on transparency is divided, there is a lack of standardized process to appeal against content removal and Reddit's app and platform design often impede moderators' ability to be transparent in their moderation practices.

Crossmod: A Cross-Community Learning-based System to Assist Reddit Moderators

Article

Nov 2019

In this paper, we introduce a novel sociotechnical moderation system for Reddit called Crossmod. Through formative interviews with 11 active moderators from 10 different subreddits, we learned about the limitations of currently available automated tools, and how a new system could extend their capabilities. Developed out of these interviews, Crossmod makes its decisions based on cross-community learning---an approach that leverages a large corpus of previous moderator decisions via an ensemble of classifiers. Finally, we deployed Crossmod in a controlled environment, simulating real-time conversations from two large subreddits with over 10M subscribers each. To evaluate Crossmod's moderation recommendations, 4 moderators reviewed comments scored by Crossmod that had been drawn randomly from existing threads. Crossmod achieved an overall accuracy of 86% when detecting comments that would be removed by moderators, with high recall (over 87.5%). Additionally, moderators reported that they would have removed 95.3% of the comments flagged by Crossmod; however, 98.3% of these comments were still online at the time of this writing (i.e., not removed by the current moderation system). To the best of our knowledge, Crossmod is the first open source, AI-backed sociotechnical moderation system to be designed using participatory methods.

Conceptualizing Visual Analytic Interventions for Content Moderation

Abstract and Figures

Recommended publications

Conceptualizing Visual Analytic Interventions for Content Moderation

After Violation But Before Sanction: Understanding Volunteer Moderators' Profiling Processes Toward...

Moderation Visibility: Mapping the Strategies of Volunteer Moderators in Live Streaming Micro Commun...

Volunteer Work: Mapping the Future of Moderation Research