Figure 2 - uploaded by Manuel Burghardt
Content may be subject to copyright.
Visualization of frequent user names (left side) and frequently used words (right side) for a collection of tweets that contain the keyword Xbox . 

Visualization of frequent user names (left side) and frequently used words (right side) for a collection of tweets that contain the keyword Xbox . 

Source publication
Conference Paper
Full-text available
Social media services like Twitter churn out user-generated content in vast amounts. The massive availability of this kind of data demands new forms of analysis and visualization, to make it accessible and interpretable. In this article , we introduce Twista, an application that can be used to create tailored tweet collections according to specific...

Context in source publication

Context 1
... that was retweeted most frequently, as each retweet obtained via the Streaming API has a reference to the original tweet and its retweet count . Twista also counts and visualizes the most frequent users (i.e. people who publish a tweet) as well as the most frequent clients (e.g. Twitter Web Client , Twitter for iPhone , Twitter for Android , TweetDeck , etc.) that are used to publish the tweet. On the content level, the tool also provides information about the most frequently used words , hashtags and frequently quoted URLs , i.e. links to external web resources. In order to visualize these frequencies, we chose a bubble chart layout (cf. fig. 2): larger bubbles indicate a higher frequency. In addition, the name and the numerical frequency are displayed in each bubble. For the most frequent words used in a tweet collection, we chose a word cloud layout, as the number of different words is very diverse, and would have resulted in a large number of bubbles. In addition, we filter out stop words 8 for the most common function words from German, English and Spanish before we visualize the results. All tweets obtained via the Streaming API contain a time stamp as well as a UTC (coordinated universal time) offset that can be used to derive the local time of the tweet creation. This can be used to sort tweets temporally or clas- sify them as daytime or nighttime tweets (cf. fig. ...

Similar publications

Article
Full-text available
There is a growing consensus that online platforms have a systematic influence on the democratic process. However, research beyond social media is limited. In this paper, we report the results of a mixed-methods algorithm audit of partisan audience bias and personalization within Google Search. Following Donald Trump's inauguration, we recruited 18...

Citations

... Burghardt (2015) purposes existing tools to collect and analyze Twitter data and to conduct a study on new media. It is described that some API tools, for example, Twista (Spanner et al, 2015), Tworpus (Bazo et al, 2013), and Tweet Archivist (Williams, et al, 2011) to save, export and visualize the Tweets data/corpus. These tools are categorized into 'streaming API tools' and 'search API tools. ...
Article
Full-text available
The current study investigates the writing or register style of 0.2 million tweets on the topic of online education during Covid-19 in July and August 2020. This pandemic delivered a massive jolt not only to the global economy, health, and business, but also to our educational system. According to the most recent UNESCO report, 1.3 billion students worldwide have been unable to attend schools or universities since March 2020. (McCarthy, 2020). As a result, online education has been regarded as grasping at straws. The study intends to highlight frequently used language structure and frequently used lexis from individual users' lexicons via tweets. Furthermore, a deliberately used stock of vocabulary items may be useful in scrutinizing and measuring learners' attitudes toward this pandemic around the world. Style was defined by Biber and Conrad (2009) as "frequent language forms used by speakers" and "an art of effective and forceful communication." As a result, the most frequent words/lemmas were extracted from the entire corpus first, and then a reduced vocabulary with statistical measures was rigorously studied. Furthermore, some other morpho-syntactic linguistic/grammatical features were investigated using the MAT tagger, a statistical computational tool for determining the text type.
... Burghardt (2015) purposes existing tools to collect and analyze Twitter data and to conduct a study on new media. It is described that some API tools, for example, Twista (Spanner et al, 2015), Tworpus (Bazo et al, 2013), and Tweet Archivist (Williams, et al, 2011) to save, export and visualize the Tweets data/corpus. These tools are categorized into 'streaming API tools' and 'search API tools. ...
Article
Full-text available
The current study investigates the writing or register style of 0.2 million tweets on the topic of online education during Covid-19 in July and August 2020. This pandemic delivered a massive jolt not only to the global economy, health, and business, but also to our educational system. According to the most recent UNESCO report, 1.3 billion students worldwide have been unable to attend schools or universities since March 2020. (McCarthy, 2020). As a result, online education has been regarded as grasping at straws. The study intends to highlight frequently used language structure and frequently used lexis from individual users' lexicons via tweets. Furthermore, a deliberately used stock of vocabulary items may be useful in scrutinizing and measuring learners' attitudes toward this pandemic around the world. Style was defined by Biber and Conrad (2009) as "frequent language forms used by speakers" and "an art of effective and forceful communication." As a result, the most frequent words/lemmas were extracted from the entire corpus first, and then a reduced vocabulary with statistical measures was rigorously studied. Furthermore, some other morpho-syntactic linguistic/grammatical features were investigated using the MAT tagger, a statistical computational tool for determining the text type.
... Users may also add a Tweet to their list of favorites, which resembles the bookmarking mechanism of web browsers. What exactly users are trying to achieve or to communicate when they retweet (boyd et al. 2010) or favorite (Meier et al. 2014) a Tweet has been Image taken from the Twista analysis and visualization tool for Tweets (Spanner et al. 2015). ...
... 500 million Tweets per day). Twista (Spanner et al. 2015) is an example of a tool that uses the Streaming API to collect Tweets that match pre-defined criteria, e.g. hashtags or keywords, for a specified period Once the specified crawling period is over, the user gets notified via email that their collection is ready for download. ...
... Parties interested in using the tool should contact the developers directly (cf. Spanner et al. 2015). An example corpus with corresponding analyses is available at http://bit.ly/1xephsf. ...
Article
Full-text available
The microblogging service Twitter provides vast amounts of user-generated language data. This article gives an overview of related work that has been conducted on Twitter so far. The anatomy of a Twitter message is described and typical uses of the Twitter platform discussed. The Twitter Application Programming Interface (API) will be introduced in a generic, non-technical way to provide a basic understanding of existing opportunities but also limitations when working with Twitter data. A basic classification system for existing tools is proposed that can be used for collecting and analyzing Twitter data and introduce some exemplary tools for each category. Then, a more comprehensive workflow for conducting studies with Twitter data is presented, which comprises the following steps: crawling, annotation, analysis and visualization. Finally, the generic workflow is illustrated by describing an example study from the context of social TV research. At the end of the article, the main issues concerning tools and methods for the analysis of Twitter data are briefly addressed.