ChapterPDF Available

Variable identification and approaches to validate fake news

Authors:

Abstract and Figures

This chapter looks at the variables involved in the spread of fake news to better understand factors that contribute to the success of fake news. This is issue remains open-ended as the solution has not been developed, and several researchers have been unsuccessful in reaching a clear approach because academics and industry are failing to consider the larger environment in which news on social media thrives. The variables involved in the dissemination of misinformation can be categorised into four categories: Human factors, Interaction factors, Platform factors, and Content factors. The human factors include attraction to rumours and thus tend to seek information that resonates with and affirms one's beliefs. Interaction factors include different ways to engage social media posts, verification of the information, and the likelihood of taking any number of actions that either promote or demote news online. Platform factors include platform algorithms and platform-tools employed online. Finally, the content factor relates to the research that shows fake news posts tend to have a distinctive linguistic style, multimedia content, and sourcing pattern that could help identify it early on. These factors rarely operate in isolation but rather the combination of all these factors may explain the complexity of understanding how fake news posts get a life of their own. Following this review, this chapter presents a step by step process of verifying news posted online by looking at the features identified above. This chapter will provide examples and a practical element for readers to cross-reference fake news. The human factor: The study of rumours The human factor can vary, as demonstrated in chapter 4 when we looked at the psychology of fake news. The key variables that one would need to take into account when considering human factors include the attraction to rumours, information that confirms one's belief, likelihood to take any number of actions, and tendency to invest time and effort to verify the information. Some of these factors have been extensively studied and modelled while others remain new fields of study. For long, the study of rumours was linked to the study of Epidemiological modelling. Epidemiology is the study of how disease spreads in a given community providing perhapsOne of the most extensively studied topics that can be applied to fake news. More than 50 years ago, Daley and Kendall (1964) explained the analogy between the spreading of infectious disease and the dissemination of rumours. They examined the spreading of a rumour from mathematical epidemiology. Researchers highlighted that a mathematical model for the spreading of rumours could be created in several different ways which depend on growth and decay of the spreading process. However, the environment in which rumours operate in brick and mortar spaces do not necessary match that of the virtual world of social media. Still, these models, the methods derived ffrom mathematical epidemiology can provide valuable insights.
Content may be subject to copyright.
Variable identification and approaches to validate fake news
By Dr Milan Dordevic and Dr Fadi Safieddine
9.1. Introduction:
This chapter looks at the variables involved in the spread of fake news to better understand factors
that contribute to the success of fake news. This is issue remains open-ended as the solution has
not been developed, and several researchers have been unsuccessful in reaching a clear approach
because academics and industry are failing to consider the larger environment in which news on
social media thrives. The variables involved in the dissemination of misinformation can be
categorised into four categories: Human factors, Interaction factors, Platform factors, and Content
factors. The human factors include attraction to rumours and thus tend to seek information that
resonates with and affirms one’s beliefs. Interaction factors include different ways to engage social
media posts, verification of the information, and the likelihood of taking any number of actions
that either promote or demote news online. Platform factors include platform algorithms and
platform-tools employed online. Finally, the content factor relates to the research that shows fake
news posts tend to have a distinctive linguistic style, multimedia content, and sourcing pattern that
could help identify it early on. These factors rarely operate in isolation but rather the combination
of all these factors may explain the complexity of understanding how fake news posts get a life of
their own.
Following this review, this chapter presents a step by step process of verifying news posted online
by looking at the features identified above. This chapter will provide examples and a practical
element for readers to cross-reference fake news.
The human factor: The study of rumours
The human factor can vary, as demonstrated in chapter 4 when we looked at the psychology of
fake news. The key variables that one would need to take into account when considering human
factors include the attraction to rumours, information that confirms one’s belief, likelihood to take
any number of actions, and tendency to invest time and effort to verify the information.
Some of these factors have been extensively studied and modelled while others remain new fields
of study. For long, the study of rumours was linked to the study of Epidemiological modelling.
Epidemiology is the study of how disease spreads in a given community providing perhapsOne
of the most extensively studied topics that can be applied to fake news. More than 50 years ago,
Daley and Kendall (1964) explained the analogy between the spreading of infectious disease and
the dissemination of rumours. They examined the spreading of a rumour from mathematical
epidemiology. Researchers highlighted that a mathematical model for the spreading of rumours
could be created in several different ways which depend on growth and decay of the spreading
process. However, the environment in which rumours operate in brick and mortar spaces do not
necessary match that of the virtual world of social media. Still, these models, the methods derived
ffrom mathematical epidemiology can provide valuable insights.
Bettencourt et al. (2006) applied several epidemiologically inspired population models to the
spread of rumours. They found that properly adapted epidemic models do a good job of fitting the
empirical data. The standard benefit of modelling epidemics is the capability to approximate
average population parameters quantitatively. These variables include individual contact rates,
duration of the infectious period, and incubation time. In most cases, these parameters also
generalise the dissemination of fake news online. Although, one can argue incubation time is not
necessarily relevant for rumours or spread of fake news.
In the process of creating epidemiological models and approximating their parameters for the
spread of Feynman diagrams, the authors proposed several theories as to why the dissemination of
news is not matched 100 per cent the spread of disease.
One interesting aspect of the spread of news is the inadequacy (or irrelevance) of the recovered
state; i.e. the state a person reaches after being exposed to fake news and the rebuking fake news.
Much fake news may never be forgotten at all. And thus, some users’ exposure to fake news can
go on indefinitely if never confronted with a debunking correction. Furthermore, Bettencourt et al.
(2006) developed a model called (SEIZ) and stands for the Susceptible, Exposed, Infected, Sceptic
model. The SEIZ model introduced an Exposed state “E”. Under this theory, the users are in an E
state. The user will take some time before they begin to believe (i.e. get Infected “I”) with the fake
news.
A model for the dissemination of fake news inspired by epidemics model, shown in Figure 1, is an
adaptation of a flowchart presented in Bettencourt et al. (2006). A user can be engaged in the
vulnerable class, then be exposed to the fake news, incubate it, and becoming a member of the
infected class. An individual might instead move into Sceptics class. Likely, part of the population
may ultimately recover, meaning that it will not manifest the fake news again. Users can also exit
any class, thus reducing the total population.
Figure 1 A model for the dissemination of fake news inspired by epidemics model
On a similar footing, Jin et al. (2013) showed how true news and fake news being propagated over
social media could be modelled by epidemiologically-derived population models. They
demonstrated that SEIZ model is accurate in capturing some key information associated with the
spread of a variety of fake news. The other widely used epidemiological model for demonstrating
information diffusion is Susceptible, Infected, Recovered model (SIR) (Newman, 2002). Newman
demonstrated that a large class of the SIR models of epidemic disease could be solved accurately
on social networks of various kinds using a mixture of mapping. Their finding delivered
methodical expressions for the sizes of a widespread and epidemic threshold when a rumour goes
viral.
Nekovee et al. (2007) presented a generic model of rumour spreading in complex networks where
.researchers deployed a mathematical system that experiences transitions from one state to another
according to certain probabilistic rules, otherwise known as Markov chain, to model their system.
The model sought into develop deterministic equations for the dynamics on complex networks.
Their results demonstrated the existence of a critical threshold in which rumour spreads and is
unable to without reaching that point in networks. While this model assumed the network to be
static, in reality, social media network are highly
dynamic, where users constantly break old connections in favour of new ones. It thus suggested
that modelling the spread of fake news in a dynamic network would be very challenging.
In another experiment, Wang, Yang, and Wang (2014) used anti-rumour broadcast to overwhelm
the rumour transmission. The authors proposed a Susceptible- Ignorant (SI) model to argue the
dynamics of rumour sharing verses anti-rumour dissemination on homogeneous networks and
heterogeneous networks. Ignorant referred to those individual who had not heard the rumour or
the anti-rumour. The Ignorant nodes can be infected or vaccinated and as such can transmit the
rumour and are willing to share the rumour to infectthe ignorant others.. Vaccinated nodes are
those immunised by the anti-rumour messaging and they can further transmit the anti-rumour to
ignorant or infected nodes.
When contacted by the vaccinated node, an infected node can discard the rumour with a probability
and change from the infected to the vaccinated. The results obtained by their research considered
constant curing rate of infected nodes and constant immunity of the anti-rumour, rumour spreading
rate. These parameters can differ from individual to individual, which is an interesting open
problem.
The impact of the influential nodes is one of the central problems in social network analysis.
Numerous models can simulate how various ideas, rumours, fake news, diseases, innovation, etc.
disseminate over the network. One such model is the susceptible/infected/susceptible (SIS) model
of Kimura, Saito, and Motoda (2009). This SIS simulation model allows nodes to be activated
multiple times reflecting the reality where people could be bombarded by rumour multiple times
eventually relenting to or believing it. Thus, this multiple activation aspect increases the
computational complexity compared to the SIR model discussed earlier by creating a
computational problem.
Budak, Agrawal, and El Abbadi (2011) conducted a study on the problem of dissemination of
misinformation in a social network. The authors investigated solutions to social media where a bad
information campaign diffuses across a network. The key factor considered is the influential
individuals who can potentially start a counter-campaign with a mission to minimise the effect of
the bad campaign. However, counter influencers have their limitation. They call this problem an
eventual influence limitation problem characterising it as a key challenge with the authors
exploring diverse features of the problem like the effect of starting the limiting campaign early or
late. Additionally, they studied the more realistic problem of influence limitation in the presence
of missing information and proposed an optimisation method to select the nodes that are probably
infected by the bad campaign.
Table 9.1 shows a summary of the variables identified by the authors above.
Table 9.1 Variables linked to Human Factors
Variables
Explanation
Notation
Sources
Contact Rate
The number of people who would get in contact
with the fake news.
CR
Bettencourt et al. (2006)
Kimura, Saito, and Motoda,
(2009).
Susceptible
Individuals who likely believe fake news.
Su
Bettencourt et al. (2006)
Exposed
The time it takes users from first exposed to fake
news before they move to the next stage of
believing in fake news.
E
Bettencourt et al. (2006)
Infected
The rate or number of people who believe the
fake news.
I
Bettencourt et al. (2006)
Sceptic
The rate or number of people who will question
the fake news.
Z
Bettencourt et al. (2006)
Recovered
The rate or number of people who may have
believed the fake news but later debunked it.
R
Jin et al. (2013)
Total population
The maximum population that given fake news
can reach.
N
Wang, Yang, and Wang (2014)
Edges
The connections between individuals as fake
news travel.
E
Wang, Yang, and Wang (2014)
Vertices
Individual exposed as fake news travel.
Individuals can have different states: ignorant,
infected, or vaccinated. These variables
correspond to earlier variables S, I, and R.
V
Wang, Yang, and Wang (2014)
Influential Node
A node with many followings and much influence
on several nodes. Thus can propagate the spread
of fake news. Thus these can be characterised as
‘bad campaign.
IN
Kimura, Saito, and Motoda,
(2009). Budak, Agrawal, and El
Abbadi (2011)
Counter Influence Node
A node with many followings and much influence
on several nodes. But counter influence nodes
will uses influence to counter fake news.
INc
Budak, Agrawal, and El Abbadi
(2011)
Threshold
The rate or number of reposting/retweeting rate
that is needed for a post to go viral.
Th
Nekovee et al. (2007)
9.3. Interaction Factors:
The actions a user can take on social media would inevitably depend on what tools the social media
platform has to offer. There are, however, several sets of common actions that tend to be shared
platform actions that propagate the spread of posts, news, and fake news. Dordevic et al. (2016)
when modelling propagation of misinformation, demonstrated a proof-of-concept and identified
the variables involved in the travel of information and misinformation. The authors showed that
combating fake news online is influenced by the following variables: rate of authentication,
passing on information rate, average cross-wire rate, the success rate of same level communication
rate, and reverse validation rate. But, perhaps the biggest influencers in the spread of fake news
are newsgroups, cyber-bots, and user influencers. Safieddine, Dordevic, and Pourghomi (2017) in
studying the impact of social media newsgroup in the dissemination of misinformation. confirmed
the impact of what was identified earlier as the role of influencers. The simulations showed how
authentication methods could reduce the spread of misinformation on social media. The three-
dimensional simulations combined with graph theory have additionally assisted in demonstrating
the variables governing the way rumours disseminate online, and how this could be minimised by
authenticating news, prior to being shared. The results showed social newsgroups have an
important impact on the explosion of rumours as well as fighting fake news online.
These variables characterise the interaction between users and post; and are explained in table 9.2.
Table 9.2 Variables linked to Interaction Factors
Variables
Explanation
Sources
Sharing
Rate of sharing. This can vary from one post to another.
Dordevic et al. (2016)
Passing on
information
Rate of commenting, liking and a variety of actions that inadvertently
gives post publicity and trend.
Dordevic et al. (2016)
Authentication
Rate of people who will take the time to check the validity of a post.
Dordevic et al. (2016)
Crosswire
The rate in which information crosses the same user multiple times.
Dordevic et al. (2016)
Same Level (Cluster)
Communication
Rate where someone takes an active role to communicate or alert other
users to the authenticity of a post.
Dordevic et al. (2016)
Reverse Validation
Rate where users who shared a post may delete a post upon realising it
is fake.
Dordevic et al. (2016)
Newsgroup
Defined as a group with over 1000 followers with tribal tendencies.
Safieddine, Dordevic, and
Pourghomi (2017)
9.3. Content Factors:
Castillo, Mendoza, and Poblete (2011) used a decision tree-based model to investigate news
content characteristic (NC). The authors demonstrated that news posts with time-sensitive subjects
have greater influence to spread than other conversations and thus can be separated mechanically.
Beside other features, trending news tend to include an address to a resource on the internet and to
have a much deeper propagation tree. Thus, the research demonstrated that the level of social
network credibility of a trending news post could be measured based on source and propagation
tree. For example, factual news is broadcasted through authors who have previously posted a high
number of news posts, originate at a single or a few users in the network, and have many re-posts.
Whereas, the same cannot be claimed for fake news. In fact, to demonstrate their approach, the
authors analysed microblog postings related to trending topics and were able to categorise them as
credible or not credible. The categorisations were based on extracts of the post and features from
users’ posting and sharing behaviour. Their results showed a significant difference in the method
news is broadcasted, and thus they were able to organise posts automatically as likely fake or not.
Fake news detection can be calibrated through Support Vector Machine (SVM) which is a machine
learning algorithm which facilitates the organisation of specific data into different categories
(Erşahin et al. 2017). SVMs functionby training with specific data organised into different
categories. The aim of the SVM algorithm is to categorise data into rumour data or non-rumour
data and to maximise the border between these two classifications. The benefits of using the SVM
approach are that it has a tendency to be accurate and performs well on datasets that are smaller
and more concise. There are difficulties with using SVM for larger datasets as it entails a long
training time. The system can also be less effective when confronted with distorted datasets and
overlapping classes.
One of the first SVM, fake news detection models, is the Radial Basis Function kernel (RBF)
model (Yang et al., 2012). The authors collected a set of rumour-related blogs from Sina Weibo -
China's leading blogging service provider. They composed an extensive collection of microblogs
which were known to be fake news. The authors then studied a set of features extracted from
microblogs and trained a classifier to automatically detect fake news from a mixed collection of
false and true information. Moreover, the authors measured the impact of features on the
classification performance for the fake news quantity. The tests used the content based and
account-based features separately to make their classification. The demo tests proved to be highly
effective in the classification.
Kwon, Cha, and Jung, (2017) identified characteristics related to the user, linguistic, network and
temporal features of rumours and compared their attributes in categorising rumours over time. The
authors studied the major difference between rumours and non-rumours. Their contribution has
helped explain the dissemination patterns of rumours over time and following the accurate
variations in forecasting powers across rumour characteristics. To help identify fake news, the
model proposed using machine learning to identify dissimilarities between the rumour and non-
rumour posts constantly. Two new algorithms were proposed for rumour classification. Statistical
analysis performed discovered that structural and temporal characteristics differentiate rumours
from non-rumours over a long-term window.
The second SVM model presented by Ma et al. (2015) utilised Dynamic Series Time Structure
(DSTS) to model the variation of news characteristics. DSTS discovered that there is a variation
of various social context features over time. An event is reflected as a collection of microblogs
related to some specific topic. To sort the number of controllable features, they transformed the
continuous time stream of microblogs into fixed time intervals. Moreover, authors introduced a
method to represent using a discrete quantity or quantities the time stream for generating time
stamps, and then an algorithm for capturing the variation of features. When investigating the
performance of DSTS model on early detection of fake news, results showed that their SVM model
with the time series of features achieved improvement in fake news detection given the whole
lifecycle of events as well as at the early stage of dissemination.
Zhao, Resnick, and Mei, (2015) presented a decision tree based ranking approach for detecting
rumours through Enquiry Phrases EP. The method aimed to focus on searching for signal tweets
and then cluster similar posts together; thus, calling these an Enquiry Phrase. Moreover, the related
posts that do not contain these simple phrases are collected together. Authors have found few
phrases that appear to be used in the way of early detecting rumours. The critical phrases like
“What?”, “Is this true?”, “Really?” showed to be a part of posts that have these phrases quite early
in the dissemination. Moreover, the authors showed in testing that tweets asking verification
questions or making alterations to controversial statements are a very important indicator of
rumours early in their life cycle. Furthermore, the authors identified different statistical variables,
but some proving to be more useful than others. The key variables identified are linked to the
source where it evaluates average tweet length, percentage of signal tweets, average numbers of
words, URLs and mentions per any tweet in the cluster, etc. There is no indication that any specific
variable here on its own has a high correlation in identifying fake news but rather the combination
of these. Since these are linked to evaluating the source, one can look into including this part of
the source evaluation variable.
Liu and Wu (2018) proposed a baseline model for early detection of fake news on social media
through categorising news propagation paths using recurrent and convolutional networks. The
model contains four major components. The first component is the propagation of path
construction and transformation. The very first is to identify the users involved in propagating the
news. The second element of the model is the Gated Recurrent Unit (GRU) propagation path
representation. It offered a vector representation for each transformed propagation path. The third
component is the Convolutional Neural Networks (CNN) propagation path representation. CNN
is a machine learning algorithm that is typically used to analyse visual imagery and thus be able to
draw analysis from the propagation path tree. The last component of the model is the propagation
path classification. Testing results on different datasets demonstrated that the proposed model
exhibit effective and efficient behaviour on detecting fake news. The model relies on common user
characteristics and not complex features such as linguistic or structural features.
These variables characterise the Content factors and are summarised in table 9.3.
Table 9.3 Variables linked to Content factors
Variables
Explanation
Sources
Time Sensitive
Time-sensitive subjects have a bigger influence on the spread of news.
Castillo, Mendoza, and
Poblete (2011).
Reference Source
Rating of the source for reliability based on historical track on
reposting of their news posts and organisation source.
Castillo, Mendoza, and
Poblete (2011).
Fake News Dataset
Linguistic analysis that compares a given post with a dataset of posts
that are categorised as fake news. Factors could lean towards indicating
fake news.
(Erşahin et al. 2017). (Yang
et al., 2012).
Kwon, Cha, and Jung, (2017)
Dynamic Timestamp
Setting a timestamp on posts and tracking their timeline allows better
estimation if the news post is factual or fake.
Ma et al. (2015)
Enquiry Phrases
Analysis of the comments made by users to the post, allowing early
detection of possible fake news.
Zhao, Resnick, and Mei,
(2015)
Propagation Path
Analysis
Use of CNN to evaluate the path representation of news to assess if it
follows trends similar to factual or fake news.
Liu and Wu (2018)
9.4. Platform Factors:
There has not been much in the literature in the form of platform variable identification as to their
impact in combating the propagation of fake news. Platform factors that influence the propagation
of fake news are essential to how we understand the success of fake news online. Some of these
factors we already covered in the Interaction Factors and how users engage with social media
posts. Here we will focus on behind the scene factors discussed in chapters 7 and 8. Namely, the
platform algorithm that encourages topics and subjects that users want to see, and by doing so,
develops an unhealthy environment where one only views one-sided argument on many of the day
to day issues. The next element is the degree of exposure associated with the user’s activities.
Where users react, like, view, or comment on a given post is inadvertently displayed to friends and
followers of that user. Some platforms, such as Facebook has a very high of exposure of activities
compared to Instagram that has significantly less exposure to activities. The control that these
platforms give users over this feature also varies. Another challenging variable is the time it takes
the social platform to label or remove fake news. Several of the earlier variables have touched on
the impact of Time Sensitive (TS), Dynamic Timestamp (DT), Contact Rate (CR) and Threshold
(Th) are all linked to the duration a fake news post is left propagating without active challenge
from the platform. Finally, platform policy on a variety of rules, including advertising and
promoting political campaign. None of these variables was specifically mentioned in the literature
and as such, remains to be explored.
Table 9.4 Variables linked to platform factors
Variables
Explanation
Notation
Filter Bubble
The degree in which an algorithm of a platform encourages material
that agrees with the users’ views.
(F)
Activity Exposure
The rate of a platform actively exposes users’ activities.
(AE)
Platform Policy
Platform policy on a variety of rules, including advertising and
promoting the political campaign.
(PP)
Put together with the first three sets of variables; one can see the challenges platforms have in
controlling the spread of fake news.
9.5. Approach to identifying Fake News
While we wait for the social media platforms, governments, researchers, and technology to catch
up, there are few clues and tools available for users to identify fake news. We identified many of
these suggestions based on the variables identified in this chapter.
From the analysis of the Human factors, one of the first clues we identify is to approach news
posted on social media with a healthy scepticism regardless of how many times one gets the same
message. A social media post has several elements that could be explored individually:
- Source and text content: A post with no source, unfamiliar, or questionable source, will
need to be investigated further. Where the source is an established media, most probably
established in a traditional media, then one could consider checking if the news has been
shared by more established outlets.
Figure 9.2 News validations by source and text
Google news search is one platform that would list multiple sources of one news article
allowing a quick and easy way to verify multiple sources at the same time. Bing News and
Yahoo News provide similar features, but one must search to identify the sources. Where
multiple sources are not found, one must explore the content. Take the example of the news
posted shared June 2019 suggesting that the US has banned Nigerians from applying for
student visas see figure 9.2. The first clue is the source news does not appear to be a
reliable name of a news outlet “DoubleAMedia.Com.NG”. A source search reveals quickly
that the news is not shared by any established media. In fact, by the time we took a second
search in early July, several fact-checking websites had debunked that post. This step is
linked to Reference Source (RS) factor, Authentication rate (A), and Recovery (R) when
identifying it is fake.
- Images: Many social media news posts use visuals such as images to provide credibility
for a news post. Several browsers allow the option to right click on the image to search the
Internet for similar images. The technical term is Reverse Image Search. Google Chrome
and Internet Explorer have the option to right click and search the image online. Google
option is called Search Google for Image’ while Internet Explorer has the option Ask
Cortana about this image’; Cortana being Microsoft digital assistant. Take the example of
a post claiming to show astronaut taking marijuana weed secretly onboard, see figure 9.3.
A reverse image search reveals that the image is fake, photoshopped from another image
of astronaut taking Easter eggs. There are two clues in this case. First, the reverse image
search did not show any established media outlet confirming the story but quite the
contrary; the only sources appear to be social media posts and fact checkers sufficient to
question the authenticity of the story. This step is also linked to Reference Source (RS)
factor, Authentication rate (A) and Recovery (R) when identifying it is fake.
-
Figure 9.3 News validations by reverse image search
- Check the date of the post: One of the most common clues to fake news is that the post
will either have no date or a date that mismatches the original source of an image or content.
In chapter one, we presented the case of the image of brother and sister in response to the
Nepal earthquake of 2015, but the reverse image search shows it is from 2007 and as such
is sufficient to debunk the claim. This approach links to Dynamic Timestamp (DT).
- Reporting Fake News: Many of the social media platforms now have the option to report
a post as fake news. And one should take a more active approach to report these. Many
times, the process is a two or three step click. You are not expected to write anything other
than bringing it to the attention of the moderators. Stopping the post spreading may hinder
it from reaching the Threshold (Th).
- Actively Share: Where the news content is questionable, do raise your concern in the
comments. This will be picked up by other users. Where your research shows the content
to be fake, mention that in the comments. Where a news post is debunked by a fact
checkers, share it. The activity of sharing is more about raising awareness of fake news and
less about whether the fake news itself is of interest to you, your friends and family.
Another reason why you should actively sharing fact-checks is that it provides these groups
with momentum to expand their base. This could have an impact on the algorithm but will
most definitely have an impact on Same Level (Cluster) communication (SL) and possibly
Reverse Validation (RV).
- Do not share: Time-sensitive information (TS) until a time when it is independently
verified from multiple media. Even if you are questioning a post, sharing it opens new
opportunities for it to expand with impact on the Sharing variable (S) and may enforce the
news to people who have already heard it from other sources with impact on Contact Rate
(CR).
- Following influencers: One may think that recommendation here is to follow positive
influencers only. That is maybe true. Beyond established political parties and celebrities,
it is recommended to follow established media groups and fact-checking groups. But if we
are to take the research by McNair (2017), we should also follow influencers we disagree
with and even tribalists. But rather than attempting to share rational counter posts, the
suggestion is to share satirical posts to counter their argument. That would be taking an
active role in combating fake news.
For example, it would be wise to follow NASA but also follow societies that support flat
earth conspiracy theories as well as groups satirical of flat earth groups. Where you come
across satirical posts, say figure 9.4, you should actively share these with friends, family,
and flat earth support groups. The worse that could happen is that some tribalists would
block you. However, this will have an impact in breaking the Filter bubble (F) where
people cannot see other views, but also where more people respond and report fake news,
it could impact the dynamics of closed newsgroups (NG).
Figure 9.4 Satirical meme depicting eclipse of a flat earth on the moon.
Other variables are outside our control. Platform policies (PP), Counter Influencer Node (INc),
how far the news spreads (N), automated or semi-automated authentication tools linked to
increased authentication rate (A), and analysis of the post language based on a dataset (FN-DS).
9.6. Conclusion:
One of the biggest challenges in the incoming years is the construction of an efficient and effective
model for capturing news on social media in real-time while being careful not to be accused of
media censorship.
Additional research can focus on the development of an algorithmic formula for forecasting the
dissemination of fake news to programme the first fully functional internet browser that would be
proficient of running live authentication.
We believe the first step towards an efficient model to identify fake news starts with better
understanding the environment, factors, and variables fake news operates. Today, we believe we
have covered a significant portion of this challenge.
References:
Bettencourt, L. M., Cintrón-Arias, A., Kaiser, D. I., & Castillo-Chávez, C. (2006). The power of a good
idea: Quantitative modeling of the spread of ideas from epidemiological models. Physica A:
Statistical Mechanics and its Applications, 364, 513-536.
Budak, C., Agrawal, D., & El Abbadi, A. (2011, March). Limiting the spread of misinformation in social
networks. In Proceedings of the 20th international conference on World wide web (pp. 665-674).
ACM.
Castillo, C., Mendoza, M., & Poblete, B. (2011, March). Information credibility on twitter. In Proceedings of
the 20th international conference on World wide web (pp. 675-684). ACM.
Conroy, N. J., Rubin, V. L., & Chen, Y. (2015). Automatic deception detection: Methods for finding fake
news. Proceedings of the Association for Information Science and Technology, 52(1), 1-4.
Daley, D. J., & Kendall, D. G. (1964). Epidemics and rumours. Nature, 204(4963), 1118.
Dordevic, M., Safieddine, F., Masri, W., & Pourghomi, P. (2016, September). Combating Misinformation
Online: Identification of Variables and Proof-of-Concept Study. In Conference on e-Business, e-
Services and e-Society (pp. 442-454). Springer, Cham.
Erşahin, B., Aktaş, Ö., Kılınç, D., & Akyol, C. (2017, October). Twitter fake account detection. In 2017
International Conference on Computer Science and Engineering (UBMK)(pp. 388-392). IEEE.
Jin, F., Dougherty, E., Saraf, P., Cao, Y., & Ramakrishnan, N. (2013, August). Epidemiological modeling
of news and rumors on twitter. In Proceedings of the 7th Workshop on Social Network Mining and
Analysis (p. 8). ACM.
Kimura, M., Saito, K., & Motoda, H. (2009, June). Efficient estimation of influence functions for SIS model
on social networks. In Twenty-First International Joint Conference on Artificial Intelligence.
Kwon, S., Cha, M., & Jung, K. (2017). Rumor detection over varying time windows. PloS one, 12(1),
e0168344.
Liu, Y., & Wu, Y. F. B. (2018, April). Early detection of fake news on social media through propagation
path classification with recurrent and convolutional networks. In Thirty-Second AAAI Conference on
Artificial Intelligence.
Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B. J., Wong, K. F., & Cha, M. (2016, July). Detecting rumors
from microblogs with recurrent neural networks. In Proceedings of the Twenty-Fifth International
Joint Conference on Artificial Intelligence (pp. 3818-3824).
Ma, J., Gao, W., Wei, Z., Lu, Y., & Wong, K. F. (2015, October). Detect rumors using time series of social
context information on microblogging websites. In Proceedings of the 24th ACM International on
Conference on Information and Knowledge Management (pp. 1751-1754). ACM.
Nekovee, M., Moreno, Y., Bianconi, G., & Marsili, M. (2007). Theory of rumour spreading in complex
social networks. Physica A: Statistical Mechanics and its Applications, 374(1), 457-470.
Newman, M. E. (2002). Spread of epidemic disease on networks. Physical review E, 66(1), 016128.
McNair, B. (2017). Fake news: Falsehood, fabrication and fantasy in journalism. Routledge.
Safieddine, F., Dordevic, M., & Pourghomi, P. (2017, July). Spread of misinformation online: Simulation
impact of social media newsgroups. In 2017 IEEE Computing Conference (pp. 899-906).
Wang, Y. Q., Yang, X. Y., & Wang, J. (2014). A rumor spreading model with control mechanism on social
networks. Chinese Journal of Physics, 52(2), 816-829.
Yang, F., Liu, Y., Yu, X., & Yang, M. (2012, August). Automatic detection of rumor on Sina Weibo.
In Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics (p. 13). ACM.
Zhao, Z., Resnick, P., & Mei, Q. (2015, May). Enquiring minds: Early detection of rumors in social media
from enquiry posts. In Proceedings of the 24th International Conference on World Wide Web (pp.
1395-1405). International World Wide Web Conferences Steering Committee.
Conference Paper
Full-text available
Several researchers have attempted to investigate the processes that govern and support the spread of fake news. This paper collates and identifies these variables. This paper then categorises these variables based on three key players that are involved in the process: Users, Content, and Social Networks. The authors conducted an extensive review of the literature and a reflection on the key variables that are involved in the process. The paper has identified a total of twenty-seven variables. Then the paper presents a series of tasks to mitigate or eliminate these variables in a holistic process that could be automated to reduce or eliminate fake news propagation. Finally, the paper suggests further research into testing the method in lab conditions.
Conference Paper
Full-text available
In the midst of today's pervasive influence of social media, automatically detecting fake news is drawing significant attention from both the academic communities and the general public. Existing detection approaches rely on machine learning algorithms with a variety of news characteristics to detect fake news. However, such approaches have a major limitation on detecting fake news early, i.e., the information required for detecting fake news is often unavailable or inadequate at the early stage of news propagation. As a result, the accuracy of early detection of fake news is low. To address this limitation, in this paper, we propose a novel model for early detection of fake news on social media through classifying news propagation paths. We first model the propagation path of each news story as a multivariate time series in which each tuple is a numerical vector representing characteristics of a user who engaged in spreading the news. Then, we build a time series classifier that incorporates both recurrent and convolutional networks which capture the global and local variations of user characteristics along the propagation path respectively, to detect fake news. Experimental results on three real-world datasets demonstrate that our proposed model can detect fake news with accuracy 85% and 92% on Twitter and Sina Weibo respectively in 5 minutes after it starts to spread, which is significantly faster than state-of-the-art baselines.
Conference Paper
Full-text available
Academic research shows increase reliance ofonline users on social media as a main source of news and information. Researchers found that young users are particularly inclined to believe what they read on social media without adequate verification of the information. There has been some research to study the spread of misinformation and identification of key variables in developing simulations of the process. Current literatureon combating misinformationfocuses on individuals andneglects social newsgroups-key players in the disseminationof information online. Using benchmark variables and values from the literature, the authors simulated the process using Biolayout; a big data-modeling tool. The results show social newsgroups have significant impact in the explosion of misinformation as well as combating misinformation. The outcome has helped better understand and visualize how misinformation travels in the spatial space of social media.
Article
Full-text available
This study determines the major difference between rumors and non-rumors and explores rumor classification performance levels over varying time windows—from the first three days to nearly two months. A comprehensive set of user, structural, linguistic, and temporal features was examined and their relative strength was compared from near-complete date of Twitter. Our contribution is at providing deep insight into the cumulative spreading patterns of rumors over time as well as at tracking the precise changes in predictive powers across rumor features. Statistical analysis finds that structural and temporal features distinguish rumors from non-rumors over a long-term window, yet they are not available during the initial propagation phase. In contrast, user and linguistic features are readily available and act as a good indicator during the initial propagation phase. Based on these findings, we suggest a new rumor classification algorithm that achieves competitive accuracy over both short and long time windows. These findings provide new insights for explaining rumor mechanism theories and for identifying features of early rumor detection.
Conference Paper
Full-text available
The spread of misinformation online is specifically amplified by use of social media, yet the tools for allowing online users to authenticate text and images are available though not easily accessible. The authors challenge this view suggesting that corporations’ responsible for the development of browsers and social media websites need to incorporate such tools to combat the spread of misinformation. As a step stone towards developing a formula for simulating spread of misinformation, the authors ran theoretical simulations which demonstrate the unchallenged spread of misinformation which users are left to authenticate on their own, as opposed to providing the users means to authenticate such material. The team simulates five scenarios that gradually get complicated as variables are identified and added to the model. The results demonstrate a simulation of the process as proof-of-concept as well as identification of the key variables that influence the spread and combat of misinformation online.
Conference Paper
Full-text available
Microblogging platforms are an ideal place for spreading rumors and automatically debunking rumors is a crucial problem. To detect rumors, existing approaches have relied on hand-crafted features for employing machine learning algorithms that require daunting manual effort. Upon facing a dubious claim, people dispute its truthfulness by posting various cues over time, which generates long-distance dependencies of evidence. This paper presents a novel method that learns continuous representations of microblog events for identifying rumors. The proposed model is based on recurrent neural networks (RNN) for learning the hidden representations that capture the variation of contextual information of relevant posts over time. Experimental results on datasets from two real-world microblog platforms demonstrate that (1) the RNN method outperforms state-of-the-art rumor detection models that use hand-crafted features; (2) performance of the RNN-based algorithm is further improved via sophisticated recurrent units and extra hidden layers; (3) RNN-based method detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services.
Book
Fake News: Falsehood, fabrication and fantasy in journalism examines the causes and consequences of the 'fake news' phenomenon now sweeping the world's media and political debates. Drawing on three decades of research and writing on journalism and news media, leading scholar Brian McNair engages with the fake news phenomenon in accessible, insightful language designed to bring clarity and context to a complex and fast-moving debate. McNair presents fake news not as a cultural issue in isolation but rather as arising from, and contributing to, significant political and social trends in twenty-first century societies. Chapters identify the factors which have laid the groundwork for fake news' explosive appearance at this moment in our globalised public sphere. These include the rise of relativism and the crisis of objectivity, the role of digital media platforms in the production and consumption of news, and the growing drive to produce online content which attracts users and generates revenue. The book also considers the decline of trust in journalism, and the how the traditional left critique of 'dominant ideology' and 'ruling elites' in media has been appropriated by the alt-right, nationalists and populists all over the world. This book rejects the left-right division in discussion of what is and is not 'fake news'. Rather, it aims to provide students, teachers, journalists and general readers with the tools necessary to navigate the digital journalism landscape in the era of President Donald Trump, and to filter out the 'fact' from the 'fake' in their news.
Article
This research surveys the current state‐of‐the‐art technologies that are instrumental in the adoption and development of fake news detection. “Fake news detection” is defined as the task of categorizing news along a continuum of veracity, with an associated measure of certainty. Veracity is compromised by the occurrence of intentional deceptions. The nature of online news publication has changed, such that traditional fact checking and vetting from potential deception is impossible against the flood arising from content generators, as well as various formats and genres. The paper provides a typology of several varieties of veracity assessment methods emerging from two major categories – linguistic cue approaches (with machine learning), and network analysis approaches. We see promise in an innovative hybrid approach that combines linguistic cue and machine learning, with network‐based behavioral data. Although designing a fake news detector is not a straightforward problem, we propose operational guidelines for a feasible fake news detecting system.
Conference Paper
Many previous techniques identify trending topics in social media, even topics that are not pre-defined. We present a technique to identify trending rumors, which we define as topics that include disputed factual claims. Putting aside any attempt to assess whether the rumors are true or false, it is valuable to identify trending rumors as early as possible. It is extremely difficult to accurately classify whether every individual post is or is not making a disputed factual claim. We are able to identify trending rumors by recasting the problem as finding entire clusters of posts whose topic is a disputed factual claim. The key insight is that when there is a rumor, even though most posts do not raise questions about it, there may be a few that do. If we can find signature text phrases that are used by a few people to express skepticism about factual claims and are rarely used to express anything else, we can use those as detectors for rumor clusters. Indeed, we have found a few phrases that seem to be used exactly that way, including: "Is this true?", "Really?", and "What?". Relatively few posts related to any particular rumor use any of these enquiry phrases, but lots of rumor diffusion processes have some posts that do and have them quite early in the diffusion. We have developed a technique based on searching for the enquiry phrases, clustering similar posts together, and then collecting related posts that do not contain these simple phrases. We then rank the clusters by their likelihood of really containing a disputed factual claim. The detector, which searches for the very rare but very informative phrases, combined with clustering and a classifier on the clusters, yields surprisingly good performance. On a typical day of Twitter, about a third of the top 50 clusters were judged to be rumors, a high enough precision that human analysts might be willing to sift through them.
Article
In this paper, considering a control mechanism that uses an anti-rumor spread to combat a rumor spread, a novel susceptible-infected (SI) model is developed. By means of mean-field theory, the early stage propagation dynamics of the rumor and the anti-rumor on homogeneous networks and inhomogeneous networks are investigated, respectively. Theoretical analysis and simulation results show that, in both types of networks, as the immunity of the anti-rumor or the curing rate of infected nodes increase, the velocity of rumor spreading obviously decreases, which significantly reduces the probabilities of rumor outbreaks. Also, we find that shortening the time delay between the rumor starting and the anti-rumor starting can improve the control effect of rumor transmission. Our results may shed new light on the research of rumor control strategies.