Conference PaperPDF Available

Measuring behavioral trust in social networks

Authors:

Abstract

Trust is an important yet complex and little understood aspect of the dyadic relationship between two entities. Trust plays an important role in the formation of coalitions in social networks and in determining how high value of information flows through the network. We present algorithmically quantifiable measures of trust based on communication behavior. We propose that trust results in likely communication behaviors which are statistically different from random communications; detecting these trust-like behaviors allows us to develop a quantitative measure of who trusts whom in the network. We develop algorithms to efficiently compute such behavioral trust and validate these measures on the Twitter network.
Measuring Behavioral Trust in Social Networks
Sibel Adali, Robert Escriva, Mark K. Goldberg, Mykola Hayvanovych, Malik Magdon-Ismail,
Boleslaw K. Szymanski, William A. Wallace and Gregory T. Williams
Abstract—Trust is an important yet complex and little under-
stood dyadic relation among actors in a social network. There are
many dimensions to trust; trust plays an important role in the
formation of coalitions in social networks, in assessing quality
and credibility of information as well as in determining how
information flows through the network.
In this paper, we present algorithmically quantifiable measures
of trust which can be determined from the communication behav-
ior of the actors in a social communication network. The basis for
our study is a proposition that trust results in likely communication
behavior patterns which are statistically different from random
communication in a network. Detecting the statistically significant
realizations of this trust-like behavior allows us to develop a
quantitative measure of who-trusts-whom relation in the network.
Since our measure of trust is based on quantifiable behavior,
we call it behavioral trust. We develop algorithms to efficiently
compute behavioral trust and we validate these measures on the
Twitter network.
I. INTRODUCTION
Trust is an important aspect of the relationship
between two entities. The trust landscape of
a social network (who trusts whom) plays an
important role in the intelligence and security
domain. Trust forms a basis for formation of
coalitions (strong communities are formed by
entities which “trust” each other); it can serve
to identify influential nodes in a network; and,
it determines how information will flow in a
social network: whether nodes will believe in-
formation they receive, choose to transmit it
to some other node. The reverse is also true:
communities can induce greater trust among the
members; continued information flow between
members can enhance the trust relationship be-
tween them.
Trust is a complex relationship. In general,
when we are deciding whether or not to trust
a person, we are all influenced by a host of
factors, such as: 1) Our own predisposition to
trust, which is linked to our psychology, which
itself was influenced by various events over
our lifetime; these events can be completely
unrelated to the person we are deciding to trust
or not trust. 2) Our relationship and past ex-
periences with the person and with his or her
friends, including rumors and gossip. 3) Our
opinions of actions and decisions the person
has made in the past. Thus, the problem of
estimating trust in social networks is a very
interesting and challenging one, because it is
not yet well understood or defined. To be able
to capture and/or quantify trust, we must focus
on some specific properties of trust, which may
have to be simplified, so that these properties
may be captured algorithmically. In this paper,
we aim to quantitatively measure dyadic trust
(trust between two entities) based on observed
communication behaviors in social networks –
we call this behavioral trust. A useful analogy
to keep in mind is the saying “imitation is the
best form of flattery” – imitation is a behavior
which is indicative of some dyadic relationship.
A typical social network consists of actors (in-
dividuals) and some form of communication be-
tween them, which could be phone calls, emails,
blog posts, etc. Increasingly, a great deal of
social relationships take place predominantly in
the form of electronic communications. People
meet and form trust relationships, participate in
activities without any face to face contact. As
a result, the interactions between individuals in
the social network is a good indicator of their
social relationships with these individuals. An
aspect of trust is based on the notion of embed-
dedness [1] which shows that the interactions
between individuals form a basis from which
a trust relationship may grow. Sometimes these
interactions may not require trust. However,
they establish a relationship that can be used to
build trust. The various characteristics of these
relationships such as the balance in participa-
tion, the persistence of communications may
signal the formation of a trusting relationship.
The social mechanisms with which people form
trusting relationships in online communities is a
fairly new topic with a lot of unknowns. In this
paper, we study a number of social behaviors
that take place in this space: conversations and
propagation of information from one person to
another. We develop statistical measures based
on the timing and sequence of communications,
not the textual content. We give efficient al-
gorithms for computing our measures, making
them scalable to social networks on millions of
nodes. We show that these behaviors correlate
strongly with each other in terms of the indi-
viduals involved and the communities formed.
We also show that they correlate with actual
forwarding behavior indicative of trust. These
results give us a new set of behavioral measures
that can be used to measure existence, emer-
gence or dissolution of trusting relationships in
social networks.
Related Work. There has been work done on
trust in computer science as well as in social
science. In [2], Beth et al. present a method
for valuation of trustworthiness in open net-
works. In [3], Buskens discusses proposes ex-
planations for the emergence of trust in so-
cial networks when actors can label others as
untrustworthy, and when actors are informed
regularly about trustworthy behavior of others.
Abdul-Rahman and Hailes [4] and Aberer and
Despotovic [5] study reputation based trust and
trust management. Abdul-Rahman and Hailes
present a model in which agent’s tune their
measures of trust based on observed reputa-
tions, and Aberer and Despotovic discuss a
trust model that is grounded in real-world so-
cial trust characteristics, and based on a rep-
utation mechanism, or word-of-mouth. Their
proposed model allows agents to decide which
other agents opinions they trust more and allows
agents to progressively tune their understanding
of another agents subjective recommendations.
In [5], Aberer and Despotovic present scal-
able algorithms that require no central control
and allow for estimating trust by computing
an agents reputation from its interactions with
other agents. In [6], Gray, Seigneur, Chen and
Jensen develop trust-based security mechanisms
using small world concepts to optimize forma-
tion and propagation of trust among entities in
a massive, networked infrastructure of diverse
units. They summarize that, in a very large mo-
bile ad hoc network, trust, risk, and recommen-
dations can be propagated through relatively
short paths connecting entities. In [7], Kuter
and Golbeck describe a different approach for
estimating trust in various computing systems.
They give an explicit probabilistic interpretation
for confidence in social networks. They describe
SUNNY, a new trust inference algorithm that
uses a probabilistic sampling technique to quan-
tify confidence and trust. SUNNY computes an
estimate of trust based on only those informa-
tion sources with high confidence estimates.
All the methods proposed above use semantic
information in some way and/or focus on a static
snapshot of a social network, which does not
capture all of the communication behavior and
dynamics. Conversely, we study the problem
of behavioral trust purely from the observed
communication statistics, using no semantic in-
formation. We give measures of behavioral trust
which apply to dynamic, streaming communica-
tion networks, for example the Twitter network.
We adopt the notion of interpersonal trust as
proposed in [8] by Wallace et al., which treats
trust as a social tie between a trustor and a
trustee [9]. Trust develops as part of an emo-
tional relationship between a pair of people akin
to the concepts of emotional and relational trust
[10], [11].
II. BEHAVIORAL TRUST
Let us formally define the problem now. The
input is the communication dynamics of a social
network, specified by a set of communication 3-
2
tuples,
hsender,receiver,timei;
note that we do not use communication content,
only the sender-receiver-time data. The output
is a behavioral trust graph Tinduced from these
inputs. The nodes in this graph are the senders
and receivers. The edges are weighted, and the
edge weight wij is the strength of the trust
relationship from node ito node j(trust can
generally be an asymmetric, directed relation-
ship).
The basis for this work is the observation that
trust between two nodes Aand Bwill result in
certain typical behaviors. These behaviors are
not only an expression of trust, but can also
facilitate the development of further trust. The
simplest such behavior is just conversation. Two
people who trust each other are likely to con-
verse; in addition, continued conversation can
lead to an enhancing of their trust relationship.
Note that such behavioral expressions are not
guaranteed expressions of trust. It is possible
to have a conversation with someone who you
do not trust; it is also possible to trust some-
one but not converse with them. Thus, such
behavioral expressions of trust should be more
viewed as noisy indicators. The more often they
occur, the more likely that a trust relationship
is likely to exist or to develop. Further, since
our measures are statistical, they ignore some of
the contextual aspects of trust. For example you
trust your doctor for medical advice and your
accountant for tax advice. From the behavioral
point of view, you would converse with both
your doctor and accountant, however, they are
distinct forms of trust. The contextual aspect
could be added back through the notion of “trust
communities” but our present goal is to simply
measure whether there is atrust relationship
between two entities Aand B.
It is also possible to measure distrust through
typical behaviors expressed by distrust. For ex-
ample, the seeking of a second opinion is a
measure of distrust. For the scope of this present
work, we focus on measuring dyadic trust. We
will focus on two particular behaviors as an ex-
pression of trust: conversation and propagation.
Specifically, if two nodes converse, then they are
more likely to trust each other. If one node prop-
agates information from another then it suggests
that the propagator trusts the information.
Conversation Propagation
A B
A B X
Y
A and B trust each
other B trusts A
Our goal is to develop algorithmic measures
of conversation and propagation, and validate
these as measures of trust in the Twitter net-
work.
A. Conversational Trust
We postulate that the longer and more bal-
anced a conversation is between two nodes, the
more likely it is that they have a trust relation-
ship; in addition, the more conversations there
are between such a pair of nodes, the more
tightly connected they are. The basic task is to
first identify when two nodes are conversing.
Let Aand Bbe a pair of users, and let
M={t1, t2,...,tk}be a sorted list of times
when a message was exchanged between A
and B. We define the average time between
messages, τ= (tkt1)/k. We would like
to construct, from the message set M, a set
of disjoint conversations. To do this, we say
that two consecutive messages ti, ti+1 are in the
same conversation if ti+1 tiS·τ(Sis a user
defined “smoothing” factor). A straightforward
algorithm can be used to construct the set of
conversations C={C1,...,C}using a single
pass through Musing the following observa-
tion. Suppose we are working on conversation
C={ti1,...,tic}; if tic+1 tic< S ·τ, then
we add tic+1 to the conversation C, otherwise
3
we start a new conversation. We only used con-
versations of size at least 2 in our experiments,
in which case Cmay not be a complete partition
of M.
The measure of conversational trust will be
based on the conversations in C, obeying the
following properties:
Longer conversations imply more trust.
More conversations imply more trust.
Balanced participation by Aand Bimplies
more trust.
Note that one could add other requirements, for
example, if people who did trust each other
stop keeping in touch, their trust will likely
deteriorate over time - i.e. more spaced apart
conversations implies less trust. However, the
above three properties are a good starting point.
We define the conversational trust Tc(A, B)as
follows:
Tc(A, B) =
l
X
i=1
kCik · H(Ci)
Where H(Ci)is a measure of the balance in
the conversation. We use the entropy function
to measure balance:
H(Ci) = plog p(1 p) log(1 p),
where p(Ci)is the fraction of messages in the
conversation Cithat were sent by A. One can
verify that many, long and balanced conversa-
tions lead to high trust as measured by Tc. Given
the stream of communications, we construct the
conversation trust graph, Tc(V, Ec), where the
weight between a pair of agents {A,B}is
Tc(A, B); we normalize so that the maximum
weight is 1 and only keep edges with weight
at least 0.01 (this choice is arbitrary, and leads
to roughly the same order of edges as in the
propagation trust graph as we describe below).
The complexity of the algorithms for comput-
ing conversational trust is O(|D|log |D|), where
|D|is the size of the communication stream.
B. Propagation Trust
Our second measure of trust is based on the
propagation of information. If a person Asends
a message to person Band if Bwithin some
time interval δpropagates the message to some
third person X, this is indicative of trust. If
Bpropagates information from Aoften, then
we propose that Bmust be trusting A. As
with conversational trust, propagation trust is
measured using only statistical communication
data without semantic information. Each time B
propagates information from A, it may be to a
different person; each such propagation signi-
fies trust in Aeven though it may be to different
people. Note that this measure of trust (unlike
the conversational trust measure) is directed. It
is possible for Bto be propagating information
from Abut not vice versa.
We now describe how to get the propagation
trust graph Tp= (V, Ep). We need to discuss
how to construct the directed edge AB,
which means that Atrusts B. We begin with two
sorted time lists of messages incoming to B, and
messages sent by B. We wish to associate pairs
of messages (one from each list) as propaga-
tions. Based on communication statistics alone,
we cannot definitely determine which messages
from Bare propagating; however, we can iden-
tify “potential propagations”. Specifically, we
say that a message m1received by Bwas po-
tentially propagated by a message m2sent by
Bif their times are close enough to satisfy the
propagation constraint:
τmin tm2tm1τmax.
So we would like to find the maximum number
of potential propagations by B, and in par-
ticular, the number of As messages which B
potentially propagated. To do this, we need to
match messages incoming to Bwith messages
outgoing from B; such matches are the potential
4
propagations, as illustrated below.
xB B y
t1s1
t2s2
t3s3
.
.
..
.
.
tnsm
The first step is to find the maximum number
of potential propagations; this corresponds to
finding a maximum sized matching, where each
match satisfies the propagation constraint. This
matching problem can be solved efficiently in
linear time [12]. A subset of messages in this
maximum matching will be from A; these mes-
sage pairs are the ones we take as B’s propaga-
tions of information from A. We only consider
as a valid propagation the pairs (A, B)for which
there were a statistically significant number of
propagations, as compared to a random commu-
nication data stream with the same in and out-
degree distributions, as in [12].
Notice that in the matching illustrated above,
none of the links cross. This corresponds to a
causality constraint, namely that if Bpropa-
gated two messages which he received at times
t1< t2, the times of the propagations must
also satisfy this ordering. One can show that
some maximum matching satisfies this con-
straint. Given that the maximum matching can
be computed in linear time, the entire algo-
rithm to find propagations (after sorting mes-
sage times) takes O(|D|log |D|).
Given the valid propagations (A, B), define
the quantities: mAB, the number of messages A
sent to B; propB, the number of propagations
by B(the size of the matching above); propAB,
the number of messages Asent to Bthat were
propagated (the subset of the matching contain-
ing messages from A). We consider two intu-
itive ways to measure the directed trust weight
Tp(B, A)from Bto A:
(i)Tp(B, A) = propAB
propB
; (ii)Tp(B, A) = propAB
mAB
.
The first measure captures how much of Bs
propagation energy is spent propagating mes-
sages from A; the second captures the fraction
of A’s messages Bconsiders worthy of propa-
gating. We have tried both in our experiments,
and they yield similar results. We only report
the results of (i). In extremely heterogeneous
networks, these two measures could capture dif-
ferent aspects of trust, however in homogeneous
networks they behave similarly.
Next we discuss the Twitter data followed by
experiments to study and validate the conversa-
tion and propagation trust measures.
III. TWITTER DATA
Twitter is a popular online free service that
enables you to broadcast short messages to
your friends or “followers”, or engage in di-
rected conversations with specific individuals.
“Tweets” are text-based posts of up to 140 char-
acters displayed on the author’s profile page that
are delivered to the author’s subscribers (follow-
ers). Senders can restrict delivery to those in
their circle of friends or, by default, allow open
access.
We constructed a dataset by collecting the
publicly available communications between
tweeters. We reduced it into our standard input
format (sender, receiver, time). The dataset
consists of more then 2 million distinct users,
of which about 1,910,000 are senders (not all of
the users are active). There are about 230,000
public directed messages (tweets) per day.
Twitter allows the ability to conveniently and
explicitly identify that you are propagating a
message through the notion of a retweet. When
we gather retweets, we only gather the informa-
tion about the original sender of the message
and the person who retweeted it. There are
two types of retweeting: directed and broadcast:
directed retweeting is to a particular receiver,
and a broadcasted retweet goes to all followers
of the retweeter. Short of interviewing people
and asking who they trust, a retweet (a true
propagation) is the next best construct within
5
Twitter for users to explicitly indicate trust in
another user. Thus, retweeting gives us a way to
validate our behavioral trust measures.
IV. EXPERIMENTS ONTWITTER DATA
We first ran some experiments to compare
the conversation and propagation trust graphs.
In many aspects, they are similar. We then
used Twitter retweets to validate our measures
of trust, and we show that our measures fare
better than random and prominence based null
hypotheses.
A. Computing Conversation and Propagation Trust Graphs
We used messages over a 10 week period,
containing 15,563,120 directed messages and
34,178,314 broadcast messages. We use only
directed messages to identify conversations for
the conversation trust graph Tc; for the prop-
agation trust graph Tp, we use directed and
broadcast messages (broadcasts are only used
for outgoing messages).
We built a random graph model for the Twit-
ter data to determine how many propagations
are a significant number. We found that over
M= 1000 random data sets, 4 propagations
of the form ABxnever happened,
which (using standard Chernoff bounds) gives a
greater than 99% p-value at the 95% confidence
level that 4 propagations in the Twitter data
would not happen under the null hypothesis
that Twitter is a random graph without dyadic
relationship structure. We now summarize some
of the properties of the computed trust graphs,
and how they relate to each other.
TcTp
Smoothing par.
S= 4
τmin = 1; τmax = 120
(min)
202,058 undir.
edges 323,820 dir. edges
Node set overlap
TcTp
Tc82,947 69,203 (83%)
Tp69,203(70%) 99,534
Edge set overlap
TcTp
Tc202,058 173,638 (86%)
Tp173,638(70%) 323,820
We treat the undirected edges in Tcas two
directed edges for purposes of comparing edge
sets. We note that there is significant similarity
between Tcand Tp, which is significantly above
random considering that there are over 2 million
users in our data. This says that the type of
relationship the two trust graphs are capturing
is similar.
B. Trust Based Communities in Tcand Tp
Trust is the foundation of communities, and
it should be possible to discover communities
in the Twitter network by identifying clusters
such that there is high trust within the cluster.
This can be done by defining a cluster density
in terms of the trust-weights on the edges, and
then using local optimality together with iter-
ative search to identify clusters (see [13]). For
simplicity, we treat the graphs as having undi-
rected edges for clustering, though the directed
clustering method could also be used. Some
basic statistics of the communities are shown
below.
#of Groups Max. Group Size Avg. Group Size
Tc82947 280 7.06
Tp81340 316 8.17
Again, notice that the two trust-graphs give sim-
ilar results, having roughly the same number of
communities, as well as a very similar average
community size. Indeed this similarity can be
more quantitatively measured by comparing the
sets of clusters arising from Tcversus Tp. To do
this we use the best match method in [14]. The
best match method takes every cluster arising
6
from Tcand compares it with the best match
cluster from Tp, and vice versa. The similar-
ity between the two sets of clusters is then
the average best match similarity. We can also
consider the similarity between the Tc-clusters
and a random set of clusters with the same size
distribution as the Tp-clusters; this serves as a
null distribution for determining whether the
observed similarity is significant. We compare
the set of trust based communities to 1,000 dif-
ferent random sets of clusters to get an average
similarity. The results are shown below.
TcTpRandom
Tc1.00 0.79 0.42
Tp0.79 1.00
Random 0.42 1.00
We see that the trust-based communities coming
from Tcand Tphave a similarity larger than
would be expected for random sets of this same
size distribution. This is a further indication
that both the conversational and progation trust
graphs are capturing a similar dyadic relation-
ship.
The main goal of this section is to study some
of the properties of the conversation and prop-
agation trust graphs. In particular, to establish
that though they are measuring different behav-
iors, both these behaviors result in establishing
similar relationships between nodes, both at a
local edge and node level, as well as on a
collective level as seen through the lens of trust-
based communities. Thus, both measures seem
to be capturing at least some part of the same
phenomenon. We would like to now provide
some evidence that this phenomenon is indeed
trust.
C. Validating Tcand TpUsing Retweets
Aretweet is a definite propagation; we make
the assumption that when a user propagates
information from some other user, there must
be some element of trust between the two users.
Thus, we take a retweet of the form
ABretweet
x
as a proxy for directed trust BA(xcould
be an individual or group of individuals, eg.
followers) – thus, we may consider directed as
well as broadcasted retweets. A broadcast prop-
agation is not as significant a trust indicator as
a directed propagation, since a directed retweet
indicates that the user has carefully processed
the information and deemed it appropriate to
forward to some specific friend. Thus, we con-
sider the broadcast retweets as less significant
measures of trust than directed retweets. We
therefore build the retweet-trust graph Tras
follows. If there is at least one directed retweet
ABx, then the directed edge BA
exists in Tr; if there are at least two broadcast
retweets by a node Bof two different messages
from A, then the directed edge BAexists in
Tr. The choice of 1 for the number of directed
retweets to indicate trust and 2for the number
of broadcast retweets to indicate trust are some-
what arbitrary and chosen for illustration. For
our 10 weeks of Twitter data, Trhad 90,057
nodes and 103,279 directed edges. About 20%
of the node set in Troverlapped with the node
sets of Tcand Tp(recall that the node sets of Tc
and Tpare very similar).
Our main experimental result is that the be-
havioral trust graphs do indeed represent trust
(at least as captured by retweets). Every edge in
the behavioral trust graphs Tcand Tprepresent
a trust relationship. If the retweet graph is our
proxy for trust, we should therefore expect that
every edge in the behavioral trust graphs should
be present in the retweet graph. In fact the frac-
tion of behavioral trust edges which are present
in the retweet graph is a measure of how well
the behavioral trust is capturing “retweet” trust,
which in turn is a proxy for trust. These results
are shown in the table below.
Conversational Trust vs. Retweets
Fraction of edges in Tr
Tc11.6 %
Trandom 2.5 %
Tdegree 2.7 %
7
About 12% of the edges in Tcare also present in
the retweet graph. To understand whether this
is significant, we consider two alternate null
models for building “trust” graphs. The first is
just a random model. So we select a set of
nodes randomly; the number of nodes we select
is exactly the number of nodes in Tc. We now
consider all the communications incident with
this random set of nodes to construct the random
trust graph Trandom. As can be seen above, only
2.5% of these edges of Trandom are present in
the retweet graph. Another plausible null model
for trust is the prominence model. Thus, one
might hypothesize that nodes which send many
messages (i.e. nodes with high communication
degree) might be trusted nodes. Indeed this is
the type of hypothesis consistent with preferen-
tial attachment type models. So, we construct
the high degree graph Tdegree in a similar way to
the random graph. Instead of selecting random
nodes, we select the highest degree nodes (the
same number as are present in Tc), and the
communications incident with these nodes are
the edges. As we see above, the high degree
nodes are no more trusted (with respect to the
edges appearing in the retweet graph) than the
random set of nodes. A similar picture arises in
the propagation trust graph Tp.
Propagation Trust vs. Retweets
Fraction of edges in Tr
Tp14.4 %
Trandom 3%
Tdegree 2.9 %
We conclude that the fraction of edges in Tc
or Tpwhich appear in the retweet graph is
significant when compared to random nodes or
the prominent nodes (as measured by commu-
nication degree). This means that behavioral
trust links are capturing something more sophis-
ticated than simply links to prominent nodes.
Several low degree nodes are also picked. This
is to be expected as trust is not a phenomenon
restricted to voluminous users. The surprising
thing is that prominent nodes do not yield better
performance than random nodes, and impor-
tantly, the behavioral trust measure performs
more than 4 times better than random.
V. CONCLUSIONS
The main contribution of this paper is to
present measurable behavioral metrics for trust.
In this way we can quantify dyadic trust (a
highly complex relationship) through observ-
able communication behavior in social net-
works. In particular, our behavioral trust mea-
sures require only the communication traffic
stream (sender, receiver, time), and does not
look at semantic contents of the messages. We
have used Twitter data to illustrate our meth-
ods, which can be applied to very dynamic
social communication networks. We were able
to use retweet data available from Twitter to val-
idate our measures of behavioral trust because
retweets are explicit propagations of informa-
tion which indicate a trust in the information.
Our results indicate that our behavioral trust
measures correlate well with retweets (signif-
icantly better than a random null hypothesis),
and better than a simple measure of trust based
on prominence. The surprising result is that
prominence based trust does not fare better than
random.
We emphasize that our measures of trust do
not have access to retweet data, and so are
applicable to general social networks where all
one can observe are communications. The ad-
vantage of only using statistical communication
data (as opposed to semantic data) is that our
algorithms are scalable to larger networks (the
Twitter data we analyzed contained 2 million
nodes). These results are preliminary in the
sense that there is a lot more information in
the behavioral trust graphs than is presented
here, and so there are many directions for future
work:
1. The conversation graph Tccan be thresholded
at higher values to yield a much larger graph
than the propagation graph Tp. It would be
interesting to study the behavior of Tcand its
8
relationship to Tpas we increase this thresh-
old. We believe this relationship is interesting
because we hypothesize that conversation is a
beginning of a trust relationship and informa-
tion propagation relies on a pre-existing trust
relationship. Thus, we expect conversation
trust to precede propagation trust. Hence, it
would be very interesting to study how, in
the real data, edges in the conversation trust
graph Tctransition from low to high weight,
ane perhaps eventually into propagation trust
edges. If this was indeed observed, it would
verify the hypothesis.
2. The intersection of the conversation and
propagation graphs TcTpwould be also
interesting to study, as it provides a more
stringent measure of trust – not only is there
conversation but also propagation.
3. The advantage of statistical algorithms are
that they are efficient, but they ignore much
information. For example after building the
statistical propagation trust graph, we have
a set of candidate edges. We may now fil-
ter these edges using semantic analysis of
content to see which edges correspond to
real propagations of information. Thus, we
would be identifying the “retweets” through
semantic information – this is important for
networks where the retweet functionality is
not available.
4. Trust is a contextual relationship. In our trust
graphs, all the trust relationships are homo-
geneous. In reality, a node may trust one set
of nodes in one context (eg. medical advice)
and another setin another context (eg. movie
advice). Semantic analysis of the statistical
behavioral trust graphs could add the context
to behavioral trust.
5. Efficient algorithms for statistically analyz-
ing the values of messages along different
dimensions can considerably enhance the be-
havioral trust measures (see for example [15]
for methods to estimate value of messages).
Specifically, if a conversation contains high
value content, it is probably a better indica-
tor of trust. Similarly, if a propagation is a
propagation of high value information, it is
probably an indication of a stronger trust re-
lationship. Thus, value analysis of messages
could considerably enhance the behavioral
trust measures.
ACKNOWLEDGMENTS
This material is based upon work partially
supported by the U.S. National Science Foun-
dation (NSF) under Grant Nos. IIS-0621303,
IIS-0522672, IIS-0324947, CNS-0323324, NSF
IIS-0634875 and by the U.S. Office of Naval
Research (ONR) Contract N00014-06-1-0466
and by the U.S. Department of Homeland
Security (DHS) through the Center for Dy-
namic Data Analysis for Homeland Secu-
rity administered through ONR grant number
N00014-07-1-0150 to Rutgers University. This
research is continuing through participation in
the Network Science Collaborative Technol-
ogy Alliance sponsored by the U.S. Army Re-
search Laboratory under Agreement Number
W911NF-09-2-0053.
The content of this paper does not necessar-
ily reflect the position or policy of the U.S.
Government, no official endorsement should be
inferred or implied.
REFERENCES
[1] M. Granovetter, “Economic action and social structure: The prob-
lem of embeddedness,” American Journal of Sociology, vol. 91,
pp. 481–510, 1985.
[2] T. Beth, M. Borcherding, and B. Klein, “Valuation of trust in
open networks,” in Proceedings of ESORICS, 1994.
[3] V. Buskens, “Social networks and trust,” in The Netherlands:
Kluwer Academic Publishers, 2002.
[4] A. Abdul-Rahman and S. Hailes, “Supporting trust in virtual
communities,” in Proceedings of the 33rd Hawaii International
Conference on System Sciences, 2000.
[5] K. Aberer and Z. Despotovic, “Managing trust in a peer2 -
peer information system,” in Proceedings of the Tenth Inter-
national Conference on Information and Knowledge Manage-
ment(CIKM01), 2001, pp. 310–317.
[6] E. Gray, J.-M. Seigneur, Y. Chen, and C. Jensen, “Trust propa-
gation in small worlds,” in Proceedings of the First International
Conference on Trust Management, 2003.
[7] U. Kuter and J. Golbeck, “Sunny: A new algorithm for trust infer-
ence in social networks using probabilistic confidence models,”
in AAAI, 2007, pp. 1377–1382.
9
[8] K. Kelton, K. R. Fleischmann, and W. A. Wallace, “Trust in
digital information,” J. Amer. Society for Information Science and
Technology, vol. 59, pp. 363–374, 2008.
[9] R. C. Mayer, F. Schoorman, and J. Davis, “An integrative model
of organizational trust,” Academy of Management Review, 1995.
[10] J. Lewis and A. Weigert, “Trust as a social reality,Social Forces,
1985.
[11] D. Rousseau, S. Sitnik, R. Burt, and C. Camerer, “Not so
different after all: A cross-discipline view of trust,Academy of
Management Review, 1998.
[12] J. Baumes, M. Goldberg, M. Hayvanovych, M. Magdon-Ismail,
W. Wallace, and M. Zaki, “Finding hidden group structure in a
stream of communications,” Intelligence and Security Informatics
(ISI), 2006.
[13] J. Baumes, M. Goldberg, and M. Magdon-Ismail, “Efficient
identification of overlapping communities,IEEE International
Conference on Intelligence and Security Informatics (ISI), pp.
27–36, May, 19-20 2005.
[14] M. Goldberg, M. Hayvanovych, and M. Magdon-Ismail, “Mea-
suring similarity between sets of overlapping clusters,” submitted
to AAAI 2010.
[15] Y. Zhou, K. Fleischmann, and W. Wallace, “Automatic text
analysis of values in the enron email dataset: Clustering a social
network using the value patterns of actors,” in Proceedings of
the 43rd Hawaii International Conference on System Sciences,
Kauai, HI.
10
... Although healthcare B A'aeshah Alhakamy aalhakami@ut.edu.sa 1 organizations and governments around the world responded quickly to the outbreak, only a few countries were able to successfully control the situation [25]. Contact tracing applications, including vaccination passports, are strategies used to identify people who might have been infected. ...
... Measures of behavioral trust in social networks can be used to evaluate the dyadic relationship between people and their networks. Adali et al. [1] were able to study trust through the retweet data collected by Twitter(X). The data helped validate their methods. ...
... Trust is a relationship that exists in a variety of contexts. In the Adali et al. [1] trust graphs, the relationships were homogeneous. It would be practical to filter the edges of the trust graph by analyzing the content of the tweets. ...
Article
Full-text available
The Coronavirus disease 2019 (COVID-19) outbreak increased the scrutiny and burden on administrative officials to manage the pandemic rules and regulations. Social media platforms allow people to express their opinions, provide information on global events, and offer diverse perspectives and feedback. Throughout the pandemic, people have used Twitter(X) to spontaneously share sentiments and emotions about 35 official applications worldwide that address policies and regulations concerning vaccination passports installed by government agencies. Thus, this work conducts a sentiment analysis of 12,976 tweets—a popular form of natural language processing (NLP)—to capture what people are tweeting about these applications. The study attempts to answer the following research questions: (R.Q.1) Does embracing COVID-19 regulations through official applications have a positive impact? (R.Q.2) Are people trusting the official application? (R.Q.3) What topics get the most retweets? Explicitly, what is the relationship between the application and the number of retweets? (R.Q.4) What challenges may arise from the interpretation of app functionality based on tweets? Each sentiment analysis is reinforced by statistical evaluation to discover underlying patterns and trends in the collected big data.
... Adali et al. [42] proposed that more frequent communication between two nodes indicates greater trust. Based on this, the core idea of this algorithm is that more direct activations between two nodes indicate greater trust strength. ...
Article
Full-text available
Inferring multilayer diffusion networks from observed cascades is both crucial and realistic. To infer multilayer diffusion networks, constructing continuous-time diffusion models that capture diffusion dynamics is a prerequisite. However, developing such models faces two main challenges: (1) reducing the number of learnable parameters for precise optimization with limited cascades while effectively modeling for accurate inference and (2) adapting the models to more realistic scenarios. In this paper, we propose a novel continuous-time diffusion model, namely the Embedding-based Continuous-time Diffusion (ECD) model, which employs an embedding method while modeling symmetric relationship strength, asymmetric relationship strength, and trust strength. Specifically, by leveraging the embedding method, the number of learnable parameters is significantly reduced compared with previous models. Then, by modeling symmetric relationship strength, our model can be used in scenarios where the relationships between nodes are symmetric. Subsequently, the trust strength can be inferred by our proposed efficient heuristic algorithm, making our model suitable for scenarios where time information is unavailable. Furthermore, we develop an optimization algorithm to optimize the proposed model and infer multilayer diffusion networks. The experimental results on synthetic and real datasets show that our model and algorithms outperform the comparison methods.
... Initial research highlighted the potential of categorizing trust prediction models based on user behaviors and interactions [35]. Subsequent studies have introduced nuanced models that distinguish between the trust a community places in an individual versus the trust an individual has in the community [36], and methods that focus on communication patterns, such as conversation trust, which evaluates the frequency and duration of interactions [37]. Another approach quantifies subjective trust values among connected users based on their social engagements [38], underscoring the critical role of direct interactions in revealing trust indicators that structural analyses might overlook. ...
Article
Full-text available
This study explores trust dynamics within online social networks, blending social science theories with advanced machine-learning (ML) techniques. We examine trust’s multifaceted nature—definitions, types, and mechanisms for its establishment and maintenance—and analyze social network structures through graph theory. Employing a diverse array of ML models (e.g., KNN, SVM, Naive Bayes, Gradient Boosting, and Neural Networks), we predict connection strengths on Facebook, focusing on model performance metrics such as accuracy, precision, recall, and F1-score. Our methodology, executed in Python using the Anaconda distribution, unveils insights into trust formation and sustainability on social media, highlighting the potent application of ML in understanding these dynamics. Challenges, including the complexity of modeling social behaviors and ethical data use concerns, are discussed, emphasizing the need for continued innovation. Our findings contribute to the discourse on trust in social networks and suggest future research directions, including the application of our methodologies to other platforms and the study of online trust over time. This work not only advances the academic understanding of digital social interactions but also offers practical implications for developers, policymakers, and online communities.
... Trustworthiness can be inferred from the trust network G T , in which a link goes from Alice → Bob if Alice follows or reshares Bob. In line with previous work, these actions can be considered endorsements that signal trust by the sharing account in the account being shared [20,[50][51][52]. ...
Article
Full-text available
The spread of misinformation poses a threat to the social media ecosystem. Effective countermeasures to mitigate this threat require that social media platforms be able to accurately detect low-credibility accounts even before the content they share can be classified as misinformation. Here we present methods to infer account credibility from information diffusion patterns, in particular leveraging two networks: the reshare network, capturing an account’s trust in other accounts, and the bipartite account-source network, capturing an account’s trust in media sources. We extend network centrality measures and graph embedding techniques, systematically comparing these algorithms on data from diverse contexts and social media platforms. We demonstrate that both kinds of trust networks provide useful signals for estimating account credibility. Some of the proposed methods yield high accuracy, providing promising solutions to promote the dissemination of reliable information in online communities. Two kinds of homophily emerge from our results: accounts tend to have similar credibility if they reshare each other’s content or share content from similar sources. Our methodology invites further investigation into the relationship between accounts and news sources to better characterize misinformation spreaders.
... We take this approach since research has shown that trust is vital in information diffusion and people's decisions during disasters. Adali et al. 28 emphasized the significance of trust in interpersonal relationships, particularly in the context of social networks, and introduced algorithmic methods to measure and quantify trust based on communication behavior, highlighting its relevance for understanding information flow dynamics in networks like Twitter. Wu et al. 29 underscored the importance of trust in information diffusion processes within network theory, highlighting the influence of rational decisions based on trust levels in acquaintances, and explored how trust dynamics impact information propagation in a two-layer multiplex network, revealing that memory span and trustable acquaintances significantly affect information spreading dynamics. ...
Preprint
Full-text available
In this study, we investigate the communication networks of urban, suburban, and rural communities from three US Midwest counties through a stochastic model that simulates the diffusion of information over time in disaster and in normal situations. To understand information diffusion in communities, we investigate the interplay of information that individuals get from online social networks, local news, government sources, mainstream media, and print media. We utilize survey data collected from target communities and create graphs of each community to quantify node-to-node and source-to-node interactions, as well as trust patterns. Monte Carlo simulation results show the average time it takes for information to propagate to 90% of the population for each community. We conclude that rural, suburban, and urban communities have different inherent properties promoting the varied flow of information. Also, information sources affect information spread differently, causing degradation of information speed if any source becomes unavailable. Finally, we provide insights on the optimal investments to improve disaster communication based on community features and contexts.
Article
Full-text available
The study examined the relationship between personal dispositions, social entrepreneurial intent (SEI), and the good life in a sample of 2,000 college students in Los Angeles, Manila, Mexicali, Taipei, and Yantai. Social cognitive career theory posits that a specific career choice-in this case becoming a social entrepreneur-affects individuals' experience of the good life, and social entrepreneurial intent mediates the relationship between personal dispositions and the good life. The paper presents and empirically tests a broad conceptualization of the good life-going beyond happiness and satisfaction-to include subjective and psychological wellbeing, freedom to make life choices, quality of social relations, and pathways to reach goals. Extending the previous research, the study finds that the personal dispositions of trust, optimism, generosity, and healthy life expectancy were robust predictors of SEI, and that SEI mediated the relationship between personal dispositions and the good life. Finally, cultural context mattered; long-term orientation, low masculinity, and high indulgence partially moderated the disposition-to-SEI-to-good life relationship. A good life is a life worth living. The good life is manifested in the overall and comprehensive quality of one's life (Veenhoven, 2000; Wong, 2013). The present study sought to understand how a specific career choice-social entrepreneurship-might affect individuals' experience of the good life. The study used a broad conceptualization of the good life, rather than limiting it to happiness
Chapter
Trust plays an important role in establishing trustworthy relationships among the SIoT objects/nodes and reduces probable risks in the decision making process. Accordingly, this chapter aims to design a trust computational heuristic by employing a number of trust features including but not limited to friendship similarity, community-of-interest, cooperativeness, and reward/punishment as the direct perception (i.e., direct trust), whereas the indirect trust (recommendations) are utilized as the direct trust of friends of trustor towards a trustee. Furthermore, a machine learning-based heuristic is used to aggregate all the trust features in order to ascertain an aggregate trust score instead of a weighted sum approach as compared to the previous chapter (i.e., Chap. 2). Our simulation results illustrate that the proposed trust-based model isolates the trustworthy and untrustworthy nodes within the network in an efficient manner.
Article
Full-text available
In this study, we investigate the communication networks of urban, suburban, and rural communities from three US Midwest counties through a stochastic model that simulates the diffusion of information over time in disaster and in normal situations. To understand information diffusion in communities, we investigate the interplay of information that individuals get from online social networks, local news, government sources, mainstream media, and print media. We utilize survey data collected from target communities and create graphs of each community to quantify node-to-node and source-to-node interactions, as well as trust patterns. Monte Carlo simulation results show the average time it takes for information to propagate to 90% of the population for each community. We conclude that rural, suburban, and urban communities have different inherent properties promoting the varied flow of information. Also, information sources affect information spread differently, causing degradation of information speed if any source becomes unavailable. Finally, we provide insights on the optimal investments to improve disaster communication based on community features and contexts.
Conference Paper
Full-text available
In many computing systems, information is produced and processed by many people. Knowing how much a user trusts a source can be very useful for aggregating, filtering, and or- dering of information. Furthermore, if trust is used to support decision making, it is important to have an accurate estimate of trust when it is not directly available, as well as a measure of confidence in that estimate. This paper describes a new approach that gives an explicit probabilistic interpretation for confidence in social networks. We describe SUNNY, a new trust inference algorithm that uses a probabilistic sampling technique to estimate our confidence in the trust information from some designated sources. SUNNY computes an esti- mate of trust based on only those information sources with high confidence estimates. In our experiments, SUNNY pro- duced more accurate trust estimates than the well known trust inference algorithm TIDALTRUST (Golbeck 2005), demon- strating its effectiveness.
Conference Paper
Full-text available
In this paper, we present an efficient algorithm for finding overlapping communities in social networks. Our algorithm does not rely on the contents of the messages and uses the communication graph only. The knowledge of the structure of the communities is important for the analysis of social behavior and evolution of the society as a whole, as well as its individual members. This knowledge can be helpful in discovering groups of actors that hide their communications, possibly for malicious reasons. Although the idea of using communication graphs for identifying clusters of actors is not new, most of the traditional approaches, with the exception of the work by Baumes et al, produce disjoint clusters of actors, de facto postulating that an actor is allowed to belong to at most one cluster. Our algorithm is significantly more efficient than the previous algorithm by Baumes et al; it also produces clusters of a comparable or better quality.
Article
Although trust is an underdeveloped concept in sociology, promising theoretical formulations are available in the recent work of Luhmann and Barber. This sociological version complements the psychological and attitudinal conceptualizations of experimental and survey researchers. Trust is seen to include both emotional and cognitive dimensions and to function as a deep assumption underwriting social order. Contemporary examples such as lying, family exchange, monetary attitudes, and litigation illustrate the centrality of trust as a sociological reality.
Article
Scholars in various disciplines have considered the causes, nature, and effects of trust. Prior approaches to studying trust are considered, including characteristics of the trustor, the trustee, and the role of risk. A definition of trust and a model of its antecedents and outcomes are presented, which integrate research from multiple disciplines and differentiate trust from similar constructs. Several research propositions based on the model are presented.
Conference Paper
A hidden group in a communication network is a group of individuals planning an activity over a communication medium without announcing their intentions. We develop algorithms for separating non-random planning-related communications from random background communications in a streaming model. This work extends previous results related to the identification of hidden groups in the cyclic model. The new statistical model and new algorithms do not assume the existence of a planning time-cycle in the stream of communications of a hidden group. The algorithms construct larger hidden groups by building them up from smaller ones. To illustrate our algorithms, we apply them to the Enron email corpus in order to extract the evolution of Enron’s organizational structure.
Conference Paper
This paper describes an automatic text analysis of values contained in the Enron email dataset that seeks to explore the potential to apply value patterns to cluster a social network. Two hypotheses are posed: individuals communicate more frequently with other individuals who share similar value patterns than with individuals with different value patterns; and people who communicate more frequently with each other share similar value patterns. The first hypothesis is supported: indeed, individuals were found to communicate more frequently with individuals who share similar value patterns, and further, the extent to which this is true appears to depend at least in part on the value patterns themselves. However, the second hypothesis is not supported - people who communicate more frequently with each other do not necessarily all fit into a particular value type. Thus, values have utility as a novel tool for social network analysis.
Conference Paper
The typical task of unsupervised learning is to organize data, for example into clusters, typically disjoint clusters (eg. the K-means algorithm). One would expect (for example) a clus- tering of books into topics to present overlapping clusters. The situation is even more so in social networks, a source of ever increasing data. Finding the groups or communities in social networks based on interactions between individuals (a measure of similarity) is an unsupervised learning task; and, groups overlap - an individual can be a chess player and a violin player, in which case he would interact with members of both these groups. The problem we address is not that of finding the overlapping clusters, but of comparing two sets of overlapping clusters. Such a task is the basis for comparing two different cluster- ings, which is important for comparing algorithms with each other or with a ground truth. From the social network point of view, we are particularly interested in quantifying social group evolution - how much the social group structure of a social network changed - by comparing the set of groups at consecutive time intervals. There is significant prior work on comparing sets of disjoint clusters (partitions). When overlap is allowed, the problem becomes considerably more complex owing to the possibil- ity of degeneracies, which we illustrate through examples. We describe three novel definitions of the distance between collections of potentially overlapping sets, and present algo- rithms for computing those distances. We test our algorithms on diverse data sets: collections composed from social groups in Twitter, Blogosphere and Enron Email data.
Article
Trust in information is developing into a vitally important topic as the Internet becomes increasingly ubiquitous within society. Although many discussions of trust in this environment focus on issues like security, technical reliability, or e-commerce, few address the problem of trust in the information obtained from the Internet. The authors assert that there is a strong need for theoretical and empirical research on trust within the field of information science. As an initial step, the present study develops a model of trust in digital information by integrating the research on trust from the behavioral and social sciences with the research on information quality and human– computer interaction. The model positions trust as a key mediating variable between information quality and information usage, with important consequences for both the producers and consumers of digital information. The authors close by outlining important directions for future research on trust in information science and technology. © 2008 Wiley Periodicals, Inc.
Article
In this article, we describe a new approach that gives an explicit probabilistic interpretation for social networks. In particular, we focus on the observation that many existing Web-based trust-inference algorithms conflate the notions of “trust” and “confidence,” and treat the amalgamation of the two concepts to compute the trust value associated with a social relationship. Unfortunately, the result of such an algorithm that merges trust and confidence is not a trust value, but rather a new variable in the inference process. Thus, it is hard to evaluate the outputs of such an algorithm in the context of trust inference. This article first describes a formal probabilistic network model for social networks that allows us to address that issue. Then we describe SUNNY, a new trust inference algorithm that uses probabilistic sampling to separately estimate trust information and our confidence in the trust estimate and use the two values in order to compute an estimate of trust based on only those information sources with the highest confidence estimates. We present an experimental evaluation of SUNNY. In our experiments, SUNNY produced more accurate trust estimates than the well-known trust inference algorithm TidalTrust, demonstrating its effectiveness. Finally, we discuss the implications these results will have on systems designed for personalizing content and making recommendations.