ArticlePDF Available

Event-based social networks: linking the online and offline social worlds

August 2012

August 2012

DOI:10.1145/2339530.2339693

Authors:

Xingjie Liu

Pennsylvania State University

Yuanyuan Tian

Show all 6 authorsHide

Newly emerged event-based online social services, such as Meetup and Plancast, have experienced increased popularity and rapid growth. From these services, we observed a new type of social network - event-based social network (EBSN). An EBSN does not only contain online social interactions as in other conventional online social networks, but also includes valuable offline social interactions captured in offline activities. By analyzing real data collected from Meetup, we investigated EBSN properties and discovered many unique and interesting characteristics, such as heavy-tailed degree distributions and strong locality of social interactions. We subsequently studied the heterogeneous nature (co-existence of both online and offline social interactions) of EBSNs on two challenging problems: community detection and information flow. We found that communities detected in EBSNs are more cohesive than those in other types of social networks (e.g. location-based social networks). In the context of information flow, we studied the event recommendation problem. By experimenting various information diffusion patterns, we found that a community-based diffusion model that takes into account of both online and offline interactions provides the best prediction power. This paper is the first research to study EBSNs at scale and paves the way for future studies on this new type of social network. A sample dataset of this study can be downloaded from http://www.largenetwork.org/ebsn.

Content uploaded by Yuanyuan Tian

Content may be subject to copyright.

Event-based Social Networks: Linking the Online and

Ofﬂine Social Worlds

Xingjie Liu

, Qi He

†

, Yuanyuan Tian

†

, Wang-Chien Lee

, John McPherson

†

, Jiawei Han



The Pennsylvania State University,

†

IBM Almaden Research Center,



University of Illinois at Urbana-Champaign

{xzl106, wlee}@cse.psu.edu,

†

{heq, ytian, jmcphers}@us.ibm.com,



hanj@cs.uiuc.edu

ABSTRACT

Newly emerged event-based online social services, such as

Meetup and Plancast, have experienced increased popularity

and rapid growth. From these services, we observed a new

type of social network – event-based social network (EBSN).

An EBSN does not only contain online social interactions

as in other conventional online social networks, but also in-

cludes valuable oﬄine social interactions captured in oﬄine

activities. By analyzing real data collected from Meetup, we

investigated EBSN properties and discovered many unique

and interesting characteristics, such as heavy-tailed degree

distributions and strong locality of social interactions.

We subsequently studied the heterogeneous nature (co-

existence of both online and oﬄine social interactions) of

EBSNs on two challenging problems: community detection

and information ﬂow. We found that communities detected

in EBSNs are more cohesive than those in other types of

social networks (e.g. location-based social networks). In the

context of information ﬂow, we studied the event recom-

mendation problem. By experimenting various information

diﬀusion patterns, we found that a community-based diﬀu-

sion model that takes into account of both online and oﬄine

interactions provides the best prediction power.

This paper is the ﬁrst research to study EBSNs at scale

and paves the way for future studies on this new type of

social network. A sample dataset of this study can be down-

loaded from http://www.largenetwork.org/ebsn.

Categories and Subject Descriptors

H.3.4 [Information Storage and Retrieval]: Systems

and Software - Information networks

General Terms

Algorithms, Experimentation.

Keywords

Event based Social Networks, Social Network Analysis, So-

cial Event Recommendation, Online and Oﬄine Social Be-

haviors, Heterogeneous Network

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

KDD’12, August 12–16, 2012, Beijing, China.

1. INTRODUCTION

Newly emerged event-based online social services, such as

Meetup (www.meetup.com), Plancast (www.plancast.com),

Yahoo! Upcoming (upcoming.yahoo.com) and Eventbrite

(www.eventbrite.com) have provided convenient online plat-

forms for people to create, distribute and organize social

events. On these web services, people may propose so-

cial events, ranging from informal get-togethers (e.g. movie

night and dining out) to formal activities (e.g. technical

conferences and business meetings). In addition to support-

ing typical online social networking facilities (e.g. sharing

comments and photos), these event-based services also pro-

mote face-to-face oﬄine social interactions. To date, many

of these services have attracted a huge number of users and

have been experiencing rapid business growth. For example,

Meetup has 9.5 million active users, creating 280, 000 social

events every month; Plancast has over 100, 000 registered

users and over 230, 000 visits per month.

Meetup Service

Users:

Events:

Social

Groups:

Users:

Events:

Following

links:

Plancast Service

Plancast

Event-based Social Network

Meetup

Event-based Social Network

Online Network:

Ofﬂine Network:

Online Network:

Figure 1: Event-based Social Network Examples

As these event-based services continue to expand, we iden-

tify a new type of social network – event-based social net-

work (EBSN) – emerging from them. Like conventional on-

line social networks, EBSNs provide an online virtual world

where users exchange thoughts and share experiences. But

what distinguishes EBSNs from conventional social networks

is that EBSNs also capture the face-to-face social interac-

tions in participating events in the oﬄine physical world.

Fig. 1 depicts two example EBSNs from Meetup and Plan-

cast. In Meetup, users may share comments, photos and

event plans with members in the same online social groups

(e.g. “bay area photographers”, “Nevada county walkers”).

In Plancast, users may directly “follow” others’ event calen-

dars. Bi-directional co-memberships of online social groups

in Meetup or uni-directional subscriptions in Plancast ulti-

mately constitute an online social network represented as

the dashed lines on the right side of Fig. 1. Meanwhile, in

both cases, users’ co-participations of the same events derive

their oﬄine social connections. These connections collec-

tively form an oﬄine social network denoted as dotted lines

in Fig. 1. The online and oﬄine social interactions jointly

deﬁne an EBSN.

Recent location-based online social networking services,

such as Foursquare (foursquare.com) and Gowalla (gowalla.

com), represent another type of popular social network, called

a location-based social network (LBSN). They are somewhat

similar to EBSNs, as they capture online social interactions

as well as oﬄine location checkins. However, unlike the of-

ﬂine social events that incur a group of people with social

interactions, location checkins from LBSNs mostly represent

individual behaviors, i.e. a particular user was at a speciﬁc

location at a speciﬁc time. Although in [5], adjacent check-

ins were treated as one kind of reason for social network tie

creation. It is estimated that adjacent checkins have only a

24% chance to lead to a new social friendship in Gowalla.

Therefore, in this paper, we only compare EBSNs against

the online social networks in LBSNs.

To the best of our knowledge, this paper is the ﬁrst work

to identify an event-based social network as a co-existence

of both online and oﬄine social interactions, and compre-

hensively study its properties. Our study revealed the many

aspects of EBSNs that are signiﬁcantly diﬀerent from con-

ventional social networks. As to be shown in our analysis,

social events present very regular temporal and spatial pat-

terns. In addition, both online and oﬄine social interactions

in EBSNs are extremely local. For example, we found that

70.65% of Meetup online friends and 84.61% of Meetup of-

ﬂine friends live within 10 miles of each other. To our sur-

prise, the degree distributions of the Meetup EBSN do not

follow the usual power law distribution, but are more heavy-

tailed than power law. Furthermore, we found that the on-

line and oﬄine social interactions in an EBSN are positively

correlated, implying a synergistic relationship between the

two parts.

Community structure detection is a very useful approach

for analyzing social networks. However, to correctly detect

communities in an EBSN, one has to consider both online

and oﬄine social interactions. In this paper, we employ an

extended Fiedler method to incorporate this heterogeneity

during the community detection process. Through exper-

iments, we demonstrate the advantage of this method to

other approaches. We also observed that the detected com-

munities in the Meetup EBSN are more cohesive than those

of the Gowalla LBSN.

To further investigate information ﬂow over EBSNs, we

also study the problem of event participation recommenda-

tion. Due to the short life time of an event, the event partic-

ipation recommendation problem signiﬁcantly diﬀers from

the usual recommendation problem for movies or places.

Recommendation of an event is only valid after the event

is created and before the event starts. This leads to a cold-

start problem. In this paper, we design a number of diﬀusion

patterns that capture the information ﬂow over the heteroge-

neous EBSNs. Through experiments we demonstrate that

the diﬀusion pattern that takes the community structures

into account yields the best prediction power.

The rest of this paper is organized as follows. We describe

the related work in Section 2 and formally deﬁne EBSNs in

Section 3 . We examine the properties of EBSNs in Sec-

tion 4 and further investigate the community structures in

Section 5. In Section 6, we tackle the event participation

prediction problem to study the information ﬂow over EB-

SNs. Finally, we conclude the paper in Section 7.

2. RELATED WORK

Oﬄine social interactions in the physical world have al-

ways been important in sociology [9]. One line of work is

to study the origin of social relationships. In [12], Feld pro-

posed a focus theory in which individuals organize their so-

cial interactions around foci, such as workplaces, families,

etc; whereas [20, 16, 3] utilized aﬃliation to explain the con-

struction of social connections. Chapter 4 of [11] provides

a nice summary on these topics. Under the above theories,

social events can be viewed as one type of focus or aﬃliation

that creates the social interactions between participants.

Thanks to the popularity of event-based social network

services, such as Meetup and Plancast, we are now able to

get our hands on large scale social data with rich information

on both online activities and oﬄine social events. In [24],

Sander and Seminar attended 40 social events in Meetup and

concluded that participants in Meetup social events have

social structures instead of just strangers meeting strangers.

Similar to event-based social networks, location-based so-

cial networks also contains “online” social interactions and

“oﬄine” checkin information. Although adjacent location

checkins may indicate implicit social interactions and social

ties [5], checkins are usually sporadic [21] and largely rep-

resent individual behaviors. The geographical features of

users were also examined to infer social ties in [7, 26]. In

comparison to these work, the “oﬄine” information (social

events) studied in this paper does not only contain location,

but also time and people involved.

3. EVENT-BASED SOCIAL NETWORKS

In this section, motivated by popular event-based social

services, we deﬁne event-based social networks and describe

how to construct the networks from collected datasets.

3.1 Event-based social services

As various online social networking services become preva-

lent, a new type of event-based social service has emerged.

These web services help users to create social event propos-

als, disseminate the proposals to related people, and keep

track of all participants. To foster eﬃcient communication

and sharing, these event-based services also provide online

social networking platforms to connect users with others

with similar interests. Below, we describe two examples of

such event-based social services: Meetup and Plancast.

Meetup is an online social event service that helps people

publish and participate in social events. On Meetup, a social

event is created by a user by specifying when, where and

what the event is. Then, the created social event is made

available to selected users or public, controlled by the event

creators. Other users may express their intent to join the

event by RSVP (“yes”, “no” or “maybe”) online. To facilitate

online interactions, meetup.com also allows users to form

social groups (e.g. “bay area single moms”, “Nevada county

walkers”) to share comments, photos and event plans.

Similar to meetup.com, Plancast is another web service

that helps users create and organize events online. Users

also RSVP to express their intent to join social events. In

Meetup Gowalla

# Users 5, 153, 886 # Users 565, 642

# Events 5, 183, 840 # Locations 2, 838, 143

# RSVPs 42, 733, 136 # Checkins 36, 804, 656

# Groups 97, 587 # Social links 2, 431, 625

# Memberships 10, 704, 068

Table 1: Dataset Statistics

contract to Meetup which adopts social groups to connect

users online, Plancast allows users to “follow” others’ social

event calendars to establish online connections.

3.2 Event-based Social Networks Deﬁnition

Based on the event-based social services described above,

we formulate a new type of social network, called an event-

based social network (EBSN).

Like any social network, EBSNs capture social interac-

tions among users. However, diﬀerent from others, ESBNs

incorporate two forms of social interactions: online social

interactions and oﬄine social interactions.

Online social interactions. In EBSNs, users can in-

teract with each other online without the need of physical

contact. For example, people can share thoughts and ex-

periences with those in the same social group in Meetup.

In Plancast, user comments and event plans are pushed to

those who “follow” the user.

Oﬄine social interactions. Social events play a ma-

jor role in ESBNs. In a social event, people physically get

together at a speciﬁc time and location, and do something

together. Therefore, the social events in EBSNs represent

the oﬄine social interactions among event participants.

Deﬁnition: Formally, we deﬁne an EBSN as a heteroge-

neous network G = hU, A

, A

oﬀ

i, where U represents the set

of users (vertices) with |U | = n, A

stands for the set of on-

line social interactions (arcs), and A

oﬀ

denotes the set of of-

ﬂine social interactions (arcs). The online social interactions

of an EBSN form an online social network G

= hU, A

and the oﬄine interactions of an EBSN compose an oﬄine

social network G

oﬀ

= hU, A

oﬀ

Note that the online social network or the oﬄine social

network of a EBSN can be either directed or undirected.

For simplicity, we only focus on undirected online and oﬄine

networks in this paper.

The online social network [1, 18] or the oﬄine social net-

work [2, 22] alone is not new and has been studied exten-

sively before. But the co-existence of both is what makes

EBSNs special. As shown later in this paper, these two

forms of social networks in EBSNs are intertwined but also

have their own distinct characteristics at the same time.

3.3 Representative Datasets Description

To eﬀectively study EBSNs and explore the unique prop-

erties against related LBSNs, we collected data from the

popular event-based web services Meetup and the popular

location-based social service Gowalla. In this section, we

introduce the basic dataset statistics, as well as how EBSN

and LBSN are established from these datasets.

Meetup EBSN. We crawled meetup.com from Oct 2011

to Jan 2012. The collected data statistics are shown in Ta-

ble 1. With the Meetup dataset, the online EBSN is con-

structed by capturing the co-membership of online social

groups: users u

and u

are connected in the online social

network G

if they are members of the same social group.

Let g

denote a group with |g

| members, then (u

, u

) ∈ A

if and only if ∃g

such that u

∈ g

and u

∈ g

. We consider

users of a smaller group more closely connected than those

of a larger group. Therefore, we adopt a similar approach

as in [19] to deﬁne the edge weights:

i,j

∀g

∈g

∧u

∈g

. (1)

The oﬄine social network of the EBSN, G

oﬀ

, is constructed

in a similar way based on the co-participation of social events:

user u

and u

are connected if they co-participated in the

same social event. If we use e

to represent a social event

with |e

| participants, and u

∈ e

to denote the fact that

participated e

, then the weight of the oﬄine social in-

teraction between u

and u

is deﬁned as

oﬀ

i,j

∀e

∈e

∧u

∈e

. (2)

Gowalla LBSN. Gowalla is a popular online location-

based social networking service that allows individual user to

“checkin”their current locations (as well as comments/photos)

and share with their friends. Gowalla requires users to ex-

plicitly specify their friends. Users need to mutually accept

each other as friends to establish an online social link.

We crawled Gowalla from Sep 2011 to Nov 2011 and col-

lected a subset of the users’ online social networks and place

checkins. The total numbers of users and locations are

also summarized in Table 1. As discussed before, although

this LBSN provides oﬄine location checkins, these check-

ins cannot directly form an oﬄine social network. Thus,

the Gowalla LBSN only has an online social network in this

study.

4. PROPERTIES OF EBSNS

In this section, we analyze the Meetup dataset to highlight

the unique properties of EBSNs. As social events play a cen-

tral role in EBSNs, we ﬁrst study those properties speciﬁ-

cally associated with social events. Then, we examine the

network properties of EBSNs.

4.1 Social Events

Social events provide a platform for users to get-together

physically. A social event is characterized by two major

features: event time and event location. First, we observe

Mon Tue Wed Thu Fri Sat Sun’

x 10

Event Start Time over Every Hour

Count

← 2PM

← 8PM

← 11AM

← 2PM

← 8PM

Figure 2: Social event time histogram over every

hour of one week.

that social events exhibit regular temporal patterns. Fig. 2

depicts the social event time pattern on weekly scale. It is

clear that in every weekday there is a small spike around

2pm in the afternoon, followed by a higher spike at 8pm in

the evening. On weekends, events distribute relatively even

throughout the day.

Figure 3: Social event geographical histogram. Each

bar represents the number of social events in 100

square miles.

We also observe that social events are mainly located in

urban areas. Fig. 3 depicts a US event geographical his-

togram with 100 square miles as a geographical unit.

4.2 Event and Group Participation

To understand the basic network properties of the Meetup

EBSN, we need to ﬁrst study the event participation and

group membership in Meetup. As shown in Fig. 4(a), most

of the events are small with just a few participants, but

big events with a large number of participants (the heavy

tail) do exist in a non-trivial quantity. Similarly, Fig. 4(b)

shows that large groups do have signiﬁcant presence. We

examine how these two distributions ﬁt the power law curve

by Kolmogorov-Smirnov test [6]. This approach estimates

the following 3 parameters:

• xmin: the best ﬁtted cutoﬀ value so that only values

larger than xmin ﬁt a power-law distribution;

• ˆα: the slope of the best ﬁtted power-law distribution so

that values larger than xmin follow distribution x

− ˆα

;

• p-value: the statistical signiﬁcance of the goodness of

the power-law ﬁtting, (p-value larger than 0.1 suggests

a signiﬁcant good ﬁt).

−7

−6

−5

−4

−3

−2

−1

# Participants per Event

Noramlized Frequency

Data Distribution

Fitted xmin = 250

Fitted Slope = 3.46

(a) # participants per event

−5

−4

−3

−2

−1

# Members per Social Group

Normalized Frequency

Data Distribution

Fitted xmin = 1045

Fitted Slope = 3.28

(b) # members per group

Figure 4: Histogram of the number of participants

per event and number of members per group.

By estimating the above parameters, we ﬁnd that only

after xmin= 250 does the event size follow a power-law dis-

tribution with a high statistical signiﬁcance (with p-value

0.357). Similarly, the number of members per group follows

a power-law distribution non-signiﬁcantly with ˆα = 3.28

only after the number of events is greater than 1045 (with p-

value 0.088). These two results suggest that although most

events and social groups are in small scale, large events and

large groups do show signiﬁcant presence in the Meetup

dataset.

4.3 Network Properties

Now we study the network properties of the Meetup ESBN

by comparing it against the Gowalla LBSN. Table 2 lists

some network properties of the Meetup EBSN online social

network G

, oﬄine social network G

oﬀ

, combined network

G as well as the Gowalla LBSN social network. First, it

can be clearly seen that the EBSN online social network is

much denser than the EBSN oﬄine social network, (larger

strongly connected component SCC, higher clustering co-

eﬃcient and lower average degree of separation). This is

due to the fact that a user connects to more people online

than in actual social events. Secondly, all three EBSN so-

cial networks (G

, G

oﬀ

and G) are much denser than the

Gowalla LBSN, because Meetup users interact with each

other by co-joining social groups or co-participating social

events whereas Gowalla users have to mutually establish

friendships to get connected.

Meetup EBSN Gowalla LBSN

off

Mean Degree 1, 786.1 140.7 1, 560.6 10.64

Median Degree 623 40 463 3

SCC. Ratio 0.999 0.993 0.997 0.987

Clustering Coef. 0.438 0.267 0.429 0.137

Degree Separation 3.00 4.25 3.07 4.47

Degree Fitted xmin 3, 765 536 7, 490 47

Degree Fitted ˆα 2.49 2.53 2.50 2.53

Degree Fitting p-value 0.000 0.000 0.000 0.124

Table 2: Network statistics comparison between

EBSN and LBSN.

To dig deeper into the network properties of EBSN, we

ﬁrst study the degree distributions in Fig. 5. Again, we ap-

ply the Kolmogorov-Smirnov statistic to examine whether

these distributions ﬁt the power law distribution. The es-

timated parameters are listed in the bottom of Table 2.

While the Gowalla LBSN conforms to the power law distri-

bution, all three of the EBSN forms are more heavy-tailed

than power law. This heavy tail phenomenon in the Meetup

EBSN is correlated with the signiﬁcant presence of big events

and big social groups found in Section 4.2.

Figure 5: Degree distribution comparison between

EBSN and LBSN.

Next, we analyze the correlation between each user’s on-

line interactions and oﬄine interactions. By applying Pear-

son correlation, we observe positive correlation between on-

line and oﬄine degrees (0.368) as well as between online

and oﬄine cluster coeﬃcients (0.393). This implies that the

online social network and the oﬄine social network work to-

gether synergistically in the Meetup EBSN – each have a

positive eﬀect on the other.

4.4 Locality of Social Interactions

0.2

0.4

0.6

0.8

User Home to Event/Checkin Location Distance (miles)

CDF

User Home To Event Location (meetup)

User Home To Checkin Location (gowalla)

(a) locality of events

0.2

0.4

0.6

0.8

Geographical Distance between Friend Homes (miles)

CDF

meetup EBSN online (G

)

meetup EBSN offline (G

off

)

meetup EBSN full (G)

gowalla LBSN

(b) locality of friends

Figure 6: Localities of Meetup EBSN and Gowalla

LBSN.

In the following, we further analyze on the geographic as-

pects of social interactions. In Fig. 6(a), we examine the

distance of a Meetup event location and a Gowalla checkin

location to the user’s home location [4, 5]. As illustrated

by this ﬁgure, although both events and checkins tend to

be local to users’ home locations, the possibility of an event

participation in Meetup decreases more dramatically as the

distance increases. As observed, 81.93% of events partici-

pated in by a user are within 10 miles of his/her home loca-

tion. This indicates that people’s social activities are much

more location constrained than place checkins. This is be-

cause people’s checkins are usually sporadic [21] and largely

represent individual behaviors. Social events, which need all

participants to meet at the same spot, must be located close

to all the participants in most cases.

Next we compare the distances between friends’ home lo-

cations in the Meetup EBSN against the Gowalla LBSN.

As depicted in Fig. 6(b), friends in Meetup, no matter in

online, oﬄine, or the combined social networks, are much

geographically closer to each other than in Gowalla LBSN.

This is because both online and oﬄine social networks in

Meetup EBSN revolve around social events, which require

participants to physically get together at the same location.

In comparison, it is perfectly ﬁne and usual for a Gowalla

user to share a location checkin when he/she visits some new

places. Not surprisingly, oﬄine friends in Meetup EBSN

tend to live closer to each other than the online friends.

84.61% of oﬄine friends live within 10 miles to each other.

5. EBSNS COMMUNITY STRUCTURE

In this section, we investigate the community structures

of EBSNs. Due to the heterogeneity of EBSNs, communities

are deﬁned by both online and oﬄine interactions

. As a

result, previous community detection algorithms on homo-

geneous networks do not directly apply to EBSNs. Thus, we

employ an extended Fiedler method to detect communities

in EBSNs and compare it against the previous approaches.

We also use the Gowalla LBSN as a comparison to further

study the unique features of the Meetup EBSN.

5.1 Clustering on Homogeneous Networks

For homogeneous social networks like the online or oﬄine

network of an EBSN, we use the popular Fiedler method

oﬀered by the Graclus tool [10] to partition networks. The

partitioned clusters are treated as user communities. Let

Although a group or an event in Meetup somewhat cap-

tures the behaviors of a set of users either online or oﬄine,

it is the combination of online and oﬄine interactions that

deﬁnes a community in EBSNs.

A deﬁne the adjacency matrix of a network. The popular

Normalized Cut (NCut) [27] shown in Eq. 3 is applied as the

graph partition objective function for each binary cut.

min

, subject to y

D1 = 0, y 6= 0. (3)

In Eq. 3, D is the diagonal matrix in which each diagonal

value is the sum of the corresponding row (D

L = D − A is the Laplacian matrix, y is the column vector

with y

∈ {1, −b} and b is some data-dependent constant.

The column vector y represents the graph cutting results

of the current binary cut, since all nodes with y

= 1 are

clustered into one cluster and the other nodes with y

= −b

are clustered into another cluster. If y is relaxed to take

on real values, Eq. 3 is equivalent to solving the generalized

eigenvalue system Ly = λDy, where y is the Fiedler vector

corresponding to the second smallest eigenvalue.

5.2 Clustering on Heterogeneous EBSNs

5.2.1 Baseline 1: Linear Combination

Given an EBSN G, we have two separate but correlated

networks G

= hU, A

i and G

oﬀ

= hU, A

oﬀ

i. Both G

and G

oﬀ

share the same user set U. As a result, the cluster-

ing process should consider the correlation between G

and

oﬀ

. The simplest way to leverage both online and oﬄine

social interactions is to combine them linearly

A = γ ∗ A

+ (1 − γ) ∗ A

oﬀ

. (4)

Here A deﬁnes a linearly combined adjacency matrix with

a weighting parameter γ to diﬀerentiate two types of inter-

actions. We name this naive method as LinearComb and

use it as a baseline for comparison. The major problem of

LinearComb is that after the linear combination, the social

interaction type information is missing in the new matrix A.

5.2.2 Baseline 2: Generalized SVD

As another baseline, we utilize Generalized Singular Vec-

tor Decomposition (GSVD) to incorporate online and oﬄine

social interactions in the clustering process by following The-

orem 5.1.

Theorem 5.1. Given two EBSN social interaction ma-

trices A

∈ R

n×n

and A

oﬀ

∈ R

n×n

, there exists unitary

matrics µ, ν ∈ R

n×n

, reversible matrix Y ∈ R

n×n

and rect-

angular diagonal matrices Σ

and Σ

such that:

= µΣ

, A

oﬀ

= Y Σ

The proof of Theorem 5.1 can be found in [14]. In Theo-

rem 5.1, the singular vectors of matrix Y (from the second

columns and onwards) collectively oﬀer a consistent clus-

tering on users by leveraging both online and oﬄine social

interactions. In this method, the singular vectors of the 2

to m

smallest singular values are used as m − 1 dimen-

sional indicator vectors for users. Then, a classic K-means

algorithm is conducted on this space to generate user com-

munities. We name this method GSVD.

One shortcoming of GSVD is that as Y is not a unitary

matrix, its values on diﬀerent column vectors vary a lot in

ranges. Therefore, the partitioning information embedded

in Y cannot be simply diﬀerentiated by the symbol sign as

the classic SVD does. In experiments, we also found that

the performance of GSVD is rather sensitive to the choice

Algorithm 1: HeteroClu

Input: EBSN G = hU, A

, A

oﬀ

i, # clusters K

Output: User cluster set C

1 Initialize C = {C

, C

, . . . , C

}, where each C

= {u

};

2 Initialize normalized weights

¯w

← (

∈C

)/(|C

| · |C

|) for connected

, C

;

3 while |C|>M do /* bottom-up cluster */

4 Find the largest ¯w

;

5 Merge C

and C

, update related normalized weights;

6 while |C| < K do /* top-down partition */

7 Binary cut all M clusters following the objective Eq. 5;

8 if C

is the cluster with the minimum cut cost then

9 delete C

from C;

10 Add spitted parts of C

into C;

11 return C

of similarity measures on the singular vectors of Y . After

many comparions, we chose the city block similarity measure

for GSVD.

5.2.3 Extended Fiedler Method

We now propose an algorithm that clusters online and

oﬄine interactions at the same time. This algorithm em-

ploys the following objective function based on normalized

cut (Eq. 3):

min α

− A

+ (1 − α)

oﬀ

− A

oﬀ

, (5)

subject to y

1 = 0, y

oﬀ

1 = 0, y 6= 0.

The above objective function contains two parts, each part

alone is a normalized cut objective function on individual

online or oﬄine social networks. But the linear combination

of both deﬁnes a global optimization over the heterogeneous

EBSN. Coupling factor α is used to weigh the importance

of each network. Note that each part is a normalized value

between 0 and 1. Therefore, the size of the individual online

or oﬄine network is not captured in Eq. 5. A naive way

to assign the importance of the two parts is to set α =

0.5. However, since online and oﬄine networks have diﬀerent

network density, we set α as

sum(A

)

sum(A

)+sum(A

off

)

Similar objective functions to Eq. 5 have been used in the

high-order co-clustering problem on multiple types of het-

erogeneous objects [13]. Solving the new objective function

(Eq. 5) is non-trivial, as it represents a typical quadratic

fractional programming problem. In [13], the similar func-

tion was ﬁrst approximated to be a quadratically constrained

quadratic programming problem by ﬁxing two denomina-

tors of the function as constants. Then, the standard semi-

deﬁnite programming is applied to compute y eﬃciently.

In this paper, we use a heuristic algorithm shown in Al-

gorithm 1 to solve the clustering problem with the objective

function deﬁned in Eq. 5. This algorithm ﬁrst employs a

bottom-up clustering algorithm on the linear combination

of online and oﬄine social networks as deﬁned in Eq. 4, to

generate M (M << K) giant loose clusters in a bottom-up

fashion. This step deﬁnes a local greedy merge procedure.

Then it uses the top-down recursive binary cut procedure

to cut large clusters to smaller ones until K clusters are

achieved. This step deﬁnes a global recursive cut procedure.

0.5 1 1.5

x 10

1.5

2.5

3.5

4.5

# Clusters (K)

Davies−Bouldin Index

Online EBSN Partition

Offline EBSN Partition

EBSN LinearComb

EBSN GSVD

EBSN HeteroClu

1 2 3 4

x 10

1.5

2.5

3.5

4.5

# Clusters (K)

Online LBSN (Gowalla)

2.93

2.53

2.02

1.80

2.20

1.98

Figure 7: Community dectection performance. The

score inside the grey rectangle is the DB index under

the optimal K based on the “knee” method.

5.3 Community Structure Evaluation

5.3.1 Evaluation Settings

To measure the quality of user communities, we use the

collected user tags as the external ground truth of latent

community semantics. 78, 158 unique user tags were col-

lected from Meetup and treated as the Meetup tag space T

with |T | = m. For each user u

, we built a binary user-tag

vector u

= {t

, t

, . . . , t

} where t

= 1 if u

selects the

tag t

; otherwise t

= 0. After normalization, the similarity

between two users u

and u

is measured by the cosine sim-

ilarity u

· u

. There are no user tags available in Gowalla.

Instead, we aggregated all location tags of a user’s checkins

to build the user-tag vector, in which t

is the number of

checkins associated to tag t

of user u

. In total, 680 unique

tags were collected in Gowalla.

The standard Davies-Bouldin (DB) index [8] was used to

measure the cohesiveness of communities, which is given by

DB =

k=1

max

k6=j

(

2 − σ

− σ

1 − c

· c

), (6)

where K is the number of communities, c

= 1/|C

∈C

is the centroid vector of cluster C

after renormalization, and

= 1/|C

∈C

· c

is the average similarity of users

in cluster C

to their centroid. A smaller DB index value

indicates a more cohesive community.

5.3.2 Results

Determining the optimal K for a clustering has been an

open problem for decades. For a fair comparison on vari-

ous approaches and datasets, we used a simple yet popular

method that identiﬁes the “knee” [15] in the plot of DB in-

dex vs. K to determine the optimal K for each clustering

ﬁrst; and then compare the corresponding DB index under

the optimal K. The DB index value corresponding to the

“knee” can be seen as the best clustering performance that

one method can achieve.

Fig. 7 compares the best DB index of each method based

on the “knee” method. Note that since the DB index av-

erages over all the worst separated clustering pairs, it is

possible that the DB index has a value greater than 2.

As shown in Figure 7, the communities for the Meetup

EBSN are more cohesive than those for Gowalla LBSN.

One interesting ﬁnding is that users in online Meetup EBSN

communities are more cohesive than users in oﬄine Meetup

EBSN communities (by 0.33), indicating that users tend to

have more similar interests if they belong to same groups,

compared to those who participated similar events. How-

ever, the combination of online and oﬄine interactions does

play an important role in the clustering process, as three

methods LinearCom, GSVD and HeteroClu outperformed

individual networks. The LinearCom is only slightly better

than individual networks (by 0.18) but worse than HeteroClu

(by 0.22), indicating that a simple linear combination can-

not diﬀerentiate heterogeneous types of social interactions

eﬀectively. The GSVD has almost the same performance as

LinearCom, suggesting that after relaxing the constraint on

the unitary matrix of SVD decomposition, the generalized

SVD lost some disambiguation power on clustering. Lastly,

HeteroClu leads the pack in comparisons. It is the only

method that achieved the best DB index (around 1.8) suf-

ﬁciently under 2, indicating that its worst pairs of clusters

were reasonably separated.

6. EBSNS INFORMATION FLOW

In this section, we study how information ﬂows over this

unique network structure. A good scenario that can be used

to examine the information ﬂow on EBSNs is the problem

of recommending users to participate in social events only

based on the topological structure of EBSNs. With this

application, we can study how information ﬂows from one

user to the online/oﬄine friends and how the information

ﬂow pathways latently drive the social event participation

process.

Unlike classic movie/book recommendations, event par-

ticipation recommendation is more challenging due to the

short life time of social events. An event is non-existent un-

til its creation time t

. And after the start time t

of an

event, participation recommendation becomes meaningless.

Due to the very limited history of an event from time t

to t

, event participation recommendation suﬀers from the

cold-start problem heavily.

Now, let’s formally deﬁne the event participation problem

as follows: given an event e, at time t (t

< t < t

), the task

is to predict users who will RSVP “yes” to event e between

t and t

. The EBSN built upon the collective data before t

will serve as the network structure and all the users who re-

sponded “yes” to e between t

and t are the positive training

examples for the prediction, notated as set S

6.1 Event-Centric Diffusion

Not to deviate from our goal of studying the information

ﬂow over the EBSNs’ unique network structures, we only

rely on the topological structure of EBSNs and the already

responded users for event participation prediction.

6.1.1 Basic Event-Centric Diffusion

We design a simple yet eﬃcient event-centric diﬀusion

model for the problem. We deﬁne f

≥ 0 as the initial

score of node u

, where only users in set S (the set of users

already RSVPed “yes”) have f > 0 and the rest of the users

have f = 0. For simplicity, we initialize f = 1/|S| for users

in S. We use the column vector v

= {v

, v

, . . . , v

} to

represent the probabilities that users have been visited after

the k-th diﬀusion step, and v

= f

The basic event-centric diﬀusion, named DIF, can be ex-

pressed as v

k+1

= D·v

, where D deﬁnes the non-symmetric

information transition matrix of a network for time t. Each

For simplicity, the event creator is treated as the ﬁrst user with

RSVP “yes”.

oﬀ

(1) single channel (2) cascaded channels (3) paralleled channels

oﬀ

Figure 8: Typical EBSN information ﬂow patterns.

element in D is deﬁned as d

. If we run the model

on the heterogeneous EBSN, we can use the linearly com-

bined adjacency matrix (Eq. 4). d

is the empirical prob-

ability of information ﬂow from user u

to user u

. Clearly,

6= d

. If u

has a larger degree than u

, the inﬂuence of

on u

is less than that of u

on u

This basic diﬀusion model is event-centric because v

rep-

resents personalized probabilities only corresponding to the

current event e. A similar diﬀusion method has also been

studied by [17] for link prediction. Because this diﬀusion

process does not converge to the stationary distribution of

information ﬂow, a self-loop on every node is necessary; oth-

erwise the information will be diverged far away quickly.

The self-loop weight follows the same deﬁnitions of Eq. 1

and Eq. 2.

6.1.2 Diffusion over EBSNs

An EBSN contains both online and oﬄine social interac-

tions, but the basic diﬀusion model DIF does not take this

heterogeneity into account. Accommodating diﬀerent forms

of social interactions, there exist at least three information

ﬂow patterns, as shown in Figure 8. The online and oﬄine

social networks G

and G

oﬀ

of an EBSN basically deﬁnes

two kinds of channels for the ﬂow of information. Figure 8(1)

depicts the basic diﬀusion model DIF over a single channel

exclusively, whereas Figure 8(2) deﬁne a cascade model, ab-

breviated as DIF-cascade, in which information interchange-

ably ﬂows from one channel to the other. The simplest

cascade diﬀusion model can be deﬁned as v

k+1

= D

· v

where D

is a cascaded transition matrix for time t, and

= D

· D

oﬀ

or D

oﬀ

· D

. Finally, in Figure 8(3), infor-

mation ﬂows over two channels concurrently. We call this

model DIF-parallel. The simplest parallel diﬀusion model is

k+1

= D

· v

, where D

deﬁnes a linearly combined tran-

sition matrix for time t, and D

= γD

+ (1 − γ)D

oﬀ

. The

parameter γ is used to measure the importance of each type

of social interactions. It plays the same role of γ in Eq. 4.

Thus, DIF-parallel is equivalent to DIF on the linearly com-

bined adjacency matrix (Eq. 4). Undoubtedly, there are

more complex information diﬀusion processes (i.e., a mix-

ture of DIF-cascade and DIF-parallel). But we will leave

them for future work.

6.1.3 Community-Based Diffusion

Information is often circulated more rapidly inside its own

community, especially for those small-scale local communi-

ties. As a result, we design a community-based diﬀusion

model in which information tends to, but is not restricted

to, ﬂow within the scope of its own community.

Speciﬁcally, in this model, v

k+1

= D

· v

, where D

deﬁnes the community-based information transition matrix.

Each element of D

is deﬁned as

(

(1−β)w

if u

/∈ C(u

βw

if u

∈ C(u

where C(u

) is the community of u

, β is a parameter used

to control weight of information ﬂows inside its community

versus outside, and N is the normalization factor so that

ij = 1. We name this model DIF-com.

Since DIF-com only adjusts the weights of edges on top

of the basic DIF model (can be seen as a combination with

DIF), it can be further combined with other complex diﬀu-

sion models, including DIF-cascade and DIF-parallel. The

names of the two combinations are DIF-com-cascade and

DIF-com-parallel, respectively. Note that DIF-com on G

based on the linearly combined adjacency matrix (Eq. 4) is

equivalent to DIF-com-parallel.

6.2 Information Flow Evaluation

6.2.1 Experimental Settings

As discussed before, event participation recommendation

suﬀer from a typical cold-start problem. When an event is

created, except for the creator, it is unknown to all the other

users. To simplify the problem, we treat the event creator as

the ﬁrst user who responded “yes” to the event. In evalua-

tion, we can start the recommendation process immediately

after the event creation, or wait for a while until there are a

few responded users. We ﬁrst focus on the latter case: given

a testing event, we set the ﬁrst k responded participants as

the seed users, where k is randomly determined. The former

case is a much harder problem and is examined at the end

of the evaluation.

We split the Meetup data into two sequential parts (cut

around Mar 2011). The ﬁrst part of data (on or before Mar

2011, take up 80%) are used for training and the second part

of data (after Mar 2012, take up 20%) are used for testing.

Given a testing event, we recommend top 5, 10, 20, 50, 100,

200, 400, 800 users to it respectively. We choose to recom-

mend a large number of users, because 1) in practice event

organizers often broadly advertise their events to the public;

and 2) we want to see the long-term trend of such a recom-

mendation system. For the recommended top N users, we

compute recall to evaluate the performance. recall is deﬁned

as the percentage of users who would respond “yes” to the

testing event that are covered by the top N recommenda-

tions. Finally, we average the recall for all testing events

under the same top N.

6.2.2 Compare Event-Centric Diffusion Models with

Classic Baselines

There are two popular baselines found in the prior art

that can be eﬃciently applied to such an event participation

recommendation problem. One is Collaborative Filtering

(CF) [25], and the other is the random walk model [23].

Note that due to the extremely short life time of events, most

supervised recommendation (link prediction) methods suﬀer

from severe sparsity of labeled data. As a result, they do not

apply to the event participation recommendation problem.

For the baseline CF, the users who ever participated in

similar groups or events in the Meetup training data are

recommendation candidates. They are then ranked by their

Jaccard similarities to the responded users. The Jaccard

similarity between two users is simply based on their past

group or event participation count vectors.

For the baseline random walk model, we applied the ran-

dom walk with restart (RWR) model. In the RWR baseline,

there is a certain chance (probability β) with which the in-

5 10 20 50 100 200 400 800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Top N

Recall

DIF

DIF−com

RWR (0)

RWR (0.15)

RWR (0.3)

RWR (0.6)

(a) Online EBSN

5 10 20 50 100 200 400 800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Top N

Recall

DIF

DIF−com

RWR (0)

RWR (0.15)

RWR (0.3)

RWR (0.6)

(b) Oﬄine EBSN

Figure 9: Prediction on individual EBSNs.

formation will ﬂow back to the starting users at each step

of information ﬂow. By setting various β, we have various

RWR baselines with names like RWR (0.3). When β = 0,

RWR downgrades to the basic random walk model.

As both CF and RWR were initially designed for homo-

geneous networks, we compared them with the basic event-

centric diﬀusion models on individual G

and G

oﬀ

in Fig. 9.

From all diﬀusion models on G

in Fig. 9(a) and G

oﬀ

Fig. 9(b), DIF-com outperforms DIF and CF, and RWR

models perform the worst. By soft-restricting information

ﬂow in the same user communities, DIF-com can guarantee

most closely related friends are recommended. The weight-

ing strategies of DIF and CF diﬀer only slightly, thus they

yield similar prediction results. The poor performance of

RWR indicates that identiﬁed network hubs are not rele-

vant to the testing event. By raising return probabilities of

RWR, the prediction performance does not improve much

even with β as high as 0.6. In addition, by comparing

Fig. 9(a) and Fig. 9(b), we ﬁnd the oﬄine EBSN has better

prediction power when N is small but online EBSN gradu-

ally catches up and even surpasses the oﬄine EBSN as N

grows large. This is because oﬄine social interactions are

able to capture closely related friends who are very likely to

participate in the same events, but the recommended users

tend to be regulars to similar events. In comparison, online

social interaction can introduce non-regulars to the events

and increase the coverage of the recommendation.

6.2.3 Compare Various Diffusion Patterns on EBSNs

In the previous section, we showed that DIF-com has the

best recommendation performance for individual online and

oﬄine social networks of an EBSN. As discussed in Sec-

tion 6.1.3, DIF-com actually represents one kind of diﬀusion

pattern on a whole EBSN (equivalent to DIF-com-parallel

based on the linearly combined adjacency matrix (Eq. 4)).

It is thus interesting to further compare various diﬀusion

models we discussed in Section 6.1.2 on the whole EBSNs

(with both online and oﬄine social interactions). All diﬀu-

sion models can be enhanced by communities since DIF-com

has been shown to outperform the rest of the methods in the

previous section. For a fair comparison, we use communi-

ties detected by Algorithm 1 for all methods. The detailed

comparisons are given by Fig. 10. Fig. 10(a) compares three

diﬀusion models over the heterogeneous EBSNs against indi-

vidual online/oﬄine networks. Only the paralleled diﬀusion

model outperforms the online or oﬄine only model. This

means that the joint presence of online and oﬄine social

interactions can improve the prediction performance. The

reason that cascade diﬀusions are worse is because values

are diﬀused twice to those far away users. Similarly, In

5 10 20 50 100 200 400 800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Top N

Recall

DIF−com Online

DIF−com Offline

DIF−com−parallel

DIF−com−cascade (On−>Off)

DIF−com−cascade (Off−>On)

(a) EBSN diﬀusion patterns

5 10 20 50 100 200 400 800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Top N

Recall

DIF−com−parallel

DIF−com−parallel Twice

DIF−com−parallel 3 Times

(b) EBSN recursive diﬀusion

Figure 10: Prediction on the heterogeneous EBSNs.

5 10 20 50 100 200 400 800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Top N

Recall

DIF−com Online

DIF−com Offline

DIF−com−parallel

DIF−com Online Cold Start

DIF−com Offline Cold Start

DIF−com−parallel Cold Start

Figure 11: Comparison to cold-start scenarios.

Fig. 10(b), we see that repeating the parallel diﬀusion model

also deteriorates the performance.

6.2.4 Examine the Effect of Cold-Start

In this section, we would like to examine how the cold-

start phenomena hurts the recommendation performance. It

is well-accepted that as the size of responded users decreases,

the recommendation performance will get worse. We simply

verify this well-known conjecture using Fig. 11. In Fig.11,

the prediction performances for those cold start cases (the

event creator is the only seed for an event) are slightly worse

than random-start cases. However, the recalls achieved by

diﬀusion from a single user are still fairly good, indicating

that using diﬀusion to predict event participation on EBSNs

is satisfactory even on the extreme cold start cases.

7. CONCLUSION

In this paper, we have identiﬁed and formally deﬁned a

new type of social network, EBSN. By using the Meetup

dataset, we studied the unique features of EBSNs includ-

ing basic network properties, community structures and in-

formation ﬂow over EBSNs. Our research revealed many

aspects of EBSNs that are signiﬁcantly diﬀerent from con-

ventional social networks and LBSNs. We hope this paper

paves the way for future studies on this interesting type of

social networks.

Acknowledgements

We would like to thank Jon Kleinberg for helping us nail

down the background of the problem, Bin Gao for his ex-

planation on the related work [13] and Jiang Bian and Mao

Ye for their valuable discussions.

8. REFERENCES

[1] Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong. Analysis

of topological characteristics of huge online social

networking services. In WWW, 2007.

[2] P. S. Bearman, J. Moody, and K. Stovel. Chains of

Aﬀection: The Structure of Adolescent Romantic and

Sexual Networks. American Journal of Sociology, 2004.

[3] C. Borgs, J. Chayes, J. Ding, and B. Lucier. The

hitchhiker’s guide to aﬃliation networks: A game-theoretic

approach. arXiv:1008.1516v1, 2010.

[4] Z. Cheng, J. Caverlee, K. Lee, and D. Sui. Exploring

millions of footprints in location sharing services. In

ICWSM, 2011.

[5] E. Cho, S. A. Myers, and J. Leskovec. Friendship and

mobility: user movement in location-based social networks.

In KDD, 2011.

[6] A. Clauset, C. Shalizi, and M. Newman. Power-law

distributions in empirical data. Arxiv preprint

arxiv:0706.1062, 2007.

[7] D. Crandall, L. Backstrom, D. Cosley, S. Suri,

D. Huttenlocher, and J. Kleinberg. Inferring social ties

from geographic coincidences. PNAS, 2010.

[8] D. Davies and D. Bouldin. A cluster separation measure.

Pattern Analysis and Machine Intelligence, IEEE

Transactions on, 1979.

[9] I. de Sola Pool Manfred. Contacts and inﬂuence. Social

networks, 1979.

[10] I. Dhillon, Y. Guan, and B. Kulis. Kernel k-means: spectral

clustering and normalized cuts. In KDD, 2004.

[11] D. Easley and J. Kleinberg. Networks, Crowds, and

Markets: Reasoning About a Highly Connected World.

Cambridge University Press, 2010.

[12] S. L. Feld. The focused organization of social ties.

American Journal of Sociology, 1981.

[13] B. Gao, T.-Y. Liu, X. Zheng, Q.-S. Cheng, and W.-Y. Ma.

Consistent bipartite graph co-partitioning for

starstructured high-order heterogeneous data co-clustering.

In KDD, 2005.

[14] G. Golub and C. Loan. Matrix Computations. Johns

Hopkins Univ. Press, 1996.

[15] A. K. Jain and R. C. Dubes. Algorithms for Clustering

Data. Prentice-Hall Prentice-Hall advanced reference series,

1988.

[16] S. Lattanzi and D. Sivakumar. Aﬃliation networks. In

STOC, 2009.

[17] R. N. Lichtenwalter, J. T. Lussier, and N. V. Chawla. New

perspectives and methods in link prediction. In KDD, 2010.

[18] A. Mislove, M. Marcon, K. Gummadi, P. Druschel, and

B. Bhattacharjee. Measurement and analysis of online

social networks. In SIGCOMM, 2007.

[19] M. Newman. Scientiﬁc collaboration networks. ii. shortest

paths, weighted networks, and centrality”. Physical Review

E, 2001.

[20] M. E. J. Newman, D. J. Watts, and S. H. Strogatz.

Random graph models of social networks. In National

Academy of Sciences, 2002.

[21] A. Noulas, S. Scellato, C. Mascolo, and M. Pontil. An

empirical study of geographic user activity patterns in

foursquare. In ICWSM, 2011.

[22] J. F. Padgett and C. K. Ansell. Robust Action and the Rise

of the Medici, 1400-1434. The American Journal of

Sociology, 1993.

[23] L. Page, S. Brin, R. Motwani, and T. Winograd. The

pagerank citation ranking: Bringing order to the web. 1999.

[24] T. Sander and S. Seminar. E-associations? using

technology to connect citizens: The case of meetup.com. In

Annual Meeting of the American Political Science

Association, 2005.

[25] B. Sarwar, G. Karypis, J. Konstan, and J. Reidl.

Item-based collaborative ﬁltering recommendation

algorithms. In WWW, 2001.

[26] S. Scellato, A. Noulas, and C. Mascolo. Exploiting place

features in link prediction on location-based social

networks. In KDD, 2011.

[27] J. Shi and J. Malik. Normalized cuts and image

segmentation. TPAMI, 2000.

MFM: A Multiple-Features Model for Leisure Event Recommendation in Geotagged Social Networks

Article

Full-text available

Dec 2023

Event-based social networks (EBSNs) are rich in information about users and leisure events. The willingness of users to participate in leisure events is influenced by many factors such as event time, location, content, organizer, and social relationship factors of users. Event recommendation systems in EBSNs can help leisure event organizers to accurately find users who want to participate in events. However, to address the existing cold-start problems and improve the accuracy of event recommendations, we propose a multiple-feature-based leisure event recommendation model (MFM). We introduce the user’s social contacts into the user preference features and construct a user feature space by integrating the features of the user preferences for events and organizers and preferences of the user’s closest friends. Moreover, considering the behavioral differences between active and inactive users, we extracted the respective features and trained the feature weight models. Finally, the experimental results showed that in comparison with the baseline models, the precision of the MFM is higher by at least 7.9%.

DFGR: Diversity and Fairness Awareness of Group Recommendation in an Event-based Social Network

Article

Full-text available

Aug 2023
NEURAL PROCESS LETT

Yuan Liang

An event-based social network is a new type of social network that combines online and offline networks, and one of its important problems is recommending suitable activities to users. However, the current research seldom considers balancing the accuracy, diversity and fairness of group activity recommendations. To solve this problem, we propose a group activity recommendation approach that considers fairness and diversity perception. Firstly, we calculate activity similarity based on the context and construct an activity similarity graph. We define the weighted coverage on the similarity graph as a submodular function and transform the problem of fair and diverse group activity recommendation into maximizing the weighted coverage on the similarity graph while considering accuracy, fairness, and diversity. Secondly, we employ a greedy algorithm to find an approximate solution that maximizes the weighted coverage with an approximation ratio. Finally, we conducted experiments on two real datasets and demonstrate the superiority of our method compared to existing approaches. Specifically, in the domain of diversity-based recommendation algorithms, our method achieves a remarkable 0.02% increase in recall rate. Furthermore, in the domain of fairness-based recommendation algorithms, our proposed method outperforms the latest approach by 0.05% in terms of overall metrics. These results highlight the effectiveness of our method in achieving a better balance among accuracy, fairness, and diversity.

Joint knowledge graph approach for event participant prediction with social media retweeting

Article

Full-text available

Nov 2023
KNOWL INF SYST

Organized event is an important form of human activity. Nowadays, many digital platforms offer organized events on the Internet, allowing users to be organizers or participants. For such platforms, it is beneficial to predict potential event participants. Existing work on this problem tends to borrow recommendation techniques. However, compared to e-commerce items and purchases, events and participation are usually of a much smaller frequency, and the data may be insufficient to learn an accurate prediction model. In this paper, we propose to utilize social media retweeting activity to enhance the learning of event participant prediction models. We create a joint knowledge graph to bridge the social media and the target domain, assuming that event descriptions and tweets are written in the same language. Furthermore, we propose a learning model that utilize retweeting information for the target domain prediction more effectively. We conduct comprehensive experiments in two scenarios with real-world data. In each scenario, we set up training data of different sizes, as well as warm and cold test cases. The evaluation results show that our approach consistently outperforms several baseline models in both warm and cold tests.

Exploring on Role of Location in Intelligent News Recommendation From Data Analysis Perspective

Article

Full-text available

Jan 2024
INFORM SCIENCES

Location factor of recommender systems has been extensively studied in the past decade. However, there is no research thoroughly analyzing location’s role in news recommendation. In this paper, a comprehensive exploration on role of location in news recommendation is presented. First of all, based on analysis of real news datasets, we find that news recommendation differs from spatial item recommendation. Location affects news consumption behaviors of users with two-fold aspects including geographic feature and semantic feature. Regarding geographic feature, location influences news recommendation according to region rather than latitude-longitude level. Furthermore, interesting news topics are also impacted by semantic feature of location. Semantic feature may play a more positive role than geographic feature. The novel findings consistently manifest that, as non-spatial items, news differ from spatial items in that location influences users' selection in terms of different pattern and degree. In summary, geographic and semantic features influence reading preference through mapping locations into special topics. Changing of location topics leads to varying of reading preference. The news datasets in this paper belong to check in data. NewsREEL dataset is from a company, and it is provided by German researcher. The location data in Twitter dataset is also check in data. NetEase news dataset are collected from NetEase news websites, and the type of location data is city or region.

A self-attention model with contrastive learning for online group recommendation in event-based social networks

Article

Full-text available

Dec 2023
J SUPERCOMPUT

Recently, there has been a surge in the popularity of online groups on event-based social networks (EBSNs) like Meetup and Douban Event. These groups cater to individuals who share common interests, provide comments, and engage in various activities. Our research focuses on online group recommendations, based on which users can conveniently join groups and participate in offline events organized by the groups. Traditional group recommendation methods do not work well in addressing this problem because they lack the ability to deal with the challenges posed by dynamic user interests, sparse supervision signals, and heterogeneous networks simultaneously. The self-attention model with contrastive learning for online group recommendation (SCL4GR) presented in this study exploits user-group sequential data, online and offline networks in a unified framework to predict user preferences for groups. First, a graph encoder is used to capture the high-order social interaction between users. Then, the pattern of dynamic interests is captured by sequence model Transformer. Furthermore, the contrastive learning is employed to derive self-supervision signals from both online and offline networks. We conduct experiments on three real-world datasets. Experimental results show that our SCL4GR consistently outperforms state-of-the-art methods for online group recommendation in EBSNs.

Attentive Implicit Relation Embedding for Event Recommendation in Event-Based Social Network

Article

Feb 2024

Yuan Liang

Organized Event Participant Prediction Enhanced by Social Media Retweeting Data

Conference Paper

Oct 2023

Opinion Diffusion and Analysis on Social Networks

Chapter

Jun 2018

Opinion Diffusion and Analysis on Social Networks

Chapter

Oct 2017

Discovery of User Groups Densely Connecting Virtual and Physical Worlds in Event-Based Social Networks

Article

Full-text available

Jan 2023

An essential task of the event-based social network (EBSN) platform is to recommend events to user groups. Usually, users are more willing to participate in events and interest groups with their friends, forming a particularly closely connected user group. However, such groups do not explicitly exist in EBSN. Therefore, studying how to discover groups composed of users who frequently participate in events and interest groups in EBSN has essential theoretical and practical significance. This article proposes the problem of discovering maximum k fully connected user groups. To address this issue, this article designs and implements three algorithms: a search algorithm based on Max-miner (MMBS), a search algorithm based on two vectors (TVBS) and enumeration tree, and a divide-and-conquer parallel search algorithm (DCPS). The authors conducted experiments on real datasets. The comparison of experimental results of these three algorithms on datasets from different cities shows that the DCPS algorithm and TVBS algorithm significantly accelerate their computational time when the minimum support rate is low. The time consumption of DCPS algorithm can reach one tenth or even lower than that of MMBS algorithm.

Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks1

Article

Full-text available

Jul 2004

This article describes the structure of the adolescent romantic and sexual network in a population of over 800 adolescents residing in a midsized town in the midwestern United States. Precise images and measures of network structure are derived from reports of re- lationships that occurred over a period of 18 months between 1993 and 1995. The study offers a comparison of the structural charac- teristics of the observed network to simulated networks conditioned on the distribution of ties; the observed structure reveals networks characterized by longer contact chains and fewer cycles than ex- pected. This article identifies the micromechanisms that generate networks with structural features similar to the observed network. Implications for disease transmission dynamics and social policy are explored.

A Cluster Separation Measure

Article

Full-text available

May 1979

A measure is presented which indicates the similarity of clusters which are assumed to have a data density which is a decreasing function of distance from a vector characteristic of the cluster. The measure can be used to infer the appropriateness of data partitions and can therefore be used to compare relative appropriateness of various divisions of the data. The measure does not depend on either the number of clusters analyzed nor the method of partitioning of the data and can be used to guide a cluster seeking algorithm.

An empirical study of geographic user activity patterns in foursquare

Article

Jan 2011

The pagerank citation ranking: Bringing order to the web

Article

Jan 1999

L. Page

Exploring millions of footprints in location sharing services

Article

Jan 2011

Z. Cheng

Algorithms for Clustering Data: Prentice Hall

Article

Jan 1988

The Focused Organization of Social Ties

Article

Mar 1981

Scott L. Feld

Sociologists since Simmel have been interested in social circles as essential features of friendship networks. Although network analysis has been increasingly used to uncover patterns among social relationships, theoretical explanations of these patterns have been inadequate. This paper presents a theory of the social organization of friendship ties. The approach is based upon Homans's concepts of activities, interactions, and sentiments and upon the concept of extra-network foci organizing social activities and interaction. The theory is contrasted with Heider's balance theory. Implications for transitivity, network bridges, and density of personal networks are discussed and presented as propositions. The focus theory is shown to help explain patterns of friendships in the 1965-66 Detroit Area Study. This paper is intended as a step toward the development of integrated theory to explain interrelationships between networks and other aspects of social structure. Implications for data analysis are discussed. Sociologists have long recognized the importance of patterns in networks of relations that connect individuals with each other. Simmel (1955) described modern society as consisting of loosely connected social circles of relationships. Granovetter (1973) has indicated the general significance of these social circles for communication, community organization, and social conflict. Various studies have supported this picture of the essential patterns in social networks, including Moreno's sociometry (1953), Milgram's "small world" experiments (1967), and Kadushin's observations (1966). Unfortunately, the study of social networks has often been carried out without concern for the origins in the larger social context. Most network analysis ends with description and labeling of patterns; and when explanations of patterns are offered, they frequently rely upon inherent tendencies within networks to become consistent, balanced, or transitive. As a consequence of such atheoretical and/or self-contained network theoretical approaches, data are collected and data analysis techniques are devised for

Robust Action and the Rise of the Medici

Article

Nov 1992

The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.

Van loan: matrix computations

Article

G. H. Golub

Robust Action and the Rise of the Medici, 1400-1434

Article

May 1993

We analyze the centralization of political parties and elite networks that underlay the birth of the Renaissance state in Florence. Class revolt and fiscal crisis were the ultimate causes of elite consolidation, but Medicean political control was produced by means of network disjunctures within the elite, which the Medici alone spanned. Cosimo de' Medici's multivocal identity as sphinx harnessed the power available in these network holes and resolved the contradiction between judge and boss inherent in all organizations. Methodologically, we argue that to understand state formation one must penetrate beneath the veneer of formal institutions, groups, and goals down to the relational substrata of peoples' actual lives. Ambiguity and heterogeneity, not planning and self-interest, are the raw materials of which powerful states and persons are constructed.

Event-based social networks: linking the online and offline social worlds

Abstract

Recommended publications

Bayes net graphs to understand co-authorship networks?

Time and Space Modeling Method for Social Services

Document Priors Based On Time-Sensitive Social Signals

Empowerment to enhance social integration among mainland chinese migrants in Hong Kong