Conference PaperPDF Available

Complex Network Analysis of a Tourism Content Sharing Network

September 2015

September 2015

DOI:10.1109/SYNASC.2015.67

Conference: 2015 17th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)

Authors:

Alex Becheru

University of Craiova

Costin Badica

University of Craiova

Exemplification of the users that share information regarding a particular tourism entity.

…

Data extraction and preprocessing workflow.

…

XML schema of extracted data.

…

AmFostAcolo question answer interaction network. The vertices' diameters are proportional to their Page-Rank coefficient. The colour of vertices' depict their community, as determined with the modularity algorithm [21]. Links are coloured according to the colour of their source vertice.

…

Degre distribution of vertices. Metric Value Avg. Degree 3.201 Avg. Weighted degree 5.881 Diameter 16 Modularity coefficient 0.48 Number of Communities detected by the modularity algorithm 146 Avg. Clustering coefficient 0.015 Avg. Path length 5.042

…

Figures - uploaded by Costin Badica

Content may be subject to copyright.

Content uploaded by Costin Badica

Content may be subject to copyright.

Complex Network Analysis of a Tourism Content

Sharing Network

Alex Becheru

University of Craiova

Email: becheru@gmail.com

Costin B˘

adic˘

University of Craiova

Email: cbadica@software.ucv.ro

Mih˘

ait¸˘

a Antonie

University of Craiova

Email: mihai.antonie@gmail.com

Abstract—This paper presents results of the analysis of a

tourism information web-site (AmFostAcolo.ro) by using Complex

Networks (CN) methods. The work accomplished here comple-

ments a previous paper, where we discussed data extraction and

modelling into a complex network. Properties of the resulted

network, communities and vertices are looked upon, in order

to extract useful information and detect social phenomena.

Temporal analysis methods are employed for examining the

evolution of the web-site. The results obtained prove the natural

development of the web-site and the usefulness of CN analysis

methods in this scenario.

I. INTRODUCTION

A great interest was shown during the last decade to

the application of information and communication technol-

ogy (ICT) for the development of Smart Tourism business

[1], here understood as ”the application of information and

communication technologies to the tourism sector1. Tourists

are interested to beneﬁt from the availability of advanced ICT

services for knowledge and information management to assist

them in taking informed decisions matching their preferences

with less effort and in shorter time. Tourism companies look

for improving the quality of their services, as well as their

image, based on the feedback gathered from their customers.

Many sources of tourist information can be found, e.g. web-

sites and social media [2], [3]. These sources can be deﬁned as

content sharing networks that allow users to share and make

use of the information provided. In this paper we shall consider

as content the reviews and/or comments expressed in natural

language, stating users’ opinions and experiences regarding

tourism entities.

As data source we chose one of the most popular tourism

information web-sites in Romania, AmFostAcolo2. Registered

users are able to interact with each other trough: (i) echoes, as

well as answers to echoes posted in relation to certain reviews

or comments, and (ii) asking questions and giving answers to

questions about a certain tourism entities.

The focus of this paper is to analyse the content sharing

network extracted from the above mentioned web-site. The

extraction and processing workﬂow has been discussed and

deﬁned in a previous paper [4]. Thus we hope to obtain

valuable information and to asses the usefulness of Complex

Network analysis methods in the context the paper presents.

1http://www.smarttourism.org/

2http://www.amfostacolo.ro

Fig. 1. Exempliﬁcation of the users that share information regarding a

particular tourism entity.

We explicitly use the term user of the web-site and not

customer of a tourism entity, as a user sharing information is

not obliged to have the quality of customer. Let us imagine a

scenario in which user A was a customer of hotel Ah, located

near hotel Bh. User A can share a valuable piece of information

about Bh without being a customer: Bh has a better view of the

sea than hotel Ah. For a better understanding of the differences

between a web-site user and customer of a tourism entity view

Figure. 1.

Throughout this paper we shall be referring to tourism

entities as areas of tourism interest. There is no limitation of

the size of the area, it can refer to a country or to a speciﬁc

hotel. Also there is no limitation of the type of the tourism

entity, it can depict a hotel, aqua park, forest, etc.

This paper is structured as following. The next sections

reviews some background information about complex networks

and metions related works. The third section provides an

insight into the data extraction and processing workﬂow. The

fourth section details the experiments performed and the results

obtained. The ﬁfth section presents related work. Conclusions

and future work are presented in the last section.

II. BACKGROU ND A ND R EL ATED W OR K

In order to understand complex interconnected systems a

new ﬁeld of research emerged Network Science (NS) or Com-

plex Networks Analysis (CNA). The heart of this new research

ﬁeld leverages on Graph Theory and Computer Science.NS

investigates non-trivial features of graph problems that usually

are not addressed by lattice theory or random graphs. The

understanding of such non-trivial features is of high interest,

as they frequently occur in real world problems. The com-

plexity of real world networks comes from the modelling and

evaluation of overlapping and interdependent phenomena, that

are neither purely regular nor purely random. Also complexity

may come with the sheer size of the network itself

There are two important papers standing as the building

blocks of Network Science. Paul Erd˝

os and Alfr´

ed R´

enyi wrote

about random graphs in 1959 [5]. In 1973, Mark Granovetter

discovered the strength of weak ties [6]. A graph usually con-

sists of a number of subgraphs, nodes inside these subgraphs

are tightly connected among them and loosely (weak ties)

connectedwith other subgraphs.One may think that those weak

ties are not relevant, but without their presence the graph of

subgraphs would not exist. CNA emerged at the beginning of

the 1990s as a result of the progress in applied computational

sciences. But the most important factor was the access to

data describing real world networks. The emergence of the

WorldWideWeb, as well as the explosion of the interest in

detailed mapping across many sciences, especially in biology

and economics.

NS can be used in many application domains. For example,

internet companies like Google and Facebook are practically

built on complex networks. In medicine, the spread of diseases

is now studied with the help of CNA [7]. Security forces

map the networks of acquaintances of wanted individuals,maps

which could lead to alternative ways to reach them. The

famous Saddam Hussein was captured using methods from

NS [8]. Large oil companies use a branch of CNA known

as Organisational Network Analysis to enhance the ﬂow of

information exchangewithin the companies [9].

Our literature research on Smart Tourism has found a

generous amount of papers regarding this subject. Especially

a lot of work is aimed at developing recommender systems

for toursim that exploit the recent advances in ICT, including

mobile computing [10], [11], [12], sematinc technologies [13],

[14] and geo-tagged information [15]. A recent survey of

toursim recommender systems is presented in [16].

Other two papers are focused on the analysis of data

extracted from AmFostAcolo [17], [18] . Paper [17] presents

result of sentiment analysis for relating tourist opinion holder

with the review content. Paper [18] explores complex network

representations of tourist reviews for extracting lexical and

quantitative features of the review text. Other works employing

network analysis computational methods in the tourism domain

are [19] and [20].

III. DATA EXTRACT IO N,PROC ES SI NG W OR KFL OW AN D

NE TWORK CREATI ON

This section should also be viewed as previous work

reiteration as the work presented here has been mentioned in

a previous paper [4].

Web-site users are able to interact on AmFostAcolo in

at least two ways. Firstly, users can ask questions and give

answers related to a particular tourism entity. This facility

Fig. 2. Data extraction and preprocessing workﬂow.

is useful whenever users would like to ask something not

necessarily related to a speciﬁc impression. For example, a

user can ask a question related to the tourism entity of “Cheile

Sohodolului”. Secondly, users can post tourist impressions

related to a certain place, so each place becomes a container of

impressions or reviews written by the users registered at Am-

FostAcolo, that visited the place and shared their impressions

on the Web site. Each post can trigger discussions about the

place in question. Other users can post echoes to this article.

Most often these echoes are questions requiring clariﬁcations

or more details about the post. Questions in turn can trigger

answers and so on.

The task of data extraction was achieved using the data

preprocessing workﬂow described in Figure. 2. This task

contains the following activities:

•Data extraction from AmFostAcolo and its conver-

sion to XML. This was achieved by developing a

customized tool based on jsoup library for HTML

processing3.

•Data storage into a relational database developed using

mysql4. This was achieved by developing a customized

tool based on jaxb5library for binding XML data with

their Java implementation.

•Data export to CSV format, to facilitate its further

processing using other tools.

Our main interest in AmFostAcolo data relates to users,

their questions related to a tourism entity, as well as the

answers to these questions. For this purpose we have deﬁned

an XML schema for representing this data in XML. Data

was extracted from AmFostAcolo using the technique of Web

scraping, and then was saved onto a set of target XML ﬁles.

The XML schema of extracted data is presented in Fig-

ure. 3. The root note represents a user, so there is one such

ﬁle for each user registered with AmFostAcolo. For each user

we record the set of questions posted by this user, as well as

their answers, using elements qapost, containing one question,

as well as one answer element. For each answer we must also

record the user that posted this answer, captured using the

qauthor element. Using this simple representation we are able

to capture the interactions established between the site users.

In our network representation users are vertices, and links

represent interactions between users. In order to be more

explicit let us imagine that user A posts a question on the

web-site and user B answers that question. This interaction is

3http://jsoup.org/

4https://www.mysql.com/

5https://jaxb.java.net/

Fig. 3. XML schema of extracted data.

Fig. 4. AmFostAcolo question answer interaction network. The vertices’

diameters are proportional to their Page-Rank coefﬁcient. The colour of

vertices’ depict their community, as determined with the modularity algorithm

[21]. Links are coloured according to the colour of their source vertice.

represented as a directed link having as source vertice B and

target vertice A. In order to understand the evolution of the

interaction we also captured the time of the answer, thus we

can further conduct temporal analysis.

In constructing the network we took in consideration all the

web-site’s users and their respective interactions. By applying

the above mentioned method for network creation we obtained

a directed graph/network with 8017 vertices and 25666 links.

See Figure. 4 for a visual representation of the network.

IV. EXP ER IM EN TS A ND R ES ULTS

The experiments presented in this section were conducted

using the following tools. Gephi is an inateractive visualisation

and exploration platform for all kinds of networks and complex

systems, dynamic and hierarchical graphs [22]. NetworkX is a

Python package for exploration and analysis of networks and

network algorithms [23].

A. Questions to answered

The purpose of this paper is to conduct analysis on the

extracted tourism information sharing network. We hope to

better understand how information is exchanged in such an

environment. A key interest is to observe the development

process of such a network. We can separate our experiments

in 2 categories:

•Experiments that regard analysis of topological aspects

of the network. These analysis shall be conducted in

depth by using 3 levels of granularity: entire network,

community and vertice. The presence of diverse soci-

ological phenomena shall be tested.

•Experiments that regard the evolution of the network

through time. Methods mentioned in the previous

category shall be used but considering the temporal

scale.

The following questions should be answered in the subse-

quent pages:

1) Can we categorise the network as a complex network

type? How does this inﬂuence the web-site’s commu-

nity of users?

2) Are there present any sociological phenomena? What

are their inﬂuence on the content sharing network?

3) Is the community of users stable and resilient to

changes?

4) How did the network form over time?

5) Is the community expanding or contracting?

6) Is there any evidence that a review regarding a

tourism entity, either positive or negative, may have

signiﬁcant inﬂuence on the community.

7) Can we determine trends of user interest, seasonal or

country wise?

B. Network topological aspects

We start by calculating basic entire network metrics [24],

see Table I. As you can see in average a user responds to

questions addressed by 3 other users, with an average number

of almost 6 answers (see row 1&2 of the table). The longest

shortest path between two vertices is 16 (diameter), which

compared to the size of the network (25666 links) has a

small value. This together with the small average path length,

5.042, is an important sign that the small world phenomenon

[25] maybe present in our network. The presence of such

a phenomenon maybe due to the presence of hubs, vertices

with a large number of connections that interconnect various

parts of a network. In order to detect the presence of hubs

we conducted a degree distribution of the vertices. As we can

see in Figure 5 the distribution is of type scale-free, which is

widely encountered in the natural world [26]. We also found

the same distribution when we took in consideration the Page-

Rank [27] coefﬁcient of each vertice. This ﬁndings supports

the presence of hubs, few nodes with large degree.

Another way to test the presence of the small world

phenomenon is to conduct a diffusion experiment, we would

aspect the diffusion to spread in a signiﬁcant part of the

network. For the diffusion experiment we chose as start vertice

a node in the periphery of the graph with out degree 8(double

the average). The diffusion is set to loose 70% of its strength

at each step. Only the vertices up to neighbour of neighbour

can further forward the diffusion, the rest can only receive. As

it can be seen in Figure. 6 the diffusion spread to a signiﬁcant

part of the network, although the start vertice is positioned

in the periphery of the graph. All the above being said, we

can now state that the small world phenomenon is present in

our network. In our context this would mean that information

travels fast and with ease in our tourism information sharing

network. Thus a review, either positive or negative, could get

noticed by a signiﬁcant part of the network.

The presence of hubs may depict the presence of another

social phenomenon, preferential attachment [28]. A set of

processes in which some quantity, typically some form of

wealth or credit, is distributed among a number of individuals

or objects according to how much they already have, so that

those who are already wealthy receive more than those who

are not. In our context this would mean that a post from a very

active user (responded to many questions in the past) weighs

(as in trusted) more that an answer from a typical user.

We continue our analysis by shrinking the granularity from

entire network to communities. For community detection we

use the modularity algorithm [21]. The value of the modularity

lies in the range [1/2,1). It is positive if the number of edges

within groups exceeds the number expected on the basis of

chance. The value of the modularity coefﬁcient for our network

is 0.48, thus we can argue that social communities are present.

The number of communities detected is 146, but this

value is skewed by many small scattered communities (not

connected with the giant component of the network). By

keeping only the communities that form the giant component

we obtained 25 large communities, see Figure. 7. The average

inter community degree is 47.2(the maximum is 48), which

proves that the communities are very well interconnected. For

a better understanding of the inter community connectivity see

the histogram of degree distribution in Figure. 7.

From a visual point our network appears to be of type core-

periphery [29], as there is a central conglomerate of nodes

which appears to be very well connected while the outer parts

of the network are more scattered. To prove the above as true

we had to determine and analyse the core. The core should

have a higher grade of inter edge connection than the rest

of the network and to some order more compact. To extract

the core we eliminated from the network the nodes that have a

degree less than 30 (10 times the average). Thus we obtained a

core network having 300 vertices (3.74% of the total) and 2080

links (12% of the total). The results of the metrics analysis can

be see in Table. II.

As we expected the core is far more connected than the

rest of the network. The average degree is more than 3 times

larger while the average weighted degree is 5 times bigger.

Event the diameter is reduced to almost half the previous value.

See Table. I for the entire network metrics’ values (previous

values). The modularity coefﬁcient has a lower value as it is

harder to determine communities in such a well connected

environment. The above ﬁndings are consistent with the fewer

number of communities. The average clustering coefﬁcient is

more than 4 times larger, while the average path length has

shrunken considerably. All these results prove the existence of

acore-periphery structure in our network.

The core-periphery type of complex networks is very re-

silient to node removals, especially from the core. This means

that our user community is stable, and only a catastrophic event

would lead to a separation of the giant component. The core

acts as a diffusion enhancer, once the information gets here

it is distributed fast in all parts on the network. To become a

member of the core a user has to be very active, so in a sense

the community is meritocratic.

At the granularity level of vertices we were most interested

in analysing geographical points of interests. The majority of

users have stated their location, usually cities or counties. The

large majority of questions on the web-site refer to a tourism

entity, and each tourist entity can be pinpointed to a location.

A location maybe a country,city or a speciﬁc address. Thus

we were able to construct a network were vertices represent

locations. A directed link from vertice A to vertice B was

constructed if a user from location A answered a question

regarding location B.

Table. III depicts the top 5 locations from where questions

are answered and top 5 countries of interest for the web-

site users. The ranking was made base on weighted in degree

respective weighted out degree. Although much more work has

Fig. 5. Degre distribution of vertices.

Metric Value

Avg. Degree 3.201

Avg. Weighted degree 5.881

Diameter 16

Modularity coefﬁcient 0.48

Number of Communities detected by the modularity algorithm 146

Avg. Clustering coefﬁcient 0.015

Avg. Path length 5.042

TABLE I. TABLE CONTAINING BASIC GRAPH METRICS OF THE ENTIRE

NE TWO RK.

Fig. 7. The inter community graph. Each vertice represents a community,

they can be distinguished through their colour. The vertices’ diameter is

proportional to their respective Page-Rannk coefﬁcient. Only the communities

that have inter-community degree larger than 0 are considered.

to be done regarding geographical analysis the current results

are correlated with Romanian tourism market 6.

C. Temporal analysis

As mentioned before in order to conduct the temporal

analysis we had to capture the time of link formation. We

decided to make a snapshot of the network on January 1’st

of each year starting with 2010 till 2015. The results can

6http://vacantalamare.stirileprotv.ro/stiri/romanii-nu-mai-prefera-bulgaria-

topul-destinatiilor-de-vacanta-alese-in-aceasta-vara.html, last visited july 20th

2015

Fig. 8. Degre distribution of communities. Only the communities that have

inter-community degree larger than 0 are considered.

Metric Value

Avg. Degree 10.267

Avg. Weighted degree 28.477

Diameter 9

Modularity coefﬁcient 0.299

Number of Communities detected by the modularity algorithm 10

Avg. Clustering coefﬁcient 0.068

Avg. Path length 3.262

TABLE II. TABLE CONTAINING BASIC GRAPH METRICS OF THE CORE.

Fig. 10. Degre distribution of communities. Only the communities that have

inter-community degree larger than 0 are considered.

be seen in Figure. 9. As it can be seen the network grows

signiﬁcantly from inside out (like a tree), natural growth. The

growth process seams to pursue the following steps:

1) from the current core start developing to the outside,

see year 2010 and 2014.

2) develop connections with the outskirts until they

become a part of the core (as connected as the core),

see years from 2011 to 2013 and 2015.

3) return to step 1.

Further we wanted to analyse the development pace of

the network, is it stepping down or up? As you can see in

Figure. 10 the pace of link formation has been constant, with

a signiﬁcant rise between 2010 and 2012.

V. CONCLUSION AND FUTURE WORK

In conclusion we can argue that due to the Complex Net-

works methods of analysis we could answer all the questions

Fig. 6. AmFostAcolo question answer interaction network with diffusion experiment. The vertices’ diameters are proportional to their Page-Rank coefﬁcient

[27]. The colour of the vertices’ depict if the diffusion has reached them (red) or not (grey). Links are coloured according to the colour of their source vertice.

The arrow points to the start vertice of diffusion.

Top locations for question answering Weighted In-Degree Top country destinations Weighted Out-Degree

Bucharest 1126 Greece 865

Giurgiu 276 Bulgaria 685

Brasov 259 Turkey 394

Iasi 242 Romania 227

Sibiu 198 Egipt 168

TABLE III. TABL E STATIN G TOP 5DEPARTURE LOCATIONS AND DESTINATIONS.

raised in section IV-A. This proves the usefulness of such

methods in the presented case of study.

We were able to state that the complex network type is

core-periphery, and depict the inﬂuence it has especially on the

increased resilience of the network. We could ﬁnd signiﬁcant

evidence for the presence of the small world and preferential

attachment phenomena. We prove the network developed in a

”natural” way from the inside out. We also have shown that

the rate on link creation is on a posibitive stable slope, thus the

community is still it the expansion process. We have shown

that a review, either positive or negative, care be perceived by

a signiﬁcant part of the network very fast due to the type of

the network and to the just mentioned phenomena. We were

able to determine the countries which are of most interest to

the users of the web-site.

We acknowledge that further work is needed. We plan to

expand our data source to several web-sites. Also we plan to do

more detailed analysis on the communities. The geographical

analysis has to be further expanded and the seasonal visitation

trends have to be discovered and analysed. .

ACK NOW LE DG EM EN T

This work was supported by the strategic grant POS-

DRU/159/1.5/2/133255. Project ID 133225(2014), co-ﬁnanced

by the European Social Fund within the Sectorial Operational

Program Human Resources Development 2007-2013.

Fig. 9. Time lapse of the evolution of network as captured on the ﬁrst day of each year. From left to right and top do bottom the respective years are: 2010,

2011, 2012, 2013, 2014, 2015.

REFERENCES

[1] G. B ¨

uY¨

uK¨

oZkan and B. Erg¨

uN, “Intelligent system applications in

electronic tourism,” Expert Systems with Applications, vol. 38, no. 6,

pp. 6586–6598, 2011.

[2] U. Gretzel, H. Werthner, C. Koo, and C. Lamsfus, “Conceptual foun-

dations for understanding smart tourism ecosystems,” Computers in

Human Behavior, 2015.

[3] E. No and J. K. Kim, “Comparing the attributes of online tourism

information sources,” Computers in Human Behavior, 2015.

[4] M. Antonie, C. B˘

adic˘

a, and A. Becheru, “Towards social data ana-

lytics for smart tourism: A network science perspective,” accepted at

RUMOUR-2015 ,Workshop on Social Media and the Web of Linked

Data 18 July 2015, Sibiu, Romania.

[5] A. Renyi and P. Erdos, “On random graphs,” Publicationes Mathemat-

icae, vol. 6, no. 290-297, p. 5, 1959.

[6] M. S. Granovetter, “The strength of weak ties,” American journal of

sociology, pp. 1360–1380, 1973.

[7] A.-L. Barab´

asi, N. Gulbahce, and J. Loscalzo, “Network medicine: a

network-based approach to human disease,” Nature Reviews Genetics,

vol. 12, no. 1, pp. 56–68, 2011.

[8] C. Wilson, “Searching for saddam: Why social network analysis hasnt

led us to osama bin laden,” Slate (February 26, 2010), 2010.

[9] R. L. Cross, J. Singer, S. Colella, R. J. Thomas, and Y. Silverstone,

The organizational network ﬁeldbook: Best practices, techniques and

exercises to drive organizational innovation and performance. John

Wiley & Sons, 2010.

[10] W.-S. Yang and S.-Y. Hwang, “itravel: A recommender system in mobile

peer-to-peer environment,” Journal of Systems and Software, vol. 86,

no. 1, pp. 12–20, 2013.

[11] M. Rodriguez-Sanchez, J. Martinez-Romo, S. Borromeo, and

J. Hernandez-Tamames, “Gat: Platform for automatic context-aware

mobile services for m-tourism,” Expert Systems with Applications,

vol. 40, no. 10, pp. 4154–4163, 2013.

[12] D. Gavalas, C. Konstantopoulos, K. Mastakas, and G. Pantziou, “Mobile

recommender systems in tourism,” Journal of Network and Computer

Applications, vol. 39, pp. 319–333, 2014.

[13] M. Al-Hassan, H. Lu, and J. Lu, “A semantic enhanced hybrid rec-

ommendation approach: A case study of e-government tourism service

recommendation system,” Decision Support Systems, vol. 72, pp. 97–

109, 2015.

[14] ´

A. Garc´

ıa-Crespo, J. L. L´

opez-Cuadrado, R. Colomo-Palacios,

I. Gonz´

alez-Carrasco, and B. Ruiz-Mezcua, “Sem-ﬁt: A semantic based

expert system to provide recommendations in the tourism domain,”

Expert systems with applications, vol. 38, no. 10, pp. 13 310–13 319,

2011.

[15] K. Jiang, H. Yin, P. Wang, and N. Yu, “Learning from contextual

information of geo-tagged web photos to rank personalized tourism

attractions,” Neurocomputing, vol. 119, pp. 17–25, 2013.

[16] J. Borr`

as, A. Moreno, and A. Valls, “Intelligent tourism recommender

systems: A survey,” Expert Systems with Applications, vol. 41, no. 16,

pp. 7370–7389, 2014.

[17] M. Colhon, C. B˘

adic˘

a, and A. S¸endre, “Relating the opinion holder

and the review accuracy in sentiment analysis of tourist reviews,” in

Knowledge Science, Engineering and Management. Springer, 2014,

pp. 246–257.

[18] A. Becheru, C. B˘

adic˘

a, and M. Colhon, “Tourist review analytics using

complex networks.”

[19] R. Baggio, N. Scott, and C. Cooper, “Network science: A review

focused on tourism,” Annals of Tourism Research, vol. 37, no. 3, pp.

802–827, 2010.

[20] D. F. Nettleton, “Data mining of social networks represented as graphs,”

Computer Science Review, vol. 7, pp. 1–34, 2013.

[21] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast

unfolding of communities in large networks,” Journal of Statistical

Mechanics: Theory and Experiment, vol. 2008, no. 10, p. P10008, 2008.

[22] M. Bastian, S. Heymann, and M. Jacomy, “Gephi: An open

source software for exploring and manipulating networks,” 2009.

[Online]. Available: http://www.aaai.org/ocs/index.php/ICWSM/09/

paper/view/154

[23] D. A. Schult and P. Swart, “Exploring network structure, dynamics, and

function using networkx,” in Proceedings of the 7th Python in Science

Conferences (SciPy 2008), vol. 2008, 2008, pp. 11–16.

[24] P. Boldi and S. Vigna, “Axioms for centrality,” Internet Mathematics,

vol. 10, no. 3-4, pp. 222–262, 2014.

[25] S. Milgram, “The small world problem,” Psychology today, vol. 2, no. 1,

pp. 60–67, 1967.

[26] A.-L. Barab´

asi et al., “Scale-free networks: a decade and beyond,”

science, vol. 325, no. 5939, p. 412, 2009.

[27] L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation

ranking: bringing order to the web.” 1999.

[28] M. E. Newman, “Clustering and preferential attachment in growing

networks,” Physical Review E, vol. 64, no. 2, p. 025102, 2001.

[29] D. A. Hojman and A. Szeidl, “Core and periphery in networks,” Journal

of Economic Theory, vol. 139, no. 1, pp. 295–309, 2008.

ResearchGate has not been able to resolve any citations for this publication.

Gephi: An Open Source Software for Exploring and Manipulating Networks

Article

Full-text available

Mar 2009

Gephi is an open source software for graph and network analysis. It uses a 3D render engine to display large networks in real-time and to speed up the exploration. A flexible and multi-task architecture brings new possibilities to work with complex data sets and produce valuable visual results. We present several key features of Gephi in the context of interactive exploration and interpretation of networks. It provides easy and broad access to network data and allows for spatializing, filtering, navigating, manipulating and clustering. Finally, by presenting dynamic features of Gephi, we highlight key aspects of dynamic network visualization.

Tourist review analytics using complex networks

Conference Paper

Full-text available

Sep 2015

A number of techniques for Natural Language Processing (shortly, NLP) based on graph representations were developed. Usually they target a specific NLP task, such as: text summarisation, syntactic parsing, word sense disambiguation, ontology construction, sentiment and subjectivity analysis, or text clustering. In this paper we explore complex network representation of tourist reviews for extracting lexical and quantitative features of the review text. The most important contribution of our proposal consists of defining a new method for keywords extraction using Complex Network ranking metrics.

Relating the Opinion Holder and the Review Accuracy in Sentiment Analysis of Tourist Reviews

Conference Paper

Full-text available

Oct 2014

In this paper we propose a sentiment classification method for the categorization of tourist reviews according to the sentiment expressed. We also give the results of the application of our sentiment analysis method on a real data set extracted from the AmFostAcolo tourist review Web site. In our analysis we were focused on investigating the relation between the opinion holder and the accuracy of the review sentiment with the review score. Based on our initial experimental results we concluded that specific characteristics of the opinion holder, like for example his or her reputation, might relate to the accuracy of the opinions expressed in his or her reviews.

Conceptual foundations for understanding smart tourism ecosystems

Article

Full-text available

Apr 2015
COMPUT HUM BEHAV

Using digital ecosystems and smart business networks as conceptual building blocks, this paper defines, describes and illustrates the idea of a smart tourism ecosystem (STE). It further draws on conceptualizations of smart technologies, smart cities and smart tourism to envision new ways in which value is created, exchanged and consumed in the STE. Technologies essential to the functioning of an STE are described and it is argued that data emerging from these technologies are the driver for new business models, interaction paradigms and even new species. Critical questions regarding the need for regulatory intervention and innovative research are raised.

A Semantic Enhanced Hybrid Recommendation Approach: a Case Study of E-government Tourism Service Recommendation System

Article

Full-text available

Feb 2015
DECIS SUPPORT SYST

Recommender systems are effectively used as a personalized information filtering technology to automatically predict and identify a set of interesting items on behalf of users according to their personal needs and preferences. Collaborative Filtering (CF) approach is commonly used in the context of recommender systems; however, obtaining better prediction accuracy and overcoming the main limitations of the standard CF recommendation algorithms, such as sparsity and cold-start item problems, remain a significant challenge. Recent developments in personalization and recommendation techniques support the use of semantic enhanced hybrid recommender systems, which incorporate ontology-based semantic similarity measure with other recommendation approaches to improve the quality of recommendations. Consequently, this paper presents the effectiveness of utilizing semantic knowledge of items to enhance the recommendation quality. It proposes a new Inferential Ontology-based Semantic Similarity (IOBSS) measure to evaluate semantic similarity between items in a specific domain of interest by taking into account their explicit hierarchical relationships, shared attributes and implicit relationships. The paper further proposes a hybrid semantic enhanced recommendation approach by combining the new IOBSS measure and the standard item-based CF approach. A set of experiments with promising results validates the effectiveness of the proposed hybrid approach, using a case study of the Australian e-government tourism services.

The pagerank citation ranking: Bringing order to the web

Article

Jan 1999

L. Page

Clustering and preferential attachment in growing networks

Article

Jan 2001

M. E. J. Newman

Towards Social Data Analytics for Smart Tourism: A Network Science Perspective

Conference Paper

Apr 2016

In this paper we present our preliminary results regarding collecting, processing and visualizing relations between the user comments that were posted on Smart Tourism Web sites. The focus of this paper is on investigating the user interactions generated by expressing questions and answers containing the users’ impressions and opinions about the attractions offered by various tourism destinations. We propose a prototype system based on the design of a conceptual data model and of the development of a data processing workflow that allows to capture, to analyze and to query the implicit social network that was determined by the relations between user comments, using specialized software tools for graph databases and complex networks analytics.

Comparing the attributes of online tourism information sources

Article

Mar 2015
COMPUT HUM BEHAV

Ongoing developments in information technology (IT), particularly with respect to the Internet, have led to changes in the way tourism-related information is distributed. These changes have affected the planning and consumption patterns of tourists before and during their trips. Since the majority of tourists retrieve information from multiple information sources on the web, it is essential to define the differences in these sources and identify the distinct characteristics or properties of each source in order to understand the needs and tendencies of tourists. This study classifies online tourism information sources into four types: blogs, public websites, company websites, and social media websites. Five website attributes are identified: accessibility, security, information-trust, interaction, and personalization. This study uses data from 61 participants. Each participant answered all of the questions for four different information sources. This study then conducts an analysis of variance (ANOVA) test and a multiple comparison Scheffé test to verify differences between groups. Based on these five attributes, the results of this multiple comparison show that the overall mean values are relatively high in personal blogs, while security is the dominant attribute for public websites. The mean values of all five attributes were relatively lower in SNSes compared to the other sources.

Intelligent tourism recommender systems: A survey

Article

Nov 2014
EXPERT SYST APPL

Complex Network Analysis of a Tourism Content Sharing Network

Figures

Recommended publications

Empirical study on tourism network in China

A complex network analysis of global tourism flows

Building a Spatially-Embedded Network of Tourism Hotspots From Geotagged Social Media Data

Network Analysis in Tourism Distribution Channels