Conference PaperPDF Available

Using Big Data to study the link between human mobility and socio-economic development

October 2015

October 2015

DOI:10.1109/BigData.2015.7363835

Conference: IEEE International Conference on Big Data
At: Santa Clara, CA, USA

Authors:

Luca Pappalardo

Italian National Research Council

Dino Pedreschi

Università di Pisa

Zbigniew Smoreda

Orange Labs, Paris, France

Fosca Giannotti

Italian National Research Council

Big Data offer nowadays the potential capability of creating a digital nervous system of our society, enabling the measurement, monitoring and prediction of relevant aspects of socioeconomic phenomena in quasi real time. This potential has fueled, in the last few years, a growing interest around the usage of Big Data to support official statistics in the measurement of individual and collective economic well-being. In this work we study the relations between human mobility patterns and socioeconomic development. Starting from nationwide mobile phone data we extract a measure of mobility volume and a measure of mobility diversity for each individual. We then aggregate the mobility measures at municipality level and investigate the correlations with external socioeconomic indicators independently surveyed by an official statistics institute. We find three main results. First, aggregated human mobility patterns are correlated with these socioeconomic indicators. Second, the diversity of mobility, defined in terms of entropy of the individual users' trajectories, exhibits the strongest correlation with the external socioeconomic indicators. Third, the volume of mobility and the diversity of mobility show opposite correlations with the socioeconomic indicators. Our results, validated against a null model, open an interesting perspective to study human behavior through Big Data by means of new statistical indicators that quantify and possibly " nowcast " the socioeconomic development of our society.

The correlation between aggregated mobility diversity and aggregated mobility volume. We split the municipalities into ten equal- sized groups according to the deciles of the measures on the x axis. For each group, we compute the mean and the standard deviation of the measures on the y axis and plot them through the black error bars. ρ indicates the Pearson correlation coefficient between the two measures (p-value < 0 . 001 ). We observe a negative correlation suggesting that high mobility entropy is linked to low mobility volume, and vice versa.

…

Figures - uploaded by Luca Pappalardo

Content may be subject to copyright.

Content uploaded by Luca Pappalardo

Content may be subject to copyright.

Using Big Data to study the link between

human mobility and socio-economic development

Luca Pappalardo

Department of Computer Science

University of Pisa, Italy

Email: lpappalardo@di.unipi.it

Dino Pedreschi

Department of Computer Science

University of Pisa, Italy

Email: pedre@di.unipi.it

Zbigniew Smoreda

SENSE

Orange Lab, France

Email: zbigniew.smoreda@orange.com

Fosca Giannotti

Institute of Information Science and Technologies

National Research Council (CNR), Italy

Email: fosca.giannotti@isti.cnr.it

Abstract—Big Data offer nowadays the potential capability

of creating a digital nervous system of our society, enabling the

measurement, monitoring and prediction of relevant aspects of

socio-economic phenomena in quasi real time. This potential has

fueled, in the last few years, a growing interest around the usage

of Big Data to support ofﬁcial statistics in the measurement of

individual and collective economic well-being. In this work we

study the relations between human mobility patterns and socio-

economic development. Starting from nation-wide mobile phone

data we extract a measure of mobility volume and a measure

of mobility diversity for each individual. We then aggregate the

mobility measures at municipality level and investigate the cor-

relations with external socio-economic indicators independently

surveyed by an ofﬁcial statistics institute. We ﬁnd three main

results. First, aggregated human mobility patterns are correlated

with these socio-economic indicators. Second, the diversity of

mobility, deﬁned in terms of entropy of the individual users’

trajectories, exhibits the strongest correlation with the external

socio-economic indicators. Third, the volume of mobility and the

diversity of mobility show opposite correlations with the socio-

economic indicators. Our results, validated against a null model,

open an interesting perspective to study human behavior through

Big Data by means of new statistical indicators that quantify and

possibly “nowcast” the socio-economic development of our society.

I. INTRODUCTION

The Big Data originating from the digital breadcrumbs of

human activities, sensed as a by-product of the ICT systems

we use everyday, allow us to scrutinize the ground truth of

individual and collective behavior at an unprecedented detail

[39]. Multiple dimensions of our social life have Big Data

proxies nowadays. Our social relationships leave traces in the

network of our phone or email contacts, in the friendship

links of our favorite social networking site. Our shopping

patterns leave traces in the transaction records of our pur-

chases. Our movements leave traces in the records of our

mobile phone calls, in the GPS tracks of our on-board navi-

gation systems. Sensing Big Data at a societal scale has the

potential of providing a powerful social microscope, which

can help us understand many complex and hidden socio-

economic phenomena. Such challenge clearly requires high-

level analytics, modeling and reasoning across all the social

dimensions above, an activity that it is often referred to as

“social mining”: the task of making sense of Big Data by

extracting meaningful information from large, messy and noisy

data [17]. In recent years, also stimulated by national ofﬁcial

statistics institutes and the United Nations [41], researchers

from different disciplines have started to use Big Data and

social mining to support ofﬁcial statistics in the measurement

of individual and collective well-being [8][37]. The majority

of works in literature focus on the analysis of mobile phone

data to study the relations between communications patterns

and well-being [11][4]. In this paper we analyze Big Data from

mobile phones to study the link between individuals’ mobility

patterns and the socio-economic development of cities. We try

to answer the following intriguing question: Can we monitor

and possibly predict the socio-economic development of cities

just by observing human movements of their residents through

the lens of Big Data? The answer to this fascinating question,

as we show in this paper, is related to the concepts of mobility

volume and mobility diversity.

We know that bio-diversity is crucial to the health of natural

ecosystems and for the balance, or well-being, of plant and

animal species that inhabit them. Diversity is a key concept

also for the social ecosystems: from Francis Galton who

showed that the diversity of opinion in a crowd is essential to

answer difﬁcult questions [15] to more recent works showing

that the diversity of social contacts is associated to socio-

economic indicators of well-being [11][19][4], social diversity

has proven to be essential in many contexts [38][23]. In this

paper we argue that diversity is a key concept also for the

mobility ecosystem, and that the volume and the diversity of

mobility patterns have high predictive power with respect to

the socio-economic development of cities.

Starting from large-scale mobile phone data we quantify

the relations between human mobility and socio-economic de-

velopment in France using municipality-level ofﬁcial statistics

as external comparison measurements. We ﬁrst deﬁne two

individual measures over mobile phone data which describe

two aspects of individual mobility behavior: the volume of

mobility, i.e. the characteristic traveled distance of an individ-

ual [18], and the diversity of mobility, i.e. the diversiﬁcation

of movements of an individual over her locations [36]. Each

individual measure is computed for each of the several million

users in our dataset based on their locations and calls as

recorded in the mobile phone data. We then aggregate the

two individual measures at the level of French municipalities

and explore the correlations between the aggregated measures

and external indicators covering different aspects of socio-

economic development: wealth, employment, education and

deprivation. We ﬁnd that both mobility measures correlate

with the external socio-economic indicators, and in particular

the measure of mobility diversity shows much stronger cor-

relations. We validate our results against a null model which

produces zero correlations allowing us to reject the hypothesis

that our results occurred by chance. Finally, we observe that

at municipality level mobility volume and mobility diversity

show negative correlations and opposite correlations with the

socio-economic indicators, suggesting that they play different

roles in the socio-economic development of cities.

The importance of our ﬁndings is twofold. On one side,

we show that mobility diversity and mobility volume are key

concepts for the well-being of our cities that can be used

to understand deeply the complexity of our interconnected

society. On the other side, our results reveal the high potential

of Big Data in providing representative, relatively inexpensive

and readily available measures as proxies of socio-economic

development and well-being. New statistical indicators can

be deﬁned to describe the well-being of a territory, in order

to support ofﬁcial statistics when such measurements are not

possible using traditional censuses and surveys [41][27].

The paper is organized as follows. Section II revises the

main works in the study of Big Data for measuring devel-

opment and well-being. Section III and Section IV present

respectively the mobile phone data and the socio-economic

indicators we use in our study. Section V introduces the

measures of mobility and explains how to compute them on

the mobile phone data. In Section VI we show the main results

of our study and discuss them in Section VII. Finally, Section

VIII concludes the paper discussing open lines of new research.

II. RE LATE D WOR K

Big Data offer nowadays the potential capability of creating

a digital nervous system of our society, enabling the measure-

ment, monitoring and prediction of relevant aspects of human

behavior [17]. For example the availability of massive digital

traces of human whereabouts, such as GPS traces from private

vehicles and mobile phone data, has offered novel insights

on the quantitative patterns characterizing human mobility

[6][18][16]. Studies from different disciplines document a

stunning heterogeneity of human travel patterns as measured

by the so-called radius of gyration [18][28], and at the same

time observe a high degree of predictability as measured by

the mobility entropy [36][12]. The patterns of human mobility

have been used to build generative models of individual human

mobility [21][29], generative models to describe human migra-

tion ﬂows [34], methods for proﬁling individuals according

to their recurrent and total mobility patterns [29], methods

to discover geographic borders according to recurrent trips of

private vehicles [33], methods to predict the formation of social

ties [7][40], and classiﬁcation models to predict the kind of

activity associated to individuals’ trips on the only basis of

the observed displacements [22][20][32].

The last few years have also witnessed a growing interest

around the usage of Big Data to support ofﬁcial statistics in the

measurement of individual and collective well-being [8][37].

Even the United Nations, in two recent reports, stimulate the

usage of Big Data to investigate the patterns of phenomena

relative to people’s health and well-being [41][27]. The vast

majority of works in the context of Big Data for ofﬁcial statis-

tics are based on the analysis of mobile phone data, the so-

called CDR (Call Detail Records) of calling and texting activity

of users. Mobile phone data, indeed, guarantee the repeatability

of experiments on different countries and geographical scales

since they can be retrieved nowadays in every country due to

their worldwide diffusion [3]. A set of recent works use mobile

phone data as a proxy for socio-demographic variables. Deville

et al., for example, show how the ubiquity of mobile phone

data can be exploited to provide accurate and detailed maps

of population distribution over national scales and any time

period [10]. Brea et al. study the structure of the social graph

of mobile phone users of Mexico and propose an algorithm for

the prediction of the age of mobile phone users [5]. Another

recent work use mobile phone data to study inter-city mobility

and develop a methodology to detect the fraction of residents,

commuters and visitors within each city [14].

A lot of effort has been put in recent years on the usage

of mobile phone data to study the relationships between

human behavior and collective socio-economic development.

The seminal work by Eagle et al. analyzes a nationwide

mobile phone dataset and shows that, in the UK, regional

communication diversity is positively associated to a socio-

economic ranking [11]. Gutierrez et al. address the issue

of mapping poverty with mobile phone data through the

analysis of airtime credit purchases in Ivory Coast [19].

Blumenstock shows a preliminary evidence of a relationship

between individual wealth and the history of mobile phone

transactions [4]. Decuyper et al. use mobile phone data to study

food security indicators ﬁnding a strong correlation between

the consumption of vegetables rich in vitamins and airtime

purchase [9]. Frias-Martinez et al. analyze the relationship

between human mobility and the socio-economic status of

urban zones, presenting which mobility indicators correlate

best with socio-economic levels and building a model to

predict the socio-economic level from mobile phone traces

[13]. Lotero et al. analyze the architecture of urban mobility

networks in two Latin-American cities from the multiplex per-

spective. They discover that the socio-economic characteristics

of the population have an extraordinary impact in the layer

organization of these multiplex systems [24]. Amini et al. use

mobile phone data to compare human mobility patterns of

a developing country (Ivory Coast) and a developed country

(Portugal). They show that cultural diversity in developing

regions can present challenges to mobility models deﬁned in

less culturally diverse regions [1]. Smith-Clarke at al. analyze

the aggregated mobile phone data of two developing countries

and extract features that are strongly correlated with poverty

indexes derived from census data [35].

Other recent works use different types of mobility data

to show that Big Data on human movements can be used

to support ofﬁcial statistics and understand people’s purchase

needs. Pennacchioli et al. for example provide an empirical

evidence of the inﬂuence of purchase needs on human mobility,

analyzing the purchases of an Italian supermarket chain to

show a range effect of products: the more sophisticated the

needs they satisfy, the more the customers are willing to travel

[30]. Marchetti et al. perform a study on a regional level

analyzing GPS tracks from cars in Tuscany to extract measures

of human mobility at province and municipality level, ﬁnding a

strong correlation between the mobility measures and a poverty

index independently surveyed by the Italian ofﬁcial statistics

institute [25].

III. MOBILE PHO NE DATA

Mobile phones are nowadays very common technological

devices carried out by individuals in their daily routine, offer-

ing a good proxy to study the patterns of human mobility. In

our study, we exploit the access to a dataset of Call Detail

Records (CDR) gathered by Orange mobile phone opera-

tor, recording 200 million calls made during 45 days (from

2007/09/01 to 2007/10/15) by 20 million anonymized users in

France. CDRs collect geographical, temporal and interaction

information on mobile phone use and show a great potential

to empirically investigate human dynamics on a society wide

scale [18][2][26]. Each time an individual makes a call or

sent a text message the mobile phone operator registers the

connection between the caller and the callee, the time of the

phone activity and the phone tower communicating with the

served phone, allowing to reconstruct the user’s time-resolved

trajectory [18]. Table I shows the format of CDR data and

tower location data in our dataset. To make sure that users’

private information are protected, all the users are anonymized

by translating their identiﬁers into hash formats. The item

“timestamp” records the exact time of the phone activity, while

“tower” is the identiﬁer of wireless tower that is serving the

caller’s call or text message. The item “mode” is simply used

to distinguish between calls and text messages.

(a)

caller callee timestamp tower mode

4F80460 4F80331 2007/09/10 23:34 36 call

2B01359 9H80125 2007/10/10 01:12 38 SMS

2B19935 6W1199 2007/10/10 01:43 38 call

(b)

tower latitude longitude

36 49.54 3.64

37 48.28 1.258

38 48.22 -1.52

TABLE I. THE FORMATS OF CDR DATA (A)AND TOW ER DATA (B).

To focus on individuals with reliable statistics we carry

out some preprocessing steps. First, we select only users with

a call frequency higher than a threshold f=N/45 >0.5,

where Nis the number of calls made by the user and 45 days

is the length of our period of observation, we then delete all

the users with less than one call every two days (in average

over the observation period). The resulting dataset contains the

mobility trajectories of 6 million active users.

IV. SOCIO-E CONOMIC DATA

As external socio-economic indicators, we use a dataset

provided by the French National Institute of Statistics and

Economic Studies (INSEE) about socio-economic indicators

in 2007 for all the French municipalities with more than

1,000 ofﬁcial residents. We collect data about four aspects

of socio-economic development: (i) per capita income, the

mean income in a given municipality; (ii) education rate, the

fraction of residents of a municipality with primary education

only; (iii) unemployment rate, the ratio between unemploy-

ment individuals and all the residents of a municipality; (iv)

deprivation index, constructed by selecting among variables

reﬂecting individual experience of deprivation and combining

them into a single score by a linear combination with speciﬁc

choices for coefﬁcients [31]1:

deprivation = 0.11 ×Overcrowding

+ 0.34 ×No access to electric heating

+ 0.55 ×Non-owner

+ 0.47 ×Unemployment

+ 0.23 ×Foreign nationality

+ 0.52 ×No access to a car

+ 0.37 ×Unskilled worker-farm worker

+ 0.45 ×Household with 6 + persons

+ 0.19 ×Low level of education

+ 0.41 ×Single-parent household.

Preliminary validation showed a high association between

the French deprivation index and both income and education

values in French municipalities, partly supporting its ability

to measure socio-economic status [31]. Figure 1 shows the

distribution of the four socio-economic indicators across the

French municipalities.

V. MEASURING HUMAN MOBILITY

Starting from the trajectories of an individual we consider

two aspects of individual mobility: the volume of mobility, i.e.

how large the typical distance traveled by an individual is, and

the diversity of mobility, i.e. how the trips of an individual are

distributed over the locations visited. The radius of gyration rg

is a measure of mobility volume and indicates the characteristic

distance traveled by an individual [18][28][29]. It characterizes

the spatial spread of the phone towers visited by an individual

ufrom her center of mass (i.e. the weighted mean point of the

phone towers visited by an individual), deﬁned as:

rg(u) = s1

i∈L

ni(ri−rcm)2(1)

where Lis the set of phone towers visited by the individual,

niis the individual’s visitation frequency of phone tower i,

N=Pi∈Lniis the sum of all the single frequencies, ri

and rcm are the vectors of coordinates of phone tower i

and center of mass respectively. To clarify the concept, let

us consider Figure 2 which displays the radius of gyration

of two individuals in our dataset. User Atravels between

1The variables used to compute the deprivation index (Overcrowding, No

access to electric heating, etc.) refer to socio-economic status in 2007. We used

the procedure in [31] to compute the deprivation index on these variables.

(a) (b) (c) (d)

Fig. 1. The distribution of socio-economic variables across the French municipalities. (a) Distribution of logarithm of per capita income; (b) distribution

of education rate; (c) distribution of unemployment rate; (d) distribution of deprivation index. We observe that all the distributions show clear peaks highlighting

the presence of typical socio-economic values across the French municipalities.

locations that are close to each other, resulting in a low radius

of gyration rg(A). In contrast, user Bhas a large radius of

gyration since the locations she visits are far apart from each

other. Figure 3a shows the distribution of radius of gyration

across the individuals in our dataset. The distribution is well

approximated by a heavy tail distribution indicating a large

variability of the radii, a conﬁrmation of previous results on

both GSM data [18] and GPS data [28].

home%loca)on%

center%of%mass%

radius%of%gyra)on%

(a)

(b)

Fig. 2. The radius of gyration of two users in our dataset. The ﬁgure

shows the spatial distribution of phone towers (circles). The size of circles is

proportional to their visitation frequency, the red location indicates the most

frequent location L1(the location where the user makes the highest number

of calls during nighttime). The cross indicates the position of the center of

mass, the black dashed line indicates the radius of gyration. User Ahas a

small radius of gyration because she travels between locations that are close

to each other. User Bhas high radius of gyration because the locations she

visits are far apart from each other.

We measure the mobility diversity of an individual uby

using the Shannon entropy [36]:

S(u) = −Pe∈Ep(e) log p(e)

log N(2)

where e= (a, b)represents a trip between an origin phone

tower and a destination phone tower, Eis the set of all the

possible origin-destination pairs, p(e)is the probability of

observing a movement between phone towers aand b, and

Nis the total number of trajectories of individual u. Mobility

entropy is high when an individual performs many different

trips from a variety of origins and destinations; it is low when

(a) (b)

Fig. 3. The distributions of radius of gyration and mobility entropy of

individuals in our dataset. (a) Distribution of radius of gyration. We observe

a heavy-tail distribution indicating a large variability of radius of gyration

across the population. (b) Distribution of mobility entropy, denoting a high

mean degree of unpredictability of human mobility patterns.

she performs a small number of recurring trips. To clarify

the concept let us consider Figure 4 which shows a network

visualization of the mobility entropy of two individuals in

our dataset. In the ﬁgure nodes represent phone towers, edges

represent trips between two phone towers, and the size of edges

is proportional to the number of trips performed on the edge.

User Xhas low mobility entropy since she distributes her trips

on a few preferred edges. Conversely user Yhas high mobility

entropy because she distributes her trips across many equal-

sized edges. Mobility entropy also quantiﬁes the possibility

to predict individual’s future whereabouts. Individuals having

a very regular movement pattern possess a mobility entropy

close to zero and their whereabouts are rather predictable (the

case of user X). Conversely, individuals with a high mobility

entropy are less predictable (the case of user Y). Figure 3b

shows the distribution of mobility entropy across the users in

our dataset, and indicates a high mean degree of predictability

of individual human mobility patterns [36].

The most frequented location L1(u)is the place where

an individual uis found with the highest probability when

stationary, most likely her home. In Figure 2 the red circles

indicate L1(A)and L1(B), i.e. the phone towers where users

Aand Bmake the highest number of calls during the period

of observation.

(a)

(b)

Fig. 4. The mobility entropy of two users in our dataset. Nodes represent

phone towers, edges represent trips between two phone towers, the size of

nodes indicates the number of calls of the user managed by the phone tower,

the size of edges indicates the number of trips performed by the user on the

edge. User Xhas low mobility entropy because she distributes the trips on

a few large preferred edges. User Yhas high mobility entropy because she

distributes the trips across many equal-sized edges.

VI. CO RR EL ATIO N ANALYSI S

We compute the two mobility measures for each individual

on the CDR data. Due the size of the dataset, we use the

MapReduce paradigm implemented by Hadoop to distribute

the computation across a cluster of coordinated nodes and

reduce the time of computation. We then aggregate the in-

dividual measures at the municipality level through a two-

step process: (i) we assign to each user ua home location

L1(u), i.e. the phone tower where the user performs the

highest number of calls during nighttime (from 10 pm to

7 am) [31]; (ii) based on these home locations, we assign

each user to the corresponding municipality with standard

Geographic Information Systems techniques. We aggregate

radius of gyration and mobility entropy at municipality level by

taking the mean, median and standard deviation values across

the population of users assigned to that municipality. We obtain

a set of 5,100 municipalities each one with the associated two

aggregated indicators.

We investigate the correlations between the aggregated

mobility measures and the four external socio-economic in-

dicators presented in Section IV. Table II summarizes the

correlation between the aggregated mobility measures and the

socio-economic indicators. Four main results emerge. First,

mobility diversity is a better predictor for socio-economic

development than mobility volume (Figure 5 and Table II).

Mobility diversity indeed has much stronger correlations than

mobility volume regardless the type of aggregation (Table

II). Secondly per capita income, primary education rate and

deprivation index show stronger correlations with the mobility

measures than the unemployment rate. Third, mobility diver-

sity and mobility volume show opposite correlations with the

socio-economic indicators: where the correlation is positive

for mobility diversity, the same correlation is negative for

mobility volume, and vice versa. Figure 6 provides another

way to observe the relations between mobility diversity and

socio-economic development. We split the municipalities in

deciles based on the values of deprivation index, and for

each decile we compute the distributions of mobility entropy

at municipality level. We observe that as the deciles of the

economic values increase both the mean and the variance of

the distribution change, consistently with the plots of Figure

5a.

In order to test the signiﬁcance of the correlations ob-

served on the empirical data, we compare our ﬁndings with

the results produced by a null model where we randomly

distribute the users over the French municipalities. We ﬁrst

extract uniformly Nusers from the dataset and assign them

to a random municipality with a population of Nusers. We

then aggregate the individual diversity measures of the users

assigned to the same municipality. We repeat the process 100

times and take the mean of the aggregated values of each

municipality produced in the 100 experiments. The outcomes

of the null model have zero correlations with all the socio-

economic indicators, allowing us to reject the hypothesis that

our results occurred by chance.

measure DI PCI PER UR

mean S-0.43 0.49 -0.49 -0.17

mean rg0.01 -0.25 0.01 -0.04

median S-0.43 0.48 -0.47 -0.17

median rg0.16 -0.21 0.47 -0.1

std S0.20 -0.26 0.27 0.11

std rg0.01 0.28 -0.21 0.13

TABLE II. CO RRE LATI ONS B ET WEE N AGG REG ATED M OB ILI TY

MEASURES AND SOCIO-EC ONO MIC I ND ICATO RS .

VII. DISCUSSION OF THE RES ULTS

The most remarkable result in our study is the observation

that human mobility, and mobility diversity in particular, is

associated with socio-economic indicators on a municipality

scale. To be speciﬁc, on a municipality level mobility entropy

is positively correlated with per capita income and negatively

correlated with deprivation index, primary education rate and

unemployment rate (Figure 5). Generalizing our empirical ﬁnd-

ings, we state that a greater diversiﬁcation of human mobility is

linked to a higher overall wealth, to a more educated territory

and to a lower level of deprivation. Remarkable is that a

systematic variation of the mobility entropy distribution exists

across geographical units deﬁned on socio-economic indicators

(Figure 6), delineating subpopulations where a different distri-

bution of entropy emerges based on the occurrence of socio-

economic indicators. This is an important ﬁnding when com-

pared to Song et al. [36], a seminal work on the predictability

of human mobility, which states that mobility entropy is very

stable across different subpopulations delineated by personal

characteristics like gender or age group. The contrast between

our ﬁndings and the result of Song et al. suggests that socio-

economic situations on a city scale are more related to in-

dividual mobility than individual demographic characteristics.

The observed variation also suggests a relation between socio-

economic development and predictability: people resident in

more developed and richer territories show a higher mobility

entropy and hence more unpredictable mobility patterns.

Although the relations between mobility diversity and

socio-economic indicators appear clearly, it is difﬁcult to

formulate a hypothesis to explain their connections. Without

a doubt, the relation between socio-economic indicators and

(a) (b) (c) (d)

(e) (f) (g) (h)

Fig. 5. The correlations between human mobility measures and socio-economic indicators: (a) mobility entropy vs deprivation index; (b) mobility entropy

vs logarithm of per capita income; (c) mobility entropy vs education rate; (d) mobility entropy vs unemployment rate; (e) radius of gyration vs deprivation

index; (f) radius of gyration vs logarithm of per capita income; (g) radius of gyration vs education rate; (h) radius of gyration vs unemployment rate. We split

the municipalities into ten equal-sized groups according to the deciles of the measures on the x axis. For each group, we compute the mean and the standard

deviation of the measures on the y axis and plot them through the black error bars. ρindicates the Pearson correlation coefﬁcient between the two measures (in

all the cases the p-value <0.001). We observe that mobility entropy has stronger correlations with socio-economic indicators than radius of gyration.

mobility diversity is two directed. It might be that a well-

developed territory provides for a wide range of activities, an

advanced network of public transportation, a higher availability

and diversiﬁcation of jobs, and other elements that foster

mobility diversity. As well as it might be that a higher mobility

diversiﬁcation of individuals lead to a higher economic well-

being as it could nourish economy, establishes economic

opportunities and facilitate ﬂows of people and goods. In-

terpretations of the relation between mobility diversity and

socio-economic development are not directly derivable from

the empirical results and should therefore be combined with

more thorough theoretical insights.

Another interesting result is that mobility volume and mo-

bility diversity show opposite correlations, i.e. high values of

aggregated mobility volume correspond to low socio-economic

development, while high values of aggregated mobility diver-

sity correspond to high socio-economic development (Figure

5). Assuming that human mobility is driven by people’s daily

activities, a possible explanation is that people living in well

developed municipalities have a wide availability of activities,

resulting in high mobility diversity. In contrast, people living

in less development municipalities, like municipalities in the

countryside, are forced to travel in search of activities that

cannot be found in their municipality, resulting in a wide

mobility volume. To investigate this hypothesis we compute

the correlation between the aggregated mobility diversity and

the aggregated mobility volume. We ﬁnd a negative correlation

(ρ=−0.38) conﬁrming our insight: at municipality scale high

mobility diversity is linked to low mobility volume (Figure 7).

We plan to investigate deeply this aspect in order to understand

the reason of this interesting correlation.

VIII. CONCLUSION

In this paper we investigate the relationships between

human mobility patterns and socio-economic development in

French municipalities. Starting from nation-wide mobile phone

data we extract for each individual two mobility measures: ra-

dius of gyration, the characteristic distance traveled by an indi-

vidual, and mobility entropy, the diversiﬁcation of movements

over her locations. We then aggregate the individual mobility

measures at municipality level by taking the mean, the median

and the variance across the population of users assigned to

each municipality. Finally, we compare the aggregated mobility

measures with external socio-economic indicators measuring

education level, unemployment rate, income and deprivation.

We ﬁnd that both mobility measures show correlations with

the socio-economic indicators, and mobility entropy shows the

strongest correlations. We conﬁrm our results against a null

model which produces zero correlations, allowing us to reject

the hypothesis that our discovery occurred by chance. Starting

from our interesting results, we plan to extend our study in

three directions.

First, since mobile phone data also provide information

about social interactions, it would interesting to extract mea-

Fig. 6. The distributions of mobility entropy in the different deciles

of deprivation index. We split the municipalities into ten equal-sized groups

computed according to the deciles of deprivation index. For each group, we

plot the distributions of mobility entropy. The blue dashed curve represents a ﬁt

of the distribution, the red dashed line represents the mean of the distribution.

We observe a systematic variation of both mean and variance of the distribution

of mobility entropy across the deciles deﬁned by deprivation index.

sures capturing the social behavior of individuals. The seminal

work by Eagle et al. showed that social diversity is a good

proxy for socio-economic development of territories [11]. It

would be interesting to compare the correlations produced by

social diversity and mobility diversity in order to understand

and quantify the different roles they play in the socio-economic

development of a territory. Is mobility diversity a better proxy

for socio-economic development than social diversity?

Second, to learn more about the relationship between the

aggregated mobility measures and the socio-economic indica-

tors it would be useful to implement and validate predictive

models. The predictive models can be aimed at predicting the

actual value of socio-economic development of the territory,

e.g. by regression models, or to predict the class of socio-

economic development, i.e. the level of socio-economic devel-

opment of a given geographic unit as done by classiﬁcation

Fig. 7. The correlation between aggregated mobility diversity and

aggregated mobility volume. We split the municipalities into ten equal-

sized groups according to the deciles of the measures on the x axis. For

each group, we compute the mean and the standard deviation of the measures

on the y axis and plot them through the black error bars. ρindicates the

Pearson correlation coefﬁcient between the two measures (p-value <0.001).

We observe a negative correlation suggesting that high mobility entropy is

linked to low mobility volume, and vice versa.

models. If we ﬁnd that the accuracy and the prediction errors

of the models are not dependent on the training and test set

selected, we would have a further conﬁrmation that mobility

measures extracted from Big Data give a real possibility

to continuously monitor the socio-economic development of

territories and provide policy makers with an important tool

for decision making.

Third, we plan to investigate the relation between hu-

man mobility patterns and socio-economic development in a

multidimensional perspective by including many other indica-

tors to understand which are the aspects of socio-economic

development that best correlate with the proposed mobility

measures. The new indicators will allow us to reﬁne our study

on the relation between mobility measures extracted from Big

Data and the socio-economic development of territories. In the

meanwhile, experiences like ours may contribute to shape the

discussion on how to measure some of the aspects of well-

being with Big Data that are available everywhere on earth. If

we learn how to use such a resource, we have the potential of

creating a digital nervous system, in support of a generalized,

sustainable development of our societies.

ACKNOWLEDGMENT

The authors would like to thank Orange for providing

the CDR data, Giovanni Lima and Pierpaolo Paolini for the

contribution developed during their master theses. We are

grateful to Carole Pornet and colleagues for providing the

socio-economic indicators and for computing the deprivation

index for the French municipalities. We also thank Maarten

Vanhoof and Lorenzo Gabrielli for the insightful discussions.

This work has been partially funded by projects: Cimplex

(grant agreement 641191), PETRA (grant agreement 609042),

SoBigData RI (grant agreement 654024).

REFERENCES

[1] A. Amini, K. Kung, C. Kang, S. Sobolevsky, and C. Ratti. The impact

of social segregation on human mobility in developing and urbanized

regions. EPJ Data Science, 3, 2014.

[2] A.-L. Barab´

asi. The origin of bursts and heavy tails in human dynamics.

Nature, 435:207–211, 2005.

[3] V. D. Blondel, A. Decuyper, and G. Krings. A survey of results on

mobile phone datasets analysis, 2015. cite arxiv:1502.03406.

[4] J. Blumenstock. Calling for better measurement: Estimating an individ-

ual’s wealth and well-being. In Proceedings of the 20th ACM SIGKDD

International Conference on Knowledge Discovery and Data Mining,

KDD’14. ACM, 2014.

[5] J. Brea, J. Burroni, M. Minnoni, and C. Sarraute. Harnessing mobile

phone social network topology to infer users demographic attributes.

In Proceedings of the 8th Workshop on Social Network Mining and

Analysis, SNAKDD’14. ACM, 2014.

[6] D. Brockmann, L. Hufnagel, and T. Geisel. The scaling laws of human

travel. Nature, 439:462, 2006.

[7] E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user

movement in location-based social networks. In Proceedings of the 17th

ACM SIGKDD International Conference on Knowledge Discovery and

Data Mining, KDD’11, pages 1082–1090. ACM, 2011.

[8] P. J. H. Daas, M. J. Puts, and B. Buelens. Big data and ofﬁcial statistics.

In The 2013 New Techniques and Technologies for Statistics conference,

2013.

[9] A. Decuyper, A. Rutherford, A. Wadhwa, J. Bauer, G. Krings, T. Gutier-

rez, V. D. Blondel, and M. A. Luengo-Oroz. Estimating food consump-

tion and poverty indices with mobile phone data. CoRR, abs/1412.2595,

2014.

[10] P. Deville, C. Linard, S. Martin, M. Gilbert, F. R. Stevens, A. E.

Gaughan, V. D. Blondel, and A. J. Tandem. Dynamic population

mapping using mobile phone data. Proceedings of the National

Academy of Sciences (PNAS), 111(45):15888–15893, 2014.

[11] N. Eagle, M. Macy, and R. Claxton. Network Diversity and Economic

Development. Science, 328(5981):1029–1031, May 2010.

[12] N. Eagle and A. S. Pentland. Eigenbehaviors: identifying structure in

routine. Behavioral Ecology and Sociobiology, 63(7):1057–1066, 2009.

[13] V. Frias-martinez, V. Soto, J. Virseda, and E. Frias-martinez. Can cell

phone traces measure social development? In Third Conference on the

Analysis of Mobile Phone Datasets, NetMob, 2013.

[14] B. Furletti, L. Gabrielli, F. Giannotti, L. Milli, M. Nanni, D. Pedreschi,

R. Vivio, and G. Garofalo. Use of mobile phone data to estimate

mobility ﬂows. measuring urban population and inter-city mobility

using big data in an integrated approach. In 47th SIS Scientiﬁc Meeting

of the Italian Statistical Society, Cagliari, June 2014.

[15] F. Galton. Vox populi. Nature, 75(7), 1907.

[16] F. Giannotti, M. Nanni, D. Pedreschi, F. Pinelli, C. Renso, S. Rinzivillo,

and R. Trasarti. Unveiling the complexity of human mobility by

querying and mining massive trajectory data. The VLDB Journal,

20(5):695–719, 2011.

[17] F. Giannotti, D. Pedreschi, A. Pentland, P. Lukowicz, D. Kossmann,

J. L. Crowley, and D. Helbing. A planetary nervous system for social

mining and collective awareness. EPJ Special Topics, 214:49–75, 2014.

[18] M. C. Gonz´

alez, C. A. Hidalgo, and A.-L. Barab´

asi. Understanding

individual human mobility patterns. Nature, 453(7196):779–782, June

2008.

[19] T. Gutierrez, G. Krings, and V. D. Blondel. Evaluating socio-economic

state of a country analyzing airtime credit and mobile phone datasets.

CoRR, abs/1309.4496, 2013.

[20] S. Jiang, J. F. Jr, and M. Gonz´

alez. Clustering daily patterns of human

activities in the city. Data Mining and Knowledge Discovery, 25:478–

510, 2012.

[21] D. Karamshuk, C. Boldrini, M. Conti, and A. Passarella. Human

mobility models for opportunistic networks. IEEE Communications

Magazine, 49(12):157–165, 2011.

[22] L. Liao, D. J. Patterson, D. Fox, and H. Kautz. Learning and inferring

transportation routines. Artif. Intell., 171(5-6):311–331, Apr. 2007.

[23] J. Lorenz, H. Rauhut, F. Schweitzer, and D. Helbing. How social

inﬂuence can undermine the wisdom of crowd effect. Proceedings of

the National Academy of Sciences (PNAS), 108(22), 2011.

[24] L. Lotero, A. Cardillo, R. Hurtado, and J. Gomez-Gardenes. Several

multiplexes in the same city: The role of socioeconomic differences in

urban mobility. Available at SSRN 2507816, 2014.

[25] S. Marchetti, C. Giusti, M. Pratesi, N. Salvati, F. Giannotti, D. Pe-

dreschi, S. Rinzivillo, L. Pappalardo, and L. Gabrielli. Small area

model-based estimators using big data sources. Journal of Ofﬁcial

Statistics, 31(2), 2015.

[26] J. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski,

J. Kertesz, and A. L. Barabasi. Structure and tie strengths in mobile

communication networks. Proceeding of the National Academy of

Sciences (PNAS), 104(18):7332–7336, 2007.

[27] Indicators and a monitoring framework for the sustainable development

goals: Launching a data revolution for the sdgs. A report by the

Leadership Council of the Sustainable Development Solutions Network,

20 march 2015.

[28] L. Pappalardo, S. Rinzivillo, Z. Qu, D. Pedreschi, and F. Giannotti.

Understanding the patterns of car travel. EPJ Special Topics, 215(1):61–

73, 2013.

[29] L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi, F. Giannotti, and

A.-L. Barab´

asi. Returners and explorers dichotomy in human mobility.

Nature Communications, 6(8166), 2015.

[30] D. Pennacchioli, M. Coscia, S. Rinzivillo, D. Pedreschi, and F. Gian-

notti. Explaining the product range effect in purchase data. In IEEE

International Conference on Big Data, pages 648–656, 2013.

[31] C. Pornet, C. Delpierre, O. Dejardin, P. Grosclaude, L. Launay, L. Gui-

ttet, T. Lang, and G. Launoy. Construction of an adaptable european

transnational ecological deprivation index: the french version. Journal

of Epidemiol Community Health, 66(11):982–9, 2012.

[32] S. Rinzivillo, L. Gabrielli, M. Nanni, L. Pappalardo, D. Pedreschi, and

F. Giannotti. The purpose of motion: Learning activities from individual

mobility networks. In Proceedings of International Conference on Data

Science and Advanced Analytics, DSAA’14, 2014.

[33] S. Rinzivillo, S. Mainardi, F. Pezzoni, M. Coscia, D. Pedreschi, and

F. Giannotti. Discovering the geographical borders of human mobility.

K¨

unstliche Intelligenz, 26(3):253–260, 2012.

[34] F. Simini, M. C. Gonz´

alez, A. Maritan, and A.-L. Barab´

asi. A universal

model for mobility and migration patterns. Nature, 484(7392):96–100,

2012.

[35] C. Smith-Clarke, A. Mashhadi, and L. Capra. Poverty on the cheap:

Estimating poverty maps using aggregated mobile communication net-

works. In Proceedings of the SIGCHI Conference on Human Factors

in Computing Systems, pages 511–520. ACM, 2014.

[36] C. Song, Z. Qu, N. Blumm, and A.-L. Barab´

asi. Limits of predictability

in human mobility. Science, 327(5968):1018–1021, 2010.

[37] P. Struijs and P. J. H. Daas. Quality approaches to big data in ofﬁcial

statistics. In European conference on Quality in Ofﬁcial Statistics, 2014.

[38] J. Surowiecki. The Wisdom of Crowds: Why the Many Are Smarter

than the Few and How Collective Wisdom Shapes Business, Economies,

Societies, and Nations. Doubleday Books, New York, 2004.

[39] Data, data, everywhere. The Economist, 25 February 2010.

[40] D. Wang, D. Pedreschi, C. Song, F. Giannotti, and A.-L. Barab´

asi.

Human mobility, social ties, and link prediction. In Proceedings of the

17th ACM SIGKDD International Conference on Knowledge Discovery

and Data Mining, KDD ’11, pages 1100–1108, New York, NY, USA,

2011. ACM.

[41] A world that counts: mobilizing the data revolution for sustainable

development. A report by the United Nations Secretary-General’s Inde-

pendent Expert Advisory Group on a Data Revolution for Sustainable

Development (IEAG), November 2014.

COVID-19 is linked to changes in the time–space dimension of human mobility

Article

Full-text available

Jul 2023
Nat. Hum. Behav.

Socio-economic constructs and urban topology are crucial drivers of human mobility patterns. During the coronavirus disease 2019 pandemic, these patterns were reshaped in their components: the spatial dimension represented by the daily travelled distance, and the temporal dimension expressed as the synchronization time of commuting routines. Here, leveraging location-based data from de-identified mobile phone users, we observed that, during lockdowns restrictions, the decrease of spatial mobility is interwoven with the emergence of asynchronous mobility dynamics. The lifting of restriction in urban mobility allowed a faster recovery of the spatial dimension compared with the temporal one. Moreover, the recovery in mobility was different depending on urbanization levels and economic stratification. In rural and low-income areas, the spatial mobility dimension suffered a more considerable disruption when compared with urbanized and high-income areas. In contrast, the temporal dimension was more affected in urbanized and high-income areas than in rural and low-income areas.

Coupling human mobility and social relationships to predict individual socioeconomic status: A graph neural network approach

Article

Jun 2024

Understanding individual's socioeconomic status (SES) can provide supporting information for designing political and economic policies. Acquiring large‐scale economic survey data is time‐consuming and laborious. The widespread mobile phone data, which can reflect human mobility and social network characteristics, has become a low‐cost data source for researchers to infer SES. However, previous studies often oversimplify human mobility features and social network features extracted from mobile phone data into general statistical features, resulting in discounting some important temporal and relational information. Therefore, we propose a comprehensive framework for individual SES prediction that effectively utilizes a combination of human mobility and social relationships. In this framework, Word2Vec module extracts human mobility features from mobile phone positioning data, and graph neural network (GNN) module GraphSAGE captures social network characteristics constructed from call detail records. We evaluated the effectiveness of our proposed approach by training the model with real‐world data in Beijing. According to the experimental results, our proposed hybrid approach outperformed the other methods evidently, demonstrating that human mobility and social links are complementary in the characterization of SES. Coupling human mobility and social links can further deepen our understanding of cities' economic geography.

Human mobility reshaped? Deciphering the impacts of the Covid-19 pandemic on activity patterns, spatial habits, and schedule habits

Article

Full-text available

Mar 2024

Despite the historically documented regularity in human mobility patterns, the relaxation of spatial and temporal constraints, brought by the widespread adoption of telecommuting and e-commerce during the COVID-19 pandemic, as well as a growing desire for flexible work arrangements in a post-pandemic work, indicates a potential reshaping of these patterns. In this paper, we investigate the multifaceted impacts of relaxed spatio-temporal constraints on human mobility, using well-established metrics from the travel behavior literature. Further, we introduce a novel metric for schedule regularity, accounting for specific day-of-week characteristics that previous approaches overlooked. Building on the large body of literature on the impacts of COVID-19 on human mobility, we make use of passively tracked Point of Interest (POI) data for approximately 21,700 smartphone users in the US, and analyze data between January 2020 and September 2022 to answer two key questions: (1) has the COVID-19 pandemic and its associated relaxation of spatio-temporal activity patterns reshaped the different aspects of human mobility, and (2) have we achieved a state of stable post-pandemic “new normal”? We hypothesize that the relaxation of the spatiotemporal constraints around key activities will result in people exhibiting less regular schedules. Findings reveal a complex landscape: while some mobility indicators have reverted to pre-pandemic norms, such as trip frequency and travel distance, others, notably at-home dwell-time, persist at altered levels, suggesting a recalibration rather than a return to past behaviors. Most notably, our analysis reveals a paradox: despite the documented large-scale shift towards flexible work arrangements, schedule habits have strengthened rather than relaxed, defying our initial hypotheses and highlighting a desire for regularity. The study’s results contribute to a deeper understanding of the post-pandemic “new normal”, offering key insights on how multiple facets of travel behavior were reshaped, if at all, by the COVID-19 pandemic, and will help inform transportation planning in a post-pandemic world.

Effectiveness of Heatwave Emergency Alert Messages through Analysis of Floating Population

Article

Nov 2023

Youjeong Hong

Heatwave emergency alert messages (EAM) not only provide objective information on heatwave occurrences but also include behavioral guidelines, such as avoiding outdoor activities or going out to respond to heatwaves and minimize damage. To investigate the EAM's effectiveness, we analyzed the changes in the floating population during 2021 using Seoul dong-unit and hourly location-based mobile big data. We also examined what socioeconomic and physical environmental characteristics could cause differences in the effectiveness, focusing on the fact that EAM may have different effects depending on regional factors. The findings revealed a notable reduction in the overall floating population of Seoul, ranging from 0.1 to 3.4%. When examined by region, an average decrease of 2.4 to 5.6% was observed, indicating the different effects of EAM by region. Further analysis of regional characteristics highlighted that areas with low EAM effectiveness were characterized by a lower ratio of senior residents, higher employment concentration, and higher accessibility to public transportation. Additionally, the number of household members, gender distribution, average age, and presence of green spaces can contribute to the gap in the effectiveness of EAM between regions.

How enlightened self-interest guided global vaccine sharing benefits all: A modeling study

Article

Full-text available

Dec 2023

Background Despite consensus that vaccines play an important role in combatting the global spread of infectious diseases, vaccine inequity is still a prevalent issue due to a deep-seated mentality of self-priority. We aimed to evaluate the existence and possible outcomes of a more equitable global vaccine distribution and explore a concrete incentive mechanism that promotes vaccine equity. Methods We designed a metapopulation epidemiological model that simultaneously considers global vaccine distribution and human mobility, which we then calibrated by the number of infections and real-world vaccination records during the coronavirus disease 2019 (COVID-19) pandemic from March 2020 to July 2021. We explored the possibility of the enlightened self-interest incentive mechanism, which comprises improving one’s own epidemic outcomes by sharing vaccines with other countries, by evaluating the number of infections and deaths under various vaccine sharing strategies using the proposed model. To understand how these strategies affect the national interests, we distinguished imported from local cases for further cost-benefit analyses that rationalise the enlightened self-interest incentive mechanism behind vaccine sharing. Results The proposed model accurately reproduces the real-world cumulative infections for both global and regional epidemics (R²>0.990), which can support the following evaluations of different vaccine sharing strategies: High-income countries can reduce 16.7 (95% confidence interval (CI) = 8.4-24.9, P < 0.001) million infection cases and 82.0 (95% CI = 76.6-87.4, P < 0.001) thousand deaths on average by more actively sharing vaccines in an enlightened self-interest manner, where the reduced internationally imported cases outweigh the threat from increased local infections. Such vaccine sharing strategies can also reduce 4.3 (95% CI = 1.2-7.5, P < 0.01) million infections and 7.0 (95% CI = 5.7-8.3, P < 0.001) thousand deaths in middle- and low-income countries, effectively benefiting the whole global population. Lastly, the more equitable vaccine distribution could help largely reduce the global mobility reduction needed for pandemic control. Conclusions The incentive mechanism of enlightened self-interest we explored here could motivate vaccine equity by realigning the national interest to more equitable vaccine distributions. The positive results could promote multilateral collaborations in global vaccine redistribution and reconcile conflicted national interests, which could in turn benefit the global population.

Assessing Human Mobility and Its Climatic and Socioeconomic Factors for Sustainable Development in Sub-Saharan Africa

Article

Full-text available

Jul 2023

Promoting human mobility and reducing inequality among countries are the Sustainable Development Goals’ (SDGs) targets. However, measuring human mobility, assessing its heterogeneity and changes, and exploring associated mechanisms and context effects are still key challenges, especially for developing countries. This study attempts to review the concept of human mobility with complex thinking, assess human mobility across forty countries in Sub-Saharan Africa (SSA), and examine the effect of climatic and socioeconomic factors. Based on the coined definition of human mobility, international migration and cross-border trips are taken to assess human mobility in terms of permanent migration and temporary moves. The forty SSA countries are hence classified into four mobility groups. Regression models are performed to identify key determinants and estimate their effects on mobility. The results reveal that seven of these forty countries had a high mobility, whereas most experienced a decline in permanent migration. Lesotho, Cabo Verde, and Namibia presented high temporary moves, while Eritrea, Rwanda, Equatorial Guinea, and Liberia had a high permanent migration. Climatic and socioeconomic conditions demonstrated significant effects on mobility but were different for temporary moves and permanent migration. Wet extremes reduced mobility, whereas extreme temperature variations had positive effects. Dry extremes promoted permanent migration but inhibited temporary moves. Economic wealth and political instability promoted permanent migration, while the young population counteracted temporary moves. Food insecurity and migrant networks stimulated human mobility. The analysis emphasises the interest in analysing human mobility for risk reduction and sustainability management at the multi-county level.

Digitalization era: Investigating the spatial interplay between cyber human activity and economy with a hierarchical framework

Article

Apr 2024
APPL GEOGR

The development of Information and Communication Technology has shifted human activity from offline to online, and promoted digital economy. These changes challenge traditional methodologies relying on physical human activity as a micro-level reflection of the macro-level economy. To address this, a hierarchical framework is proposed to characterize cyber human activity, incorporating activity diversity, size, and preference. Then, Ordinary Least Squares and Geographically Weighted Regression models are used to examine the spatial interplay between cyber human activity and township-level economy. K-Medoids clustering is further applied to coefficients in cyber GWR model to reveal mechanisms influencing local economies. Taking diverse geospatial data in Jilin Province, China as an example, the result indicates that the significant extreme values of cyber human activity have divided Jilin Province into four subregions, closely aligning with local economic segmentation. Moreover, the local economy can be better reflected by cyber human activity rather than physical ones, especially in highly digitized regions. Furthermore, the regions with similar influence mechanisms cluster geographically. These clusters are categorized into three types, namely, dynamics, robust, and balanced, representing different mechanisms influencing the local economy. Practical and theoretical implications are discussed, including using cyber human activity in assessing the economy and implementing adaptable economic policies at township level.

Inferring Carbon Emissions from Human Mobility Data

Preprint

Full-text available

Nov 2023

Accurate identification, quantification, and continual monitoring of carbon emissions constitute pivotal elements for proactive climate interventions. Conventional methodologies like direct measurement, including point-source assessments or remote sensing, often face challenges related to high costs or limited accuracy. Especially in low- and middle-income nations experiencing escalating emissions, carbon monitoring heavily relies on the administrative capabilities of local governments, which frequently lack adequate monitoring infrastructures. Addressing this predicament, our study introduces a computational framework to forecast CO 2 emissions by leveraging comprehensive observable human activity data from third-party sources. Our findings elucidate a robust correlation between multi-origin CO 2 emissions and human mobility ( r = 0.89). Notably, machine learning models adeptly predict these emissions by integrating characteristics extracted from temporally aggregated, anonymized mobility networks (R ² ≈1.0). We demonstrate that the model effectively captures the notable reduction in CO 2 emissions during the COVID-19 lockdown, with both human mobility and CO 2 emissions in China decreasing by 56.97% and 32.45%, respectively. The prediction accuracy remains high for countries with varying social economic development, such as the U.S., Italy and Mexico. This study presents an inexpensive, real-time, and robust method of quantifying CO 2 emissions on a large scale with high precision, and it could facilitate tailored CO 2 emission reduction strategies, grounded in robust scientific evidence derived from the dynamics of human mobility.

Research on fine classification of urban rail transit passengers based on two-step clustering-FCM fusion algorithm

Conference Paper

Dec 2023

Associations between socio‐demographic factors and change in mobility due to COVID‐19 restrictions in Ontario, Canada using geographically weighted regression

Article

Sep 2023

Transportation research has shown that socio‐demographic factors impact people's mobility patterns. During the COVID‐19 pandemic, some of these effects have changed in accordance with changing mobility needs adapting to the pandemic, including restrictions on in‐person gatherings, closure of in‐person businesses, and working from home. We investigate two gaps in current knowledge in this area of transportation research: to what extent the associations between socio‐demographic factors and mobility metrics have changed, and how these associations vary across geographic space. We used aggregate deidentified cell tower location data to measure two mobility metrics—movement time and radius of gyration—and socio‐demographic data from the 2016 Canadian Census to model these associations across Ontario, Canada in 2020 using a linear model and a geographically weighted regression model. We find that certain associations between socio‐demographics and mobility have changed from what we previously observed before the pandemic, and we can see the variation of these associations across space. These findings will improve our understanding of how socio‐demographic factors affect mobility patterns in different communities and demonstrate the importance of measuring these associations at a more fine‐grained level using models that consider spatial variation to best reflect the nature of these associations.

Returners and explorers dichotomy in human mobility

Article

Full-text available

Sep 2015

The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.

Big Data as a Source for Official Statistics

Article

Full-text available

Jun 2015
J Offic Stat

More and more data are being produced by an increasing number of electronic devices physically surrounding us and on the internet. The large amount of data and the high frequency at which they are produced have resulted in the introduction of the term ‘Big Data’. Because these data reflect many different aspects of our daily lives and because of their abundance and availability, Big Data sources are very interesting from an official statistics point of view. This article discusses the exploration of both opportunities and challenges for official statistics associated with the application of Big Data. Experiences gained with analyses of large amounts of Dutch traffic loop detection records and Dutch social media messages are described to illustrate the topics characteristic of the statistical analysis and use of Big Data.

Small Area Model-Based Estimators Using Big Data Sources

Article

Full-text available

Jun 2016
J Offic Stat

The timely, accurate monitoring of social indicators, such as poverty or inequality, on a finegrained spatial and temporal scale is a crucial tool for understanding social phenomena and policymaking, but poses a great challenge to official statistics. This article argues that an interdisciplinary approach, combining the body of statistical research in small area estimation with the body of research in social data mining based on Big Data, can provide novel means to tackle this problem successfully. Big Data derived from the digital crumbs that humans leave behind in their daily activities are in fact providing ever more accurate proxies of social life. Social data mining from these data, coupled with advanced model-based techniques for fine-grained estimates, have the potential to provide a novel microscope through which to view and understand social complexity. This article suggests three ways to use Big Data together with small area estimation techniques, and shows how Big Data has the potential to mirror aspects of well-being and other socioeconomic phenomena.

The impact of social segregation on human mobility in developing and industrialized regions

Article

Full-text available

Dec 2014

This study leverages mobile phone data to analyze human mobility patterns in a developing nation, especially in comparison to those of a more industrialized nation. Developing regions, such as the Ivory Coast, are marked by a number of factors that may influence mobility, such as less infrastructural coverage and maturity, less economic resources and stability, and in some cases, more cultural and language-based diversity. By comparing mobile phone data collected from the Ivory Coast to similar data collected in Portugal, we are able to highlight both qualitative and quantitative differences in mobility patterns - such as differences in likelihood to travel, as well as in the time required to travel - that are relevant to consideration on policy, infrastructure, and economic development. Our study illustrates how cultural and linguistic diversity in developing regions (such as Ivory Coast) can present challenges to mobility models that perform well and were conceptualized in less culturally diverse regions. Finally, we address these challenges by proposing novel techniques to assess the strength of borders in a regional partitioning scheme and to quantify the impact of border strength on mobility model accuracy.

A survey of results on mobile phone datasets analysis

Article

Full-text available

Feb 2015

In this paper, we review some advances made recently in the study of mobile phone datasets. This area of research has emerged a decade ago, with the increasing availability of large-scale anonymized datasets, and has grown into a stand-alone topic. We will survey the contributions made so far on the social networks that can be constructed with such data, the study of personal mobility, geographical partitioning, urban planning, and help towards development as well as security and privacy issues.

Estimating Food Consumption and Poverty Indices with Mobile Phone Data

Article

Full-text available

Nov 2014

Recent studies have shown the value of mobile phone data to tackle problems related to economic development and humanitarian action. In this research, we assess the suitability of indicators derived from mobile phone data as a proxy for food security indicators. We compare the measures extracted from call detail records and airtime credit purchases to the results of a nationwide household survey conducted at the same time. Results show high correlations (> .8) between mobile phone data derived indicators and several relevant food security variables such as expenditure on food or vegetable consumption. This correspondence suggests that, in the future, proxies derived from mobile phone data could be used to provide valuable up-to-date operational information on food security throughout low and middle income countries.

Dynamic population mapping using mobile phone data

Article

Full-text available

Oct 2014
P NATL ACAD SCI USA

Significance Knowing where people are is critical for accurate impact assessments and intervention planning, particularly those focused on population health, food security, climate change, conflicts, and natural disasters. This study demonstrates how data collected by mobile phone network operators can cost-effectively provide accurate and detailed maps of population distribution over national scales and any time period while guaranteeing phone users’ privacy. The methods outlined may be applied to estimate human population densities in low-income countries where data on population distributions may be scarce, outdated, and unreliable, or to estimate temporal variations in population density. The work highlights how facilitating access to anonymized mobile phone data might enable fast and cheap production of population maps in emergency and data-scarce situations.

How social influence can undermine the wisdom of crowd effect

Article

Jan 2010

Several Multiplexes in the Same City: The Role of Socioeconomic Differences in Urban Mobility

Chapter

Feb 2016

In this work we analyze the architecture of real urban mobility networks from the multiplex perspective. In particular, based on empirical data about the mobility patterns in the cities of Bogotá and Medellín, each city is represented by six multiplex networks, each one representing the origin-destination trips performed by a subset of the population corresponding to a particular socioeconomic status. The nodes of each multiplex are the different urban locations whereas links represent the existence of a trip from one node (origin) to another (destination). On the other hand, the different layers of each multiplex correspond to the different existing transportation modes. By exploiting the characterization of multiplex transportation networks combining different transportation modes, we aim at characterizing the mobility patterns of each subset of the population. Our results show that the socioeconomic characteristics of the population have an extraordinary impact in the layer organization of these multiplex systems.

Harnessing Mobile Phone Social Network Topology to Infer Users Demographic Attributes

Conference Paper

Aug 2014

We study the structure of the social graph of mobile phone users in the country of Mexico, with a focus on demographic attributes of the users (more specifically the users' age). We examine assortativity patterns in the graph, and observe a strong age homophily in the communications preferences. We propose a graph based algorithm for the prediction of the age of mobile phone users. The algorithm exploits the topology of the mobile phone network, together with a subset of known users ages (seeds), to infer the age of remaining users. We provide the details of the methodology, and show experimental results on a network GT with more than 70 million users. By carefully examining the topological relations of the seeds to the rest of the nodes in GT , we find topological metrics which have a direct inuence on the performance of the algorithm. In particular we characterize subsets of users for which the accuracy of the algorithm is 62% when predicting between 4 age categories (whereas a pure random guess would yield an accuracy of 25%). We also show that we can use the probabilistic information computed by the algorithm to further increase its inference power to 72% on a significant subset of users.

Using Big Data to study the link between human mobility and socio-economic development

Abstract and Figures

Recommended publications

Urban hourly water demand prediction using human mobility data

An analytical framework to nowcast well-being using mobile phone data

The impact of biases in mobile phone ownership on estimates of human mobility

Influence of social relations on human mobility and sociality: a study of social ties in a cellular...