Conference PaperPDF Available

Using Big Data to study the link between human mobility and socio-economic development

Authors:

Abstract and Figures

Big Data offer nowadays the potential capability of creating a digital nervous system of our society, enabling the measurement, monitoring and prediction of relevant aspects of socioeconomic phenomena in quasi real time. This potential has fueled, in the last few years, a growing interest around the usage of Big Data to support official statistics in the measurement of individual and collective economic well-being. In this work we study the relations between human mobility patterns and socioeconomic development. Starting from nationwide mobile phone data we extract a measure of mobility volume and a measure of mobility diversity for each individual. We then aggregate the mobility measures at municipality level and investigate the correlations with external socioeconomic indicators independently surveyed by an official statistics institute. We find three main results. First, aggregated human mobility patterns are correlated with these socioeconomic indicators. Second, the diversity of mobility, defined in terms of entropy of the individual users' trajectories, exhibits the strongest correlation with the external socioeconomic indicators. Third, the volume of mobility and the diversity of mobility show opposite correlations with the socioeconomic indicators. Our results, validated against a null model, open an interesting perspective to study human behavior through Big Data by means of new statistical indicators that quantify and possibly " nowcast " the socioeconomic development of our society.
Content may be subject to copyright.
Using Big Data to study the link between
human mobility and socio-economic development
Luca Pappalardo
Department of Computer Science
University of Pisa, Italy
Email: lpappalardo@di.unipi.it
Dino Pedreschi
Department of Computer Science
University of Pisa, Italy
Email: pedre@di.unipi.it
Zbigniew Smoreda
SENSE
Orange Lab, France
Email: zbigniew.smoreda@orange.com
Fosca Giannotti
Institute of Information Science and Technologies
National Research Council (CNR), Italy
Email: fosca.giannotti@isti.cnr.it
Abstract—Big Data offer nowadays the potential capability
of creating a digital nervous system of our society, enabling the
measurement, monitoring and prediction of relevant aspects of
socio-economic phenomena in quasi real time. This potential has
fueled, in the last few years, a growing interest around the usage
of Big Data to support official statistics in the measurement of
individual and collective economic well-being. In this work we
study the relations between human mobility patterns and socio-
economic development. Starting from nation-wide mobile phone
data we extract a measure of mobility volume and a measure
of mobility diversity for each individual. We then aggregate the
mobility measures at municipality level and investigate the cor-
relations with external socio-economic indicators independently
surveyed by an official statistics institute. We find three main
results. First, aggregated human mobility patterns are correlated
with these socio-economic indicators. Second, the diversity of
mobility, defined in terms of entropy of the individual users’
trajectories, exhibits the strongest correlation with the external
socio-economic indicators. Third, the volume of mobility and the
diversity of mobility show opposite correlations with the socio-
economic indicators. Our results, validated against a null model,
open an interesting perspective to study human behavior through
Big Data by means of new statistical indicators that quantify and
possibly “nowcast” the socio-economic development of our society.
I. INTRODUCTION
The Big Data originating from the digital breadcrumbs of
human activities, sensed as a by-product of the ICT systems
we use everyday, allow us to scrutinize the ground truth of
individual and collective behavior at an unprecedented detail
[39]. Multiple dimensions of our social life have Big Data
proxies nowadays. Our social relationships leave traces in the
network of our phone or email contacts, in the friendship
links of our favorite social networking site. Our shopping
patterns leave traces in the transaction records of our pur-
chases. Our movements leave traces in the records of our
mobile phone calls, in the GPS tracks of our on-board navi-
gation systems. Sensing Big Data at a societal scale has the
potential of providing a powerful social microscope, which
can help us understand many complex and hidden socio-
economic phenomena. Such challenge clearly requires high-
level analytics, modeling and reasoning across all the social
dimensions above, an activity that it is often referred to as
“social mining”: the task of making sense of Big Data by
extracting meaningful information from large, messy and noisy
data [17]. In recent years, also stimulated by national official
statistics institutes and the United Nations [41], researchers
from different disciplines have started to use Big Data and
social mining to support official statistics in the measurement
of individual and collective well-being [8][37]. The majority
of works in literature focus on the analysis of mobile phone
data to study the relations between communications patterns
and well-being [11][4]. In this paper we analyze Big Data from
mobile phones to study the link between individuals’ mobility
patterns and the socio-economic development of cities. We try
to answer the following intriguing question: Can we monitor
and possibly predict the socio-economic development of cities
just by observing human movements of their residents through
the lens of Big Data? The answer to this fascinating question,
as we show in this paper, is related to the concepts of mobility
volume and mobility diversity.
We know that bio-diversity is crucial to the health of natural
ecosystems and for the balance, or well-being, of plant and
animal species that inhabit them. Diversity is a key concept
also for the social ecosystems: from Francis Galton who
showed that the diversity of opinion in a crowd is essential to
answer difficult questions [15] to more recent works showing
that the diversity of social contacts is associated to socio-
economic indicators of well-being [11][19][4], social diversity
has proven to be essential in many contexts [38][23]. In this
paper we argue that diversity is a key concept also for the
mobility ecosystem, and that the volume and the diversity of
mobility patterns have high predictive power with respect to
the socio-economic development of cities.
Starting from large-scale mobile phone data we quantify
the relations between human mobility and socio-economic de-
velopment in France using municipality-level official statistics
as external comparison measurements. We first define two
individual measures over mobile phone data which describe
two aspects of individual mobility behavior: the volume of
mobility, i.e. the characteristic traveled distance of an individ-
ual [18], and the diversity of mobility, i.e. the diversification
of movements of an individual over her locations [36]. Each
individual measure is computed for each of the several million
users in our dataset based on their locations and calls as
recorded in the mobile phone data. We then aggregate the
two individual measures at the level of French municipalities
and explore the correlations between the aggregated measures
and external indicators covering different aspects of socio-
economic development: wealth, employment, education and
deprivation. We find that both mobility measures correlate
with the external socio-economic indicators, and in particular
the measure of mobility diversity shows much stronger cor-
relations. We validate our results against a null model which
produces zero correlations allowing us to reject the hypothesis
that our results occurred by chance. Finally, we observe that
at municipality level mobility volume and mobility diversity
show negative correlations and opposite correlations with the
socio-economic indicators, suggesting that they play different
roles in the socio-economic development of cities.
The importance of our findings is twofold. On one side,
we show that mobility diversity and mobility volume are key
concepts for the well-being of our cities that can be used
to understand deeply the complexity of our interconnected
society. On the other side, our results reveal the high potential
of Big Data in providing representative, relatively inexpensive
and readily available measures as proxies of socio-economic
development and well-being. New statistical indicators can
be defined to describe the well-being of a territory, in order
to support official statistics when such measurements are not
possible using traditional censuses and surveys [41][27].
The paper is organized as follows. Section II revises the
main works in the study of Big Data for measuring devel-
opment and well-being. Section III and Section IV present
respectively the mobile phone data and the socio-economic
indicators we use in our study. Section V introduces the
measures of mobility and explains how to compute them on
the mobile phone data. In Section VI we show the main results
of our study and discuss them in Section VII. Finally, Section
VIII concludes the paper discussing open lines of new research.
II. RE LATE D WOR K
Big Data offer nowadays the potential capability of creating
a digital nervous system of our society, enabling the measure-
ment, monitoring and prediction of relevant aspects of human
behavior [17]. For example the availability of massive digital
traces of human whereabouts, such as GPS traces from private
vehicles and mobile phone data, has offered novel insights
on the quantitative patterns characterizing human mobility
[6][18][16]. Studies from different disciplines document a
stunning heterogeneity of human travel patterns as measured
by the so-called radius of gyration [18][28], and at the same
time observe a high degree of predictability as measured by
the mobility entropy [36][12]. The patterns of human mobility
have been used to build generative models of individual human
mobility [21][29], generative models to describe human migra-
tion flows [34], methods for profiling individuals according
to their recurrent and total mobility patterns [29], methods
to discover geographic borders according to recurrent trips of
private vehicles [33], methods to predict the formation of social
ties [7][40], and classification models to predict the kind of
activity associated to individuals’ trips on the only basis of
the observed displacements [22][20][32].
The last few years have also witnessed a growing interest
around the usage of Big Data to support official statistics in the
measurement of individual and collective well-being [8][37].
Even the United Nations, in two recent reports, stimulate the
usage of Big Data to investigate the patterns of phenomena
relative to people’s health and well-being [41][27]. The vast
majority of works in the context of Big Data for official statis-
tics are based on the analysis of mobile phone data, the so-
called CDR (Call Detail Records) of calling and texting activity
of users. Mobile phone data, indeed, guarantee the repeatability
of experiments on different countries and geographical scales
since they can be retrieved nowadays in every country due to
their worldwide diffusion [3]. A set of recent works use mobile
phone data as a proxy for socio-demographic variables. Deville
et al., for example, show how the ubiquity of mobile phone
data can be exploited to provide accurate and detailed maps
of population distribution over national scales and any time
period [10]. Brea et al. study the structure of the social graph
of mobile phone users of Mexico and propose an algorithm for
the prediction of the age of mobile phone users [5]. Another
recent work use mobile phone data to study inter-city mobility
and develop a methodology to detect the fraction of residents,
commuters and visitors within each city [14].
A lot of effort has been put in recent years on the usage
of mobile phone data to study the relationships between
human behavior and collective socio-economic development.
The seminal work by Eagle et al. analyzes a nationwide
mobile phone dataset and shows that, in the UK, regional
communication diversity is positively associated to a socio-
economic ranking [11]. Gutierrez et al. address the issue
of mapping poverty with mobile phone data through the
analysis of airtime credit purchases in Ivory Coast [19].
Blumenstock shows a preliminary evidence of a relationship
between individual wealth and the history of mobile phone
transactions [4]. Decuyper et al. use mobile phone data to study
food security indicators finding a strong correlation between
the consumption of vegetables rich in vitamins and airtime
purchase [9]. Frias-Martinez et al. analyze the relationship
between human mobility and the socio-economic status of
urban zones, presenting which mobility indicators correlate
best with socio-economic levels and building a model to
predict the socio-economic level from mobile phone traces
[13]. Lotero et al. analyze the architecture of urban mobility
networks in two Latin-American cities from the multiplex per-
spective. They discover that the socio-economic characteristics
of the population have an extraordinary impact in the layer
organization of these multiplex systems [24]. Amini et al. use
mobile phone data to compare human mobility patterns of
a developing country (Ivory Coast) and a developed country
(Portugal). They show that cultural diversity in developing
regions can present challenges to mobility models defined in
less culturally diverse regions [1]. Smith-Clarke at al. analyze
the aggregated mobile phone data of two developing countries
and extract features that are strongly correlated with poverty
indexes derived from census data [35].
Other recent works use different types of mobility data
to show that Big Data on human movements can be used
to support official statistics and understand people’s purchase
needs. Pennacchioli et al. for example provide an empirical
evidence of the influence of purchase needs on human mobility,
analyzing the purchases of an Italian supermarket chain to
show a range effect of products: the more sophisticated the
needs they satisfy, the more the customers are willing to travel
[30]. Marchetti et al. perform a study on a regional level
analyzing GPS tracks from cars in Tuscany to extract measures
of human mobility at province and municipality level, finding a
strong correlation between the mobility measures and a poverty
index independently surveyed by the Italian official statistics
institute [25].
III. MOBILE PHO NE DATA
Mobile phones are nowadays very common technological
devices carried out by individuals in their daily routine, offer-
ing a good proxy to study the patterns of human mobility. In
our study, we exploit the access to a dataset of Call Detail
Records (CDR) gathered by Orange mobile phone opera-
tor, recording 200 million calls made during 45 days (from
2007/09/01 to 2007/10/15) by 20 million anonymized users in
France. CDRs collect geographical, temporal and interaction
information on mobile phone use and show a great potential
to empirically investigate human dynamics on a society wide
scale [18][2][26]. Each time an individual makes a call or
sent a text message the mobile phone operator registers the
connection between the caller and the callee, the time of the
phone activity and the phone tower communicating with the
served phone, allowing to reconstruct the user’s time-resolved
trajectory [18]. Table I shows the format of CDR data and
tower location data in our dataset. To make sure that users’
private information are protected, all the users are anonymized
by translating their identifiers into hash formats. The item
“timestamp” records the exact time of the phone activity, while
“tower” is the identifier of wireless tower that is serving the
caller’s call or text message. The item “mode” is simply used
to distinguish between calls and text messages.
(a)
caller callee timestamp tower mode
4F80460 4F80331 2007/09/10 23:34 36 call
2B01359 9H80125 2007/10/10 01:12 38 SMS
2B19935 6W1199 2007/10/10 01:43 38 call
.
.
..
.
..
.
..
.
..
.
.
(b)
tower latitude longitude
36 49.54 3.64
37 48.28 1.258
38 48.22 -1.52
.
.
..
.
..
.
.
TABLE I. THE FORMATS OF CDR DATA (A)AND TOW ER DATA (B).
To focus on individuals with reliable statistics we carry
out some preprocessing steps. First, we select only users with
a call frequency higher than a threshold f=N/45 >0.5,
where Nis the number of calls made by the user and 45 days
is the length of our period of observation, we then delete all
the users with less than one call every two days (in average
over the observation period). The resulting dataset contains the
mobility trajectories of 6 million active users.
IV. SOCIO-E CONOMIC DATA
As external socio-economic indicators, we use a dataset
provided by the French National Institute of Statistics and
Economic Studies (INSEE) about socio-economic indicators
in 2007 for all the French municipalities with more than
1,000 official residents. We collect data about four aspects
of socio-economic development: (i) per capita income, the
mean income in a given municipality; (ii) education rate, the
fraction of residents of a municipality with primary education
only; (iii) unemployment rate, the ratio between unemploy-
ment individuals and all the residents of a municipality; (iv)
deprivation index, constructed by selecting among variables
reflecting individual experience of deprivation and combining
them into a single score by a linear combination with specific
choices for coefficients [31]1:
deprivation = 0.11 ×Overcrowding
+ 0.34 ×No access to electric heating
+ 0.55 ×Non-owner
+ 0.47 ×Unemployment
+ 0.23 ×Foreign nationality
+ 0.52 ×No access to a car
+ 0.37 ×Unskilled worker-farm worker
+ 0.45 ×Household with 6 + persons
+ 0.19 ×Low level of education
+ 0.41 ×Single-parent household.
Preliminary validation showed a high association between
the French deprivation index and both income and education
values in French municipalities, partly supporting its ability
to measure socio-economic status [31]. Figure 1 shows the
distribution of the four socio-economic indicators across the
French municipalities.
V. MEASURING HUMAN MOBILITY
Starting from the trajectories of an individual we consider
two aspects of individual mobility: the volume of mobility, i.e.
how large the typical distance traveled by an individual is, and
the diversity of mobility, i.e. how the trips of an individual are
distributed over the locations visited. The radius of gyration rg
is a measure of mobility volume and indicates the characteristic
distance traveled by an individual [18][28][29]. It characterizes
the spatial spread of the phone towers visited by an individual
ufrom her center of mass (i.e. the weighted mean point of the
phone towers visited by an individual), defined as:
rg(u) = s1
NX
iL
ni(rircm)2(1)
where Lis the set of phone towers visited by the individual,
niis the individual’s visitation frequency of phone tower i,
N=PiLniis the sum of all the single frequencies, ri
and rcm are the vectors of coordinates of phone tower i
and center of mass respectively. To clarify the concept, let
us consider Figure 2 which displays the radius of gyration
of two individuals in our dataset. User Atravels between
1The variables used to compute the deprivation index (Overcrowding, No
access to electric heating, etc.) refer to socio-economic status in 2007. We used
the procedure in [31] to compute the deprivation index on these variables.
(a) (b) (c) (d)
Fig. 1. The distribution of socio-economic variables across the French municipalities. (a) Distribution of logarithm of per capita income; (b) distribution
of education rate; (c) distribution of unemployment rate; (d) distribution of deprivation index. We observe that all the distributions show clear peaks highlighting
the presence of typical socio-economic values across the French municipalities.
locations that are close to each other, resulting in a low radius
of gyration rg(A). In contrast, user Bhas a large radius of
gyration since the locations she visits are far apart from each
other. Figure 3a shows the distribution of radius of gyration
across the individuals in our dataset. The distribution is well
approximated by a heavy tail distribution indicating a large
variability of the radii, a confirmation of previous results on
both GSM data [18] and GPS data [28].
home%loca)on%
center%of%mass%
radius%of%gyra)on%
A"
(a)
B"
(b)
Fig. 2. The radius of gyration of two users in our dataset. The figure
shows the spatial distribution of phone towers (circles). The size of circles is
proportional to their visitation frequency, the red location indicates the most
frequent location L1(the location where the user makes the highest number
of calls during nighttime). The cross indicates the position of the center of
mass, the black dashed line indicates the radius of gyration. User Ahas a
small radius of gyration because she travels between locations that are close
to each other. User Bhas high radius of gyration because the locations she
visits are far apart from each other.
We measure the mobility diversity of an individual uby
using the Shannon entropy [36]:
S(u) = PeEp(e) log p(e)
log N(2)
where e= (a, b)represents a trip between an origin phone
tower and a destination phone tower, Eis the set of all the
possible origin-destination pairs, p(e)is the probability of
observing a movement between phone towers aand b, and
Nis the total number of trajectories of individual u. Mobility
entropy is high when an individual performs many different
trips from a variety of origins and destinations; it is low when
(a) (b)
Fig. 3. The distributions of radius of gyration and mobility entropy of
individuals in our dataset. (a) Distribution of radius of gyration. We observe
a heavy-tail distribution indicating a large variability of radius of gyration
across the population. (b) Distribution of mobility entropy, denoting a high
mean degree of unpredictability of human mobility patterns.
she performs a small number of recurring trips. To clarify
the concept let us consider Figure 4 which shows a network
visualization of the mobility entropy of two individuals in
our dataset. In the figure nodes represent phone towers, edges
represent trips between two phone towers, and the size of edges
is proportional to the number of trips performed on the edge.
User Xhas low mobility entropy since she distributes her trips
on a few preferred edges. Conversely user Yhas high mobility
entropy because she distributes her trips across many equal-
sized edges. Mobility entropy also quantifies the possibility
to predict individual’s future whereabouts. Individuals having
a very regular movement pattern possess a mobility entropy
close to zero and their whereabouts are rather predictable (the
case of user X). Conversely, individuals with a high mobility
entropy are less predictable (the case of user Y). Figure 3b
shows the distribution of mobility entropy across the users in
our dataset, and indicates a high mean degree of predictability
of individual human mobility patterns [36].
The most frequented location L1(u)is the place where
an individual uis found with the highest probability when
stationary, most likely her home. In Figure 2 the red circles
indicate L1(A)and L1(B), i.e. the phone towers where users
Aand Bmake the highest number of calls during the period
of observation.
X"
(a)
Y"
(b)
Fig. 4. The mobility entropy of two users in our dataset. Nodes represent
phone towers, edges represent trips between two phone towers, the size of
nodes indicates the number of calls of the user managed by the phone tower,
the size of edges indicates the number of trips performed by the user on the
edge. User Xhas low mobility entropy because she distributes the trips on
a few large preferred edges. User Yhas high mobility entropy because she
distributes the trips across many equal-sized edges.
VI. CO RR EL ATIO N ANALYSI S
We compute the two mobility measures for each individual
on the CDR data. Due the size of the dataset, we use the
MapReduce paradigm implemented by Hadoop to distribute
the computation across a cluster of coordinated nodes and
reduce the time of computation. We then aggregate the in-
dividual measures at the municipality level through a two-
step process: (i) we assign to each user ua home location
L1(u), i.e. the phone tower where the user performs the
highest number of calls during nighttime (from 10 pm to
7 am) [31]; (ii) based on these home locations, we assign
each user to the corresponding municipality with standard
Geographic Information Systems techniques. We aggregate
radius of gyration and mobility entropy at municipality level by
taking the mean, median and standard deviation values across
the population of users assigned to that municipality. We obtain
a set of 5,100 municipalities each one with the associated two
aggregated indicators.
We investigate the correlations between the aggregated
mobility measures and the four external socio-economic in-
dicators presented in Section IV. Table II summarizes the
correlation between the aggregated mobility measures and the
socio-economic indicators. Four main results emerge. First,
mobility diversity is a better predictor for socio-economic
development than mobility volume (Figure 5 and Table II).
Mobility diversity indeed has much stronger correlations than
mobility volume regardless the type of aggregation (Table
II). Secondly per capita income, primary education rate and
deprivation index show stronger correlations with the mobility
measures than the unemployment rate. Third, mobility diver-
sity and mobility volume show opposite correlations with the
socio-economic indicators: where the correlation is positive
for mobility diversity, the same correlation is negative for
mobility volume, and vice versa. Figure 6 provides another
way to observe the relations between mobility diversity and
socio-economic development. We split the municipalities in
deciles based on the values of deprivation index, and for
each decile we compute the distributions of mobility entropy
at municipality level. We observe that as the deciles of the
economic values increase both the mean and the variance of
the distribution change, consistently with the plots of Figure
5a.
In order to test the significance of the correlations ob-
served on the empirical data, we compare our findings with
the results produced by a null model where we randomly
distribute the users over the French municipalities. We first
extract uniformly Nusers from the dataset and assign them
to a random municipality with a population of Nusers. We
then aggregate the individual diversity measures of the users
assigned to the same municipality. We repeat the process 100
times and take the mean of the aggregated values of each
municipality produced in the 100 experiments. The outcomes
of the null model have zero correlations with all the socio-
economic indicators, allowing us to reject the hypothesis that
our results occurred by chance.
measure DI PCI PER UR
mean S-0.43 0.49 -0.49 -0.17
mean rg0.01 -0.25 0.01 -0.04
median S-0.43 0.48 -0.47 -0.17
median rg0.16 -0.21 0.47 -0.1
std S0.20 -0.26 0.27 0.11
std rg0.01 0.28 -0.21 0.13
TABLE II. CO RRE LATI ONS B ET WEE N AGG REG ATED M OB ILI TY
MEASURES AND SOCIO-EC ONO MIC I ND ICATO RS .
VII. DISCUSSION OF THE RES ULTS
The most remarkable result in our study is the observation
that human mobility, and mobility diversity in particular, is
associated with socio-economic indicators on a municipality
scale. To be specific, on a municipality level mobility entropy
is positively correlated with per capita income and negatively
correlated with deprivation index, primary education rate and
unemployment rate (Figure 5). Generalizing our empirical find-
ings, we state that a greater diversification of human mobility is
linked to a higher overall wealth, to a more educated territory
and to a lower level of deprivation. Remarkable is that a
systematic variation of the mobility entropy distribution exists
across geographical units defined on socio-economic indicators
(Figure 6), delineating subpopulations where a different distri-
bution of entropy emerges based on the occurrence of socio-
economic indicators. This is an important finding when com-
pared to Song et al. [36], a seminal work on the predictability
of human mobility, which states that mobility entropy is very
stable across different subpopulations delineated by personal
characteristics like gender or age group. The contrast between
our findings and the result of Song et al. suggests that socio-
economic situations on a city scale are more related to in-
dividual mobility than individual demographic characteristics.
The observed variation also suggests a relation between socio-
economic development and predictability: people resident in
more developed and richer territories show a higher mobility
entropy and hence more unpredictable mobility patterns.
Although the relations between mobility diversity and
socio-economic indicators appear clearly, it is difficult to
formulate a hypothesis to explain their connections. Without
a doubt, the relation between socio-economic indicators and
(a) (b) (c) (d)
(e) (f) (g) (h)
Fig. 5. The correlations between human mobility measures and socio-economic indicators: (a) mobility entropy vs deprivation index; (b) mobility entropy
vs logarithm of per capita income; (c) mobility entropy vs education rate; (d) mobility entropy vs unemployment rate; (e) radius of gyration vs deprivation
index; (f) radius of gyration vs logarithm of per capita income; (g) radius of gyration vs education rate; (h) radius of gyration vs unemployment rate. We split
the municipalities into ten equal-sized groups according to the deciles of the measures on the x axis. For each group, we compute the mean and the standard
deviation of the measures on the y axis and plot them through the black error bars. ρindicates the Pearson correlation coefficient between the two measures (in
all the cases the p-value <0.001). We observe that mobility entropy has stronger correlations with socio-economic indicators than radius of gyration.
mobility diversity is two directed. It might be that a well-
developed territory provides for a wide range of activities, an
advanced network of public transportation, a higher availability
and diversification of jobs, and other elements that foster
mobility diversity. As well as it might be that a higher mobility
diversification of individuals lead to a higher economic well-
being as it could nourish economy, establishes economic
opportunities and facilitate flows of people and goods. In-
terpretations of the relation between mobility diversity and
socio-economic development are not directly derivable from
the empirical results and should therefore be combined with
more thorough theoretical insights.
Another interesting result is that mobility volume and mo-
bility diversity show opposite correlations, i.e. high values of
aggregated mobility volume correspond to low socio-economic
development, while high values of aggregated mobility diver-
sity correspond to high socio-economic development (Figure
5). Assuming that human mobility is driven by people’s daily
activities, a possible explanation is that people living in well
developed municipalities have a wide availability of activities,
resulting in high mobility diversity. In contrast, people living
in less development municipalities, like municipalities in the
countryside, are forced to travel in search of activities that
cannot be found in their municipality, resulting in a wide
mobility volume. To investigate this hypothesis we compute
the correlation between the aggregated mobility diversity and
the aggregated mobility volume. We find a negative correlation
(ρ=0.38) confirming our insight: at municipality scale high
mobility diversity is linked to low mobility volume (Figure 7).
We plan to investigate deeply this aspect in order to understand
the reason of this interesting correlation.
VIII. CONCLUSION
In this paper we investigate the relationships between
human mobility patterns and socio-economic development in
French municipalities. Starting from nation-wide mobile phone
data we extract for each individual two mobility measures: ra-
dius of gyration, the characteristic distance traveled by an indi-
vidual, and mobility entropy, the diversification of movements
over her locations. We then aggregate the individual mobility
measures at municipality level by taking the mean, the median
and the variance across the population of users assigned to
each municipality. Finally, we compare the aggregated mobility
measures with external socio-economic indicators measuring
education level, unemployment rate, income and deprivation.
We find that both mobility measures show correlations with
the socio-economic indicators, and mobility entropy shows the
strongest correlations. We confirm our results against a null
model which produces zero correlations, allowing us to reject
the hypothesis that our discovery occurred by chance. Starting
from our interesting results, we plan to extend our study in
three directions.
First, since mobile phone data also provide information
about social interactions, it would interesting to extract mea-
Fig. 6. The distributions of mobility entropy in the different deciles
of deprivation index. We split the municipalities into ten equal-sized groups
computed according to the deciles of deprivation index. For each group, we
plot the distributions of mobility entropy. The blue dashed curve represents a fit
of the distribution, the red dashed line represents the mean of the distribution.
We observe a systematic variation of both mean and variance of the distribution
of mobility entropy across the deciles defined by deprivation index.
sures capturing the social behavior of individuals. The seminal
work by Eagle et al. showed that social diversity is a good
proxy for socio-economic development of territories [11]. It
would be interesting to compare the correlations produced by
social diversity and mobility diversity in order to understand
and quantify the different roles they play in the socio-economic
development of a territory. Is mobility diversity a better proxy
for socio-economic development than social diversity?
Second, to learn more about the relationship between the
aggregated mobility measures and the socio-economic indica-
tors it would be useful to implement and validate predictive
models. The predictive models can be aimed at predicting the
actual value of socio-economic development of the territory,
e.g. by regression models, or to predict the class of socio-
economic development, i.e. the level of socio-economic devel-
opment of a given geographic unit as done by classification
Fig. 7. The correlation between aggregated mobility diversity and
aggregated mobility volume. We split the municipalities into ten equal-
sized groups according to the deciles of the measures on the x axis. For
each group, we compute the mean and the standard deviation of the measures
on the y axis and plot them through the black error bars. ρindicates the
Pearson correlation coefficient between the two measures (p-value <0.001).
We observe a negative correlation suggesting that high mobility entropy is
linked to low mobility volume, and vice versa.
models. If we find that the accuracy and the prediction errors
of the models are not dependent on the training and test set
selected, we would have a further confirmation that mobility
measures extracted from Big Data give a real possibility
to continuously monitor the socio-economic development of
territories and provide policy makers with an important tool
for decision making.
Third, we plan to investigate the relation between hu-
man mobility patterns and socio-economic development in a
multidimensional perspective by including many other indica-
tors to understand which are the aspects of socio-economic
development that best correlate with the proposed mobility
measures. The new indicators will allow us to refine our study
on the relation between mobility measures extracted from Big
Data and the socio-economic development of territories. In the
meanwhile, experiences like ours may contribute to shape the
discussion on how to measure some of the aspects of well-
being with Big Data that are available everywhere on earth. If
we learn how to use such a resource, we have the potential of
creating a digital nervous system, in support of a generalized,
sustainable development of our societies.
ACKNOWLEDGMENT
The authors would like to thank Orange for providing
the CDR data, Giovanni Lima and Pierpaolo Paolini for the
contribution developed during their master theses. We are
grateful to Carole Pornet and colleagues for providing the
socio-economic indicators and for computing the deprivation
index for the French municipalities. We also thank Maarten
Vanhoof and Lorenzo Gabrielli for the insightful discussions.
This work has been partially funded by projects: Cimplex
(grant agreement 641191), PETRA (grant agreement 609042),
SoBigData RI (grant agreement 654024).
REFERENCES
[1] A. Amini, K. Kung, C. Kang, S. Sobolevsky, and C. Ratti. The impact
of social segregation on human mobility in developing and urbanized
regions. EPJ Data Science, 3, 2014.
[2] A.-L. Barab´
asi. The origin of bursts and heavy tails in human dynamics.
Nature, 435:207–211, 2005.
[3] V. D. Blondel, A. Decuyper, and G. Krings. A survey of results on
mobile phone datasets analysis, 2015. cite arxiv:1502.03406.
[4] J. Blumenstock. Calling for better measurement: Estimating an individ-
ual’s wealth and well-being. In Proceedings of the 20th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining,
KDD’14. ACM, 2014.
[5] J. Brea, J. Burroni, M. Minnoni, and C. Sarraute. Harnessing mobile
phone social network topology to infer users demographic attributes.
In Proceedings of the 8th Workshop on Social Network Mining and
Analysis, SNAKDD’14. ACM, 2014.
[6] D. Brockmann, L. Hufnagel, and T. Geisel. The scaling laws of human
travel. Nature, 439:462, 2006.
[7] E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user
movement in location-based social networks. In Proceedings of the 17th
ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, KDD’11, pages 1082–1090. ACM, 2011.
[8] P. J. H. Daas, M. J. Puts, and B. Buelens. Big data and official statistics.
In The 2013 New Techniques and Technologies for Statistics conference,
2013.
[9] A. Decuyper, A. Rutherford, A. Wadhwa, J. Bauer, G. Krings, T. Gutier-
rez, V. D. Blondel, and M. A. Luengo-Oroz. Estimating food consump-
tion and poverty indices with mobile phone data. CoRR, abs/1412.2595,
2014.
[10] P. Deville, C. Linard, S. Martin, M. Gilbert, F. R. Stevens, A. E.
Gaughan, V. D. Blondel, and A. J. Tandem. Dynamic population
mapping using mobile phone data. Proceedings of the National
Academy of Sciences (PNAS), 111(45):15888–15893, 2014.
[11] N. Eagle, M. Macy, and R. Claxton. Network Diversity and Economic
Development. Science, 328(5981):1029–1031, May 2010.
[12] N. Eagle and A. S. Pentland. Eigenbehaviors: identifying structure in
routine. Behavioral Ecology and Sociobiology, 63(7):1057–1066, 2009.
[13] V. Frias-martinez, V. Soto, J. Virseda, and E. Frias-martinez. Can cell
phone traces measure social development? In Third Conference on the
Analysis of Mobile Phone Datasets, NetMob, 2013.
[14] B. Furletti, L. Gabrielli, F. Giannotti, L. Milli, M. Nanni, D. Pedreschi,
R. Vivio, and G. Garofalo. Use of mobile phone data to estimate
mobility flows. measuring urban population and inter-city mobility
using big data in an integrated approach. In 47th SIS Scientific Meeting
of the Italian Statistical Society, Cagliari, June 2014.
[15] F. Galton. Vox populi. Nature, 75(7), 1907.
[16] F. Giannotti, M. Nanni, D. Pedreschi, F. Pinelli, C. Renso, S. Rinzivillo,
and R. Trasarti. Unveiling the complexity of human mobility by
querying and mining massive trajectory data. The VLDB Journal,
20(5):695–719, 2011.
[17] F. Giannotti, D. Pedreschi, A. Pentland, P. Lukowicz, D. Kossmann,
J. L. Crowley, and D. Helbing. A planetary nervous system for social
mining and collective awareness. EPJ Special Topics, 214:49–75, 2014.
[18] M. C. Gonz´
alez, C. A. Hidalgo, and A.-L. Barab´
asi. Understanding
individual human mobility patterns. Nature, 453(7196):779–782, June
2008.
[19] T. Gutierrez, G. Krings, and V. D. Blondel. Evaluating socio-economic
state of a country analyzing airtime credit and mobile phone datasets.
CoRR, abs/1309.4496, 2013.
[20] S. Jiang, J. F. Jr, and M. Gonz´
alez. Clustering daily patterns of human
activities in the city. Data Mining and Knowledge Discovery, 25:478–
510, 2012.
[21] D. Karamshuk, C. Boldrini, M. Conti, and A. Passarella. Human
mobility models for opportunistic networks. IEEE Communications
Magazine, 49(12):157–165, 2011.
[22] L. Liao, D. J. Patterson, D. Fox, and H. Kautz. Learning and inferring
transportation routines. Artif. Intell., 171(5-6):311–331, Apr. 2007.
[23] J. Lorenz, H. Rauhut, F. Schweitzer, and D. Helbing. How social
influence can undermine the wisdom of crowd effect. Proceedings of
the National Academy of Sciences (PNAS), 108(22), 2011.
[24] L. Lotero, A. Cardillo, R. Hurtado, and J. Gomez-Gardenes. Several
multiplexes in the same city: The role of socioeconomic differences in
urban mobility. Available at SSRN 2507816, 2014.
[25] S. Marchetti, C. Giusti, M. Pratesi, N. Salvati, F. Giannotti, D. Pe-
dreschi, S. Rinzivillo, L. Pappalardo, and L. Gabrielli. Small area
model-based estimators using big data sources. Journal of Official
Statistics, 31(2), 2015.
[26] J. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski,
J. Kertesz, and A. L. Barabasi. Structure and tie strengths in mobile
communication networks. Proceeding of the National Academy of
Sciences (PNAS), 104(18):7332–7336, 2007.
[27] Indicators and a monitoring framework for the sustainable development
goals: Launching a data revolution for the sdgs. A report by the
Leadership Council of the Sustainable Development Solutions Network,
20 march 2015.
[28] L. Pappalardo, S. Rinzivillo, Z. Qu, D. Pedreschi, and F. Giannotti.
Understanding the patterns of car travel. EPJ Special Topics, 215(1):61–
73, 2013.
[29] L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi, F. Giannotti, and
A.-L. Barab´
asi. Returners and explorers dichotomy in human mobility.
Nature Communications, 6(8166), 2015.
[30] D. Pennacchioli, M. Coscia, S. Rinzivillo, D. Pedreschi, and F. Gian-
notti. Explaining the product range effect in purchase data. In IEEE
International Conference on Big Data, pages 648–656, 2013.
[31] C. Pornet, C. Delpierre, O. Dejardin, P. Grosclaude, L. Launay, L. Gui-
ttet, T. Lang, and G. Launoy. Construction of an adaptable european
transnational ecological deprivation index: the french version. Journal
of Epidemiol Community Health, 66(11):982–9, 2012.
[32] S. Rinzivillo, L. Gabrielli, M. Nanni, L. Pappalardo, D. Pedreschi, and
F. Giannotti. The purpose of motion: Learning activities from individual
mobility networks. In Proceedings of International Conference on Data
Science and Advanced Analytics, DSAA’14, 2014.
[33] S. Rinzivillo, S. Mainardi, F. Pezzoni, M. Coscia, D. Pedreschi, and
F. Giannotti. Discovering the geographical borders of human mobility.
K¨
unstliche Intelligenz, 26(3):253–260, 2012.
[34] F. Simini, M. C. Gonz´
alez, A. Maritan, and A.-L. Barab´
asi. A universal
model for mobility and migration patterns. Nature, 484(7392):96–100,
2012.
[35] C. Smith-Clarke, A. Mashhadi, and L. Capra. Poverty on the cheap:
Estimating poverty maps using aggregated mobile communication net-
works. In Proceedings of the SIGCHI Conference on Human Factors
in Computing Systems, pages 511–520. ACM, 2014.
[36] C. Song, Z. Qu, N. Blumm, and A.-L. Barab´
asi. Limits of predictability
in human mobility. Science, 327(5968):1018–1021, 2010.
[37] P. Struijs and P. J. H. Daas. Quality approaches to big data in official
statistics. In European conference on Quality in Official Statistics, 2014.
[38] J. Surowiecki. The Wisdom of Crowds: Why the Many Are Smarter
than the Few and How Collective Wisdom Shapes Business, Economies,
Societies, and Nations. Doubleday Books, New York, 2004.
[39] Data, data, everywhere. The Economist, 25 February 2010.
[40] D. Wang, D. Pedreschi, C. Song, F. Giannotti, and A.-L. Barab´
asi.
Human mobility, social ties, and link prediction. In Proceedings of the
17th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, KDD ’11, pages 1100–1108, New York, NY, USA,
2011. ACM.
[41] A world that counts: mobilizing the data revolution for sustainable
development. A report by the United Nations Secretary-General’s Inde-
pendent Expert Advisory Group on a Data Revolution for Sustainable
Development (IEAG), November 2014.
... The location of people in cities is predictable 19 and is strictly connected with the circadian rhythms of social activities 20 , as well as home and work locations. The spatio-temporal variability of commuting patterns 21 is intertwined with the mode of journey 22 , the population density (that is, urbanization level) 23 and the socio-economic status [24][25][26] . ...
... At first glance, the positive correlation between the radius of gyration and the unemployment rate (Fig. 3a,b) seems to be dissonant from previous works 25,97 . However, this result can be due to a rise in the unemployment rate and the spatial mobility levels before the pandemic. ...
Article
Full-text available
Socio-economic constructs and urban topology are crucial drivers of human mobility patterns. During the coronavirus disease 2019 pandemic, these patterns were reshaped in their components: the spatial dimension represented by the daily travelled distance, and the temporal dimension expressed as the synchronization time of commuting routines. Here, leveraging location-based data from de-identified mobile phone users, we observed that, during lockdowns restrictions, the decrease of spatial mobility is interwoven with the emergence of asynchronous mobility dynamics. The lifting of restriction in urban mobility allowed a faster recovery of the spatial dimension compared with the temporal one. Moreover, the recovery in mobility was different depending on urbanization levels and economic stratification. In rural and low-income areas, the spatial mobility dimension suffered a more considerable disruption when compared with urbanized and high-income areas. In contrast, the temporal dimension was more affected in urbanized and high-income areas than in rural and low-income areas.
... Several studies have shown that individuals of higher SES tend to travel further and have greater mobility diversity than those of lower SES (Carlsson-Kanyama & Linden, 1999;Frias-Martinez et al., 2013;Pappalardo et al., 2015). ...
Article
Understanding individual's socioeconomic status (SES) can provide supporting information for designing political and economic policies. Acquiring large‐scale economic survey data is time‐consuming and laborious. The widespread mobile phone data, which can reflect human mobility and social network characteristics, has become a low‐cost data source for researchers to infer SES. However, previous studies often oversimplify human mobility features and social network features extracted from mobile phone data into general statistical features, resulting in discounting some important temporal and relational information. Therefore, we propose a comprehensive framework for individual SES prediction that effectively utilizes a combination of human mobility and social relationships. In this framework, Word2Vec module extracts human mobility features from mobile phone positioning data, and graph neural network (GNN) module GraphSAGE captures social network characteristics constructed from call detail records. We evaluated the effectiveness of our proposed approach by training the model with real‐world data in Beijing. According to the experimental results, our proposed hybrid approach outperformed the other methods evidently, demonstrating that human mobility and social links are complementary in the characterization of SES. Coupling human mobility and social links can further deepen our understanding of cities' economic geography.
... • Spatial entropy: We use entropy to measure the heterogeneity of time distribution across geographical space. Spatial entropy has been used in previous works [52][53][54] and is defined as: ...
Article
Full-text available
Despite the historically documented regularity in human mobility patterns, the relaxation of spatial and temporal constraints, brought by the widespread adoption of telecommuting and e-commerce during the COVID-19 pandemic, as well as a growing desire for flexible work arrangements in a post-pandemic work, indicates a potential reshaping of these patterns. In this paper, we investigate the multifaceted impacts of relaxed spatio-temporal constraints on human mobility, using well-established metrics from the travel behavior literature. Further, we introduce a novel metric for schedule regularity, accounting for specific day-of-week characteristics that previous approaches overlooked. Building on the large body of literature on the impacts of COVID-19 on human mobility, we make use of passively tracked Point of Interest (POI) data for approximately 21,700 smartphone users in the US, and analyze data between January 2020 and September 2022 to answer two key questions: (1) has the COVID-19 pandemic and its associated relaxation of spatio-temporal activity patterns reshaped the different aspects of human mobility, and (2) have we achieved a state of stable post-pandemic “new normal”? We hypothesize that the relaxation of the spatiotemporal constraints around key activities will result in people exhibiting less regular schedules. Findings reveal a complex landscape: while some mobility indicators have reverted to pre-pandemic norms, such as trip frequency and travel distance, others, notably at-home dwell-time, persist at altered levels, suggesting a recalibration rather than a return to past behaviors. Most notably, our analysis reveals a paradox: despite the documented large-scale shift towards flexible work arrangements, schedule habits have strengthened rather than relaxed, defying our initial hypotheses and highlighting a desire for regularity. The study’s results contribute to a deeper understanding of the post-pandemic “new normal”, offering key insights on how multiple facets of travel behavior were reshaped, if at all, by the COVID-19 pandemic, and will help inform transportation planning in a post-pandemic world.
... 이렇듯 코로나 19를 중심으로 재난문자의 효과에 대한 연구가 이루어지고 있으나 폭염을 비롯한 다른 많은 재난 유형의 재난문자 어떤 효과를 가지는지에 대한 연구는 부족한 실정이다. 더욱 이, 폭염과 같이 특정 시간대에 도시민 활동 및 이동 관리를 통한 피해 경감과 예방의 목적을 가진 재난문자는 도시민 이동에 대해 어떤 효과를 미치는지 이해할 필요성이 높다.2.3 도시민 이동에 영향을 미치는 지역 특성 요인도시계획부터 사회학 및 지리학 등 학술 영역은 인간 활동 및 이동과 지역 및 개인의 특성 간의 관계 규명에 집중하고 있다(Frias-Martinez et al., 2010;Lee and Holme, 2015;Pappalardo et al., 2015;Jung and Nam, 2019;Heo et al., 2020). 관련 연구들에서는 이동성 또는 유동인구에 영향을 미치는 지역 특성을 크게 사회경제적 측면 또는 물리환경적 측면에서 분석하고 있다. ...
Article
Heatwave emergency alert messages (EAM) not only provide objective information on heatwave occurrences but also include behavioral guidelines, such as avoiding outdoor activities or going out to respond to heatwaves and minimize damage. To investigate the EAM's effectiveness, we analyzed the changes in the floating population during 2021 using Seoul dong-unit and hourly location-based mobile big data. We also examined what socioeconomic and physical environmental characteristics could cause differences in the effectiveness, focusing on the fact that EAM may have different effects depending on regional factors. The findings revealed a notable reduction in the overall floating population of Seoul, ranging from 0.1 to 3.4%. When examined by region, an average decrease of 2.4 to 5.6% was observed, indicating the different effects of EAM by region. Further analysis of regional characteristics highlighted that areas with low EAM effectiveness were characterized by a lower ratio of senior residents, higher employment concentration, and higher accessibility to public transportation. Additionally, the number of household members, gender distribution, average age, and presence of green spaces can contribute to the gap in the effectiveness of EAM between regions.
... By more actively sharing vaccine resources under "enlightened self-interest" incentive mechanism, up to 94.8% (95% CI = 94.3%-95.3%, P < 0.001) of inter-regional mobility reduction can be averted compared with the "selfish" strategy, promoting the free movements of talents and goods [39][40][41] in these regions. Under a more equitable global vaccine distribution, vaccine-producing regions could maintain a higher level of inter-regional mobility, providing another incentive for actively sharing vaccines. ...
Article
Full-text available
Background Despite consensus that vaccines play an important role in combatting the global spread of infectious diseases, vaccine inequity is still a prevalent issue due to a deep-seated mentality of self-priority. We aimed to evaluate the existence and possible outcomes of a more equitable global vaccine distribution and explore a concrete incentive mechanism that promotes vaccine equity. Methods We designed a metapopulation epidemiological model that simultaneously considers global vaccine distribution and human mobility, which we then calibrated by the number of infections and real-world vaccination records during the coronavirus disease 2019 (COVID-19) pandemic from March 2020 to July 2021. We explored the possibility of the enlightened self-interest incentive mechanism, which comprises improving one’s own epidemic outcomes by sharing vaccines with other countries, by evaluating the number of infections and deaths under various vaccine sharing strategies using the proposed model. To understand how these strategies affect the national interests, we distinguished imported from local cases for further cost-benefit analyses that rationalise the enlightened self-interest incentive mechanism behind vaccine sharing. Results The proposed model accurately reproduces the real-world cumulative infections for both global and regional epidemics (R²>0.990), which can support the following evaluations of different vaccine sharing strategies: High-income countries can reduce 16.7 (95% confidence interval (CI) = 8.4-24.9, P < 0.001) million infection cases and 82.0 (95% CI = 76.6-87.4, P < 0.001) thousand deaths on average by more actively sharing vaccines in an enlightened self-interest manner, where the reduced internationally imported cases outweigh the threat from increased local infections. Such vaccine sharing strategies can also reduce 4.3 (95% CI = 1.2-7.5, P < 0.01) million infections and 7.0 (95% CI = 5.7-8.3, P < 0.001) thousand deaths in middle- and low-income countries, effectively benefiting the whole global population. Lastly, the more equitable vaccine distribution could help largely reduce the global mobility reduction needed for pandemic control. Conclusions The incentive mechanism of enlightened self-interest we explored here could motivate vaccine equity by realigning the national interest to more equitable vaccine distributions. The positive results could promote multilateral collaborations in global vaccine redistribution and reconcile conflicted national interests, which could in turn benefit the global population.
... International efforts have been made to monitor migration and mobility by using Big Data like mobile phone location data [33], geocoded Twitter messages [34], and GPS records [35]. Big Data also offer the potential capacity to measure, monitor, and predict socioeconomic phenomena and interactions in quasi-real time [36][37][38]. Nevertheless, accessible and timely global or representative data are still a critical challenge [39]. ...
Article
Full-text available
Promoting human mobility and reducing inequality among countries are the Sustainable Development Goals’ (SDGs) targets. However, measuring human mobility, assessing its heterogeneity and changes, and exploring associated mechanisms and context effects are still key challenges, especially for developing countries. This study attempts to review the concept of human mobility with complex thinking, assess human mobility across forty countries in Sub-Saharan Africa (SSA), and examine the effect of climatic and socioeconomic factors. Based on the coined definition of human mobility, international migration and cross-border trips are taken to assess human mobility in terms of permanent migration and temporary moves. The forty SSA countries are hence classified into four mobility groups. Regression models are performed to identify key determinants and estimate their effects on mobility. The results reveal that seven of these forty countries had a high mobility, whereas most experienced a decline in permanent migration. Lesotho, Cabo Verde, and Namibia presented high temporary moves, while Eritrea, Rwanda, Equatorial Guinea, and Liberia had a high permanent migration. Climatic and socioeconomic conditions demonstrated significant effects on mobility but were different for temporary moves and permanent migration. Wet extremes reduced mobility, whereas extreme temperature variations had positive effects. Dry extremes promoted permanent migration but inhibited temporary moves. Economic wealth and political instability promoted permanent migration, while the young population counteracted temporary moves. Food insecurity and migrant networks stimulated human mobility. The analysis emphasises the interest in analysing human mobility for risk reduction and sustainability management at the multi-county level.
Article
The development of Information and Communication Technology has shifted human activity from offline to online, and promoted digital economy. These changes challenge traditional methodologies relying on physical human activity as a micro-level reflection of the macro-level economy. To address this, a hierarchical framework is proposed to characterize cyber human activity, incorporating activity diversity, size, and preference. Then, Ordinary Least Squares and Geographically Weighted Regression models are used to examine the spatial interplay between cyber human activity and township-level economy. K-Medoids clustering is further applied to coefficients in cyber GWR model to reveal mechanisms influencing local economies. Taking diverse geospatial data in Jilin Province, China as an example, the result indicates that the significant extreme values of cyber human activity have divided Jilin Province into four subregions, closely aligning with local economic segmentation. Moreover, the local economy can be better reflected by cyber human activity rather than physical ones, especially in highly digitized regions. Furthermore, the regions with similar influence mechanisms cluster geographically. These clusters are categorized into three types, namely, dynamics, robust, and balanced, representing different mechanisms influencing the local economy. Practical and theoretical implications are discussed, including using cyber human activity in assessing the economy and implementing adaptable economic policies at township level.
Preprint
Full-text available
Accurate identification, quantification, and continual monitoring of carbon emissions constitute pivotal elements for proactive climate interventions. Conventional methodologies like direct measurement, including point-source assessments or remote sensing, often face challenges related to high costs or limited accuracy. Especially in low- and middle-income nations experiencing escalating emissions, carbon monitoring heavily relies on the administrative capabilities of local governments, which frequently lack adequate monitoring infrastructures. Addressing this predicament, our study introduces a computational framework to forecast CO 2 emissions by leveraging comprehensive observable human activity data from third-party sources. Our findings elucidate a robust correlation between multi-origin CO 2 emissions and human mobility ( r = 0.89). Notably, machine learning models adeptly predict these emissions by integrating characteristics extracted from temporally aggregated, anonymized mobility networks (R ² ≈1.0). We demonstrate that the model effectively captures the notable reduction in CO 2 emissions during the COVID-19 lockdown, with both human mobility and CO 2 emissions in China decreasing by 56.97% and 32.45%, respectively. The prediction accuracy remains high for countries with varying social economic development, such as the U.S., Italy and Mexico. This study presents an inexpensive, real-time, and robust method of quantifying CO 2 emissions on a large scale with high precision, and it could facilitate tailored CO 2 emission reduction strategies, grounded in robust scientific evidence derived from the dynamics of human mobility.
Article
Transportation research has shown that socio‐demographic factors impact people's mobility patterns. During the COVID‐19 pandemic, some of these effects have changed in accordance with changing mobility needs adapting to the pandemic, including restrictions on in‐person gatherings, closure of in‐person businesses, and working from home. We investigate two gaps in current knowledge in this area of transportation research: to what extent the associations between socio‐demographic factors and mobility metrics have changed, and how these associations vary across geographic space. We used aggregate deidentified cell tower location data to measure two mobility metrics—movement time and radius of gyration—and socio‐demographic data from the 2016 Canadian Census to model these associations across Ontario, Canada in 2020 using a linear model and a geographically weighted regression model. We find that certain associations between socio‐demographics and mobility have changed from what we previously observed before the pandemic, and we can see the variation of these associations across space. These findings will improve our understanding of how socio‐demographic factors affect mobility patterns in different communities and demonstrate the importance of measuring these associations at a more fine‐grained level using models that consider spatial variation to best reflect the nature of these associations.
Article
Full-text available
The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.
Article
Full-text available
More and more data are being produced by an increasing number of electronic devices physically surrounding us and on the internet. The large amount of data and the high frequency at which they are produced have resulted in the introduction of the term ‘Big Data’. Because these data reflect many different aspects of our daily lives and because of their abundance and availability, Big Data sources are very interesting from an official statistics point of view. This article discusses the exploration of both opportunities and challenges for official statistics associated with the application of Big Data. Experiences gained with analyses of large amounts of Dutch traffic loop detection records and Dutch social media messages are described to illustrate the topics characteristic of the statistical analysis and use of Big Data.
Article
Full-text available
The timely, accurate monitoring of social indicators, such as poverty or inequality, on a finegrained spatial and temporal scale is a crucial tool for understanding social phenomena and policymaking, but poses a great challenge to official statistics. This article argues that an interdisciplinary approach, combining the body of statistical research in small area estimation with the body of research in social data mining based on Big Data, can provide novel means to tackle this problem successfully. Big Data derived from the digital crumbs that humans leave behind in their daily activities are in fact providing ever more accurate proxies of social life. Social data mining from these data, coupled with advanced model-based techniques for fine-grained estimates, have the potential to provide a novel microscope through which to view and understand social complexity. This article suggests three ways to use Big Data together with small area estimation techniques, and shows how Big Data has the potential to mirror aspects of well-being and other socioeconomic phenomena.
Article
Full-text available
This study leverages mobile phone data to analyze human mobility patterns in a developing nation, especially in comparison to those of a more industrialized nation. Developing regions, such as the Ivory Coast, are marked by a number of factors that may influence mobility, such as less infrastructural coverage and maturity, less economic resources and stability, and in some cases, more cultural and language-based diversity. By comparing mobile phone data collected from the Ivory Coast to similar data collected in Portugal, we are able to highlight both qualitative and quantitative differences in mobility patterns - such as differences in likelihood to travel, as well as in the time required to travel - that are relevant to consideration on policy, infrastructure, and economic development. Our study illustrates how cultural and linguistic diversity in developing regions (such as Ivory Coast) can present challenges to mobility models that perform well and were conceptualized in less culturally diverse regions. Finally, we address these challenges by proposing novel techniques to assess the strength of borders in a regional partitioning scheme and to quantify the impact of border strength on mobility model accuracy.
Article
Full-text available
In this paper, we review some advances made recently in the study of mobile phone datasets. This area of research has emerged a decade ago, with the increasing availability of large-scale anonymized datasets, and has grown into a stand-alone topic. We will survey the contributions made so far on the social networks that can be constructed with such data, the study of personal mobility, geographical partitioning, urban planning, and help towards development as well as security and privacy issues.
Article
Full-text available
Recent studies have shown the value of mobile phone data to tackle problems related to economic development and humanitarian action. In this research, we assess the suitability of indicators derived from mobile phone data as a proxy for food security indicators. We compare the measures extracted from call detail records and airtime credit purchases to the results of a nationwide household survey conducted at the same time. Results show high correlations (> .8) between mobile phone data derived indicators and several relevant food security variables such as expenditure on food or vegetable consumption. This correspondence suggests that, in the future, proxies derived from mobile phone data could be used to provide valuable up-to-date operational information on food security throughout low and middle income countries.
Article
Full-text available
Significance Knowing where people are is critical for accurate impact assessments and intervention planning, particularly those focused on population health, food security, climate change, conflicts, and natural disasters. This study demonstrates how data collected by mobile phone network operators can cost-effectively provide accurate and detailed maps of population distribution over national scales and any time period while guaranteeing phone users’ privacy. The methods outlined may be applied to estimate human population densities in low-income countries where data on population distributions may be scarce, outdated, and unreliable, or to estimate temporal variations in population density. The work highlights how facilitating access to anonymized mobile phone data might enable fast and cheap production of population maps in emergency and data-scarce situations.
Chapter
In this work we analyze the architecture of real urban mobility networks from the multiplex perspective. In particular, based on empirical data about the mobility patterns in the cities of Bogotá and Medellín, each city is represented by six multiplex networks, each one representing the origin-destination trips performed by a subset of the population corresponding to a particular socioeconomic status. The nodes of each multiplex are the different urban locations whereas links represent the existence of a trip from one node (origin) to another (destination). On the other hand, the different layers of each multiplex correspond to the different existing transportation modes. By exploiting the characterization of multiplex transportation networks combining different transportation modes, we aim at characterizing the mobility patterns of each subset of the population. Our results show that the socioeconomic characteristics of the population have an extraordinary impact in the layer organization of these multiplex systems.
Conference Paper
We study the structure of the social graph of mobile phone users in the country of Mexico, with a focus on demographic attributes of the users (more specifically the users' age). We examine assortativity patterns in the graph, and observe a strong age homophily in the communications preferences. We propose a graph based algorithm for the prediction of the age of mobile phone users. The algorithm exploits the topology of the mobile phone network, together with a subset of known users ages (seeds), to infer the age of remaining users. We provide the details of the methodology, and show experimental results on a network GT with more than 70 million users. By carefully examining the topological relations of the seeds to the rest of the nodes in GT , we find topological metrics which have a direct inuence on the performance of the algorithm. In particular we characterize subsets of users for which the accuracy of the algorithm is 62% when predicting between 4 age categories (whereas a pure random guess would yield an accuracy of 25%). We also show that we can use the probabilistic information computed by the algorithm to further increase its inference power to 72% on a significant subset of users.