Content uploaded by Costas Panagiotakis
Author content
All content in this area was uploaded by Costas Panagiotakis on Jan 01, 2023
Content may be subject to copyright.
Personalized Itinerary Recommendation via
Expectation-Maximization
Costas Panagiotakis
Department of Management Science and Technology
Hellenic Mediterranean University
72100 Agios Nikolaos, Crete, Greece
Email: cpanag@hmu.gr
Evangelia Daskalaki
Department of Electrical and Computer Engineering
Hellenic Mediterranean University
71004 Heraklion, Crete, Greece
Email: eva@ics.forth.gr
Harris Papadakis
Department of Electrical and Computer Engineering
Hellenic Mediterranean University
Heraklion 71004, Crete, Greece
Email:adanar@hmu.gr
Paraskevi Fragopoulou
Department of Electrical and Computer Engineering
Hellenic Mediterranean University
71004 Heraklion, Crete, Greece
Email:fragopou@ics.forth.gr
Abstract—The personalized itinerary recommendation prob-
lem in selecting a subset of locations to visit from among a larger
set while maximizing the benefit for the tourist. In this work, we
propose an efficient deterministic method for the recommenda-
tion of personalized itineraries consisting of a sequence of Points
of Interest (POIs) that maximizes the expected user satisfaction
and adheres to user time constraints. Experimental results on a
large number of synthetic and real-world datasets demonstrate
the high performance of our framework.
Index Terms—Recommender Systems, Orienteering Problem
I. INTRODUCTION
Recommender Systems predict the preferences of users
for specific items, based on collective analysis of prior user
preferences [1]. They have become increasingly popular in
assisting users in the decision making process. Recommender
systems have been applied successfully to the important and
complex task of planning and scheduling tour itineraries which
comprise sequences of Points-of-Interests (POIs) based on the
unique preferences of individual tourists [2]. The selection of
the most valuable POIs is not trivial due to the aforementioned
constrains and parameters as well as the limitations of each
individual tourist. In this work our main goal is to provide a
sequence of POIs that maximize user satisfaction under several
given constraints such as user time budget, POIs opening hours
as well as spatial constraints (e.g. start and end user points,
POIs locations, etc).
Figure 1 depicts an instance of a personalized tour itinerary,
where the user starts at point 6 and ends at point 4. In this
example, a solution provided by the proposed framework is
plotted with red color, and consists of the three POIs (10, 16,
11) with the highest user satisfaction score. The size and the
color of a POI correspond to the duration of the visit and the
gained user satisfaction, respectively. Additionally, all graph
edges are assigned a travelling time. According to the proposed
timetable (see Fig. 1(b)), the tour start at 10:00 and ends at
12:53 respecting the user’s time budget (10:00 to 13:00).
0 20 40 60 80 100
0
20
40
60
80
100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0
0.2
0.4
0.6
0.8
1
(a) (b)
User
Preferences
POIs Sa sfac on
Recommender
System
I nerary
PIREM
POIs Opening
Hours
User Time
Budget
POIs Travel
Times
User Start and
End points
SR
(c)
Fig. 1. (a),(b) An example of a personalized itinerary on a 2D map with 16
POIs. The proposed itinerary consists of three POIs (10, 16, 11). The tour
should start at point 6 at 10:00 and end at point 4 before 13:00 (user time
budget: 10:00-13:00). (a) A map of 16 POIs, where each POI is drawn by a
circle. The proposed itinerary is indicated by the red line. (b) The timetable
of the personalized itinerary. (c) A schema of the proposed framework.
The main contribution of this work concerns the formulation
of the Personalized Itinerary Recommendation (PIR) prob-
lem based on the maximization of an appropriate objective
function, leading to a high performance and computation-
ally efficient deterministic method. For each visited POI,
the proposed objective function takes into account user’s
satisfaction, the POI’s visit duration as well as the number
of already selected POIs in order to achieve higher values as
the number of POIs in the itinerary increases. In addition, the
gained user satisfaction is related to the POI’s visit duration
constituting the proposed objective function more realistic.
Another significant contribution concerns the applicability of
the proposed method, as it can be easily combined with any
recommender system (see Figure 1(c)). Finally, the creation
of a large synthetic dataset, used to test the Personalized
Itinerary Recommendation under various parameters, consti-
tutes another important contribution of this work.
II. RE LATE D WO RK
Many prior studies formulated the Itinerary Recommenda-
tion as a variant of the Orienteering Problem (OP) [3] or the
Travelling Salesman Problem (TSP) [4]. These methods, how-
ever, did not succeed in general to incorporate personalization
for individual users. In personalization-based approaches, the
main challenge remains to implicitly infer the preferences of
tourists and incorporate these as part of the recommended tour
itinerary [2].
The PersTour algorithm [5] considers both POI popularity
and user preferences to recommend suitable POIs for the user
to visit and the amount of time to spend at each one. Authors
in [6] adopt a two-phase heuristic approach combining an
Artificial bee colony algorithm and a differential evolution
algorithm taking into consideration the spatial heterogeneity
of POIs. The method proposed in [7] recommends emotionally
pleasing tours in a city. To quantify the extent to which
various urban locations are pleasant, data from a crowd-
sourcing platform was utilized. The selection of the best
itinerary is performed by first identifying Mshortest paths
using Eppstein’s algorithm [8], and subsequently computing
the average rank for all locations in each of the first m
(m<M) paths. At each exploration, the path with the
lowest (best) average rank is maintained. In [9], a Genetic
Algorithm (GA) is proposed to provide a travel plan consisting
of a set of highly-ranked tourist attractions and restaurants
with respect to several constraints. GA uses natural selection
and genetics principles to solve the optimization problem of
itinerary recommendation. It is based on multistage processing
that includes initialization, selection, crossover, and mutation
to generate and refine the candidate solution. AGAM [10] is
another genetic algorithm with crossover and mutation for
solving the same problem. Recently, extra practical tourism
constraints have been included in the tourist trip design prob-
lem such as mandatory visits, limits on the number of locations
of each type, and the order at which selected locations are
visited [11]. In [11], four methods are proposed based on the
branch-and-check approach to solve the classical problem with
extra practical tourism constraints. The master problem selects
a subset of locations verifying all constraints except time-
related, and all these locations constitute candidate solutions.
III. PROB LE M DE FIN IT IO N
First, we define preliminaries concerning the input of our
approach. We assume a graph (e.g. city map) with nPOIs
P={p1, ...., pn}. Let Tbe the traveling time matrix (n×n)
of the pair-wise distances for all POI. Tcan be computed
by applying Johnson’s algorithm [12] on the graph of POIs
in O(n2), under the assumption that the number of graph
edges is O(n)which is usually true for city maps. pairs.
Additionally, for each POI pithe visit duration diand the
opening time window oiis known. Without loss of generality,
we can assume that p1and pnare the given starting and
ending locations (POIs) of the tour. According to the problem
definition, the user provides the starting time st and the time
budget Bof the tour. This means that the tour itinerary
should end at st +Bor earlier. sidefines the gained user
satisfaction per hour by visiting POI pi. In our framework, siis
computed offline e.g. by a recommender system based on user
preferences or other features (travel history, etc.) as depicted in
Figure 1(c). The output of our approach is an itinerary c, which
defines the visited POIs as well as the corresponding temporal
information. Therefore, an itinerary cis defined by a sequence
of triples, where each triple (pi, ati, dti)is comprised by the
visited POI piwith the corresponding arrival atiand departure
dtitimes. Thus, we denote by v(c), the sequence of triples
(pi, ati, dti)of itinerary c, for which it holds that dti> ati.
Therefore, it holds that ∀pi:pi∈v(c), c ≥2it holds that
the arrival time atiis given by ati=dti−1+Tpi−1,pi. The
itinerary should end at POI pn, meaning that the last triplet of c
should be the following: c(|c|) = (pn, atn, dtn).dtn≤st+B,
meaning that the tour itinerary ends at time st +Bor earlier.
A. Evaluating an itinerary
Solving the itinerary recommendation problem amounts to
finding a legal (i.e. satisfying the pre-mentioned problem con-
strains) itinerary c∗that maximizes an appropriately defined
objective function Fso that c∗=argmaxc∈LS F(c), where
LS is the set of legal itineraries according to the problem
constrains. In order to assess this itinerary, we propose an
objective function Fthat has the following properties in order
to to achieve the highest user satisfaction, while respecting the
given problem constraints:
•For each POI (pi) of c,Flinearly increases with the
corresponding gained user satisfaction per hour sithat
is multiplied by the visit duration dti−ati. Intuitively,
the larger gained satisfaction, the more preferable the
itinerary c.
•The number of visited points |v(c)|slightly increases the
value of the objective function, so that when two legal
itineraries yield almost the same user satisfaction, the
larger itinerary will be more preferable.
•The value of the objective function for legal and non legal
itineraries is non-negative and −∞, respectively.
The aforementioned properties are well captured by defining
the objective function F(c)as following:
F(c) = (1 + log(|v(c)|)) ·P(pi,ati,dti)∈csi·(dti−ati)if c∈LS
−∞ if c∈ LS
(1)
According to Eq. 1, it holds that if c′⊃c, c, c′∈LS
then F(c′)> F (c). Therefore, as cgrows e.g. by applying
an iterative algorithm, it holds that the F(c)increases. The
expected value of the objective function of the itinerary c
shows the upper limit of its current value F(c)taking into
account that the maximum time duration of an itinerary is at
most B. Let F(c)≥F(c)be the following simplified version
of the expected value of the objective function (see Eq. 2)
under the assumption that the value of F(c)linearly increases
with the duration of itinerary c, that is true according to Eq.
1, if we ignore the term (1 + log(|v(c)|)).
F(c) = B
dtn−at1
·F(c)(2)
In this formulation, the total duration of itinerary cis given
by the difference dtn−at1. According to the proposed
methodology, the F(c)is maximized.
IV. THE PIREM ALGORITHM
According to the proposed method, the itinerary recom-
mendation problem is solved by sequentially adding the most
suitable unvisited POI in the current itinerary, the one that
maximizes the expected value of the objective function, as this
is defined in Eq. (2). The input of the proposed method are
the variables P, st, B, T ,di, oi, si, i ∈ {1, ..., n}as described
in Section III. The goal of the proposed method is to compute
a solution for the PIR problem. The set Sof the indexes of
visited POI is initialized to the empty set, while the first triplet
of c∗is set equal to {(p1, st, st)}, according to the problem
definition. In the main loop of the proposed PIREM algorithm,
we get the set of the unvisited POIs indexes Uthat will be
used to find the next visited POI. In the computation of U,
we ignore POI pn(U={1, ..., n −1} − S), since this is
definitely inserted after the last POI of c∗(c∗(|c|)) at the end
of the method. This loop terminates when no changes take
place in the main loop or the set Uis empty.
Subsequently, we evaluate whether the insertion of each
unvisited POI pk,k∈Uat the position mof the current
optimal itinerary c∗is legal according to the problem con-
straints and whether it improves the current optimal value of
F B (expected value of the objective function). If both of the
following statements are true, it means that the insertion of pk
at position mof cis valid:
1) All the visited POIs of care opened.
2) The tour cends at time st +Bor earlier.
Finally, we check if the insertion of pkimproves the current
optimal value F B and we update F B . The current optimal
itinerary c∗and set Sare updated in this loop, so that the most
suitable POI will be inserted at the most suitable position of c∗.
Therefore due to the proposed expectation maximization based
method, a short in time duration itinerary is more promising
and it is preferred to be selected as optimal itinerary c∗to be
extended, than a long in time duration one with similar values
on the objective function.
The PIREM-SR algorithm: The resulting solution of PIREM
may land on a local minima of the objective function due to the
sequential optimization. Thus, we propose an extra optional
step to improve the PIREM solution via a better exploration
of the search space as follows:
1) In this step, successive replacements of the visited POIs
with unvisited ones are performed. In each iteration of
the main loop of this step, we select the replacement that
improves the most the value of the objective function.
2) Otherwise, if the value of the objective function cannot
be improved further, the method terminates.
We denote this variant of the algorithm with the extra step
of successive replacements as PIREM-SR. It holds that the
solutions proposed by PIREM-SR are better or equivalent to
the corresponding solutions of PIREM, since the extra step
can only result to improvement.
V. EX PE RI ME NTAL E VALUATION
In our experimental results, we have created 2048 different
experimental setups on 64 synthetic datasets using several
problem parameters. Our intention is to provide a high number
of random experimental setups that are realistic concerning
the default parameters’ values in order to be able to fairly
compare all the method under almost real conditions. Each
of the 64 synthetic datasets is generated by adding nPOIs at
random positions on a 2D-map, where n∈ {8,16,24,32,40}.
The roads (edges) of each map are generated as follows,
we sequentially connect the closest POIs according to the
following rule: An edge is created if the distance between its
middle point and the rest of the edges exceeds a predefined
threshold in order not to create edges that are very close to
each other. In order to create 64 synthetic datasets, we have
created 16 maps for every value of nfollowing the aforemen-
tioned procedure. Subsequently, we set the parameters for each
POI piof a synthetic dataset. Parameters diand oiare selected
randomly from {0.25,0.5,0.75,1}and {[9:00, 24:00], [12:00,
21:00] , [9:00, 14:00], [14:00, 24:00], [9:00, 14:00] ∪[17:00,
21:00]}, respectively. Finally, for each synthetic dataset, we
create 32 different experimental setups by randomly selecting
the starting and ending locations of the tour from the available
POIs. For each setup, we set the starting time of tour at 9:00
(st =9:00), while the time budget Bis randomly selected from
{5,6,7,8}. The value of parameter siis randomly selected in
[0,1]. One example of the synthetic datasets with n= 16 is
illustrated in Fig. 1.
Additionally, in order to test our method with real data,
we used two real datasets from Vienna and Budapest cities
presented in [5]. Vienna and Budapest datasets comprise a
set of users and their visits to n= 28 and n= 38 POIs,
respectively. For the real dataset, we create 256 different
experimental setups following the same procedure applied on
the synthetic datasets (see previous paragraph). The value of
parameter siof a POI is given by the ratio of the POI visits
according to the data provided by [5].
In our experiments, we have included the proposed methods
PIREM,PIREM-SR. In order to show the importance of the
EM criterion, we have implemented two variants of the pro-
posed methods that maximizes the value of objective function
F(c)instead of F(c), called PIRM and PIRM-SR. Moreover,
to evaluate the performance of the proposed method, we com-
pared it against the following PIR methods [7], [9], described
in Section II. Hereafter, the method proposed in [7] that is
based on shortest paths is called SPM and the genetic based
algorithm proposed in [9] is called GA. Both methods have
been modified to maximize the proposed objective function
F(c). The itineraries provided by the aforementioned methods
are evaluated according to the objective function F(c)that
measures the quality (user satisfaction). Moreover, we have
evaluated the methods’ computational efficiency by measuring
their execution times. All the analysis has been done using
MATLAB 2020a on an Intel i7 core 3.20GHz with 32 GB
RAM.
Table I presents the average values of objective function F
for the six methods on the synthetic (for various values of
n) and real datasets. It holds that the proposed PIREM-SR
method clearly outperforms all methods under any map size
TABLE I
THE AVER AGE VAL UES O F FON TH E SY NTH ET IC AN D RE AL DATASE TS .
Syntetic Datasets Real Datasets
Method n = 16 n = 24 n = 32 n = 40 Average Vienna Budapest Average
PIREM 7.37 8.33 8.58 8.58 8.22 5.77 6.12 5.94
PIREM-SR 7.47 8.43 8.67 8.69 8.32 5.86 6.16 6.01
PIRM 6.26 6.62 6.28 6.56 6.43 4.97 4.59 4.78
PIRM-SR 6.27 6.64 6.31 6.58 6.45 5.05 4.60 4.82
SPM 5.53 6.26 6.39 6.9 6.27 4.00 4.68 4.34
GA 7.51 8.08 8.1 8.19 7.97 5.26 5.53 5.39
16 24 32 40 n
0
0.2
0.4
0.6
0.8
Pr
PIREM
PIREM-SR
PIRM
PIRM-SR
SPM
GA
Fig. 2. The average Precision of each method for different values of n.
with n≥24.PIREM also shows high performance results
since it outperforms all other methods under any map size
with n≥24. Next, it appears that good performance results
are obtained by GA.GA outperforms all the methods for
small maps (n= 16). This can be explained by the fact that
according to the parameters of GA when n= 16 it appears that
GA exhaustively searches the solution space, yielding high per-
formance results. This also becomes clear when comparing the
execution times of GA and PIREM. Low performance results
are obtained by SPM and PIRM.PIREM clearly outperforms
PIRM due to the proposed expectation maximization criterion,
that extends the more promising itineraries taking into account
their duration as well as the current value of the objective
function.
The results on real datasets almost agree with the cor-
responding results on the synthetic datasets. The methods’
ranking obtained for each value of nand real datasets almost
agree with the average results (see columns Average of Table I)
for synthetic and real datasets, respectively. Figure 2 shows the
average precision of each method for different values of non
synthetic datasets. For each method the precision is computed
by the percentage of datasets, where the method yields the
best itinerary according to the objective function F(c)criterion
over all methods. The results of Fig. 2 agree with the results
of Table I concerning the ranking of the methods.
Concerning the extra step of successive replacements
(method PIREM-SR), it seems that it slightly improves the
results of PIREM on about 50% of the 2048 experiments
performed. According to the results of Table I, for the cases
PIREM-SR provides improvement, it holds that on average, the
user satisfaction obtained by PIREM-SR is about 2.5% higher
than the user satisfaction obtained by PIREM.
The proposed method shows higher computational effi-
ciency compared to SPM and GA. It holds that on average
PIREM-SR is about 110 and 180 times faster than SPM and
GA, respectively. The average execution time in seconds for
the PIREM-SR method on the synthetic datasets is 0.015 sec.
Concerning PIREM, it appears to be about 270 and 450 times
faster than SPM and GA, respectively.
VI. CONCLUSIONS
In this work, we proposed a new efficient deterministic
method that recommends personalized tours that consists of
a sequence of Points of Interest (POIs) and is based on EM.
More specifically, we propose the PIREM and PIREM-SR
methods that sequentially select unvisited POIs taking into ac-
count user interests, user time budget, POI opening hours and
spatial constraints. We showed in our experimental evaluation
that the proposed method has been successfully applied on real
and synthetic datasets, providing high performance results by
maximizing user satisfaction and adhering to user time budget.
ACK NOW LE DG EM EN TS
This research has been co-financed by the European Union
and Greek national funds through the Operational Program
Competitiveness, Entrepreneurship and Innovation, under the
call RESEARCH - CREATE - INNOVATE B cycle (project
code: T2EDK-03135).
REFERENCES
[1] C. Panagiotakis, H. Papadakis, A. Papagrigoriou, and P. Fragopoulou,
“Improving recommender systems via a dual training error based correc-
tion approach,” Expert Systems with Applications, vol. 183, p. 115386,
2021.
[2] K. H. Lim, J. Chan, S. Karunasekera, and C. Leckie, “Tour recommen-
dation and trip planning using location-based social media: a survey,”
Knowledge and Information Systems, vol. 60, no. 3, pp. 1247–1275,
2019.
[3] P. Vansteenwegen, W. Souffriau, and D. Van Oudheusden, “The orien-
teering problem: A survey,” European Journal of Operational Research,
vol. 209, no. 1, pp. 1–10, 2011.
[4] G. Gutin and A. P. Punnen, The traveling salesman problem and its
variations. Springer Science & Business Media, 2006, vol. 12.
[5] K. H. Lim, J. Chan, C. Leckie, and S. Karunasekera, “Personalized trip
recommendation for tourists based on user interests, points of interest
visit durations and visit recency,” Knowledge and Information Systems,
vol. 54, no. 2, pp. 375–406, 2018.
[6] H. Ji, W. Zheng, X. Zhuang, and Z. Lin, “Explore for a day?
generating personalized itineraries that fit spatial heterogeneity of
tourist attractions,” Information & Management, vol. 58, no. 8,
p. 103557, 2021. [Online]. Available: https://www.sciencedirect.com/
science/article/pii/S0378720621001312
[7] D. Quercia, R. Schifanella, and L. M. Aiello, “The shortest path to
happiness: Recommending beautiful, quiet, and happy routes in the city,”
in Proceedings of the 25th ACM conference on Hypertext and social
media, 2014, pp. 116–125.
[8] D. Eppstein, “Finding the k shortest paths,” SIAM Journal on computing,
vol. 28, no. 2, pp. 652–673, 1998.
[9] B. S. Wibowo and M. Handayani, “A genetic algorithm for generating
travel itinerary recommendation with restaurant selection,” in IEEE
International Conference on Industrial Engineering and Engineering
Management, 2018, pp. 427–431.
[10] P. Yochum, L. Chang, T. Gu, and M. Zhu, “An Adaptive Genetic
Algorithm for Personalized Itinerary Planning,” IEEE Access, vol. 8,
pp. 88 147–88 157, 2020.
[11] D. M. Vu, Y. Kergosien, J. E. Mendoza, and P. Desport, “Branch-
and-check approaches for the tourist trip design problem with rich
constraints,” Computers and Operations Research, vol. 138, no. July
2020, p. 105566, 2022.
[12] D. B. Johnson, “Efficient algorithms for shortest paths in sparse net-
works,” Journal of the ACM (JACM), vol. 24, no. 1, pp. 1–13, 1977.