ChapterPDF Available

Collaborative Filtering in Recommender Systems: Technicalities, Challenges, Applications, and Research Trends

Authors:

Abstract and Figures

The rapid development and extensive use of recommender systems have changed the face of online service experience. The enormous data generated and the complexity involved in analyzing these data for an effective recommendation has attracted researchers from different domains, especially data analytics. In this direction, collaborative filtering (CF) has been the most widely considered approach. The objective of this chapter is to represent a comprehensive study of the CF. The chapter is written in a tutorial fashion so that it can be followed by the readers who are the beginners in this field or unfamiliar with the recommender system. Different aspects of CF such as classifications, approaches, data extraction methods, similarity metrics, prediction approaches, and performance metrics are studied meticulously. The application of CF in different domains is reviewed. More than 100 research articles are surveyed and categorized according to the application domain of CF they have covered. The challenges involved in the successful adoption of the CF are validly examined. In addition to a brief survey on CF, a systematic survey, considering 277 related papers, on current research trends (2011-2017) on CF is presented. A special discussion of future directions of CF is also stated.
Content may be subject to copyright.
Collaborative Filtering in Recommender
Systems: Technicalities, Challenges,
Applications, and Research Trends
PRADEEP KUMAR SINGH, PIJUSH KANTI DUTTA PRAMANIK, and
PRASENJIT CHOUDHURY
Department of Computer Science and Engineering,
National Institute of Technology Durgapur, India
E-mail: pijushjld@yahoo.co.in (P. K. D. Pramanik)

ABSTRACT
The rapid development and extensive use of recommender systems (RSs)
have changed the face of online service experience. The enormous data
generated and the complexity involved in analyzing these data for an
effective recommendation has attracted researchers from different domains,
especially data analytics. In this direction, collaborative filtering (CF) has
been the most widely considered approach. The objective of this chapter
is to represent a comprehensive study of the CF. The chapter is written
in a tutorial fashion so that it can be followed by the readers who are the
beginners in this field or unfamiliar with the RS. Different aspects of CF such
as classifications, approaches, data extraction methods, similarity metrics,
prediction approaches, and performance metrics are studied meticulously.
The application of CF in different domains is reviewed. More than 100
research articles are surveyed and categorized according to the application
domain of CF they have covered. The challenges involved in the successful
adoption of the CF are validly examined. In addition to a brief survey on
CF, a systematic survey, considering 277 related papers, on current research
trends (2011–2017) on CF is presented. A special discussion of future
directions of CF is also stated.
184 New Age Analytics

The recommender system (RS) has become the backbone of e-commerce.
In addition to the basic searching facility, every e-commerce portal
is opting for RS as an integral part of it. As the e-commerce market is
continuously growing, more products and services are made available
for online purchasing. Among this sea of online products and services,
customers find it very difficult to find the appropriate item for themselves.
The e-commerce vendors have come up with the solution for helping the
customer to find the appropriate item by recommending the item to the
customer which he/she might like or desire. The technical scheme that
enables the recommendation process is termed as a RS. RS attempts to
predict the items that prospective online buyers may prefer and recom-
mend these anticipated items. Unlike the search tools where people ferret
out the online products, recommendation engines aim to consciously catch
the attention of the users to the likable products. The overall objective is
to bail out users form explicit and tiresome searching and to improve the
online shopping experience. The success of e-business largely depends
on the intelligence of the algorithm used for a product recommendation.
Hence, in the age of digital marketing, it is crucial for online stores to
adopt intelligent recommendation techniques in order to sustain in the
market competition. Companies like Flipkart, Amazon, eBay, Netflix,
MovieLens, IMDb, etc. use RS extensively and innovatively as a core part
of their business innovation and exploration.
RS assesses the preference and choice of users by tracking and analyzing
their buying and browsing habits and history. The tool used for this
purpose is generally known as the ltering approach. Filtering Approach
is a method which makes the selective presentation among an array of
available commodities using various ltering parameters which makes the
ltered products more favorable to the recipient. There are several ltering
approaches of an RS in the literature such as: (i) content-based (CB), (ii)
CF, (iii) hybrid ltering, (HB), (iv) knowledge-based (KB), and (v) context-
aware (CA) as shown in Figure 8.1. CF is more popular ltering approach
among these over the past few years (Burke, 2002). CF works on the fact
of comparison of user activities, purchases, ratings, preferences, and using
this data for comparison and subsequent analysis. The customers prefer
products that have been liked or given higher preference by people with
a similar taste (Deshpande and Karypis, 2004). Hence, the CF is the most
important in this regard.
Collaborative Filtering in Recommender Systems 185
 A general framework for the recommender system.
CF attempts to guess the target user’s interest by assessing the top-n
similar users’ interests on the basis of the assumption that if two persons’
choice matches for certain things, it is highly probable that their choices
will match for other things as well. The CF suggests that the ratings given
by similar users tend to be substantively similar and similar items also tend
to receive similar ratings. The CF algorithms exploit this assumption for
the recommendation and actually use the similarity value to predict user
preferences. Similarity allows the recommender engines to nd the user
purchase patterns as well as allowing them to understand how those rating
patterns are similar to other users. All rating information is stored into
memory for prediction in making of the top-n list for recommendation.
Similar users or items have a major contribution in the prediction phase of
CF-based RS. The top-n list of the recommended items aected if simi-
larity provides the wrong result. CF uses two approaches for considering
similarity (Singh, Pramanik, and Choudhury, 2018):
i. User Similarity-Based Approach (USBA): Tries to predict the
rating based on rating information collected from similar users; and
ii. Item Similarity-Based Approach (ISBA): Uses the same idea as
USBA but, it uses item similarity instead of user similarity.
186 New Age Analytics
After computation of the similarity value of users’ and items,’ predic-
tion approaches are used to predict the ratings of a target item for a target
user. Furthermore, CF-based RS generates a list of top-n items and recom-
mend to the target user. The top-n list of the recommended items aected
if similarity provides the wrong result.
The structure of the remaining chapter is as follows. Several appli-
cation domains of RS are stated in Section 8.2. More than 100 papers
are surveyed and categorized according to the application domain they
address and the ltering approach they used. Section 8.3 mentions
dierent CF approaches. The working principle of neighborhood-based
CF is explained in Section 8.4. Section 8.5 introduces the data extraction
methods used in CF. Section 8.6 explains the similarity metrics used in CF
algorithms while a comparative study on dierent similarity metrics has
been presented in Section 8.7. Sections 8.8 and 8.9 discuss the prediction
approaches and performance metrics used in CF-based RSs, respectively.
The challenges in CF-based RSs, as well as the security and trust attacks,
are discussed meticulously in Section 8.10. Section 8.11 mentions some
of the notable works on CF. Section 8.12 presents the research trends in
CF-based RS. 277 related papers are studied for this purpose. The future
scope of CF-based Rs is discussed briey in Section 8.13. And nally, the
conclusion of the paper is presented in Section 8.14.

RS has found many application domains. Below a few of them mentioned:
1. E-Government: It is the medium by which the government makes
use of the internet and computers to deliver services to the citizens.
It is the most effective modern method which helps the government
to connect people across the country.
2. E-Library and E-Learning: It is the medium by which the system
of education is provided to individuals completely over the internet
with the help of electronic devices. It is a formal way of delivering
education through electronic resources.
3. E-Tourism: It is the digital process which is implemented to
achieve the strategy of e-commerce in tourism. It also helps to keep
the client connected with the travel partners. E-tourism leads to an
excellent medium of marketing and promotions of a company.
Collaborative Filtering in Recommender Systems 187
4. E-Resource: It refers to any resource or collection preserved in
electronic format. This type of resource requires an electronic
device to access the information. Since the resource is available in
the electronic format, huge sets of data can be available for access.
5. E-Commerce: Any commercial transactions, exchange, or transfer
of data which is carried on via the internet is termed as electronic
commerce or e-commerce. It is the fastest method of conducting
business in the modern world and thus leads to the digitization of
society.
The performance of these applications can be improved using memory-
based CF. People can easily provide their opinion about the services of these
applications and due to this; they can be received more personalized, diverse,
novel, and accurate recommendations. Table 8.1 lists dierent application
domains of RS. It also mentions the notable research works towards these
domains and also the ltering approaches used in those works.
 Application Domains of Recommender Systems, Notable Works Towards
That Domain, and Filtering Approach Used
Application
Domain
Filtering Approach Recommender System
E-government Knowledge-based Meo, Quattrone, and Ursino, 2008; Teran and
Meier, 2010; Esteban et al., 2014; Cornelis et
al., 2007
Collaborative Guo and Lu, 2007
Collaborative, Hybrid,
Knowledge-based
Wu, Zhang, and Lu, 2015; Lu et al., 2010
E-library and
e-learning
Content-based,
Collaborative, Hybrid
Balabanović and Shoham, 1997; Renda and
Straccia, 2005
Hybrid,
Knowledge-based
Porcel, López-Herrera, and Herrera-Viedma,
2009; Porcel, Herrera-Viedma, and Moreno,
2009; Porcel and Herrera-Viedma, 2010;
Serrano-Guerrero et al., 2011; Cobos et al., 2013
Knowledge-based,
Content-based
Zaíane, 2002; Chen, Duh, and Liu, 2004; Chen
and Duh, 2008; Capuano et al., 2014; Farzan
and Brusilovsky, 2006; Santos et al., 2014; Lu,
2004; Biletskiy et al., 2009
E-tourism Knowledge-based Burke, Hammond, and Young, 1996;
Fesenmaier et al., 2003; García-Crespo
et al., 2011
188 New Age Analytics
Application
Domain
Filtering Approach Recommender System
Knowledge-based,
Collaborative,
Context-aware,
Hybrid
Avesani, Massa, and Tiella, 2005; Martínez,
Rodríguez, and Espinilla, 2009; Ruotsalo et
al., 2013; García-Crespo et al., 2009; Console
et al., 2003; Moreno et al., 2013
Content-based,
Collaborative, Hybrid,
Demographic
Schiaffino and Amandi, 2009; Luz et al., 2013;
Baraglia et al., 2012
Context-aware Tung and Soo, 2004; Pashtan et al., 2003;
Rikitianskii, Harvey, and Crestani, 2014; Xing
et al., 2013
Collaborative,
Context-aware
Yanga and Hwang, 2013
E-resource Content-based Jinni, 2017; Rotten Tomatoes, 2017; IMDb,
2017; Asnicar and Tasso, 1997; ACRnews,
2017; Chesnevar and Maguitman, 2004; Park,
2013
Collaborative Ali and Van Stam, 2004; Konstan et al., 1997;
FoxTrit, 2017; Miller, Konstan, and Riedl,
2004; Hauver and French, 2001; Marcel et al.,
2003; Lee, Cho, and Kim, 2010; TASTEKiD,
2017; nanoCROWD, 2017; Movielens, 2017
Context-aware,
Collaborative
Braunhofer, Kaminskas, and Ricci, 2013;
Baltrunas et al., 2012; Levandoski et al., 2012;
Natarajan, Shin, and Dhillon, 2013; Oh et al.,
2014
Collaborative,
Knowledge-based
Zhang, Zhou, and Zhang, 2011; Hayes and
Cunningham, 2001; Sánchez et al., 2011;
Boutet et al., 2013
Content-based,
Collaborative, Hybrid
Smyth and Cotter, 2000; Blanco-Fernández
et al., 2006; Salter and Antonopoulos, 2006;
Melville, Mooney, and Nagarajan, 2002;
Domingues et al., 2013; Christou, Amolochitis,
and Tan, 2016; Parra, Brusilovsky, and
Trattner, 2014; Amolochitis, Christou, and
Tan, 2014
Knowledge-based,
Content-based
Jäschke et al., 2007; Hotho et al., 2006; Celma
and Serra, 2008; Bjelica, 2010; Moukas
and Maes, 1998; Billsus and Pazzani, 2000;
Nguyen, Lu, and Lu, 2014; Martín-Vicente et
al., 2012; Zhang et al., 2012
 (Continued)
Collaborative Filtering in Recommender Systems 189
Application
Domain
Filtering Approach Recommender System
E-commerce Knowledge-based,
Demographic
Garfinkel et al., 2006; Mccarthy et al., 2004;
Cao and Li, 2007; Hu et al., 2012; Zhao et al.,
2016; Zhao et al., 2014;
Knowledge-based,
Content-based
Burke, 1999; Nanopoulos et al., 2010; Zhang
et al., 2013; Yin et al., 2014
Collaborative, Hybrid Pratikshashiv, 2015; Lawrence et al., 2001;
Chen and Pu, 2012; Walter et al., 2012; Liu
and Karger, 2015

The CF technique can be classified into two categories as shown in Figure 8.2
(Su and Khoshgoftaar, 2009):
i. Model-Based CF: It uses some algorithms of machine learning
(ML) like Bayesian network clustering and rule-based approaches
which builds a model on user-item rating dataset and then recom-
mends items to the user.
ii Neighborhood/Memory-Based CF: Similarity and prediction
computation are the two major steps used in this category of CF.
 Collaborative filtering techniques.
 (Continued)
190 New Age Analytics


Figure 8.3 shows the conceptual framework of neighborhood-based CF
(Yang et al., 2016). Neighborhood CF defines the closest neighbors using
the following two algorithms:
User-Based CF Algorithm: User similarity metric is used to find
the nearest neighbors. The rating value of these neighbors and their
similarity values are utilized in the prediction of unrated items of
users for the formation of the Top-n list in the recommendation.
Item-Based CF Algorithm: In the item-based CF algorithm, the
nearest neighbors are determined using the similarity values of items,
and these similarity values and rating values of these neighbors are
used in the formation of the recommendation list to the user.
 A conceptual framework for neighborhood-based collaborative filtering.
Table 8.2 shows the descriptions of the notations used in this chapter.


CF uses ratings in the recommendation process. Two types of ratings have
been used in CF for a recommendation-explicit rating and implicit rating
(Li et al., 2018).
Collaborative Filtering in Recommender Systems 191
 Notations and Their Descriptions
Notation Description
Sim(i,j) Similarity between two items i and j
Ru,i Rating value of user u on item i
R
̅uAverage or mean rating value of user u
|Uij|Number of ratings of user u on both items i and j
Predicted rating value of user u on item i
r
̅ιAverage or mean rating value of item i
1. Explicit: These ratings are the specific rating that a user gives to
a product (for example, a user rates a book 3 on a scale of 1 to 5).
These explicit ratings are directly used in the extractions of users’
interest for future recommendation. The disadvantage of explicit
data is that it makes user responsible for data collection and future
rating prediction who hardly takes interest to give a rating on a
particular item.
2. Implicit: These ratings are collected by logging the user’s data
generated while browsing the website. Implicit data are easier
to collect as it does not put any pressure on the user to rate the
products on the site. However, dealing with an implicit rating is
very complicated as it is hard to find the users’ preferences from
these collected users’ browsing data. Using these collected ratings
(explicit or implicit); RSs predict the unknown ratings of the user
based on different similarity metrics and these predicted ratings
used in the recommendation process.


There are various similarity metrics used in the CF to find the nearest
neighbors and similarity values (Sarwar et al., 2001; Bilge and Kaleli,
2014; Bobadilla et al., 2012). The metrics used in the item-based CF are:
1. Cosine Similarity (CS): The function of cosine distance finds
similarity between two samples by studying the cosine of the angle
between them to quantify the similarity. The similarity values are
in the range [1, –1], where 1 shows the maximum similarity and –1
192 New Age Analytics
depicts no similarity. CS between two items i and j, is calculated
using:
sim(i, j) = cos(i, j) =
22
.
| | *| |
ij
ij
Here, i and j identifies the dot-product between two items.
2. Adjusted Cosine Similarity (ACS): It is similar to cosine
distance, also caters to the individual user’s rating. To achieve this,
it subtracts the average user rating from the individual ratings to
get uniformity. It is computed by:
sim(i, j) =
( )
( )
( )
, ,
2
2
22
, ,
( )
ui u u j u
uU
ui u u j u
uU uU
RRR R
RR R R
∈∈
−−
−−
∑∑
Here, Ru,i and Ru,j are the rating value of user u on two items i and j,
respectively. R
̅u, shows the average rating value of user u.
3. Pearson Correlation (PC): It is the most popular Similarity
Metric and is widely used in various experiments. The similarity
in it is represented between [1, –1], where 1 shows the maximum
similarity and –1 depicts no similarity. Similarity using PC, in
Item-based CF algorithm is:
sim(i, j) =
( )
( )
( )
,,
2
2
22
,,
( )
ui i u j j
uU
ui i u j j
uU uU
R RR R
RR R R
∈∈
−−
−−
∑∑
Here, R
̅i and R
̅j are the mean rating value of two items i and j,
respectively.
4. Jaccard Similarity (JS): It considers only all the common ratings
between items in spite of the absolute rating value of items. It is
calculated by [1]:
sim(i, j) =
ij
ii
RR
RR
5. Spearman Correlation (SC): It is calculated just like PC, but it
uses the respective rank of the actual rating value. The equation of
calculating similarity value by SC is as follows:
sim(i, j) =
( )
( ) ( )
,,
22
22
,,
( )
ui i u j j
uU
ui i u j j
uU uU
k kk k
kk k k
∈∈
−−
−−
∑∑
Collaborative Filtering in Recommender Systems 193
Here, ku,i and ku,j show the respective rank of items i and j of rating
value of user u. k
̅i and k
̅j denote the average rank of items i and j
respectively.
6. Euclidean Distance (ED): The Euclidian distance uses the under-
root of the squared sum of the difference between individual
ratings of the two samples whose similarity we want to find. The
distance gives an insight into how different the rating patterns are:
sim(i, j) = ,
2
, ,
2
( )
||
ij iu ju
uU
ij
rr
U
7. Manhattan Distance (MD): The equation to find similarity using
MD is given below.
sim(i, j) = ,
1
, ,
( )
||
ij iu ju
uU
ij
rr
U
8. Mean Squared Distance (MSD): It is similar to ED only difference
is that the whole Euclidian distance is squared, thus removing
under-root from the mathematics thus making calculations easier.
The equation of MSD for calculating the similarity value is shown
by:
sim(i, j) = ,
2
, ,
( )
||
ij iu ju
uU
ij
rr
U

Purpose of the RS is to provide optimized and personalized products
recommendation to the users. RS has various options to choose similarity
metrics (in literature), which gives various lists of top-n recommendation
items.
Table 8.3 illustrates the list of top-10 similar movies of target movie id
1, using the traditional similarity measures. It can be observed that every
similarity measure has a dierent top-10 movies list. Hence, there is a
need for a comparative study on similarity metrics to enhance the accuracy
of CF. On the basis of a comparative study of similarity measures; we can
improve the accuracy of RS because each similarity measures have some
limitations. For constructing Table 8.3, we collect the MovieLens dataset,
194 New Age Analytics
i.e., ml–20 m. The ltering criteria have been applied to minimize the
sparsity. These ltering criteria are:
i. Select the users who provide ratings to a minimum of 100 numbers
of movies.
ii. Select the movies which are received to a minimum of 1000 number
of ratings.
 List of Top-10 Similar Movies of Target Movie id 1, Using the Traditional
Similarity Measures
Similarity Metric Top-10 Similar Movies
Pearson Correlation 926 1272 1276 623 730 869 215 95 999 1301
Cosine Distance 1276 1192 1027 956 1079 949 352 1088 215 915
Adjusted Cosine Distance 1276 352 1027 1079 926 1088 580 401 15 729
Mean Squared Distance 1276 352 1088 926 1079 729 1027 1142 1239 354
Euclidean Distance 1276 352 1088 926 1079 729 1027 1142 1239 354
Manhattan Distance 1276 29 15 1088 1079 352 85 1239 1182 119
Spearman Correlation 1276 352 1027 926 1079 869 1088 915 949 1142

Different prediction approaches have been utilized in the prediction phase
of CF-based RS (Sarwar et al., 2001; Wu et al., 2013; Herlocker et al.,
1999). These methods for item-based CF are:
1. Mean Centering (MC): In this approach, the mean of the target
item’s rating is added with the weighted average (WA) of subtraction
between all available ratings of top-n similar items with their
respective mean is done, using as weights the correlation values
computed by the similarity measures. The equation of the MC
approach to predict the rating as given below:
=
( )
( )
()
()
, ( )
| , |
u
u
ju j
jN i
i
jN i
sim i j r r
rsim i j
+
2. Weighted Average (WA): To predict the rating for a target item,
a WA of all available ratings of top-n similar items is calculated
using weights as the correlation values computed by the similarity
measures. The equation to predict rating using WA is:
Collaborative Filtering in Recommender Systems 195
=
( )
( )
()
()
,
| , |
u
u
ju
jN i
jN i
sim i j r
sim i j
3. Z Score (ZS): Using the standard deviation of rating of the item in
MC, the equation of Z-score for item-based CF is as follows:
=
( )
( )
()
()
, ( ) /
| , |
u
u
ju j i
jN i
ii
jN i
sim i j r r
rsim i j
σ
σ
+
Here, σi represents the standard deviation of the rating value of item i.
These prediction approaches have some limitations in the sparse
dataset. Hence, for the more personalized and accurate recommendation,
there is a need for a comparative study of prediction approaches in CF.
By mutually exchanging i and j with u and v respectively, we can get the
computational equation of SMs and PAs in user-based CF.

Various performance metrics have been used in the literature of
CF-based RSs (Singh, Pramanik, and Choudhury, 2018; Samundeeswary
and Krishnamurthy, 2017; Zuva and Zuva, 2017; Pampın, Jerbi, and
O’Mahony, 2015):
1. Mean Absolute Error (MAE): It is the amount of error in the
rating prediction. The equation for calculating MAE is:
1
| |
ˆ
N
ii
i
MAE
pq
N
=
=
Here, < pi and q
̂j > denote each original ratings-predicted ratings
pair and, N shows the total number pairs that represent original
and predicted ratings pair.
2. Root Mean Square Error (RMSE): After some modification in
the equation of MAE, we get the equation of RMSE as follows:
2
1
2
( )
ˆ
N
ii
ipq
N
RMSE =
=
3. Coverage: Item coverage is the percentage of items included in
the recommendation list over the number of potential items:
196 New Age Analytics
*100
item
n
Coverage
N
=
User coverage is the percentage of users for whom the recommender
was able to generate a recommendation list over the number of
potential users.
*100
user
u
Coverage
U
=
Catalog coverage is the percentage of recommended user-item pairs
over the total number of potential pairs. The number of recom-
mended user-item pairs can be represented by the length of the
recommended lists L.
()
*100
*
user
length L
Coverage
NU
=
And finally, user interaction coverage is the percentage of rated
predictions over the total number of ratings. Here, n, and u represent
the number of items in the recommendation list and the number
of users involved in the generation of this recommendation list
respectively. N and U denote the number of potential items and the
number of potential users, whereas L shows the number of user-
item in the recommendation list.
4. Diversity: It measures how dissimilar recommended items are for
a user. This similarity is often determined using the item’s content
(e.g., movie genres) but can also be determined using how similar
items are rated.
Diversity = ()
2
(1 ( , ))
( 1) jjk
i Lu sim i i
NN
Here, sim(ij,ik) denotes the similarity between item j and item k.
5. Serendipity: It is the measure of how surprising the successful or
relevant recommendations are. The probability of a recommenda-
tion is simply a function of its overall rank over n items:
1
i
i
n rank
P
n
=
Here, Pi represents the probability of item i for recommendation
and ranki shows the rank of item i over n items. The equation of
findings unexpected recommendation is:
Collaborative Filtering in Recommender Systems 197
UNEXP =
RS
PM
Here, PM denotes the set of recommendations generated by a
primitive prediction model, and RS shows the generated recom-
mendations. UNEXP consists that list which does not belong to RS.
We define serendipity as follows:
Serendipity = 1
()
N
i
iu RS
N
=
6. Novelty: It can be defined as:
Novelty = 2
i
iL
log P
n
Higher novelty values represent that less popular items are being
recommended, thus less well-known items are likely being surfaced
for users.
The equations of computing precision, recall, F-measure, and accuracy
are as follows:
1. Precision: It can be calculated by the fraction of the recommended
items that are actually relevant to the target user.
Precision =
p
pp
t
tf
+
2. Recall: It consists of the relevant items that are part of the set
of recommended items. Hence, the equation of calculating recall
becomes:
Recall =
p
pn
t
tf
+
3. F-Measure: Precision and Recall values have been used to
compute the F-measure, and the equation is:
F-measure = *
2*
Precision Recall
Precision Recall
+
4. Accuracy: It shows how close a predicted rating is to the actual
rating. The equation of computing accuracy as follows:
198 New Age Analytics
Accuracy =
pn
pn p n
tt
tt f f
+
++ +
Here, tp, fp, tn, and fn denote the true positive, false positive, true
negative, and false negative respectively.

8.10.1 NEW USER PROBLEM
When users newly register to an RS, they do not have any ratings in their
profile ratings denote the taste or preferences of the users. In the absence
of a user, since CF is based on user preferences, it is unable to recom-
mend many of the items (Lakshmi and Lakshmi, 2014). Even when users
have scanty profiles with very few ratings, CF fails to render a reliable,
personalized recommendation to these users. To overcome this problem,
an RS used demographic features of the user from the user’s profile for
the recommendation. But it also has some issues that two users, having
the same profile, may not have the same intent towards a particular item.
8.10.2 NEW ITEM PROBLEM
The new item is an additional issue in cold start problems which is based
on the items, recurrently added to the list (Lakshmi and Lakshmi, 2014).
Firstly, the items are rated then only they can be recommended to users.
8.10.3 SPARSITY PROBLEM
It takes place when the user has used some particular product but didn’t
bother to rate it, and another possibility can be that the user was completely
unfamiliar with the product, so he didn’t rate it (Lakshmi and Lakshmi,
2014). To run over this problem one approach of RS is a clustering method.
Clustering method refines the data according to the preference of the user,
and by doing so; it makes it easy for recommending items. But again,
some issues have to be resolved in the case of multi-level clustering.
Collaborative Filtering in Recommender Systems 199
8.10.4 SCALABILITY PROBLEM
CF works on the database that contains user-item rating, and it has some
scalability issue for the users and items set in large numbers. For large item
set, the complexity of CF algorithms will be too large. High scalability of CF
system is required as many of the systems need to respond immediately to
fulfill online requirements which make recommendations for all users based
on their purchase and rating history (Alloway, 2018; Poonam, Goudar, and
Sunita, 2015).
8.10.5 SYNONYMY PROBLEM
Another problem with the CF approach is the synonymy problem
(Xiaoyuan and Taghi, 2009). Most of the CF algorithms are unable to find
similar items with various names (synonyms). Due to this, some associa-
tion problem occurs, for e.g., “kids’ movie” and “kids’ film” is basically the
same items to be searched, but according to memory-based CF, there is no
match between above two terms to compute similarity. The next problem is
abbreviations which are used a lot nowadays. Sometimes users are shown
different results when they search for particular data, inserting abbrevia-
tions. Here the work should include these shortened words and categories
them in the same list as per their full forms. Then there come the issues
which are caused due to symbols or smileys. Some users prefer smiles to
give a review of some products. For example, if a user wants to say that,
he liked a product; he will simply give a smiley or a thumb up and used, or
thumb down for dislike. So, such symbols should also be evaluated because
some sites like Amazon do not hold any importance for smileys. Such sites
rather ask to write a review in a minimum of 20 words (BBC, 2018). With
this, there also comes the problem of reviewing the product in different
regional languages. Different users want to give a review to the product
in their respective languages for e.g., Hindi (“bahutachha”), Bengali
(“khubbhalo”), English (“very nice”), etc. These different languages give
out the same meaning that the product is good. But, if only one language
will be considered then the reviews of other users, will lack its importance.
For the betterment of the RS, it is also very important to take all these issues
under examination.
200 New Age Analytics
8.10.6 LONG TAIL (LT) PROBLEM
In addition to the above-mentioned problems, one major problem also will
arise, namely long tail (LT) problem. This section will discuss: What LT
problem is and how to deal with it?
RSs basically use the past records of the user and then it anticipates the
possible future likes and dislikes of the user and recommends accordingly. A
better RS would propose fewer common options to draw the user’s interest.
It would not recommend similar kind of items repeatedly. Diversity is
related to this aspect. This aspect implies the need for recommending diverse
items to the user and how dierent the item is with respect to each other.
But the RS lacks to co-operate with this aspect which leads to LT problem
(Lei, 2013). The user will be deprived of many other necessary items just
because he did not rate those items or because he did not have any access
to those items. This generally leads to an LT problem. LT problem is when
many items remain unrated or low rated. To deal with this problem, one
idea is to rank the items in dierent ways. There is a need for segregating
the ratings of the users and then rank it. Apart from the highest-rated items,
there is also a need for recommending low rated items. The low rated items
do not hold any importance. Researchers face the problem in the ltering of
important low rated items. There is also a need to rank the items according
to the purchase history and then recommend the lowest purchased item.
However, it is not important that items that are more prestigious should
necessarily be at the top of the list. We can see this aspect in the case of
recommending books.
The LT problem can be reduced in an RS by considering (i) accuracy,
(ii) similarity, (iii) diversity, and (iv) LT (Oscar, 2010; Daniel and Kartik,
2007; Yoon and Alexander, 2008; Hongzhi et al., 2012).
Accuracy: A good RS should always check the accuracy level of
the items recommended. To what extent the item is accurate will
make the RS system run more smoothly.
Similarity: This area emphasizes the fact, how much the product
is similar to the users’ past interest. There are various algorithms
used in the RS system to find the similarity between users or items.
Diversity: The same kind of products should never be recommended
to the user on a regular basis, as this lesser down the interest of the
user. A dynamic RS gives more diversity.
Collaborative Filtering in Recommender Systems 201
Long Tail (LT): Some good items do not come into the top-n
recommended list of items due to the smaller number of ratings.
This problem provides the recommendations of more popular items
only.
8.10.7 ATTACKS
The open nature of CF-based RS makes them prone to attacks known as
Shilling attacks. Every RS identifies an item set favored by a certain user
termed as the recommendation list for that user. Unscrupulous people
use unethical ways to push their product into the Top-n recommendation
list or pull down their competitors’ product from that list. Hence, every
attack is either a push or a nuke attack. To accomplish this, attackers inject
fake profiles into the RS and give biased ratings to the items leading to
erroneous recommendations. An attacker creates a fake profile in such a
way as to remain effective and undetected at the same time. The rating
is represented as the m-dimensional vector, where m represents a total
number of items in the system. Every attack profile has four subparts
(Mobasher et al., 2007):
Target Item (IT): A singleton item, which is to be pushed or nuked.
Selected Items (IS): A set of items whose rating is determined by a
function based on the type of attack.
Filler Items (IF): A set of items chosen and rated randomly to copy
the behavior of an authentic user.
Unrated Items (IN): A set of items not rated by the attacker.
The attackers closely follow certain attack models while designing
an attack (Mobasher et al., 2005; Mobasher et al.,2015; Kaur and Goel,
2016). The target item is generally rated with the highest or the lowest
rating. But the rating functions of the ller and selected items lead to
dierent attack models. Let r
̅ denote the average rating of the RS over
all items and users and let r
̅ı denote the average rating of a certain item
i. Similarly, let σ be the standard deviation of all ratings value over all
users and items, and σi the standard deviation value of ratings of an
item i. Let, N(r, σ2) denote the Gaussian distribution having mean r and
variance σ2 and ρ(i) be the function based on which the ller items IF
are rated.
202 New Age Analytics
8.10.7.1 RANDOM ATTACK
Except for target item, attack profiles are generated based on randomly
selected users’ ratings from the database which contains information about
the distribution of ratings. This attack was first mentioned by Lam and
Riedl (2004). The set IS contains no items, whereas IF contains randomly
picked up items whose ratings are given by the function N(r, σ2) centerd on
the overall average rating in the database. In random attack:
IS = ϕ and ρ(i) = N(r
̅, σ2)
8.10.7.2 AVERAGE ATTACK
In average attack, target item’s mean rating is used to generate an attack
profile across all the users for filler items. Like the random attack, the IS
remains empty. The filler items are rated by the function N(r
̅ıi
2) centered
on the average rating σi of each item i in the database (Burke et al., 2005).
Average attack proves to be more effective than a random attack. In
average attack:
IS = ϕ and ρ(i) = N(r
̅ıi
2)
8.10.7.3 BANDWAGON ATTACK
In bandwagon attack, high ratings are added to generate profiles for the
selected item to increase the ratings of popular items. Here, IS = {popular
items}and ρ(i) = N(r
̅ıi
2). The items in the IS are assigned high ratings.
This attack needs an additional knowledge about the most popular items
in an RS.
8.10.7.4 SEGMENT ATTACK
It is created to increase the recommendations of the set of target items
for a certain set of users. Here, IS is the items similar to target items and
ρ(i) = N(r
̅ıi
2). The items in IS are termed as segment items that are well-
liked by the target users. Like the target items, the segment items in IS are
given high ratings while the filler items are given low ratings.
Collaborative Filtering in Recommender Systems 203

CF is at the heart of the RSs. It has been employed to develop recommen-
dation techniques that suggest the best suitable items for customers. Yang
et al. (2014) have presented a survey on the CF-based RSs categorizing
them into social recommendation approaches using matrix factorization
and neighborhood-based methods. They have also proposed the idea of
utilizing the information from social networks as an additional input in CF
for better quality recommendations. Elahi et al. (2016) have discussed the
two most popular rating prediction algorithms used in a CF-neighborhood-
based model and latent factor model while throwing some light on the cold
start problem faced by CF techniques. Instead of using the entire data for
CF and also introduces the use of active learning which involves obtaining
high-quality data that can better represent a user’s preferences, as a solu-
tion to the problem of a cold start. An excellent comparison of the CF
algorithms found in the literature has been made by Cacheda et al. (2011)
using several evaluation metrics, presenting the merits and demerits of
every technique. To deal with sparse datasets, a new CF algorithm has
been introduced in (Cacheda et al., 2011) that focus on the differences
between the items or users rather than looking at their similarities. Shi et
al. (2014) have presented a brief review of CF explaining the traditional
memory-based and model-based CF approaches in detail. In addition,
it also surveys some extended CF algorithms that make use of different
information sources apart from the user-item matrix and presents the chal-
lenges faced by them. From the perspective of e-vendors in the domain of
e-commerce, Karimova et al. (2007) have presented a literature review of
the various recommendation techniques including the CF approaches. The
analysis reveals the limitations of CF such as computational complexity,
accuracy, and so on. An excellent survey of CF has been done by Su et al.
(Su and Khoshgoftaar, 2009) where the three categories memory-based,
model-based and hybrid CF algorithms have been studied in detail. The
strengths and weaknesses of these algorithms, as well as their predictive
performances, have been analyzed using several evaluation metrics. Finally,
the various challenges faced by CF—scalability, sparsity, synonymy, grey
sheep, shilling attacks, and privacy protection, etc. have been presented
along with their possible solutions. Nagarnaik and Thomas (2015) have
surveyed the various recommendation techniques including the CF algo-
rithms explaining its various categories. A literature review has been done
on the techniques that have been proposed to overcome the challenges
204 New Age Analytics
faced by CF algorithms. Also, a hybrid CF technique has been proposed
taking a combination of CF techniques and pattern finding algorithms
for a better-quality web page recommendation. Yang et al. (2016) have
shown the entire framework of a typical CF-based RS. A detailed survey
of the working of CF algorithms-similarity metrics, prediction algorithms,
and neighbor selection has been done, and case studies have presented to
measure the accuracy of various CF algorithms using evaluation metrics.


Journal articles and conference proceedings were collected from four major
electronic databases, i.e., ACM Digital Library, IEEE Explore, Springer,
and Science Direct (Elsevier) to find the research trends in CF for the
recommendation. The following queries were executed in Google Scholar:
a. (CF in RSs OR (issue OR issues OR challenge OR challenges OR
problem OR problems)).
b. (Neighborhood CF in recommendation systems OR (issue OR
issues OR challenge OR challenges OR problem OR problems)).
c. (CF OR neighborhood CF OR RSs)
Additional specications about papers were also added in the advanced
search of Google Scholar, to get more ltered papers out:
a. 2011–2017 is selected in the field of date section.
b. IEEE OR Elsevier OR ACM OR Springer has-been selected in the
“published in” field.
The application of the above queries yielded around 500 research papers.
Of the 500 research papers, we selected only 277 papers which are related
to the area of computers and its allied elds. The keywords and abstract of
each paper were used in the categorization of the collected papers, which led
to the following results as shown in Figures 8.4–8.6. Figure 8.4 depicts the
papers as well as their distribution among the top publications on selected
277 papers of CF. Out of the 277 papers, 77 are model-based, 162 are
neighborhood-based, and the rest apply some other ltering technique in
addition to CF as shown in Figure 8.5. And nally, Figure 8.6 states that the
162 papers containing neighborhood-based CF can be further segregated in
a number of sub-domains based on the problems they try to rectify.
Collaborative Filtering in Recommender Systems 205
 Number of paper distribution of CF, in each publication.
 Number of papers in different categories of CF.
206 New Age Analytics
 The percentage share of published papers in different categories of neighbor-
hood-based CF.


The accuracy of neighborhood-based CF mainly depends upon the top-n
list of similar users or items (Yi et al., 2019; Soojung, 2019). Recommen-
dations using these similar users/items tend towards more popular items.
But an ideal RS has different properties such as more personalization,
more diverse, more serendipity, and more novel. Recommendations using
neighborhood-based CF can be improved if the researchers use informa-
tion from social networks and contextual information of user or item with
their rating information (Ambulgekar et al., 2019).

RS has found its usefulness in several fields of e-services. Among several
filtering approaches used in RSs, CF is most common and popular. The
mainly used perception of CF-based is that the rating of the items given by
Collaborative Filtering in Recommender Systems 207
similar users will be close and similar items have similar rating patterns.
On this basis, CF suggests the recommendable items to the users. CF
extracts user ratings either implicitly or explicitly. To find the user-user
and item-item similarity, several metrics are used to find the similarity of
user-user and item-item. The similarity value has been used to predict the
recommendable items. Several prediction approaches such as MC, WA,
Z score (ZS), etc. are used. Different performance metrics such as MAE,
RMSE, Coverage, etc. measure the correctness of recommendation. For
effective implementation of CF-based RS, several challenges need to be
addressed. New user and a new item, sparsity, scalability, synonymy, and
LT problems are among them. Security and trust attacks on RS are also a
major concern in people’s acceptance of RSs. Due to the utility of CF in RS;
it has attracted the attention of the academicians and researchers to make it
more effective. The future of CF-based RS will be more personalized and
diverse with more serendipity.

data extraction
e-commerce
future direction

performance metrics
recommendation system
recommender system attacks
research trends
similarity metrics
REFERENCES
ACRnews, (2017). [Online]. Available: http://www.acr-news.com/ (accessed on 16 February
2020).
Ali, K., & Van Stam, W., (2004). “TiVo: Making show recommendations using a distributed
collaborative filtering architecture.” In: Proceedings of the Tenth ACM SIGKDD Interna-
tional Conference on Knowledge Discovery and Data Mining.
208 New Age Analytics
Alloway, T., (2018). “Amazon Urges California Referendum on Online Tax.” [Online]. Avail-
able: https://ftalphaville.ft.com/2011/07/12/619451/amazon-urges-california-referendum-
on-online-tax/ (accessed on 16 February 2020).
Ambulgekar, H. P., Manjiri, K. P., & Kokare, M. B., (2019). “A survey on collaborative
filtering: Tasks, approaches and applications.” In: Proceedings of International Ethical
Hacking Conference (pp. 289–300).
Amolochitis, E., Christou, I. T., & Tan, Z. H., (2014). “Implementing a commercial-strength
parallel hybrid movie recommendation engine.” IEEE Intelligent Systems, 29, 92–96.
Asnicar, F., & Tasso, C., (1997). “ifWeb: A prototype of user model-based intelligent agent
for document filtering and navigation in the world wide web.” In: Proceedings of 6th
International Conference on User Modeling.
Avesani, P., Massa, P., & Tiella, R., (2005). “Moleskiing.it: A Trust-aware Recommender
System for Ski Mountaineering.” In: Proceedings of the ACM Symposium on Applied
Computing.
Balabanović, M., & Shoham, Y., (1997). “Fab: Content-based, collaborative recommenda-
tion.” Communications of the ACM, 40, pp. 66–72.
Baltrunas, L., Ludwig, B., Peer, S., & Ricci, F., (2012). “Context relevance assessment and
exploitation in mobile recommender systems.” Personal and Ubiquitous Computing,
16(5), 507–526.
Baraglia, R., Frattari, C., Muntean, C. I., Nardini, F. M., & Silvestri, F., (2012). “RecTour:
A recommender system for tourists.” In: Proceedings of the 2012 IEEE/WIC/ACM
International Joint Conference on Web Intelligence and Intelligent Agent Technology.
BBC, (2018). “Orlando Figes to Pay Fake Amazon Review Damages.” [Online]. Available:
https://www.bbc.com/news/uk-10670407 (accessed on 16 February 2020).
Biletskiy, Y., Baghi, H., Keleberda, I., & Fleming, M., (2009). “An adjustable personalization
of search and delivery of learning objects to learners.” Expert Systems with Applications,
36(5), pp. 9113–9121.
Bilge, A., & Kaleli, C., (2014). “A multi-criteria item-based collaborative filtering
framework.” In: 11th International Joint Conference on Computer Science and Software
Engineering.
Billsus, D., & Pazzani, M. J., (2000). “User modeling for adaptive news access.” User
Modeling and User-Adapted Interaction, 10, pp. 147–180.
Bjelica, M., (2010). “Towards TV recommender system: Experiments with user modeling.”
IEEE Transactions on Consumer Electronics, 56(3), 1763–1769.
Blanco-Fernández, Y., Arias, J. J. P., Nores, M. L., Gil-Solla, A., & Cabrer, M. R., (2006).
“AVATAR: An improved solution for personalized TV based on semantic inference.”
IEEE Transactions on Consumer Electronics, 52(1), pp. 223–231.
Bobadilla, J., Hernando, A., Ortega, F., & Abraham, G., (2012). Collaborative filtering
based on significances. Information Sciences, 185(1), pp. 1–17.
Boutet, A., Frey, D., Guerraoui, R., Jegou, A., & Kermarrec, A. M., (2013). “WhatsUp: A
decentralized instant news recommender.” In: IEEE 27th International Symposium on
Parallel Distributed Processing.
Braunhofer, M., Kaminskas, M., & Ricci, F., (2013). “Location-aware music recommenda-
tion.” International Journal of Multimedia Information Retrieval, 2(1), 31–44.
Burke, R. D., Hammond, K. J., & Young, B. C., (1996). “Knowledge-based navigation of
complex information spaces.” In: Proceedings of the Thirteenth National Conference
Collaborative Filtering in Recommender Systems 209
on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence
Conference (AAAI). Portland, Oregon.
Burke, R., (1999). “The wasabi personal shopper: A case-based recommender system.”
In: Proceedings of the 11th National Conference on Innovative Applications of Artificial
Intelligence.
Burke, R., (2002). “Hybrid recommender systems: Survey and experiments.” User
Modeling and User-Adapted Interaction, 12(4), pp. 331–370.
Burke, R., Mobasher, B., Bhaumik, R., & Williams, C., (2005). “Segment-based injection
attacks against collaborative filtering recommender systems.” In: Fifth IEEE International
Conference on Data Mining (ICDM’05).
Cacheda, F., Carneiro, V., Fern’andez, D., & Formoso, V., (2011). “Comparison of
collaborative filtering algorithms: Limitations of current techniques and proposals for
scalable, high-performance recommender systems.” In: TWEB (Vol. 5, No. 1/2, p. 33).
Cao, Y., & Li, Y., (2007). “An intelligent fuzzy-based recommendation system for consumer
electronic products.” Expert Systems with Applications, 33(1), pp. 230–240.
Capuano, N., Gaeta, M., Ritrovato, P., & Salerno, S., (2014). “Elicitation of latent learning
needs through learning goals recommendation.” Computers in Human Behavior, 30, pp.
663–673.
Celma, Ò., & Serra, X., (2008). “FOAFing the music: Bridging the semantic gap in music
recommendation.” Web Semantics: Science, Services and Agents on the World Wide Web,
6(4).
Chen, C. M., & Duh, L. J., (2008). “Personalized web-based tutoring system based on
fuzzy item response theory.” Expert Systems with Applications, 34, pp. 2298–2315.
Chen, C. M., Duh, L. J., & Liu, C. Y., (2004). “A personalized courseware recommendation
system based on fuzzy item response theory.” In: IEEE International Conference on
e-Technology, e-Commerce and e-Service.
Chen, L., & Pu, P., (2012). “Critiquing-based recommenders: Survey and emerging trends.”
User Modeling and User-Adapted Interaction, 22(1), pp. 125–150.
Chesnevar, C. I., & Maguitman, A. G., (2004). “ArgueNet: An argument-based recom-
mender system for solving Web search queries.” In: 2nd International IEEE Conference
on Intelligent Systems.
Christou, I. T., Amolochitis, E., & Tan, Z. H., (2016). “AMORE: design and implementation
of a commercial-strength parallel hybrid movie recommendation engine.” Knowledge
and Information Systems, 47(3), 671–696.
Cobos, C., Rodriguez, O., Rivera, J., Betancourt, J., Mendoza, M., Leó, N. E., & Herrera-
Viedma, E., (2013). “A hybrid system of pedagogical pattern recommendations based on
singular value decomposition and variable data attributes.” Information Processing and
Management: An International Journal, 49, 607–625.
Console, L., Torre, I., Lombardi, I., Gioria, S., & Surano, V., (2003). “Personalized and
adaptive services on board a car: An application for tourist information.” Journal of
Intelligent Information Systems, 21(3), pp. 249–284.
Cornelis, C., Lu, J., Guo, X., & Zhang, G., (2007). “One-and-only item recommendation
with fuzzy logic techniques.” Information Sciences, 177, pp. 4906–4921.
Daniel, M. F., & Kartik, H., (2007). “Recommender systems and their impact on sales
diversity.” In: Proceedings of the 8th ACM Conference on Electronic Commerce (EC ‘07)
(pp. 192–199). ACM, New York, NY, USA.
210 New Age Analytics
Deshpande, M., & Karypis, G., (2004). “Item-based top-n recommendation Algorithms.”
ACM Transactions on Information Systems (TOIS), 22, pp. 143–177.
Domingues, M., Gouyon, F., Jorge, A., Leal, J., Vinagre, J., Lemos, L., & Sordo, M.,
(2013). “Combining usage and content in an online recommendation system for music
in the long tail.” International Journal of Multimedia Information Retrieval, 2(1), 3–13.
Elahi, M., Ricci, F., & Rubens, N., (2016). “A survey of active learning in collaborative
filtering recommender systems.” In: Computer Science Review (Vol. 20, pp. 29–50).
Esteban, B., Tejeda-Lorente, Á., Porcel, C., Arroyo, M., & Herrera-Viedma, E., (2014).
“TPLUFIB-WEB: A fuzzy linguistic Web system to help in the treatment of low back
pain problems.” Knowledge-Based Systems, 67, pp. 429–438.
Farzan, R., & Brusilovsky, P., (2006). “Social navigation support in a course recommendation
system.” In: Adaptive Hypermedia and Adaptive Web-Based Systems: 4th International
Conference (AH 2006). Dublin, Ireland.
Fesenmaier, D. R., Ricci, F., Schaumlechner, E., Wöber, K., & Zanella, C., (2003).
“DIETORECS: Travel advisory for multiple decision styles.” In: Proceedings of the
International Conference on Information and Communication Technologies in Tourism.
Wien, Austria.
FoxTrit, (2017). [Online]. Available: http://www.foxtrot.com/wp-content/endurance-page-
cache/_index.html (accessed on 16 February 2020).
García-Crespo, A., Chamizo, J., Rivera, I., Mencke, M., Colomo-Palacios, R., & Gómez-
Berbís, J. M., (2009). “SPETA: Social pervasive e-tourism advisor.” Telematics and
Informatics, 26, pp. 306–315.
García-Crespo, Á., López-Cuadrado, J. L., Colomo-Palacios, R., González-Carrasco, I., &
Ruiz-Mezcua, B., (2011). “Sem-Fit: A semantic based expert system to provide recommen-
dations in the tourism domain.” Expert Systems with Applications, 38, pp. 13310–13319.
Garfinkel, R., Gopal, R., Tripathi, A., & Yin, F., (2006). “Design of a shopbot and recom-
mender system for bundle purchases.” Decision Support Systems, 42(3), pp. 1974–1986.
Guo, X., and Lu, J., (2007). “Intelligent e-government services with personalized recom-
mendation techniques.” International Journal of Intelligent Systems, 22, pp. 401–417.
Hauver, D., & French, J., (2001). “Flycasting: using collaborative filtering to generate a
playlist for online radio.” In: First International Conference on Web Delivering of Music.
Hayes, C., & Cunningham, P., (2001). “Smart radio-community based music radio.” Knowl-
edge Based Systems, 14, pp. 197–201.
Herlocker, J. L., Konstan, J. A., Borchers, A., & Riedl, J., (1999). “An algorithmic framework
for performing collaborative filtering.” In: 22nd Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval.
Hongzhi, Y., Bin, C., Jing, L., Junjie, Y., & Chen, C., (2012). “Challenging the long tail
recommendation.” In: Proc. VLDB Endow., 5(9), 896–907.
Hotho, A., Jäschke, R., Schmitz, C., & Stumme, G., (2006). “Information retrieval in folk-
sonomies: Search and ranking.” In: The Semantic Web: Research and Applications: 3rd
European Semantic Web Conference. Budva, Montenegro.
Hu, J., Wang, B., Liu, Y., & Li, D. Y., (2012). “Personalized tag recommendation using
social influence.” Journal of Computer Science and Technology, 27(3), pp. 527–540.
IMDb, (2017). [Online]. Available: http://www.imdb.com/ (accessed on 16 February 2020).
Jäschke, R., Marinho, L. B., Hotho, A., Schmidt-Thieme, L., & Stumme, G., (2007). “Tag
Recommendations in Folksonomies.” In: 11th European Conference on Principles and
Practice of Knowledge Discovery in Databases. Warsaw, Poland.
Collaborative Filtering in Recommender Systems 211
Jinni, (2017). [Online]. Available: http://www.jinni.com/ (accessed on 16 February 2020).
Karimova, F. (2016), “A Survey of e-Commerce Recommender Systems,” European
Scientific Journal, 12(34), 75–89.
Kaur, P., & Goel, S., (2016). “Shilling attack models in recommender system.” In: Inter-
national Conference on Inventive Computation Technologies (ICICT) (Vol. 2, pp. 1–5).
Konstan, J. A., Miller, B. N., Maltz, D., Herlocker, J. L., Gordon, L. R., & Riedl, J., (1997).
“GroupLens: Applying collaborative filtering to Usenet news.” Communications of the
ACM, 40, pp. 77–87.
Lakshmi, S. S., & Lakshmi, T. A., (2014). “Recommendation systems: Issues and challenges.”
In: International Journal of Computer Science and Information Technologies, 5.
Lam, S. K., & Riedl, J., (2004). “Shilling recommender systems for fun and profit.” In:
Proceedings of the 13th International Conference on World Wide Web (pp. 393–402).
Lawrence, R., Almasi, G., Kotlyar, V., Viveros, M., and Duri, S., (2001). “Personalization of
supermarket product recommendations.” Data Mining and Knowledge Discovery, 5(1),
pp. 11–32.
Lee, S. K., Cho, Y. H., & Kim, S. H., (2010). “Collaborative filtering with ordinal scale-based
implicit ratings for mobile music recommendations.” Information Sciences, 180(11), pp.
2142–2155.
Lei, S., (2013). “Trading-off among accuracy, similarity, diversity, and long-tail: A
graph-based recommendation approach.” In: Proceedings of the 7th ACM Conference on
Recommender Systems.
Levandoski, J. J., Sarwat, M., Eldawy, A., & Mokbel, M. F., (2012). “LARS: A location-
aware recommender system.” In: Proceedings of the 2012 IEEE 28th International
Conference on Data Engineering.
Li, D., Miao, C., Chu, S., Mallen, J., Yoshioka, T., & Srivastava, P., (2018). “Stable Matrix
Approximation for Top-n Recommendation on Implicit Feedback Data.” In Hawaii
International Conference on System Sciences 2018 (HICSS-51).
Liu, Q., & Karger, D. R., (2015). “Kibitz: End-to-end recommendation system builder.”
In: RecSys.
Lu, J., (2004). “Personalized e-learning material recommender system.” In: Proceedings of
International Conference on Information Technology for Application.
Lu, J., Shambour, Q., Xu, Y., Lin, Q., & Zhang, G., (2010). “BizSeeker: A hybrid semantic
recommendation system for personalized government-to-business e-services.” Internet
Research, 20, pp. 342–365.
Luz, N., Moreno, M., Anacleto, R., Almeida, A., & Martins, C., (2013). “A hybrid recom-
mendation approach for a tourism system.” Expert Systems with Applications, 9(40),
3532–3550.
Marcel, M. A., Ball, M., Boley, H., Greene, S., Howse, N., Lemire, D., & Mcgrath, S.,
(2003). “RACOFI: A rule-applying collaborative filtering system.” In: Proc. IEEE/WIC
COLA’03. Halifax, Canada.
Martín-Vicente, M. I., Gil-Solla, A., Ramos-Cabrer, M., Blanco-Fernández, Y., & Servia-
Rodríguez, S., (2012). “Semantics-driven recommendation of coupons through digital
TV: Exploiting synergies with social networks.” In: IEEE International Conference on
Consumer Electronics.
Martínez, L., Rodríguez, R. M., & Espinilla, M., (2009). “REJA: A georeferenced hybrid
recommender system for restaurants.” In: Proceedings of the 2009 IEEE/WIC/ACM
212 New Age Analytics
International Joint Conference on Web Intelligence and Intelligent Agent Technology
(pp. 187–190).
Mccarthy, K., Reilly, J., Mcginty, L., & Smyth, B., (2004). “Thinking positively-explanatory
feedback for conversational recommender systems.” In: Proceedings of the ECCBR 2004
Workshops.
Melville, P., Mooney, R. J., & Nagarajan, R., (2002). “Content-boosted collaborative filtering
for improved recommendations.” In: Eighteenth National Conference on Artificial Intel-
ligence. Edmonton, Alberta, Canada.
Meo, P. D., Quattrone, G., & Ursino, D., (2008). “A decision support system for designing
new services tailored to citizen profiles in a complex and distributed e-government
scenario.” Data and Knowledge Engineering, 67, pp. 161–184.
Miller, B. N., Konstan, J. A., & Riedl, J., (2004). “PocketLens: Toward a personal recom-
mender system.” ACM Transactions on Information Systems, 22(3), pp. 437–476.
Mobasher, B., Burke, R., Bhaumik, R., & Sandvig, J. J., (2007). “Attacks and remedies in
collaborative recommendation.” IEEE Intelligent Systems, 22(3), 56–63.
Mobasher, B., Burke, R., Bhaumik, R., & Williams, C. (2007). “Toward trustworthy recom-
mender systems: An analysis of attack models and algorithm robustness.” ACM Trans.
Internet Technol., 7(4).
Mobasher, B., Burke, R., Bhaumik, R., & Williams, C., (2005). “Effective attack models
for shilling item-based collaborative filtering systems.” In: Proceedings of the WebKDD
Workshop, Held in Conjunction with ACM SIGKDD2005.
Moreno, A., Valls, A., Isern, D., Marin, L., & Borràs, J., (2013). “SigTur/E-destination:
Ontology-based personalized recommendation of tourism and leisure activities.”
Engineering Applications of Artificial Intelligence, 26(1), pp. 633–651.
Moukas, A., & Maes, P., (1998). “Amalthaea: An evolving multi-agent information filtering
and discovery system for the WWW.” Autonomous Agents and Multi-Agent Systems, 1(1),
59–88.
Movielens, (2017). [Online]. Available: https://movielens.org/ (accessed on 16 February 2020).
Nagarnaik, P., & Thomas, A., (2015). “Survey on recommendation system methods.” In:
2nd International Conference on Electronics and Communication Systems (ICECS) (pp.
1603–1608).
nanoCROWD, (2017). [Online]. Available: http://nanocrowd.com/ (accessed on 16 February
2020).
Nanopoulos, A., Rafailidis, D., Symeonidis, P., & Manolopoulos, Y., (2010). “Music box:
Personalized music recommendation based on cubic analysis of social tags.” IEEE
Transactions on Audio, Speech and Language Processing, 18(2), pp. 407–412.
Natarajan, N., Shin, D., & Dhillon, I. S., (2013). “Which app will you use next?: Collaborative
filtering with interactional context.” In: Proceedings of the 7th ACM Conference on
Recommender Systems.
Nguyen, T. T. S., Lu, H. Y., & Lu, J., (2014). “Web-page recommendation based on web
usage and domain knowledge.” IEEE Transactions on Knowledge and Data Engineering,
26(10), pp. 2574–2587.
Oh, J., Kim, S., Kim, J., & Yu, H., (2014). “When to recommend: A new issue on TV show
recommendation.” Information Sciences, 280, pp. 261–274.
Oscar, C., (2010). “Music Recommendation and Discovery: The Long Tail, Long Fail, and
Long Play in the Digital Music Space.” Springer Publishing Company, Incorporated.
Collaborative Filtering in Recommender Systems 213
Pampın, H. J. C., Jerbi, H., & O’Mahony, M. P. (2015), “Evaluating the relative performance
of collaborative filtering recommender systems, Journal of Universal Computer Science,
21(13), 1849–1868.
Park, Y. J., (2013). “An adaptive match-making system reflecting the explicit and implicit
preferences of users.” Expert Systems with Applications: An International Journal, 40,
1196–1204.
Park, Y. J., & Tuzhilin, A. (2008, October), “The long tail of recommender systems and
how to leverage it,” In Proceedings of the 2008 ACM Conference on Recommender
Systems (pp. 11–18).
Parra, D., Brusilovsky, P., & Trattner, C., (2014). “See what you want to see: Visual user-
driven approach for hybrid recommendation.” In: Proceedings of the 19th International
Conference on Intelligent User Interfaces.
Pashtan, A., Blattler, R., Andi, A. H., & Scheuermann, P., (2003). “CATIS: A context-aware
tourist information system.” In: Proceedings of the 4th International Workshop of Mobile
Computing.
Poonam, T. B., Goudar, R. M., & Sunita, B., (2015). “Article: Survey on collaborative
filtering, content-based filtering and hybrid recommendation system.” International
Journal of Computer Applications, 31–36.
Porcel, C., & Herrera-Viedma, E., (2010). “Dealing with incomplete information in a
fuzzy linguistic recommender system to disseminate information in university digital
libraries.” Knowledge-Based Systems, 23, pp. 32–39.
Porcel, C., Herrera-Viedma, E., & Moreno, J. M., (2009). “A multi-discipliner recommender
system to advice research resources in university digital libraries.” Expert Systems with
Applications, 36, pp. 12520–12528.
Porcel, C., López-Herrera, A. G., & Herrera-Viedma, E., (2009). “A recommender system
for research resources based on fuzzy linguistic modeling.” Expert Systems with
Applications: An International Journal, 36, pp. 5173–5183.
Pratikshashiv, (2015). “Flipkart Uses Collaborative Based Filtering.” [Online]. Available:
https://pratikshashiv.wordpress.com/ (accessed on 16 February 2020).
Renda, M. E., & Straccia, U., (2005). “A personalized collaborative digital library environ-
ment: A model and an application.” Information Processing and Management: An Inter-
national Journal, 41, 5–21.
Rikitianskii, A., Harvey, M., & Crestani, F., (2014). “A personalized recommendation system
for context-aware suggestions.” In: Advances in Information Retrieval: 36th European
Conference on IR Research. ECIR.
Rotten Tomatoes, (2017). [Online]. Available: https://www.rottentomatoes.com/ (accessed
on 16 February 2020).
Ruotsalo, T., Haav, K., Stoyanov, A., Roche, S., Fani, E., Deliai, R., Mäkelä, E., Kauppinen,
T., & Hyvönen, E., (2013). “Smart museum: A mobile recommender system for the web
of data.” Web Semantics: Science, Services and Agents on the World Wide Web, 20.
Samundeeswary, K., & Krishnamurthy, V. (2017, June), “Comparative study of recom-
mender systems built using various methods of collaborative filtering algorithm.” In
2017 International Conference on Computational Intelligence in Data Science (ICCIDS)
(pp. 1–6). IEEE.
Salter, J., & Antonopoulos, N., (2006). “Cinema screen recommender agent: Combining
collaborative and content-based filtering.” IEEE Intelligent Systems, 21(1), pp. 35–41.
214 New Age Analytics
Sánchez, L. Q., Recio-García, J. A., & Díaz-Agudo, B., (2011). “Happy movie: A Facebook
application for recommending movies to groups.” In: 23rd International Conference on
Tools with Artificial Intelligence (ICTAI).
Santos, O. C., Boticario, J. G., D. Pérez-Marín, Santos, O., Boticario, J., & Perez-Marin, D.,
(2014). “Extending web-based educational systems with personalized support through
user centered designed recommendations along the e-learning life cycle.” Science of
Computer Programming, 88, pp. 92–109.
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J., (2001). “Item-based collaborative filtering
recommendation algorithms.” In: 10th International Conference on World Wide Web.
Schiaffino, S., & Amandi, A., (2009). “Building an expert travel agent as a software agent.”
Expert Systems with Applications: An International Journal, 36(2), 1291–1299.
Serrano-Guerrero, J., Herrera-Viedma, E., Olivas, J. A., Cerezo, A., & Romero, F. P.,
(2011). “A google wave-based fuzzy recommender system to disseminate information
in university digital libraries 2.0.” Information Sciences: An International Journal, 181,
1503–1516.
Shi, Y., Larson, M., & Alan, H., (2014). “Collaborative filtering beyond the user-item matrix:
A survey of the state of the art and future challenges.” In: ACM Comput. Surv., (Vol. 47,
No. 1–3, p. 45).
Singh, P. K., Pramanik, P. K. D., & Choudhury, P., (2018). “A comparative study of different
similarity metrics in highly sparse rating dataset.” In: Data Management, Analytics and
Innovation, Proceedings of ICDMAI (Vol. 2, pp. 45–60). Springer.
Smyth, B., & Cotter, P., (2000). “A personalized television listings service.” Communications
of the ACM, 4(8), pp. 107–111.
Soojung, L., (2019). “Using entropy for similarity measures in collaborative filtering.” In:
Journal of Ambient Intelligence and Humanized Computing.
Su, X., & Khoshgoftaar, T. M., (2009). “A survey of collaborative filtering techniques.”
Advances in Artificial Intelligence. Article ID 421425.
TASTEKiD, (2017). [Online]. Available: https://www.tastekid.com/ (accessed on 16 February
2020).
Teran, L., & Meier, A., (2010). “A fuzzy recommender system for elections.” In: Electronic
Government and the Information Systems Perspective, First International Conference
(EGOVIS). Bilbao, Spain.
Tung, H. W., & Soo, V. W., (2004). “A personalized restaurant recommender agent for
mobile e-service.” In: Proceedings of the 2004 IEEE International Conference on
e-Technology, e-Commerce and e-Service (EEE’04). Washington, DC, USA.
Walter, F. E., Battiston, S., Yildirim, M., & Schweitzer, F., (2012). “Moving recommender
systems from on-line commerce to retail stores.” Information Systems and e-Business
Management, 10(3), pp. 367–393.
Wei, K., Huang, J., & Fu, S., (2007). “A survey of e-commerce recommender systems.” In:
International Conference on Service Systems and Service Management (pp. 1–5).
Wu, D., Zhang, G., & Lu, J., (2015). “A fuzzy preference tree-based recommender system
for personalized business-to-business e-services.” IEEE Transactions on Fuzzy Systems,
23, pp. 29–43.
Wu, J., Chen, L., Feng, Z., Zhou, M., & Wu, Z., (2013). “Predicting quality of service for
selection by neighborhood-based collaborative filtering.” IEEE Transactions Systems,
Man, and Cybernetics: Systems, 43(2), pp. 428–439.
Collaborative Filtering in Recommender Systems 215
Xiaoyuan, S., & Taghi, K. M., (2009). “A Survey of Collaborative Filtering Techniques.”
Adv. in Artif. Intell.
Xing, X., Yu, Z., Liuhang, Z., & Yuan, N. J., (2013). “T-finder: A recommender system
for finding passengers and vacant taxis.” IEEE Transactions on Knowledge and Data
Engineering, 25, pp. 2390–2403.
Yang, X., Guo, Y., Liu, Y., & Steck, H., (2014). “A survey of collaborative filtering based
social recommender systems.” In: Computer Communications, 41, 1–10.
Yang, Z., Wu, B., Zheng, K., Wang, X., & Lei, L., (2016). “A Survey of Collaborative
Filtering-Based Recommender Systems for Mobile Internet Applications (pp. 3273–
3287).” IEEE Access 4.
Yanga, W. S., & Hwang, S. Y., (2013). “iTravel: A recommender system in mobile peer-to-
peer environment.” Journal of Systems and Software, 86(1), 12–20.
Yi, M., Nianhao, X., Ruichun, T., Liang, L., & Xiaohan, Y., (2019). “An efficient similarity
measure for collaborative filtering.” In: Procedia Computer Science (pp. 147, 416–421).
Yin, H., Cui, B., Sun, Y., Hu, Z., & Chen, L., (2014). “LCARS: A spatial item recommender
system.” ACM Transactions on Information Systems, 32(3), pp. 11, 1–11, 37.
Zaíane, O. R., (2002). “Building a recommender agent for e-learning systems.” In: Proceed-
ings of the International Conference on Computers in Education. Washington, DC, USA.
Zhang, H., Gao, Y., Chen, H., & Li, Y., (2012). “TravelHub: A semantics-based mobile
recommender for composite services.” In: 16th International Conference on Computer
Supported Cooperative (CSCWD).
Zhang, Z. K., Zhou, T., & Zhang, Y. C., (2011). “Tag-aware recommender systems: A
state-of-the-art survey.” Journal of Computer Science and Technology, 26.
Zhang, Z., Lin, H., Liu, K., Wu, D., Zhang, G., & Lu, J., (2013). “A hybrid fuzzy-based
personalized recommender system for telecom products/services.” Information Sciences,
235, pp. 117–129.
Zhao, W. X., Li, S., He, Y., Wang, L., Wen, J. R., & Li, X., (2016). “Exploring demographic
information in social media for product recommendation.” Knowledge of Information
System, 49(1), pp. 61–89.
Zhao, X. W., Guo, Y., He, Y., Jiang, H., Wu, Y., & Li, X., (2014). “We know what you
want to buy: A demographic-based system for product recommendation on microblogs.”
In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining.
Zuva, K., & Zuva, T., (2017). “Diversity and serendipity in recommender systems.” In:
Proceedings of the International Conference on Big Data and Internet of Thing.
... Generally, the following two techniques are used to find similar users for recommendation: a) Collaborative filtering: A technique known as collaborative filtering can exclude content that a particular user might otherwise find interesting based on the reviews of other users (Singh et al. 2020). It searches through a large number of people to find other people who share a user's tastes (Singh et al. 2019a). Collaborative filtering is the most often utilized method for making recommendations. ...
... By considering the dynamics of user preferences, the authors aim to improve the accuracy and relevance of recommendations. In another work, Singh et al. (2019a) introduced an innovative similarity calculation method for collaborative filtering that incorporates user's neighbors' liking and disliking patterns related to the categorical attributes of items. This information is then used to generate more personalized and accurate recommendations. ...
Article
Full-text available
The exponential increase in digital data has increased the amount of available online information. This complicates the user’s decision-making. Most online merchants and service providers utilize recommendation systems to solve this problem and meet customer needs. The traditional collaborative filtering based approach faces enormous challenges in providing potential personalized recommendation results. The demographic information of users may improve personalized recommendation results. This research proposes an improved recommendation approach based on users’ demographic information. Compared with traditional collaborative filtering-based approaches, this approach provides improved results. The experimental results show the enhanced prediction accuracy of the proposed approach and significantly lower errors when experimenting with the MovieLens dataset.
... CF operates by comparing user behaviors, purchases, ratings, and preferences to identify similarities and make recommendations. Users tend to favor items that other users with similar tastes have appreciated or preferred (Singh et al., 2020;Schafer et al., 2007). This approach requires significant data collection efforts and often involves categorizing users into groups based on demographics and behavioral tendencies to reduce the amount of information needed (Gedikli, 2024). ...
Article
Full-text available
In recent years, there has been a rapid increase in the number of research papers being published, leading to what many feel is an overload of information. This makes it difficult for researchers to choose the right journal for their work. To help with this, journal recommender systems have been suggested as useful tools to help researchers find the most appropriate journals for their research. With so many journals, publishers, and recommender systems to choose from, deciding on the best one can be complicated. This decision depends on several factors, including the publisher, the scientific database, and the specific needs and preferences of the user. In this paper, we offer a detailed comparison of popular journal recommender systems, both theoretically and through experiments, to see how effective they are at making recommendations. We focus on how relevant and helpful these recommendations are. We also provide advice for researchers on how to make the most of these recommender systems to aid in their publishing process.
... Popular RSs today fall into several categories: content-based (CB)-based RSs [5], collaborative filtering (CF)-based RSs [6], and hybrid RSs [7]. Analyzing learner behavioral data is fundamental to CB-based algorithms. ...
Article
Full-text available
Traditional educational models struggle to meet the demands of students seeking personalized online learning resources (OLRs). Collaborative filtering (CF) algorithms are widely employed for personalized OLR recommendations, yet they encounter issues such as poor scalability, cold start, and sparse data issues. In response, an enhanced CF algorithm is proposed, incorporating a fusion of time weighting and a credibility selection strategy. Initially, interactions and ratings among learners are analyzed. Subsequently, the algorithm integrates learner similarity and trust, calculating the credibility value weight between learners. Dynamic time weighting is then introduced separately into CF algorithms based on OLRs and learners, respectively. Ultimately, the algorithm predicts learner ratings for unknown OLRs. Experimental comparisons demonstrate that the performance metrics of the hybrid algorithm presented in this paper show significant improvement over traditional and other improved algorithms. It exhibits enhanced rating prediction accuracy, facilitating precise recommendations of personalized OLRs to learners.
... Depending on how you work with this matrix, you can find two categories into which collaborative filtering recommendation systems can be divided: collaborative filtering based on models and collaborative filtering based on neighbors (Singh et al., 2020). ...
Conference Paper
In the current context of an era in which a significant portion of people are constantly living online, with various multimedia streaming platforms serving as major sources of entertainment, and with e-commerce playing also a key role, recommender systems are carving out their place as one of the most important and widely used tools for enhancing user experiences on these platforms. This work undertakes a comparative study on some of the techniques used within these systems, mainly focused on those based in collaborative filtering. Multiple recommender systems will be implemented according to each of these methods, taking for this purpose the vinyl records and CDs Amazon’s user ratings.
... In response to this need, the author developed an online bookstore as an ecommerce web application to help the start-up establish its business plan. [5] The core of e-commerce is now the recommender system (RS). Every e-commerce portal chooses to include RS as a crucial component of it in addition to the basic searching feature. ...
Article
: Social media has been influencing e-commerce in multiple ways by being a platform of marketing for ecommerce businesses over the years. This project boosts the idea of social commerce by providing a platform for small businesses to grow and reach the world. At the same time it provides a platform for the internet community to explore various products. A web application where users can register themselves in two ways, as a seller or as a customer. Sellers can post their products and will get insights on how their business is growing in the e-market. A detailed analysis will be provided based on the orders, no. of visitors, likes, etc. On the other hand, the customers can view the products as well as the posts just like on any other social media platform. These users can follow other users and see the posts of other users as well as they can search for a particular product in two ways, text and visual. Results will contain the products and the posts. Also, recommendations will be shown to the user. Users can follow, like and buy products from sellers. Features such as personalized recommendation, product classification, post engagement help the small businesses to create a brand. Being a fusion of social media and e-commerce, this web app provides a single platform for small businesses and social influencers.
... The formula for Spearman's rank coefficient is = Spearman's rank correlation coefficient. The Spearman Rank Correlation can take a value from +1 to -1, where +1 indicates a perfect positive correlation, -1 indicates a perfect negative correlation, and 0 indicates no correlation [2] [4] . An example of Spearman's Rank Correlation is given in, where the data shows a weak correlation with a value near 0 . ...
Article
Full-text available
Collaborative Filtering (CF) is a widely used technique in recommendation systems to suggest items to users based on their previous interactions with the system. CF involves finding correlations between the preferences of different users and using those correlations to provide recommendations. This technique can be divided into user-based and item-based CF, both of which utilize similarity metrics to generate recommendations. Content-based filtering is another commonly used recommendation technique that analyzes the attributes of items to suggest similar items. To enhance the accuracy of recommendation systems, hybrid algorithms that combine CF and content-based filtering techniques have been developed. These hybrid systems leverage the strengths of both approaches to provide more accurate and personalized recommendations. In conclusion, collaborative filtering is an essential technique in recommendation systems, and the use of various similarity metrics and hybrid techniques can enhance the quality of recommendations.
... It recommends a product to a customer based on the profile, feedback, and item information, etc. Its main restriction is the limited analysis of content as well as the specialization [26]. CF recommends products to consumers based on their previous ratings to alleviate the above problems. ...
Article
Full-text available
In order to increase sales, companies try their best to develop relevant offers that anticipate customer needs. One way to achieve this is by leveraging artificial intelligence algorithms that process data collected based on customer transactions, extract insights and patterns from them, and then present them in a user-friendly way to human or artificial intelligence decision makers. This study is based on a hybrid approach, it starts with an online marketplace dataset that contains many customers’ purchases and ends up with global personalized offers based on three different datasets. The first one, generated by a recommendation system, identifies for each customer a list of products they are most likely to buy. The second is generated with an Apriori algorithm. Apriori is used as an associate rule mining technique to identify and map frequent patterns based on support, confidence, and lift factors, and also to pull important rules between products. The third and last one describes, for each customer, their purchase probability in the next few weeks, based on the BG/NBD model and the average of transactions using the Gamma-Gamma model, as well as the satisfaction based on the CLV and RFMTS models. By combining all three datasets, specific and targeted promotion strategies can be developed. Thus, the company is able to anticipate customer needs and generate the most appropriate offers for them while respecting their budget, with minimum operational costs and a high probability of purchase transformation.
... Consumers experience utility whenever they consume an item. Given the vast success of collaborative filtering approaches in recommender systems in practice (Singh et al., 2020), we assume that the ratings predicted with the matrix factorization model are a good approximation of the consumers' true utilities. Since such predictions are, of course, not exact, we define the true utilities as the predicted rating plus some noise: ...
Article
Full-text available
Automated recommendations can nowadays be found on many e-commerce platforms, and such recommendations can create substantial value for consumers and providers. Often, however, not all recommendable items have the same profit margin, and providers might thus be tempted to promote items that maximize their profit. In the short run, consumers might accept non-optimal recommendations, but they may lose their trust in the long run. Ultimately, this leads to the problem of designing balanced recommendation strategies, which consider both consumer and provider value and lead to sustained business success. This work proposes a simulation framework based on agent-based modeling designed to help providers explore longitudinal dynamics of different recommendation strategies. In our model, consumer agents receive recommendations from providers, and the perceived quality of the recommendations influences the consumers’ trust over time. We design several recommendation strategies which either give more weight on provider profit or on consumer utility. Our simulations show that a hybrid strategy that puts more weight on consumer utility but without ignoring profitability considerations leads to the highest cumulative profit in the long run. This hybrid strategy results in a profit increase of about 20% compared to pure consumer or profit oriented strategies. We also find that social media can reinforce the observed phenomena. In case when consumers heavily rely on social media, the cumulative profit of the best strategy further increases. To ensure reproducibility and foster future research, we publicly share our flexible simulation framework.
... This gives the square of the MAE (the square of the average difference between the original values and the predicted values). The advantage is that it makes the large errors more pronounced so that the model focuses on the large errors and their causes [86]. In addition, it is easier to model the linear programming models in the computation of the slope using the mean absolute error since the differences will be clearer. ...
Article
Full-text available
Movie recommender systems are meant to give suggestions to the users based on the features they love the most. A highly performing movie recommendation will suggest movies that match the similarities with the highest degree of performance. This study conducts a systematic literature review on movie recommender systems. It highlights the filtering criteria in the recommender systems, algorithms implemented in movie recommender systems, the performance measurement criteria, the challenges in implementation, and recommendations for future research. Some of the most popular machine learning algorithms used in movie recommender systems such as K-means clustering, principal component analysis, and self-organizing maps with principal component analysis are discussed in detail. Special emphasis is given to research works performed using metaheuristic-based recommendation systems. The research aims to bring to light the advances made in developing the movie recommender systems, and what needs to be performed to reduce the current challenges in implementing the feasible solutions. The article will be helpful to researchers in the broad area of recommender systems as well as practicing data scientists involved in the implementation of such systems.
Article
Full-text available
We postulate and analyze a nonlinear Subsampling Accuracy Loss (SSAL) model based on the Root Mean Square Error (RMSE) and two SSAL models based on the Mean Square Error (MSE), suggested by extensive preliminary simulations. The SSAL models predict Accuracy Loss in terms of subsampling parameters like the Fraction of Users Dropped (FUD) and the Fraction of Items Dropped (FID). We seek to investigate whether the models depend on the characteristics of the dataset in a constant way across datasets when using the SVD Collaborative Filtering (CF) algorithm. The dataset characteristics considered include various densities of the rating matrix and the numbers of users and items. Extensive simulations and rigorous regression analysis led to empirical symmetrical SSAL models in terms of FID and FUD whose coefficients depend only on the data characteristics. The SSAL models came out to be multi-linear in terms of odds ratios of dropping a user (or an item) vs. not dropping it. Moreover, one MSE deterioration model turned out to be linear in the FID and FUD odds where their interaction term has a zero coefficient. Most importantly, the models are constant in the sense that they are written in closed-form using the considered data characteristics (densities and numbers of users and items). The models are validated through extensive simulations based on 850 synthetically generated primary (pre-subsampling) matrices derived from the 25M MovieLens Data. Nearly 460,000 subsampled rating matrices were then simulated and subjected to the SVD CF algorithm. Further validation was conducted using the 1M MovieLens and the Yahoo! Music Rating datasets. The models were constant and significant across all 3 datasets.
Article
Full-text available
In the field of recommendation system, the memory-based Collaborative filtering has been proven to be useful in lots of practices. Similarity measures like Pearson correlation coefficient tend to only focus on improving as much as possible the accuracy. Handling datasets with different features, exiting measures cannot apply to different types of data simultaneously. In this paper, an improved similarity measure Common Pearson Correlation Coefficient (COPC) was proposed. Unlike existing measures, it strongly depends on chosen distance function, which adhere to the natural property of monotonicity and utilize consensus evaluation measure to capture an optimal value to improve PCC measure. To mitigate sparse problem, we also introduce the Hellinger Distance (Hg) as global similarity to lower the impact of lacking co-rated items. Experimental results on real-world datasets demonstrates that our measure outperformed the existing schemes of predicting ratings.
Article
Full-text available
Collaborative filtering has been successfully implemented in many commercial recommender systems. These systems recommend items favored by other users with similar preference history to the current user. As finding similar users is critical to the performance of the system, various techniques have been suggested to develop similarity measures. However, there are still much to be improved, because existing similarity measures simply utilize additional heuristic information and seldom reflect the global rating behaviors on items. This paper aims to improve the previous similarity measures by employing the information entropy of user ratings so that the user’s global rating behavior on items can be reflected. The efficiency of the proposed method is examined through extensive experiments to demonstrate its superior performance over the previous similarity measures especially in small-scaled and sparse datasets.
Conference Paper
Full-text available
Recommender System has been popularly used for recommending products and services to the online buyers and users. Collaborative Filtering (CF) is one of the most popular filtering approaches used to find the preferences of users for the recommendation. CF works on the ratings given by the users for a particular item. It predicts the rating that is not explicitly given for any item and build the recommendation list for a particular user. Different similarity metrics and prediction approaches are used for this purpose. But these metrics and approaches have some issues in dealing with highly sparse datasets. In this paper, we sought to find the most accurate combinations of Similarity Metrics and prediction approaches for both user and item similarity-based CF. In this comparative study, we deliberately instill sparsity of different magnitudes (10%, 20%, 30% and 40%) by deleting given ratings in an existing dataset. We then predict the deleted ratings using different combinations of Similarity Metrics and prediction approach. We assessed the accuracy of the prediction with the help of two evaluation metrics (MAE and RMSE).
Conference Paper
The present age of digital information has presented a heterogeneous online environment which makes it a formidable mission for a noble user to search and locate the required online resources timely. Recommender systems were implemented to rescue this information overload issue. However, majority of recommendation algorithms focused on the accuracy of the recommendations, leaving out other important aspects in the definition of good recommendation such as diversity and serendipity. This results in low coverage, long-tail items often are left out in the recommendations as well. In this paper, we present and explore a recommendation technique that ensures that diversity, accuracy and serendipity are all factored in the recommendations. The proposed algorithm performed comparatively well as compared to other algorithms in literature.
Conference Paper
Recommender systems which are based on collaborative filtering are vulnerable to “shilling attacks” due to their open nature. Shillers inject a few unscrupulous “shilling profiles” into the database of ratings for altering the system's recommendation, due to which some inappropriate items are recommended by the system. In this paper, we simulated shilling attacks namely random, average, bandwagon and segment on Movie-Lens 1 dataset, which focused on a set of users having similar interests. Biased ratings of the items are also introduced in the system. The results show that although segment attack has impact on item based collaborative filtering, still it has higher robustness than user based collaborative filtering approach.
Article
Due to their powerful personalization and efficiency features, recommendation systems are being used extensively in many online environments. Recommender systems provide great opportunities to businesses, therefore research on developing new recommender system techniques and methods have been receiving increasing attention. This paper reviews recent developments in recommender systems in the domain of ecommerce. The main purpose of the paper is to summarize and compare the latest improvements of e-commerce recommender systems from the perspective of e-vendors. By examining the recent publications in the field, our research provides thorough analysis of current advancements and attempts to identify the existing issues in recommender systems. Final outcomes give practitioners and researchers the necessary insights and directions on recommender systems.