ArticlePDF Available

Item-based Collaborative Filtering Recommendation Algorithms

Authors:

Abstract

Recommender systems apply knowledge discovery techniques to the problem of making personalized recommendations for information, products or services during a live interaction. These systems, especially the k-nearest neighbor collaborative filtering based ones, are achieving widespread success on the Web. The tremendous growth in the amount of available information and the number of visitors to Web sites in recent years poses some key challenges for recommender systems. These are: producing high quality recommendations, performing many recommendations per second for millions of users and items and achieving high coverage in the face of data sparsity. In traditional collaborative filtering systems the amount of work increases with the number of participants in the system. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. To address these issues we have explored item-based collaborative filtering techniques. Itembased techniques first analyze the user-item matrix to identify relationships between different items, and then use these relationships to indirectly compute recommendations for users. In this paper we analyze different item-based recommendation generation algorithms. We look into different techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vectors) and different techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Finally, we experimentally evaluate our results and compare them to the basic k-nearest neighbor approach. Our experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time p...
Item-Based Collaborative Filtering Recommendation
Algorithms
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl
GroupLens Research Group/Army HPC Research Center
Department of Computer Science and Engineering
University of Minnesota, Minneapolis, MN 55455
ABSTRACT
1. INTRODUCTION
Copyright is held by the author/owner.
WWW10, May 1-5, 2001, Hong Kong.
ACM 1-58113-348-0/01/0005.
1.1 Related Work
1.2 Contributions
1.3 Organization
2. COLLABORATIVE FILTERING BASED
RECOMMENDER SYSTEMS
2.0.1 Overview of the Collaborative Filtering Pro-
cess
.
.
2.0.2 Challenges of User-based Collaborative Filter-
ing Algorithms
u
1
u
2
u
a
u
m
.
.
.
.
i
1
i
2
i
j
i
n
. .
. .
Input (ratings table)
Active user
Item for which prediction
is sought
Prediction
Recommendation
CF-Algorithm
P
a,j
(prediction on
item j
for the active
user)
{T
i1
, T
i2
, ..., T
iN
} Top-N
list of items for the
active user
Output interface
3. ITEM-BASED COLLABORATIVE FILT-
ERING ALGORITHM
3.1 Item Similarity Computation
3.1.1 Cosine-based Similarity
3.1.2 Correlation-based Similarity
3.1.3 Adjusted Cosine Similarity
1
2
3
i n-1 n
1
2
u
m
m-1
j
R-
R -
R R
R R
R R
Item-item similarity is computed by
looking into co-rated items only. In
case of items 
i and j the similarity  s
i,j
is
computed by looking into them. Note:
each of these co-rated pairs are
obtained from different users, in this
example they come from users  1, u
and m-1.
s
i,j
=?
3.2 Prediction Computation
3.2.1 Weighted Sum
3.2.2 Regression
3.3 Performance Implications
1 2 3 ii-1 i+1 n-1 n
1
2
u
m
m-1
2nd 1st 3rd 5th4th
Ranking of the items similar to the i-th item
R R R R
u
R R R R
i1 2 3 i-1 m-1 m
s
i,1
s
i,3
s
i,i-1
s
i,m
-
-
prediction
weighted sum regression-based
4. EXPERIMENTAL EVALUATION
4.1 Data set
.
4.2 Evaluation Metrics
4.2.1 Experimental Procedure
Experimental steps.
Benchmark user-based system.
Experimental platform.
4.3 Experimental Results
4.3.1 Effect of Similarity Algorithms
Relative performance of different similarity 
measures
0.66
0.68
0.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
Adjusted cosine Pure cosine Correlation
MAE
4.3.2 Sensitivity of Training/Test Ratio
4.3.3 Experiments with neighborhood size
4.3.4 Quality Experiments
Sensitivity of the parameter  x
0.73
0.75
0.77
0.79
0.81
0.83
0.85
0.87
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Train/test ratio, x
MAE
itm-itm itm-reg
Sensitivity of the Neighborhood Size
0.736
0.741
0.746
0.751
10
20
30
40
50
60
70
80
90
100
125
150
175
200
No. of Neighbors
MAE
itm-itm itm-reg
4.3.5 Performance Results
4.4 Sensitivity of the Model Size
Sensitivity of the model size
(at selected train/test ratio)
0.72
0.74
0.76
0.78
0.8
0.82
0.84
25 50 75 100 125 150 175 200 item-item
Model size
MAE
x=0.3 x=0.5 x=0.8
Item-item vs. User-user at Selected
Neighborhood Sizes (at x=0.8) 
0.725
0.73
0.735
0.74
0.745
0.75
0.755
10 20 60 90 125 200
No. of neighbors
MAE
user-user item-item
item-item-regression nonpers
Item-item vs. User-user at Selected 
Density Levels (at No. of Nbr = 30)
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.2 0.5 0.8 0.9
Train/test ratio, x
MAE
user-user item-item
item-item-regression nonpers
4.4.1 Impactofthemodelsizeonrun-timeandthrough-
put
4.5 Discussion
5. CONCLUSION
6. ACKNOWLEDGMENTS
Recommendation time vs. Model size
(at selected train/test ratio)
0.75
5.75
10.75
15.75
20.75
25.75
30.75
35.75
25 50 75 100 125 150 175 200 item-
item
Model size
Rec. time (sec.)
x=0.3 x=0.5 x=0.8
Throughput vs. Model size
(at selected train/test ratio)
0
20000
40000
60000
80000
100000
25 50
75 100 125 150 175 200 item-
item
Model size
Throughput (recs./sec)
x=0.3 x=0.5 x=0.8
7. REFERENCES
... In order to showcase the ability of LARP to learn effective and generalizable representations, we integrate the representations generated by Equations 10 and 12 with three cold-start playlist continuation methods: ItemKNN [45], DropoutNet [56], and CLCRec [58]. It should be noted that these methods are originally designed for coldstart recommendation and we adapt them to the task of cold-start playlist continuation (please see B for more details). ...
... • ItemKNN [45] is a parameter-free approach which relies on the cosine similarity between the input partial playlist e¯and all the candidate tracks ES, to find the most similar tracks. • DropoutNet [56] is designed to treat the cold-start as a robustness task in which sparse interactive signals are reconstructed using content features. ...
... Ablation study with ItemKNN[45] as the recommender. The last row is our final method, LARP-TPC-fusion. ...
Preprint
Full-text available
As online music consumption increasingly shifts towards playlist-based listening, the task of playlist continuation, in which an algorithm suggests songs to extend a playlist in a personalized and musically cohesive manner, has become vital to the success of music streaming. Currently, many existing playlist continuation approaches rely on collaborative filtering methods to perform recommendation. However, such methods will struggle to recommend songs that lack interaction data, an issue known as the cold-start problem. Current approaches to this challenge design complex mechanisms for extracting relational signals from sparse collaborative data and integrating them into content representations. However, these approaches leave content representation learning out of scope and utilize frozen, pre-trained content models that may not be aligned with the distribution or format of a specific musical setting. Furthermore, even the musical state-of-the-art content modules are either (1) incompatible with the cold-start setting or (2) unable to effectively integrate cross-modal and relational signals. In this paper, we introduce LARP, a multi-modal cold-start playlist continuation model, to effectively overcome these limitations. LARP is a three-stage contrastive learning framework that integrates both multi-modal and relational signals into its learned representations. Our framework uses increasing stages of task-specific abstraction: within-track (language-audio) contrastive loss, track-track contrastive loss, and track-playlist contrastive loss. Experimental results on two publicly available datasets demonstrate the efficacy of LARP over uni-modal and multi-modal models for playlist continuation in a cold-start setting. Code and dataset are released at: https://github.com/Rsalganik1123/LARP.
Article
Selecting the right academic major significantly shapes an individual's future career path, making it a longstanding focus of research. The shift to online platforms, accelerated by the challenges posed by the coronavirus pandemic, has transformed counseling and guidance systems. Consequently, developing robust online support systems has become imperative for extending guidance to all students. This article introduces the design, development, and evaluation of “My Future Career,” a multidimensional recommendation system (RS) crafted to aid students in navigating university and academic major selection decisions. The system relies on three key student‐driven parameters: central university entrance exam scores, rankings, and occupational personality types, utilizing cosine similarity and normalized distance to align user and item profiles. Following the system's completion, an assessment was conducted using data from real users, revealing an impressive accuracy (hit rate 100%, precision 88%) in recommendations following the inclusion of contextual post‐filtering features. The findings not only highlight the system's effectiveness but also underscore the positive user experience, as students express contentment with its ease of use and practical utility. The results emphasize the endorsement of expert's regarding the system's consistency (52%), relevance (96%), and acceptance (96%) in providing recommendations.
Chapter
This paper addresses the issue of how to effectively use users' historical data in restaurant recommender systems, as opposed to systems, such as FindMe, that only rely on online operations. Towards that end, the authors propose a bias-based SVD method as the underlying recommendation algorithm and test it against the traditional item-based collaborative filtering method on the Entrée restaurant dataset. The results are promising as the obtained Root-Mean-Square-Error (RMSE) values reach 0.58 for the SVD and 0.62 for the item-based system. Researchers can extend the transformation from user behaviors to ratings in more application domains other than the restaurant one.
Article
Purpose The collaborative filtering algorithm is a classical and widely used approach in product recommendation systems. However, the existing algorithms rely mostly on common ratings of items and do not consider temporal information about items or user interests. To solve this problem, this study proposes a new user-item composite filtering (UICF) recommendation framework by leveraging temporal semantics. Design/methodology/approach The UICF framework fully utilizes the time information of item ratings for measuring the similarity of items and takes into account the short-term and long-term interest decay for computing users’ latest interest degrees. For an item to be probably recommended to a user, the interest degrees of the user on all the historically rated items are weighted by their similarities with the item to be recommended and then added up to predict the recommendation degree. Findings Comprehensive experiments on the MovieLens and KuaiRec datasets for user movie recommendation were conducted to evaluate the performance of the proposed UICF framework. Experimental results show that the UICF outperformed three well-known recommendation algorithms Item-Based Collaborative Filtering (IBCF), User-Based Collaborative Filtering (UBCF) and User-Popularity Composite Filtering (UPCF) in the root mean square error (RMSE), mean absolute error (MAE) and F1 metrics, especially yielding an average decrease of 11.9% in MAE. Originality/value A UICF recommendation framework is proposed that combines a time-aware item similarity model and a time-wise user interest degree model. It overcomes the limitations of common rating items and utilizes temporal information in item ratings and user interests effectively, resulting in more accurate and personalized recommendations.
Article
Collaborative Filtering (CF) is achieving a plateau of high popularity. Still, recommendation success is challenged by the diversity of user preferences, structural sparsity of user-item ratings, and inherent subjectivity of rating scales. The increasing user base and item dimensionality of e-commerce and e-entertainment platforms creates opportunities, while further raising generalization and scalability needs. Moved by the need to answer these challenges, user-based and item-based clustering approaches for CF became pervasive. However, classic clustering approaches assess user (item) rating similarity across all items (users), neglecting the rich diversity of item and user profiles. Instead, as preferences are generally simultaneously correlated on subsets of users and items, biclustering approaches provide a natural alternative, being successfully applied to CF for nearly two decades and synergistically integrated with emerging deep learning CF stances. Notwithstanding, biclustering-based CF principles are dispersed, causing state-of-the-art approaches to show accentuated behavioral differences. This work offers a structured view on how biclustering aspects impact recommendation success, coverage, and efficiency. To this end, we introduce a taxonomy to categorize contributions in this field and comprehensively survey state-of-the-art biclustering approaches to CF, highlighting their limitations and potentialities.
Article
Multi-behavioral recommender systems have emerged as a solution to address data sparsity and cold-start issues by incorporating auxiliary behaviors alongside target behaviors. However, existing models struggle to accurately capture varying user preferences across different behaviors and fail to account for diverse item preferences within behaviors. Various user preference factors (such as price or quality) entangled in the behavior may lead to sub-optimization problems. Furthermore, these models overlook the personalized nature of user behavioral preferences by employing uniform transformation networks for all users and items. To tackle these challenges, we propose the Disentangled Cascaded Graph Convolutional Network (Disen-CGCN), a novel multi-behavior recommendation model. Disen-CGCN employs disentangled representation techniques to effectively separate factors within user and item representations, ensuring their independence. In addition, it incorporates a multi-behavioral meta-network, enabling personalized feature transformation across user and item behaviors. Furthermore, an attention mechanism captures user preferences for different item factors within each behavior. By leveraging attention weights, we aggregate user and item embeddings separately for each behavior, computing preference scores that predict overall user preferences for items. Our evaluation of benchmark datasets demonstrates the superiority of Disen-CGCN over state-of-the-art models, showcasing an average performance improvement of 7.07% and 9.00% on respective datasets. These results highlight Disen-CGCN’s ability to effectively leverage multi-behavioral data, leading to more accurate recommendations.
Article
Full-text available
blem through a collaborative filtering approach. PHOAKS works by automatically recognizing, tallying, and redistributing recommendations of Web resources mined from Usenet news messages. A collaborative filtering system that recognizes and reuses recommendations. PHOAKS: 60 March 1997/Vol. 40, No. 3 COMMUNICATIONS OF THE ACM the same types of benefits. In the case of ratingsbased systems, for example, everyone rates objects of interest. Yet there is evidence that people naturally prefer to play distinct producer/consumer roles in the information ecology [2]; in particular, only a minority of people expend the effort of judging information and volunteering their opinions to others. Independently, we have observed such role specialization in Netnews; authors volunteer long lists of recommended Web resources at a stable, but low, rate. PHOAKS assumes the roles of recommendation provider and recommendati
Article
Recommender systems assist and augment a natural social process. In a typical recommender system people, provide recommendations as inputs, which tile system then aggregates and directs to appropriate recipients. In some cases, the primary transformation is in the aggregation; in others, the system's value lies in its ability to make good matches between recommenders and those seeking recommendations. This special section includes descriptions of five recommender systems. A sixth article analyzes incentives for provision of recommendations. Recommender systems introduce two interesting incentive problems. First, once one has established a profile of interests, it is easy to free ride by consuming evaluations provided by others. Second, if anyone can provide recommendations, content owners may generate mountains of positive recommendations for their own materials and negative recommendations for their competitors. Recommender systems also raise concerns about personal privacy.
Article
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.
Conference Paper
When making a choice in the absence of decisive first-hand knowledge, choosing as other like-minded, similarly-situated people have successfully chosen in the past is a good strategy - in effect, using other people as filters and guides: filters to strain out potentially bad choices and guides to point out potentially good choices. Current human-computer interfaces largely ignore the power of the social strategy. For most choices within an interface, new users are left to fend for themselves and if necessary, to pursue help outside of the interface. We present a general history-of-use method that automates a social method for informing choice and report on how it fares in the context of a fielded test case: the selection of videos from a large set. The positive results show that communal history-of-use data can serve as a powerful resource for use in interfaces.
Article
predicated on the belief that information filtering can be more effective when humans are involved in the filtering process. Tapestry was designed to support both content-based filtering and collaborative filtering, which entails people collaborating to help each other perform filtering by recording their reactions to documents they read. The reactions are called annotations; they can be accessed by other people’s filters. Tapestry is intended to handle any incoming stream of electronic documents and serves both as a mail filter and repository; its components are the indexer, document store, annotation store, filterer, little box, remailer, appraiser and reader/browser. Tapestry’s client/server architecture, its various components, and the Tapestry query language are described.