Content uploaded by Badrul Sarwar
Author content
All content in this area was uploaded by Badrul Sarwar on Dec 07, 2012
Content may be subject to copyright.
Item-Based Collaborative Filtering Recommendation
Algorithms
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl
GroupLens Research Group/Army HPC Research Center
Department of Computer Science and Engineering
University of Minnesota, Minneapolis, MN 55455
ABSTRACT
1. INTRODUCTION
Copyright is held by the author/owner.
WWW10, May 1-5, 2001, Hong Kong.
ACM 1-58113-348-0/01/0005.
1.1 Related Work
1.2 Contributions
1.3 Organization
2. COLLABORATIVE FILTERING BASED
RECOMMENDER SYSTEMS
2.0.1 Overview of the Collaborative Filtering Pro-
cess
.
.
2.0.2 Challenges of User-based Collaborative Filter-
ing Algorithms
u
1
u
2
u
a
u
m
.
.
.
.
i
1
i
2
i
j
i
n
. .
. .
Input (ratings table)
Active user
Item for which prediction
is sought
Prediction
Recommendation
CF-Algorithm
P
a,j
(prediction on
item j
for the active
user)
{T
i1
, T
i2
, ..., T
iN
} Top-N
list of items for the
active user
Output interface
3. ITEM-BASED COLLABORATIVE FILT-
ERING ALGORITHM
3.1 Item Similarity Computation
3.1.1 Cosine-based Similarity
3.1.2 Correlation-based Similarity
3.1.3 Adjusted Cosine Similarity
1
2
3
i n-1 n
1
2
u
m
m-1
j
R-
R -
R R
R R
R R
Item-item similarity is computed by
looking into co-rated items only. In
case of items
i and j the similarity s
i,j
is
computed by looking into them. Note:
each of these co-rated pairs are
obtained from different users, in this
example they come from users 1, u
and m-1.
s
i,j
=?
3.2 Prediction Computation
3.2.1 Weighted Sum
3.2.2 Regression
3.3 Performance Implications
1 2 3 ii-1 i+1 n-1 n
1
2
u
m
m-1
2nd 1st 3rd 5th4th
Ranking of the items similar to the i-th item
R R R R
u
R R R R
i1 2 3 i-1 m-1 m
s
i,1
s
i,3
s
i,i-1
s
i,m
-
-
prediction
weighted sum regression-based
4. EXPERIMENTAL EVALUATION
4.1 Data set
.
4.2 Evaluation Metrics
4.2.1 Experimental Procedure
Experimental steps.
Benchmark user-based system.
Experimental platform.
4.3 Experimental Results
4.3.1 Effect of Similarity Algorithms
Relative performance of different similarity
measures
0.66
0.68
0.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
Adjusted cosine Pure cosine Correlation
MAE
4.3.2 Sensitivity of Training/Test Ratio
4.3.3 Experiments with neighborhood size
4.3.4 Quality Experiments
Sensitivity of the parameter x
0.73
0.75
0.77
0.79
0.81
0.83
0.85
0.87
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Train/test ratio, x
MAE
itm-itm itm-reg
Sensitivity of the Neighborhood Size
0.736
0.741
0.746
0.751
10
20
30
40
50
60
70
80
90
100
125
150
175
200
No. of Neighbors
MAE
itm-itm itm-reg
4.3.5 Performance Results
4.4 Sensitivity of the Model Size
Sensitivity of the model size
(at selected train/test ratio)
0.72
0.74
0.76
0.78
0.8
0.82
0.84
25 50 75 100 125 150 175 200 item-item
Model size
MAE
x=0.3 x=0.5 x=0.8
Item-item vs. User-user at Selected
Neighborhood Sizes (at x=0.8)
0.725
0.73
0.735
0.74
0.745
0.75
0.755
10 20 60 90 125 200
No. of neighbors
MAE
user-user item-item
item-item-regression nonpers
Item-item vs. User-user at Selected
Density Levels (at No. of Nbr = 30)
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.2 0.5 0.8 0.9
Train/test ratio, x
MAE
user-user item-item
item-item-regression nonpers
4.4.1 Impactofthemodelsizeonrun-timeandthrough-
put
4.5 Discussion
5. CONCLUSION
6. ACKNOWLEDGMENTS
Recommendation time vs. Model size
(at selected train/test ratio)
0.75
5.75
10.75
15.75
20.75
25.75
30.75
35.75
25 50 75 100 125 150 175 200 item-
item
Model size
Rec. time (sec.)
x=0.3 x=0.5 x=0.8
Throughput vs. Model size
(at selected train/test ratio)
0
20000
40000
60000
80000
100000
25 50
75 100 125 150 175 200 item-
item
Model size
Throughput (recs./sec)
x=0.3 x=0.5 x=0.8
7. REFERENCES