ArticlePDF Available

Implementation of Knowledge based Collaborative Filtering and Machine Learning for E-Commerce Recommendation System

Authors:
  • SRM Institute of Science and Technology Delhi NCR Campus

Abstract and Figures

This is the era of I-way. The development of high-speed computing and huge storage devices change the working culture of human. It affects the traditional business processes and shifted towards online business. It creates huge problems like overload and irrelevant information which are the causes of confusion both customers as well as enterprise. Recommendation system solves these problems. Design and development of efficient system is one of the key areas of the recent researchers. Collaborative filtering (CF) and content-based filtering algorithms are widely used in the implementation of such system. Collaborative used user’s features while content-based used item’s features. Most of the CFs are rating or review based processed homogeneous information. In this paper we proposed knowledge-based collaborative filtering algorithm for large data set that uses various activities done by users during interaction of item through E-commerce web site like clicks, select and purchase. The performance of the system is compared with the base models using real time Amazon E-commerce dataset using precession, recall and NDCG evaluation parameters in various combinations of activities performed by users on items.
Content may be subject to copyright.
Journal of Physics: Conference Series
PAPER • OPEN ACCESS
Implementation of Knowledge based Collaborative
Filtering and Machine Learning for E-Commerce
Recommendation System
To cite this article: Mahesh Kumar Singh et al 2021 J. Phys.: Conf. Ser. 2007 012032
View the article online for updates and enhancements.
You may also like
A Time Effect based Collaborative Filtering
Approach for User Preference Statistics
and Recommendation
Yuxi Chen, Xiaotong Zhang, Qing Zhao et
al.
-
Online Book Recommendation System
using Collaborative Filtering (With Jaccard
Similarity)
Avi Rana and K. Deeba
-
Application of Improved Collaborative
Filtering Algorithm in Recommendation of
Batik Products of Miao Nationality
Ning Ding, Jian Lv and Lai Hu
-
This content was downloaded from IP address 216.19.203.70 on 29/10/2022 at 04:44
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
1
Implementation of Knowledge based Collaborative Filtering and
Machine Learning for E-Commerce Recommendation System
Mahesh Kumar Singh1, Om Prakash Rishi2, Akhilesh Kumar Singh3, Pushpendra
Singh4, Pushpa Choudhary5
1,3,5 Department of Information Technology, G L Bajaj Institute of Technology and
Management Greater Noida, U P, India.
2Department of Computer Science and Informatics. University of Kota, Kota, Rajasthan,
India.
4Department of Information Technology, Indraprastha Engineering College, Ghaziabad,
UP, India
e-mail: 1maheshkrsg@gmail.com, 2omprakashrishi@yahoo.com,
3akhileshaks@gmail.com, 4pushpendra.singh1@gmail.com, 5pushpak2728@gmail.com
Abstract. This is the era of I-way. The development of high-speed computing and huge storage
devices change the working culture of human. It affects the traditional business processes and
shifted towards online business. It creates huge problems like overload and irrelevant information
which are the causes of confusion both customers as well as enterprise. Recommendation system
solves these problems. Design and development of efficient system is one of the key areas of the
recent researchers. Collaborative filtering (CF) and content-based filtering algorithms are widely
used in the implementation of such system. Collaborative used user’s features while content-based
used item’s features. Most of the CFs are rating or review based processed homogeneous
information. In this paper we proposed knowledge-based collaborative filtering algorithm for large
data set that uses various activities done by users during interaction of item through E-commerce
web site like clicks, select and purchase. The performance of the system is compared with the base
models using real time Amazon E-commerce dataset using precession, recall and NDCG evaluation
parameters in various combinations of activities performed by users on items.
Keywords: Web Recommendation System (WRS), Web Artificial Intelligence, Web Information
Technology. Collaborative Filtering (CF), Knowledge base, Knowledge graph, Web Usage Mining.
Web Engineering Applications.
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
2
1. INTRODUCTION
O Internet has transformed the style of traditional ways of business, almost every company wants to
create its own web site for helping and doing their business. Since Internet provides a very large market
place hence every customer is faced with multiple choices. Suppose a customer looking to read a book
without any specific area, there are many books of same variety, therefore customer spent a lot of time for
searching relevant book. If there is any site or app that provides relevant book to the customer as he/she
read previously then it saved so many times of the customer. This feature of that web site is known as
recommendation system.
Anciently, a person bought product only suggested his/her friends or relatives. This is the traditional
way of purchasing when there was any doubt about the product, but this is the era of I-way that circle has
expanded to include online sites that utilize some sort of recommendation engine [1]. A recommendation
engine figure 1 uses various algorithms to filter and recommends the most relevant products to the
customers on the basis of his/her past behaviors that is it recommends products which the user might be
likely to buy.
Some popular websites that are using recommendation are This is the era of I-way. The development
of high-speed computing and huge storage devices change the working culture of human. It affects the
traditional business processes and shifted towards online business. It creates huge problems like overload
and irrelevant information which are the causes of confusion both customers as well as enterprise.
Recommendation system solves these problems. Design and development of efficient system is one of the
key areas of the recent researchers shown in table 1.
Figure 1 Architecture of Recommendation Engine
Almost every collaborative filtering uses unstructured data such as ratings, reviews are images to
profile the users for personalized recommendation. In this paper we can extends the power of
collaborative filtering (CF) using large scale structured heterogeneous user behavior data. The main
building block of proposed CF combined the traditional CF with knowledge base. The behavior of the
users can be represented by directed graph called knowledge graph.
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
3
Table 1 Some popular sites that used recommendation system
1.1 Knowledge graph
The representation of relation between customer and product is denoted by a directed graph called
knowledge graph [2,3]. It is a directed graph of triplets (subject, predicate, object) called SPO. Subject,
object, and predicate, subject and object are entities and predicate show the relationship between these
entities [4]. Entities are denoted by nodes and relationship by edges.
“A customer C1 buy a product P1 of category cat1 of brand b1 fall in price range r1 and Customer c1
select product p2 of brand b2 categories cat1 of price range r2” SPO of this statement is listed in Table 2.
Table 2 Subject Predicate Object list of the given statement
Combining all SPO triplets together to form a multi directed graph of a given statement as shown in
figure 2.
Sr. no Site User Item Description
1 LinkedIn Member Members or
Jobs
Members are interested in another
members or jobs
2 Facebook Member Members Members are interested in another
members
3 Amazon Member Products or
books
Members are interested in products
4 Netflix Member Movies or Story
series
Members are interested in watching
movies or story series.
5 Flipkart Member Products Members are interested in products.
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
4
Figure 2 Sample Knowledge Graph for the Statement
The other activities of the customer can be written as “A customer C1 also_view a product p3 of brand
b3 of category cat1 fall_in price range r1. Customer C1 also_buy product p4 of brand b2 of category cat1
fall_in price_range r2.”
1.1.1 Construction of Knowledge Base. During the construction of knowledge base [5], it is mandatory
to consider some parameters like completeness, accuracy, and quality of data which determine the
usefulness of knowledge base. There are four major groups of knowledge base construction methods,
curated method, collaborative method, automated semi-structured method and automated unstructured
method.
1.1.1.1. Curated method. In this method triplets are created manually by a closed group of experts. The
accuracy of curated knowledge base is very but this method is not scalable due dependency on human
experts.
1.1.1.2. Collaborative method. In this method triplets are created manually by the open group of
volunteers. This method is widely used in Wikipedia and Freebase and better scalable, but it also has some
limitations due to this growth of Wikipedia has been slowing down.
1.1.1.3. Automated semi-structure method. In this method triplets are extracted automatically from
semi-structured text by making some rules. This method is used in Wikipedia infoboxes which has large
and highly accurate knowledge graph like YAGO [6] and DBpedia [7], but semi-structure text cover very
fraction of information stored on the web.
1.1.1.4. Automated unstructured Method: In this method triplets are created automatically from
unstructured text using machine learning and natural languages processing. This method tries to read the
web extracts facts from the natural language text of web pages like Nell and Knowledge Vault.
Knowledge graph is similar as knowledge base which is classified into schema based and schema free.
Some popular schema-based knowledge base is listed in table 3.
Schema based approach uses predefined and globally unique identified entities and relations in a fixed
vocabulary while in schema free approach entities and relations are identified using open information
extraction techniques.
Table 3 Size in million (M) of some popular Schema-Based Knowledge Bases
Name of Knowledge
Graph
Entities Relations Facts
Freebase [8} 45 M 360000 735 M
Wikidata[9] 19 M 1735 68 M
DBpedia[7] 5.2 M 1467 638 M
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
5
YAGO2[6] 10.8 M 150 547 M
Google Knowledge Graph
[3]
670 M 360000 200000 M
1.1.2. Learning of knowledge graph. Let E={e1,e2,e3,………………en} be the set of all entities (that is
all the subject and object entities) and R={r1,r2,r3,……rm} be the set of all relations in the knowledge
graph. For each possible triplet Xi,k,j=(ei,rk,ej) over this set of entities and relations as a binary random
variable Yi,k,j 
Ne x Nr x Ne which contains
,,=1
0 ( , , ) 
 ……(1)
Let eh, et are the head and tail entity in the knowledge graph rk is the edge between them then et can be
related by eh as et=transpose(eh,rk)=eh+rk by applying this relation for all nodes relation among the nodes
are easily calculated.
Figure 3 Binary Relation Presentation of Data
2. RELATED WORK
Recommendation System try to identify the user’s interest in the specific domain of contents based on
their previous experiences. When a user interacts with the E-commercial site he\she offers a set of implicit
or explicit information like clicks, rating, comments etc. about his/her tastes. Recommendation systems
are categorized into two main categories personalized [10,11] and non-personalized. Personalized system
uses history of the customers’ navigations/behaviors like content based [12], collaborative filtering [13],
PageRank [14] in social network analysis etc., while non-personalized system does not require any
historical data it used the characteristics of the products like popularity based.
2.1. Content based filtering system
This system is designed to recommend the products on the basis of user’s past preferred order. It saves
all the information related to each user in a vector form known as profile vector and another vector,
product vector which contains all the information related to the products. This algorithm finds the
cosine of the angle between the profile vector and product vector. It uses traditional classification and
clustering techniques such as Support Vector Machine [13] or Nearest Neighbors algorithms [12]. There
are two types of user implicit and explicit. Those updated their information automatically by the system
are called implicit users while some give their feedback to the system in the given range are called explicit
users. According to Aggarwal [1] it has some drawbacks like accuracy of the system is highly dependent
on the specific application that is used for features of items, over specialization and training size. There
are three major limitations of content-based filtering recommendation technique overspecialization, cold
start and limitations of contents.
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
6
2.2. Collaborative filtering algorithm
It uses user behaviors for recommending items. This is the most commonly used algorithms in the
industry since it is not dependent any additional information. There are two types of collaborative filtering
techniques, memory based collaborative filtering and model based collaborative filtering.
Memory based collaborative filtering [15] used item based and user-based approaches,
recommendations are generated on the preferences of nearest neighbors [ 16], while model based
collaborative filtering used matrix factorization approach such as SVD, Tensor factorization [17] it is
widely used in order to predict most preferable product wish to purchase by the customer. Graph based or
social network-based recommendation system [18] utilized information available from social network like
user preferences, influences from friends of social network to overcome the problems of cold start and
data sparsity problems or recommendation systems.
3. DESIGN ARCHITECTURE OF PROPOSED MODEL
The architecture of proposed model (see fig 4) contains 4 basic components user that used the system
by giving some data to the system and system provided a listed of item in his recommendation list. Event
data pre-processing transformed the data as per the system requirements. Ranking algorithm that
generated the rank score of each item based on user’s preference, and Matching algorithm that measured
the similarity among items and users using item-item and user-user similarity methods respectively.
Figure 4 Architecture of Proposed Model
3.1. Input Data
Input data sets are collected from Amazone[16] E-commerce web site. It contains various activities
performed by users during navigation of the items. Let m be the number users and n be the number of
items then there are m x n passible activities are possible, but there are three activities (say view,
add_to_cart and transaction) of information used in system as mentioned. The user and item activities is
represented by m x n preference matrix (see eq.(2))
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
7
=
 ……… 

..……………….
..
...
 …………. 
=…(2)
3.2. Data pre-processing
Input datasets contain so many attributes in different domains. Every input data set consists of
individual data objects, all the data sets have common properties like type of the data object, size,
dimensionality sparsity and abstraction. There are some methods which are used to process the data before
use of proposed recommendation system.
3.2.1. Feature selection. There are many attributes in the input from users and items but, method of
dimension reduction of the datasets which is primarily used to remove redundant and irrelevant attributes
from the datasets. The system consider c_id , ip_address and session attributes from the customer dataset,
P_id, P_cat from product dataset and P_view, P_select, and P_buy from relation datasets and all others are
removed.
3.2.2 Data binarization. It is the method to provide the values of the attributes here the values of the
relations are considered for computation of preference score of the product hence the values of the
attributes like P_view, P_select and P_Buy are binarized that either 0 for no and 1 for yes, therefore the
preference matrix will be.
,=1  ,   
0  ..(2)
3.3. Computation of preference or ranking score using knowledge recommendation system (KRS)
The preference or ranking score of the product is computed on the number of customers participated in
the events in a specific period of times.
3.3.1. Event Database. It is the collection of customer events performed on different categories (P_cat)
of products (see Table-4). Suppose Cbij shows the customer i that buy the product j (i.e. a product j which
is buy by customers i), Csij shows the customer i that select the product j ( i.e. product j is only selected
by the customer i ) Similarly Cvij shows the customer i that view the product j (i.e. the product j is view by
the customer i). The preference order between the products is measure as product buy>product
_select>product_view. The customer’s preference of the product can be represented in matrix form as Cij it
means customer i prefer in terms of buy or select or view product j. If there are m number of customers
and n number of products then i=1,2,3,4,……..m and j=1,2,3,4,……….n. The product preference matrix
(Cij) can be written by m x n matrix (2).
Table 4 Events Data base
Name of Attribute Data
Type
Description
P_view Number Product view of click
P_select Number Product selected or
add_to_cart
P_buy Number Product bought or transaction
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
8
P_also_buy Number Buy with some other products
Price Real
Number
Price of the product
P_cat Text Category of the product
P_price_range Number Range of price
3.3.2. Customer data. The user data base (see Table 6) contains specific data like c_id, url, location of
the customer, navigational details, timestamp etc. The web server logs are represented as the tuple {u_id,
ip_address, url, timestamp, location}, that represent the details of web access user, pages requested to
access, corresponding browser time and the location of the user. But c_id of the customer is beneficial for
recommendation. The u_id with the url which are visited by the customer can be set up by the association
matrix.
Table 5 Product Data base

11 12 ..1
21 22 ..2
..
1 ....
..
2 .. ..

……………(3)
Here vij is the browser visiting information is a particular time that user i visited the first j pages
(product).
The URLgives the idea about the product like product_id and product_category (see Table-5). The
user’s information with the product’s information is use d for the recommendation.
Name of
Attribute
Data
Type
Description
P_ID Varchar Unique Identification number of the Product
P_cat Text Category of product
P_brand Text Name of the brand of a product.
P_name Text Name of the product
Price Real
number
Price of product
Unique_page Number Number of unique pages visited by the user per
product.
P_session Time Time spent by user for a particular product ID
Name of
Attribute
Data
Type
Description
C_ID Number Unique identification number of any user
Age Number Age group of the customer
Gender Text Gender group
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
9
Table 6 User Data base
3.3.3 Method. The dataset consists product details, customer details, page detail, action and categories,
brands of products. The relation between user and item is denoted by the triples (Ci, Rk, Pj), this shows that
customer i is related with particular relation k with product j. The first aim is to find the order pair (Ci,Pj)
bought, select and view list of customers then we find the order pair (Ck,Pj) that is list of customers
related with that product Pj.
The customer’s preference rank can be calculated by the formula
= 

 + 

 + 

 …(4)
Where X, Y and Z are the weight adjusting coefficients corresponding to three difference relations of
shopping all are not same as the preference of these steps the values of Z>Y>X. At value X=0.25, Y=0.5
and Z=1 is suitable for better results. Hence this formula finds the preference score of the products higher
value means higher the score.
3.4. Matching algorithm for recommendation (Association Rule Mining)
The rule mining is used to create the knowledgeable bunches data of similar category that it associates
the customer profiles with each product. That is valuable for the recommendation both customer as well as
for the enterprise. It is clear that there are three categories of the products hence association rules can be
generated from three different kinds of transaction sets, called buy transaction set, selected_but_not_buy
and also_view transaction set. For each transaction from the web logs there are three steps for the
association rules. First deals the minimum support and minimum confidence, second replace each product
in the transaction set with its corresponding categories of products and in third step generate association
rule for each transactional set using Apriori algorithm. It can be given by a matrix P=Pkl called product
matrix, where k=1,2…n is the total number of products and l=1,2,3 or 4 define in the product matrix Pij
that represents the association degree among the product categories in different transactional steps.
=1.0    
0.50
0.25
0.00    
   
 ……(5)
The relation among these three parts of information is denoted by the triples (Ci, Rk, Pj), this shows that
customer i is related with particular relation k with product j. The first aim is to find the order pair (Ci, Pj)
bought, select and view list of customers then we find the order pair (Ck, Pj) that is list of customers
related with that product Pj.
URL Text Uniform Resource Locator for each web page
IP_address Text Machine_ID
Session Time Time spent by user
Location Number Zip code of user
Page Number Number of unique pages visited by users for different
products.
S_ID Text Session ID
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
10
There are two different matrices customer preference matrix and product association matrix before
recommending it is to find the matching scores for each product associated with each customer this score
gives the idea closeness of product with the customer.
3.4.1. Customer-Customer collaborative filtering. This algorithm finds the similarity score between
customers. Based on this similarity score, it then picks out the most similar users and recommends
products which these similar customers have liked, selected or bought previously.
A user based Collaborative Filtering technique works based on a set of customers u have the same
relation to the product by aggregating these relations using the formula (6).
, =
(,)., …….. (6)
Where Nu is the set customers of which K customers have the similar interest target to customer u.
Sim(u,k) shows similarity between customer u and k predefined customers, and rk,j represents the rating
given by k users to the product j.
3.4.2. Product-Product collaborative filtering. In product based collaborative filtering considers the
similarity among the products or services. It is supposed that similar products are related in a similar way
by the same customer. Hence the products recommended to the customer u are scored or ranked by
aggregating the similarity of the different customers and the customer u related in the past. It is possible to
compute similarity score by the given formula (7).
,,
1(, ).
i
uj uk
kN
rSimjkr
K
¦
………(7)
Where Ni denotes the set of products or items neighbor to j Sim(j,k) is similarity value
4. EXPERIMENTS AND RESULT ANALYSIS
Experiments are performed on Amazon E-commerce datasets [19] there are five kinds of sub- datasets
automotive, mobile phones, home appliances, movies, books. The behavior of data i.e. relations (events)
like view, select and buy represent interaction that were collected over period of 4.5 months. In the
original data there are 27,56, 101 events including 26,64,312 only view, 69,332 only select and 22,457
buys produced by 1407580 unique customers and 8885 unique products.
The statistics of these datasets are summarized in the table. We consider Top N recommendation
measurement including Precision, Recall, Hit Ratio, NDCG [20] for evaluating the model and baselines.
Fist three methods are used to evaluate the quality of recommendation system and last method is used for
accuracy and ranking positions of the correct products in output list. There are so many relations among
the entities of the knowledge graph, but this paper considers only three relations.
Buy relation: This relation shows that a customer c bought the product p.
Select relation: This relation shows that the customer c add_ to _cart the product p.
View relation: This relation shows that a customer c visits the product p page.
4.1. Computation of baselines
We use following methods for baselines performance comparison.
4.1.1. Bayesian Personalized Ranking (BPR) [21]. It is a popular method in Top N recommendation
used matrix factorization as the prediction component. It is based on the
triplet(u,i,j) where user u interact with item I but not item j. Relationship between item I and j with
respect to user u can be given by the formula

> 0………….(8)
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
11
4.1.2. Hidden Factors and Topics (BPR_HFT) [22]. It is a method used for textual reviews; we use
HFT under BPR pair wise ranking framework for fair comparison.
4.1.3. Visual Bayesian Personalized Ranking (VBPR) [23]. This method is used for recommendation
with images.
4.1.4. Deep Convolutional Neural Network (DCNN)[24]. It is a review based deep recommendation
method to jointly model the users and the products.
4.1.5. Joint Representation Learning (JRL) [19]. It is a model which can leverage multi-model
information for Top-N recommendation.
4.2 Ranking accuracy
It deals the levels of utility of the recommended product or service with respect to the ranking
proposed by the user. Discounted Cumulative Gain (DCG) is very popular matrix for evaluating the
ranking accuracy. The Normalized DCG [27] is defining as follows
=
 
()

 , =
…(9)
Where m denotes the total number of users in the test dataset De, Iu is the set of products/services liked
by user u, vj is position of j in the recommended list, guj represents the utility gain given by the user u to
the product j, and IDCG is the ideal value computed on the basis of real value using same formula as
DCG. Another way to evaluate the accuracy of the relevant list is to consider the tradeoff between the
length of the list RL and the number of actual relevant products/services for the user. The relevant list RL
contains true positives (tp) but not false negative (fn) and false positive (fp). Hence RL and number of
relevant products can be computed in terms of Precision and Recall [10] as follows.
=
,=
…...… (10)
In the single matrix it can be summarized in F-measure [10] which can be computed by the following
formula.
=      
 .... (11)
4.3 Settings of parameters
All the parameters used in this filter are initialized in the range (0,1) and updated as per Stochastic
Gradient Descent (SGD). The learning rate is determined in the range of {1.0,0.1,0.01,0.001,0.0001} and
the dimension is started in the range {10,50,100,200,300,400,500,600} which gives the final learning rate
as 0.01 and dimension as 200. For computing the baselines 70% products of each user are consider for
training while other are used for testing. The system generated top 10 recommendation for each user from
the test dataset.
4.4 Performance Comparison
The performance of proposed filter shown in table 7 and table 8. Table-7 shows the performance
comparisons with various base models and table-8 shows the performance with possible combinations of
relations. From the experimental result (See fig 5) it is clear that both reviews based and rating based
models enhanced the performance of recommendation system but heterogeneous information source-
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
12
based model like JRL performs more better than baseline system, which gives the idea that the knowledge
based collaborative filtering (KCF) performs more better than that of JRL consistently over five datasets
and all evaluation measures which verifies the proposed system.
Table 7 Performance table on top 10 recommendation between baselines and proposed system.
Table 8 Performance on top 10 recommendation when incorporating between varieties of relational
structures used in knowledge graph. The final result is significantly than all other models
Figure 5 Performance of top 10 recommendation when incorporating between varieties of relational
structures used in Knowledge Graph
0
1
2
3
4
5
6
7
8
Recall Prec NDCG Recall Prec NDCG Recall Prec NDCG Recall Prec NDCG Recall Prec NDCG
Automotive Mobile phones Home appliances Movies Books
Buy
Select
view
Buy+Select
Buy+View
view+Select
All(KCF)
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
13
5. CONCLUSION
This paper discussed the concept of knowledge graph its learning and creation. Create a
knowledge-based collaborative filtering that processed the heterogeneous information which are
unstructured by using the concept of knowledge graph that is a directed graph it converted into
structured form. The triplet tuple relation between user and the product played a vital role in the
development of proposed CF. Experimental results used real world datasets for performance
measurements of various filters used in recommendation system based on rating, review and
heterogeneous information. From the result it is clear that the performance of proposed filter is much
better than the discussed filters, therefore we conclude that the proposed CF is better for creation of
recommendation system.
REFERENCES
[1] Aggarwal C.C. 2016” Recommender Systems”, Springer.
[2] Catherine R., K. Mazaitis,M Eskenazi, W Cohen: “Explainable Entity Based Recommendation
with Knowledge Graph”, RecSys 2017.
[3] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor, “Freebase: a collaboratively
created graph database for structuring human knowledge,” in Proceedings of the 2008 ACM
SIGMOD international conference on Management of data. ACM, 2008, pp. 1247–1250.
[4] L. De Raedt, Logical and relational learning. Springer, 2008.
[5] Mahesh Kumar Singh, Om Prakash Rishi, Sumit Wadhwa “Application of Page Ranking
Algorithm Based on Numbers of Link Visits in Web Recommendation System for Online
Business” 4th International Conference on Green Computing and Engineering Technology
(ICGCET-2018), on 17th -19th August 2018, at Aalborg University, Neils Hohrs Vej 8,
Esbjerg, Denmark.
[6] A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. H. Jr, and T. M. Mitchell, “Toward an
Architecture for Never-Ending Language Learning,” in Proceedings of the Twenty-Fourth
AI Press, 2010, pp. 1306–1313
[7] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives, “DBpedia: A Nucleus
for a Web of Open Data,” in The Semantic Web. Springer Berlin Heidelberg, 2007, vol.
4825, pp. 722–735.
[8] Ruining He,Wang-Cheng Kang and JulianMcAuley;”Translation based Recommendation: In
RecSys. ACM 2017.
[9] F. M. Suchanek, G. Kasneci, and G. Weikum, “Yago: A Core of Semantic Knowledge,” in
Proceedings of the 16th International Conference on World Wide Web. New York, NY,
USA: ACM, 2007, pp. 697–706.
[10] Mahesh Kumar Singh, Om Prakash Rishi, Sumit Wadhwa” Knowledge Generation through
Web Mining Techniques for E-business Recommendation System” 1st International
Conference on Communication and Computing (ICCCT-2018), on 7th and 8th Sept 2018, at
GLBITM Gr Noida, UP India.
[11] M. K. Singh, O. P. Rishi, S. Awasthi, A. P. Srivastava and S. Wadhwa, "Classification and
Comparison of Web Recommendation Systems used in Online Business." 2020 International
Conference on Computation, Automation and Knowledge Management (ICCAKM), Dubai,
United Arab Emirates, 2020, pp. 471-480.
[12] P. Lops, M. De Gemmis , G. Semeraro:” Content-based recommender systems: state of the art
and trends”, in: Recommender Systems Handbook, Springer, 2011, pp. 73–105
[13] Z. Xia, Y. Dong, G. Xing “Support vector machines for collaborative filtering”, in: Proceedings
of the Forty-fourth Annual Southeast Regional Conference, ACM, 2006, pp. 169–174 .
[14] L. Page, S. Brin, R. Motwani, and T. Winograd.” The PageRank citation ranking: Bringing
order to the Web”. Technical report, Stanford Digital Libraries SIDL-WP- 1999- 0120, 1999.
[15] Al-shamri M.Y.H.” Expert System with Applications Power Coefficient as a similarity measure
for Memory based collaborative recommender systems “Expert System with Application
41(13),pp 5680-5688, 2014
ICCEMME 2021
Journal of Physics: Conference Series 2007 (2021) 012032
IOP Publishing
doi:10.1088/1742-6596/2007/1/012032
14
[16] M.Y.H. Al-shamri,” Expert System with Applications Power Coefficient as a similarity measure
for Memory based collaborative recommender systems”Expert System with Application
41(13),pp 5680-5688, 2014
[17] K. Miyahara, M. Pazzani :” Collaborative filtering with the simple Bayesian clas- sifier”,
Proceedings of the Eighteenth Pacific Rim International Conference on PRICAI 20 0 0
Topics in Artificial Intelligence, 2000, pp. 679–689.
[18] Mahesh Kumar Singh, Om Prakash Rishi “Event Driven Recommendation for E-commerce
using Knowledge based Collaborative Filtering Technique. ” Scalable Computing: Practice
and Experience, ISSN 1895-1767.Volume 21, Issues 3,(2020-SCPE), pp. 369–378, DOI
10.12694:/scpe.v21i3.1709
[19] J. McAuley, Leskovec, J. “Hidden:factors and hidden topics: understanding rating dimensions
with review text”. In Proceedings of the 7th ACM Conference on Recommender Systems,
Hong Kong, China,12–16 October 2013; pp. 165–172.
[20] H. Seifoddini and M. Djassemi: “The production data-based similarity coefficient versus
jaccard’s similarity coefficient,” Computers & industrial engineering, vol. 21, no. 1, 1991,
pp. 263–266.
[21] S Rendle. C Freudenthaler, Z. Gantner, , L . Schmidt-Thieme. “BPR: Bayesian personalized
ranking from implicit feedback”. In Proceedings of the Twenty-Fifth Conference on
Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 18–21 June 2009; pp. 452–461.
[22] Y. Zhang, Q. Ai, X. Chen, w.B. Croft:” Joint representation learning for top-n recommendation
with heterogeneous information sources”. In Proceedings of the 2017 ACM Conference on
Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1449–1458.
[23] He, R., & McAuley, J. (2016). VBPR: Visual Bayesian Personalized Ranking from Implicit
Feedback. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). Retrieved
from https://ojs.aaai.org/index.php/AAAI/article/view/9973
[24] Mahesh Kumar Singh, Om Prakash Rishi, Anukrati Sharma, Zaved Akthatar “Knowledge
Extraction Through Page Rank Using Web Mining Techniques for E-business: A Review ” a
chapter in” Maximizing Business Performance and Efficiency through Intelligent Systems”
... Recall is a very important part of the recommendation system and there are different recall methods in different recommendation systems. In this regard, various researchers worked on this topic, such as Paul Covington et al. [1], M. Deshpande et al. [2], Peter Brusilovsky et al. [3], Adriá n Valera et al. [4], Yousefian Jazi et al. [5], Farah Tawfiq Abdul Hussien et al. [6], Singh Mahesh Kumar et al. [7] and Zeqi Ruan et al. [8]. ...
... For example, in music recommendation systems, in addition to the traditional content-based and collaborative filtering recall [20,21] strategies, there are also emotion-based recall [22]. What is more, there is recommendation system based on dynamic analysis of customer behavior in e-commerce platform [23], in addition to the above strategies [7,24]. ...
Article
In order to have a more comprehensive introduction and understanding of the re-search progress of recall strategies in recommender systems, this paper reviews the application of diverse recall methods in various recommender systems by different researchers. By searching and reading literature in major databases like Google Scholar, it is found that the recall method suitable for news recommendation system is also generally applicable in other recommendation systems. Therefore, this paper takes news recommendation system as an example to introduce traditional content-based recall and collaborative filtering-based methods. Hot-based recall and Embed-ding-based recall also developed in recent years. Furthermore, recall strategies (emo-tion-based recall and UIBB) that are specifically applicable to music and e-commerce recommendation systems are introduced. This paper briefly introduces these recall styles and collects researchers' evaluations and attitudes towards these recall styles, aiming to provide help for recommender system designers in optimizing recall methods.
... The machine learning support vector machine used on the product reviews helps people's choices or attraction or sentiment analysis [16]; in some cases, machine learning is used to predict sales of a specific store's products [17]. Knowledge-based means using data mining techniques and machine learning the recommendation system actually work as well [18]. At last, it can be said that every aspect of the e-commerce sector is updating rapidly with the use of artificial intelligence, machine learning, recommended systems, data mining, etc. ...
Article
Full-text available
E-commerce sectors are growing rapidly worldwide and it adopts the new technological innovation drastically, such as embracing artificial intelligence in e-commerce sectors. Machine learning adaptation in e-commerce sectors is the most and much news already published by giant e-commerce companies, such as Amazon and eBay. The aim of this paper is to find out how artificial intelligence helps the e-commerce platform to choose a seller from multiple sellers when the same products or listings are sold by multiple sellers. When a customer will place the order, then who will get the order of the customer as multiple sellers sell the item within the same product listings. In the research, it is figured out that machine learning techniques are normally used for the selection of the seller where the prior points used for finding the appropriate seller are feedback or ratings, seller products location or distance from the customer, advertising or PPC or campaign, discounts, etc.
Article
Click conversion rate is an important indicator that e-commerce platforms pay attention to. The user's behaviors toward the product on the e-commerce platform, to some extent, indicates a level of love for the product. In order to precisely predict the click conversion rate, this paper suggests a click conversion rate prediction model for e-commerce platforms based on various user behaviors in heterogeneous network. First, in view of the sparseness of interaction information data between users and items on the e-commerce, we take advantage of the ability to discover comparable nodes with the meta-path based random walk method. Simultaneously, the time decay function is introduced in order to construct a sequence of user nodes and a sequence of item nodes with comparable behaviors in the near future. Second, users' interests are diverse. Given that the self-attention mechanism can calculate the degree of association between each node and its neighbors, we propose a multi-head self-attention mechanism to aggregate neighborhood information representing the central node in order to capture the user's interests while improving the model's stability. Finally, to address the phenomenon that users have multiple behaviors under the e-commerce platform, we introduce behavioral weight factors. At the same time, we fill the user and item sequences generated according to the meta-path into the heterogeneous attention network and initialize the importance of user behaviors. Then, we propose the MUB-HAN (Multiple User Behavior Heterogeneous Attention Network) model to further improve the accuracy of click conversion rate. The experiments indicate that the model can generate superior predictions of click conversion rates based on user behavior toward items in e-commerce platforms while incorporating a huge quantity of auxiliary information.
Article
Full-text available
The Internet is changing the method of selling and purchasing items. Nowadays online trading replaces offline trading. The items offered by the online system can influence the nature of buying customers. The recommendation system is one of the basic tools to provide such an environment. Several techniques are used to design and implement the recommendation system. Every recommendation system passes from two phases similarity computation among the users or items and correlation between target user and items. Collaborative filtering is a common technique used for designing such a system. The proposed system uses a knowledge base generated from knowledge graph to identify the domain knowledge of users, items, and relationships among these, knowledge graph is a labelled multidimensional directed graph that represents the relationship among the users and the items. Almost every existing recommendation system is based on one of feature, review, rating, and popularity of the items in which users’ involvement is very less or none. The proposed approach uses about 100 percent of users’ participation in the form of activities during navigation of the web site. Thus, the system expects under the users’ interest that is beneficial for both seller and buyer. The proposed system relates the category of items, not just specific items that may be interested in the users. We see the effectiveness of this approach in comparison with baseline methods in the area of recommendation system using three parameters precision, recall, and NDCG through online and offline evaluation studies with user data, and its performance is better than all other baseline systems in all aspects.
Conference Paper
Full-text available
The Web has accumulated a rich source of information, such as text, image, rating, etc, which represent different aspects of user preferences. However, the heterogeneous nature of this information makes it difficult for recommender systems to leverage in a unified framework to boost the performance. Recently, the rapid development of representation learning techniques provides an approach to this problem. By translating the various information sources into a unified representation space, it becomes possible to integrate heterogeneous information for informed recommendation. In this work, we propose a Joint Representation Learning (JRL) framework for top-N recommendation. In this framework, each type of information source (review text, product image, numerical rating, etc) is adopted to learn the corresponding user and item representations based on available (deep) representation learning architectures. Representations from different sources are integrated with an extra layer to obtain the joint representations for users and items. In the end, both the per-source and the joint representations are trained as a whole using pair-wise learning to rank for top-N recommendation. We analyze how information propagates among different information sources in a gradient-descent learning paradigm, based on which we further propose an extendable version of the JRL framework (eJRL), which is rigorously extendable to new information sources to avoid model re-training in practice. By representing users and items into embeddings offline, and using a simple vector multiplication for ranking score calculation online, our framework also has the advantage of fast online prediction compared with other deep learning approaches to recommendation that learn a complex prediction network for online calculation.
Chapter
Internet plays a vital role for doing the business. It provides platform for creating huge number of customers for ease of business. E-business organizations are growing rapidly and doubly in every minute; World Wide Web (WWW) provides huge information for the Internet users. The accesses of user's behavior are recorded in web logs. This information seems to be very helpful in an E-business environment for analysis and decision making. Mining of web data come across many new challenges with enlarged amount of information on data stored in web logs. The search engines play key role for retrieving the relevant information from huge information. Nowadays, the well-known search engines, like Google, MSN, Yahoo, etc. Have provided the users with good search results worked on special search strategies. In web search services the web page ranker component plays the main factor of the Google. This paper discusses the new challenges faced by web mining techniques, ranking of web pages using page ranking algorithms and its application in E-business analysis to improve the business operations.
Article
Explainable recommendation is an important task. Many methods have been proposed which generate explanations from the content and reviews written for items. When review text is unavailable, generating explanations is still a hard problem. In this paper, we illustrate how explanations can be generated in such a scenario by leveraging external knowledge in the form of knowledge graphs. Our method jointly ranks items and knowledge graph entities using a Personalized PageRank procedure to produce recommendations together with their explanations.
Article
Modeling the complex interactions between users and items as well as amongst items themselves is at the core of designing successful recommender systems. One classical setting is predicting users' personalized sequential behavior (or `next-item' recommendation), where the challenges mainly lie in modeling `third-order' interactions between a user, her previously visited item(s), and the next item to consume. Existing methods typically decompose these higher-order interactions into a combination of pairwise relationships, by way of which user preferences (user-item interactions) and sequential patterns (item-item interactions) are captured by separate components. In this paper, we propose a unified method, TransRec, to model such third-order relationships for large-scale sequential prediction. Methodologically, we embed items into a `transition space' where users are modeled as translation vectors operating on item sequences. Empirically, this approach outperforms the state-of-the-art on a wide spectrum of real-world datasets. Data and code are available at https://sites.google.com/a/eng.ucsd.edu/ruining-he/.
Conference Paper
In order to recommend products to users we must ultimately predict how a user will respond to a new product. To do so we must uncover the implicit tastes of each user as well as the properties of each product. For example, in order to predict whether a user will enjoy Harry Potter, it helps to identify that the book is about wizards, as well as the user's level of interest in wizardry. User feedback is required to discover these latent product and user dimensions. Such feedback often comes in the form of a numeric rating accompanied by review text. However, traditional methods often discard review text, which makes user and product latent dimensions difficult to interpret, since they ignore the very text that justifies a user's rating. In this paper, we aim to combine latent rating dimensions (such as those of latent-factor recommender systems) with latent review topics (such as those learned by topic models like LDA). Our approach has several advantages. Firstly, we obtain highly interpretable textual labels for latent rating dimensions, which helps us to `justify' ratings with text. Secondly, our approach more accurately predicts product ratings by harnessing the information present in review text; this is especially true for new products and users, who may have too few ratings to model their latent factors, yet may still provide substantial information from the text of even a single review. Thirdly, our discovered topics can be used to facilitate other tasks such as automated genre discovery, and to identify useful and representative reviews.
Article
E-commerce systems employ recommender systems to enhance the customer loyalty and hence increasing the cross-selling of products. However, choosing appropriate similarity measure is a key to the recommender system success. Based on this measure, a set of neighbors for the current active user is formed which in turn will be used later to recommend unseen items to this active user. Pearson correlation coefficient, the most popular similarity measure for memory-based collaborative recommender system (CRS), measures how much two users are correlated. However, statistic’s literature introduced many other coefficients for matching two sets (vectors) that may perform better than Pearson correlation coefficient. This paper explores Jaccard and Dice coefficients for matching users of CRS. A more general coefficient called a Power coefficient is proposed in this paper which represents a family of coefficients. Specifically, Power coefficient gives many degrees for emphasizing on the positive matches between users. However, CRS users have positive and negative matches and therefore these coefficients have to be modified to take negative matches into consideration. Consequently, they become more suitable for CRS research. Many experiments are carried out for all the proposed variants and are compared with the traditional approaches. The experimental results show that the proposed variants outperform Pearson correlation coefficient and cosine similarity measure as they are the most common approaches for memory-based CRS.