ArticlePDF Available

Implementation of Knowledge based Collaborative Filtering and Machine Learning for E-Commerce Recommendation System

August 2021
Journal of Physics Conference Series 2007(1):012032

August 2021
2007(1):012032

DOI:10.1088/1742-6596/2007/1/012032

License
CC BY 3.0

Authors:

Om Prakash Rishi

University of Kota

Pushpendra Singh

SRM Institute of Science and Technology Delhi NCR Campus

Show all 5 authorsHide

This is the era of I-way. The development of high-speed computing and huge storage devices change the working culture of human. It affects the traditional business processes and shifted towards online business. It creates huge problems like overload and irrelevant information which are the causes of confusion both customers as well as enterprise. Recommendation system solves these problems. Design and development of efficient system is one of the key areas of the recent researchers. Collaborative filtering (CF) and content-based filtering algorithms are widely used in the implementation of such system. Collaborative used user’s features while content-based used item’s features. Most of the CFs are rating or review based processed homogeneous information. In this paper we proposed knowledge-based collaborative filtering algorithm for large data set that uses various activities done by users during interaction of item through E-commerce web site like clicks, select and purchase. The performance of the system is compared with the base models using real time Amazon E-commerce dataset using precession, recall and NDCG evaluation parameters in various combinations of activities performed by users on items.

Sample Knowledge Graph for the Statement

…

Architecture of Proposed Model

…

Figures - available via license: Creative Commons Attribution 3.0 Unported

Content may be subject to copyright.

Available via license: CC BY 3.0

Content may be subject to copyright.

Journal of Physics: Conference Series

PAPER • OPEN ACCESS

Implementation of Knowledge based Collaborative

Filtering and Machine Learning for E-Commerce

Recommendation System

To cite this article: Mahesh Kumar Singh et al 2021 J. Phys.: Conf. Ser. 2007 012032

View the article online for updates and enhancements.

You may also like

A Time Effect based Collaborative Filtering

Approach for User Preference Statistics

and Recommendation

Yuxi Chen, Xiaotong Zhang, Qing Zhao et

al.

Online Book Recommendation System

using Collaborative Filtering (With Jaccard

Similarity)

Avi Rana and K. Deeba

Application of Improved Collaborative

Filtering Algorithm in Recommendation of

Batik Products of Miao Nationality

Ning Ding, Jian Lv and Lai Hu

This content was downloaded from IP address 216.19.203.70 on 29/10/2022 at 04:44

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution

of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Published under licence by IOP Publishing Ltd

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

Implementation of Knowledge based Collaborative Filtering and

Machine Learning for E-Commerce Recommendation System

Mahesh Kumar Singh1, Om Prakash Rishi2, Akhilesh Kumar Singh3, Pushpendra

Singh4, Pushpa Choudhary5

1,3,5 Department of Information Technology, G L Bajaj Institute of Technology and

Management Greater Noida, U P, India.

2Department of Computer Science and Informatics. University of Kota, Kota, Rajasthan,

India.

4Department of Information Technology, Indraprastha Engineering College, Ghaziabad,

UP, India

e-mail: 1maheshkrsg@gmail.com, 2omprakashrishi@yahoo.com,

3akhileshaks@gmail.com, 4pushpendra.singh1@gmail.com, 5pushpak2728@gmail.com

Abstract. This is the era of I-way. The development of high-speed computing and huge storage

devices change the working culture of human. It affects the traditional business processes and

shifted towards online business. It creates huge problems like overload and irrelevant information

which are the causes of confusion both customers as well as enterprise. Recommendation system

solves these problems. Design and development of efficient system is one of the key areas of the

recent researchers. Collaborative filtering (CF) and content-based filtering algorithms are widely

used in the implementation of such system. Collaborative used user’s features while content-based

used item’s features. Most of the CFs are rating or review based processed homogeneous

information. In this paper we proposed knowledge-based collaborative filtering algorithm for large

data set that uses various activities done by users during interaction of item through E-commerce

web site like clicks, select and purchase. The performance of the system is compared with the base

models using real time Amazon E-commerce dataset using precession, recall and NDCG evaluation

parameters in various combinations of activities performed by users on items.

Keywords: Web Recommendation System (WRS), Web Artificial Intelligence, Web Information

Technology. Collaborative Filtering (CF), Knowledge base, Knowledge graph, Web Usage Mining.

Web Engineering Applications.

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

1. INTRODUCTION

O Internet has transformed the style of traditional ways of business, almost every company wants to

create its own web site for helping and doing their business. Since Internet provides a very large market

place hence every customer is faced with multiple choices. Suppose a customer looking to read a book

without any specific area, there are many books of same variety, therefore customer spent a lot of time for

searching relevant book. If there is any site or app that provides relevant book to the customer as he/she

read previously then it saved so many times of the customer. This feature of that web site is known as

recommendation system.

Anciently, a person bought product only suggested his/her friends or relatives. This is the traditional

way of purchasing when there was any doubt about the product, but this is the era of I-way that circle has

expanded to include online sites that utilize some sort of recommendation engine [1]. A recommendation

engine figure 1 uses various algorithms to filter and recommends the most relevant products to the

customers on the basis of his/her past behaviors that is it recommends products which the user might be

likely to buy.

Some popular websites that are using recommendation are This is the era of I-way. The development

of high-speed computing and huge storage devices change the working culture of human. It affects the

traditional business processes and shifted towards online business. It creates huge problems like overload

and irrelevant information which are the causes of confusion both customers as well as enterprise.

Recommendation system solves these problems. Design and development of efficient system is one of the

key areas of the recent researchers shown in table 1.

Figure 1 Architecture of Recommendation Engine

Almost every collaborative filtering uses unstructured data such as ratings, reviews are images to

profile the users for personalized recommendation. In this paper we can extends the power of

collaborative filtering (CF) using large scale structured heterogeneous user behavior data. The main

building block of proposed CF combined the traditional CF with knowledge base. The behavior of the

users can be represented by directed graph called knowledge graph.

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

Table 1 Some popular sites that used recommendation system

1.1 Knowledge graph

The representation of relation between customer and product is denoted by a directed graph called

knowledge graph [2,3]. It is a directed graph of triplets (subject, predicate, object) called SPO. Subject,

object, and predicate, subject and object are entities and predicate show the relationship between these

entities [4]. Entities are denoted by nodes and relationship by edges.

“A customer C1 buy a product P1 of category cat1 of brand b1 fall in price range r1 and Customer c1

select product p2 of brand b2 categories cat1 of price range r2” SPO of this statement is listed in Table 2.

Table 2 Subject Predicate Object list of the given statement

Combining all SPO triplets together to form a multi directed graph of a given statement as shown in

figure 2.

Sr. no Site User Item Description

1 LinkedIn Member Members or

Jobs

Members are interested in another

members or jobs

2 Facebook Member Members Members are interested in another

members

3 Amazon Member Products or

books

Members are interested in products

4 Netflix Member Movies or Story

series

Members are interested in watching

movies or story series.

5 Flipkart Member Products Members are interested in products.

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

Figure 2 Sample Knowledge Graph for the Statement

The other activities of the customer can be written as “A customer C1 also_view a product p3 of brand

b3 of category cat1 fall_in price range r1. Customer C1 also_buy product p4 of brand b2 of category cat1

fall_in price_range r2.”

1.1.1 Construction of Knowledge Base. During the construction of knowledge base [5], it is mandatory

to consider some parameters like completeness, accuracy, and quality of data which determine the

usefulness of knowledge base. There are four major groups of knowledge base construction methods,

curated method, collaborative method, automated semi-structured method and automated unstructured

method.

1.1.1.1. Curated method. In this method triplets are created manually by a closed group of experts. The

accuracy of curated knowledge base is very but this method is not scalable due dependency on human

experts.

1.1.1.2. Collaborative method. In this method triplets are created manually by the open group of

volunteers. This method is widely used in Wikipedia and Freebase and better scalable, but it also has some

limitations due to this growth of Wikipedia has been slowing down.

1.1.1.3. Automated semi-structure method. In this method triplets are extracted automatically from

semi-structured text by making some rules. This method is used in Wikipedia infoboxes which has large

and highly accurate knowledge graph like YAGO [6] and DBpedia [7], but semi-structure text cover very

fraction of information stored on the web.

1.1.1.4. Automated unstructured Method: In this method triplets are created automatically from

unstructured text using machine learning and natural languages processing. This method tries to read the

web extracts facts from the natural language text of web pages like Nell and Knowledge Vault.

Knowledge graph is similar as knowledge base which is classified into schema based and schema free.

Some popular schema-based knowledge base is listed in table 3.

Schema based approach uses predefined and globally unique identified entities and relations in a fixed

vocabulary while in schema free approach entities and relations are identified using open information

extraction techniques.

Table 3 Size in million (M) of some popular Schema-Based Knowledge Bases

Name of Knowledge

Graph

Entities Relations Facts

Freebase [8} 45 M 360000 735 M

Wikidata[9] 19 M 1735 68 M

DBpedia[7] 5.2 M 1467 638 M

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

YAGO2[6] 10.8 M 150 547 M

Google Knowledge Graph

[3]

670 M 360000 200000 M

1.1.2. Learning of knowledge graph. Let E={e1,e2,e3,………………en} be the set of all entities (that is

all the subject and object entities) and R={r1,r2,r3,……rm} be the set of all relations in the knowledge

graph. For each possible triplet Xi,k,j=(ei,rk,ej) over this set of entities and relations as a binary random

variable Yi,k,j 

Ne x Nr x Ne which contains

,,=1

0 ( , , ) 

 ……(1)

Let eh, et are the head and tail entity in the knowledge graph rk is the edge between them then et can be

related by eh as et=transpose(eh,rk)=eh+rk by applying this relation for all nodes relation among the nodes

are easily calculated.

Figure 3 Binary Relation Presentation of Data

2. RELATED WORK

Recommendation System try to identify the user’s interest in the specific domain of contents based on

their previous experiences. When a user interacts with the E-commercial site he\she offers a set of implicit

or explicit information like clicks, rating, comments etc. about his/her tastes. Recommendation systems

are categorized into two main categories personalized [10,11] and non-personalized. Personalized system

uses history of the customers’ navigations/behaviors like content based [12], collaborative filtering [13],

PageRank [14] in social network analysis etc., while non-personalized system does not require any

historical data it used the characteristics of the products like popularity based.

2.1. Content based filtering system

This system is designed to recommend the products on the basis of user’s past preferred order. It saves

all the information related to each user in a vector form known as profile vector and another vector,

product vector which contains all the information related to the products. This algorithm finds the

cosine of the angle between the profile vector and product vector. It uses traditional classification and

clustering techniques such as Support Vector Machine [13] or Nearest Neighbors algorithms [12]. There

are two types of user implicit and explicit. Those updated their information automatically by the system

are called implicit users while some give their feedback to the system in the given range are called explicit

users. According to Aggarwal [1] it has some drawbacks like accuracy of the system is highly dependent

on the specific application that is used for features of items, over specialization and training size. There

are three major limitations of content-based filtering recommendation technique overspecialization, cold

start and limitations of contents.

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

2.2. Collaborative filtering algorithm

It uses user behaviors for recommending items. This is the most commonly used algorithms in the

industry since it is not dependent any additional information. There are two types of collaborative filtering

techniques, memory based collaborative filtering and model based collaborative filtering.

Memory based collaborative filtering [15] used item based and user-based approaches,

recommendations are generated on the preferences of nearest neighbors [ 16], while model based

collaborative filtering used matrix factorization approach such as SVD, Tensor factorization [17] it is

widely used in order to predict most preferable product wish to purchase by the customer. Graph based or

social network-based recommendation system [18] utilized information available from social network like

user preferences, influences from friends of social network to overcome the problems of cold start and

data sparsity problems or recommendation systems.

3. DESIGN ARCHITECTURE OF PROPOSED MODEL

The architecture of proposed model (see fig 4) contains 4 basic components user that used the system

by giving some data to the system and system provided a listed of item in his recommendation list. Event

data pre-processing transformed the data as per the system requirements. Ranking algorithm that

generated the rank score of each item based on user’s preference, and Matching algorithm that measured

the similarity among items and users using item-item and user-user similarity methods respectively.

Figure 4 Architecture of Proposed Model

3.1. Input Data

Input data sets are collected from Amazone[16] E-commerce web site. It contains various activities

performed by users during navigation of the items. Let m be the number users and n be the number of

items then there are m x n passible activities are possible, but there are three activities (say view,

add_to_cart and transaction) of information used in system as mentioned. The user and item activities is

represented by m x n preference matrix (see eq.(2))

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

=





 ……… 



..……………….

..

...

 …………. 





=   …(2)

3.2. Data pre-processing

Input datasets contain so many attributes in different domains. Every input data set consists of

individual data objects, all the data sets have common properties like type of the data object, size,

dimensionality sparsity and abstraction. There are some methods which are used to process the data before

use of proposed recommendation system.

3.2.1. Feature selection. There are many attributes in the input from users and items but, method of

dimension reduction of the datasets which is primarily used to remove redundant and irrelevant attributes

from the datasets. The system consider c_id , ip_address and session attributes from the customer dataset,

P_id, P_cat from product dataset and P_view, P_select, and P_buy from relation datasets and all others are

removed.

3.2.2 Data binarization. It is the method to provide the values of the attributes here the values of the

relations are considered for computation of preference score of the product hence the values of the

attributes like P_view, P_select and P_Buy are binarized that either 0 for no and 1 for yes, therefore the

preference matrix will be.

,=1  ,   

0  …..(2)

3.3. Computation of preference or ranking score using knowledge recommendation system (KRS)

The preference or ranking score of the product is computed on the number of customers participated in

the events in a specific period of times.

3.3.1. Event Database. It is the collection of customer events performed on different categories (P_cat)

of products (see Table-4). Suppose Cbij shows the customer i that buy the product j (i.e. a product j which

is buy by customers i), Csij shows the customer i that select the product j ( i.e. product j is only selected

by the customer i ) Similarly Cvij shows the customer i that view the product j (i.e. the product j is view by

the customer i). The preference order between the products is measure as product buy>product

_select>product_view. The customer’s preference of the product can be represented in matrix form as Cij it

means customer i prefer in terms of buy or select or view product j. If there are m number of customers

and n number of products then i=1,2,3,4,……..m and j=1,2,3,4,……….n. The product preference matrix

(Cij) can be written by m x n matrix (2).

Table 4 Events Data base

Name of Attribute Data

Type

Description

P_view Number Product view of click

P_select Number Product selected or

add_to_cart

P_buy Number Product bought or transaction

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

P_also_buy Number Buy with some other products

Price Real

Number

Price of the product

P_cat Text Category of the product

P_price_range Number Range of price

3.3.2. Customer data. The user data base (see Table 6) contains specific data like c_id, url, location of

the customer, navigational details, timestamp etc. The web server logs are represented as the tuple {u_id,

ip_address, url, timestamp, location}, that represent the details of web access user, pages requested to

access, corresponding browser time and the location of the user. But c_id of the customer is beneficial for

recommendation. The u_id with the url which are visited by the customer can be set up by the association

matrix.

Table 5 Product Data base







11 12 ..1

21 22 ..2

1 ....

2 .. ..







……………(3)

Here vij is the browser visiting information is a particular time that user i visited the first j pages

(product).

The URLgives the idea about the product like product_id and product_category (see Table-5). The

user’s information with the product’s information is use d for the recommendation.

Name of

Attribute

Data

Type

Description

P_ID Varchar Unique Identification number of the Product

P_cat Text Category of product

P_brand Text Name of the brand of a product.

P_name Text Name of the product

Price Real

number

Price of product

Unique_page Number Number of unique pages visited by the user per

product.

P_session Time Time spent by user for a particular product ID

Name of

Attribute

Data

Type

Description

C_ID Number Unique identification number of any user

Age Number Age group of the customer

Gender Text Gender group

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

Table 6 User Data base

3.3.3 Method. The dataset consists product details, customer details, page detail, action and categories,

brands of products. The relation between user and item is denoted by the triples (Ci, Rk, Pj), this shows that

customer i is related with particular relation k with product j. The first aim is to find the order pair (Ci,Pj)

bought, select and view list of customers then we find the order pair (Ck,Pj) that is list of customers

related with that product Pj.

The customer’s preference rank can be calculated by the formula

= 









 + 









 + 









 …(4)

Where X, Y and Z are the weight adjusting coefficients corresponding to three difference relations of

shopping all are not same as the preference of these steps the values of Z>Y>X. At value X=0.25, Y=0.5

and Z=1 is suitable for better results. Hence this formula finds the preference score of the products higher

value means higher the score.

3.4. Matching algorithm for recommendation (Association Rule Mining)

The rule mining is used to create the knowledgeable bunches data of similar category that it associates

the customer profiles with each product. That is valuable for the recommendation both customer as well as

for the enterprise. It is clear that there are three categories of the products hence association rules can be

generated from three different kinds of transaction sets, called buy transaction set, selected_but_not_buy

and also_view transaction set. For each transaction from the web logs there are three steps for the

association rules. First deals the minimum support and minimum confidence, second replace each product

in the transaction set with its corresponding categories of products and in third step generate association

rule for each transactional set using Apriori algorithm. It can be given by a matrix P=Pkl called product

matrix, where k=1,2…n is the total number of products and l=1,2,3 or 4 define in the product matrix Pij

that represents the association degree among the product categories in different transactional steps.

=1.0    

0.50

0.25

0.00    

   

 ……(5)

The relation among these three parts of information is denoted by the triples (Ci, Rk, Pj), this shows that

customer i is related with particular relation k with product j. The first aim is to find the order pair (Ci, Pj)

bought, select and view list of customers then we find the order pair (Ck, Pj) that is list of customers

related with that product Pj.

URL Text Uniform Resource Locator for each web page

IP_address Text Machine_ID

Session Time Time spent by user

Location Number Zip code of user

Page Number Number of unique pages visited by users for different

products.

S_ID Text Session ID

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

There are two different matrices customer preference matrix and product association matrix before

recommending it is to find the matching scores for each product associated with each customer this score

gives the idea closeness of product with the customer.

3.4.1. Customer-Customer collaborative filtering. This algorithm finds the similarity score between

customers. Based on this similarity score, it then picks out the most similar users and recommends

products which these similar customers have liked, selected or bought previously.

A user based Collaborative Filtering technique works based on a set of customers u have the same

relation to the product by aggregating these relations using the formula (6).

, =

(,)., …….. (6)

Where Nu is the set customers of which K customers have the similar interest target to customer u.

Sim(u,k) shows similarity between customer u and k predefined customers, and rk,j represents the rating

given by k users to the product j.

3.4.2. Product-Product collaborative filtering. In product based collaborative filtering considers the

similarity among the products or services. It is supposed that similar products are related in a similar way

by the same customer. Hence the products recommended to the customer u are scored or ranked by

aggregating the similarity of the different customers and the customer u related in the past. It is possible to

compute similarity score by the given formula (7).

1(, ).

uj uk

rSimjkr



………(7)

Where Ni denotes the set of products or items neighbor to j Sim(j,k) is similarity value

4. EXPERIMENTS AND RESULT ANALYSIS

Experiments are performed on Amazon E-commerce datasets [19] there are five kinds of sub- datasets

automotive, mobile phones, home appliances, movies, books. The behavior of data i.e. relations (events)

like view, select and buy represent interaction that were collected over period of 4.5 months. In the

original data there are 27,56, 101 events including 26,64,312 only view, 69,332 only select and 22,457

buys produced by 1407580 unique customers and 8885 unique products.

The statistics of these datasets are summarized in the table. We consider Top N recommendation

measurement including Precision, Recall, Hit Ratio, NDCG [20] for evaluating the model and baselines.

Fist three methods are used to evaluate the quality of recommendation system and last method is used for

accuracy and ranking positions of the correct products in output list. There are so many relations among

the entities of the knowledge graph, but this paper considers only three relations.

Buy relation: This relation shows that a customer c bought the product p.

Select relation: This relation shows that the customer c add_ to _cart the product p.

View relation: This relation shows that a customer c visits the product p page.

4.1. Computation of baselines

We use following methods for baselines performance comparison.

4.1.1. Bayesian Personalized Ranking (BPR) [21]. It is a popular method in Top N recommendation

used matrix factorization as the prediction component. It is based on the

triplet(u,i,j) where user u interact with item I but not item j. Relationship between item I and j with

respect to user u can be given by the formula



> 0………….(8)

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

4.1.2. Hidden Factors and Topics (BPR_HFT) [22]. It is a method used for textual reviews; we use

HFT under BPR pair wise ranking framework for fair comparison.

4.1.3. Visual Bayesian Personalized Ranking (VBPR) [23]. This method is used for recommendation

with images.

4.1.4. Deep Convolutional Neural Network (DCNN)[24]. It is a review based deep recommendation

method to jointly model the users and the products.

4.1.5. Joint Representation Learning (JRL) [19]. It is a model which can leverage multi-model

information for Top-N recommendation.

4.2 Ranking accuracy

It deals the levels of utility of the recommended product or service with respect to the ranking

proposed by the user. Discounted Cumulative Gain (DCG) is very popular matrix for evaluating the

ranking accuracy. The Normalized DCG [27] is defining as follows

=

 

()





 , =

…(9)

Where m denotes the total number of users in the test dataset De, Iu is the set of products/services liked

by user u, vj is position of j in the recommended list, guj represents the utility gain given by the user u to

the product j, and IDCG is the ideal value computed on the basis of real value using same formula as

DCG. Another way to evaluate the accuracy of the relevant list is to consider the tradeoff between the

length of the list RL and the number of actual relevant products/services for the user. The relevant list RL

contains true positives (tp) but not false negative (fn) and false positive (fp). Hence RL and number of

relevant products can be computed in terms of Precision and Recall [10] as follows.

= 

,= 

…...… (10)

In the single matrix it can be summarized in F-measure [10] which can be computed by the following

formula.

=      

 .... (11)

4.3 Settings of parameters

All the parameters used in this filter are initialized in the range (0,1) and updated as per Stochastic

Gradient Descent (SGD). The learning rate is determined in the range of {1.0,0.1,0.01,0.001,0.0001} and

the dimension is started in the range {10,50,100,200,300,400,500,600} which gives the final learning rate

as 0.01 and dimension as 200. For computing the baselines 70% products of each user are consider for

training while other are used for testing. The system generated top 10 recommendation for each user from

the test dataset.

4.4 Performance Comparison

The performance of proposed filter shown in table 7 and table 8. Table-7 shows the performance

comparisons with various base models and table-8 shows the performance with possible combinations of

relations. From the experimental result (See fig 5) it is clear that both reviews based and rating based

models enhanced the performance of recommendation system but heterogeneous information source-

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

based model like JRL performs more better than baseline system, which gives the idea that the knowledge

based collaborative filtering (KCF) performs more better than that of JRL consistently over five datasets

and all evaluation measures which verifies the proposed system.

Table 7 Performance table on top 10 recommendation between baselines and proposed system.

Table 8 Performance on top 10 recommendation when incorporating between varieties of relational

structures used in knowledge graph. The final result is significantly than all other models

Figure 5 Performance of top 10 recommendation when incorporating between varieties of relational

structures used in Knowledge Graph

Recall Prec NDCG Recall Prec NDCG Recall Prec NDCG Recall Prec NDCG Recall Prec NDCG

Automotive Mobile phones Home appliances Movies Books

Buy

Select

view

Buy+Select

Buy+View

view+Select

All(KCF)

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

5. CONCLUSION

This paper discussed the concept of knowledge graph its learning and creation. Create a

knowledge-based collaborative filtering that processed the heterogeneous information which are

unstructured by using the concept of knowledge graph that is a directed graph it converted into

structured form. The triplet tuple relation between user and the product played a vital role in the

development of proposed CF. Experimental results used real world datasets for performance

measurements of various filters used in recommendation system based on rating, review and

heterogeneous information. From the result it is clear that the performance of proposed filter is much

better than the discussed filters, therefore we conclude that the proposed CF is better for creation of

recommendation system.

REFERENCES

[1] Aggarwal C.C. 2016” Recommender Systems”, Springer.

[2] Catherine R., K. Mazaitis,M Eskenazi, W Cohen: “Explainable Entity Based Recommendation

with Knowledge Graph”, RecSys 2017.

[3] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor, “Freebase: a collaboratively

created graph database for structuring human knowledge,” in Proceedings of the 2008 ACM

SIGMOD international conference on Management of data. ACM, 2008, pp. 1247–1250.

[4] L. De Raedt, Logical and relational learning. Springer, 2008.

[5] Mahesh Kumar Singh, Om Prakash Rishi, Sumit Wadhwa “Application of Page Ranking

Algorithm Based on Numbers of Link Visits in Web Recommendation System for Online

Business” 4th International Conference on Green Computing and Engineering Technology

(ICGCET-2018), on 17th -19th August 2018, at Aalborg University, Neils Hohrs Vej 8,

Esbjerg, Denmark.

[6] A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. H. Jr, and T. M. Mitchell, “Toward an

Architecture for Never-Ending Language Learning,” in Proceedings of the Twenty-Fourth

AI Press, 2010, pp. 1306–1313

[7] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives, “DBpedia: A Nucleus

for a Web of Open Data,” in The Semantic Web. Springer Berlin Heidelberg, 2007, vol.

4825, pp. 722–735.

[8] Ruining He,Wang-Cheng Kang and JulianMcAuley;”Translation based Recommendation: In

RecSys. ACM 2017.

[9] F. M. Suchanek, G. Kasneci, and G. Weikum, “Yago: A Core of Semantic Knowledge,” in

Proceedings of the 16th International Conference on World Wide Web. New York, NY,

USA: ACM, 2007, pp. 697–706.

[10] Mahesh Kumar Singh, Om Prakash Rishi, Sumit Wadhwa” Knowledge Generation through

Web Mining Techniques for E-business Recommendation System” 1st International

Conference on Communication and Computing (ICCCT-2018), on 7th and 8th Sept 2018, at

GLBITM Gr Noida, UP India.

[11] M. K. Singh, O. P. Rishi, S. Awasthi, A. P. Srivastava and S. Wadhwa, "Classification and

Comparison of Web Recommendation Systems used in Online Business." 2020 International

Conference on Computation, Automation and Knowledge Management (ICCAKM), Dubai,

United Arab Emirates, 2020, pp. 471-480.

[12] P. Lops, M. De Gemmis , G. Semeraro:” Content-based recommender systems: state of the art

and trends”, in: Recommender Systems Handbook, Springer, 2011, pp. 73–105

[13] Z. Xia, Y. Dong, G. Xing “Support vector machines for collaborative filtering”, in: Proceedings

of the Forty-fourth Annual Southeast Regional Conference, ACM, 2006, pp. 169–174 .

[14] L. Page, S. Brin, R. Motwani, and T. Winograd.” The PageRank citation ranking: Bringing

order to the Web”. Technical report, Stanford Digital Libraries SIDL-WP- 1999- 0120, 1999.

[15] Al-shamri M.Y.H.” Expert System with Applications Power Coefficient as a similarity measure

for Memory based collaborative recommender systems “Expert System with Application

41(13),pp 5680-5688, 2014

ICCEMME 2021

Journal of Physics: Conference Series 2007 (2021) 012032

IOP Publishing

doi:10.1088/1742-6596/2007/1/012032

[16] M.Y.H. Al-shamri,” Expert System with Applications Power Coefficient as a similarity measure

for Memory based collaborative recommender systems”Expert System with Application

41(13),pp 5680-5688, 2014

[17] K. Miyahara, M. Pazzani :” Collaborative filtering with the simple Bayesian clas- sifier”,

Proceedings of the Eighteenth Pacific Rim International Conference on PRICAI 20 0 0

Topics in Artificial Intelligence, 2000, pp. 679–689.

[18] Mahesh Kumar Singh, Om Prakash Rishi “Event Driven Recommendation for E-commerce

using Knowledge based Collaborative Filtering Technique. ” Scalable Computing: Practice

and Experience, ISSN 1895-1767.Volume 21, Issues 3,(2020-SCPE), pp. 369–378, DOI

10.12694:/scpe.v21i3.1709

[19] J. McAuley, Leskovec, J. “Hidden:factors and hidden topics: understanding rating dimensions

with review text”. In Proceedings of the 7th ACM Conference on Recommender Systems,

Hong Kong, China,12–16 October 2013; pp. 165–172.

[20] H. Seifoddini and M. Djassemi: “The production data-based similarity coefficient versus

jaccard’s similarity coefficient,” Computers & industrial engineering, vol. 21, no. 1, 1991,

pp. 263–266.

[21] S Rendle. C Freudenthaler, Z. Gantner, , L . Schmidt-Thieme. “BPR: Bayesian personalized

ranking from implicit feedback”. In Proceedings of the Twenty-Fifth Conference on

Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 18–21 June 2009; pp. 452–461.

[22] Y. Zhang, Q. Ai, X. Chen, w.B. Croft:” Joint representation learning for top-n recommendation

with heterogeneous information sources”. In Proceedings of the 2017 ACM Conference on

Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1449–1458.

[23] He, R., & McAuley, J. (2016). VBPR: Visual Bayesian Personalized Ranking from Implicit

Feedback. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). Retrieved

from https://ojs.aaai.org/index.php/AAAI/article/view/9973

[24] Mahesh Kumar Singh, Om Prakash Rishi, Anukrati Sharma, Zaved Akthatar “Knowledge

Extraction Through Page Rank Using Web Mining Techniques for E-business: A Review ” a

chapter in” Maximizing Business Performance and Efficiency through Intelligent Systems”

Application of recall methods in recommendation systems

Article

Jun 2023

Yumeng Wang

In order to have a more comprehensive introduction and understanding of the re-search progress of recall strategies in recommender systems, this paper reviews the application of diverse recall methods in various recommender systems by different researchers. By searching and reading literature in major databases like Google Scholar, it is found that the recall method suitable for news recommendation system is also generally applicable in other recommendation systems. Therefore, this paper takes news recommendation system as an example to introduce traditional content-based recall and collaborative filtering-based methods. Hot-based recall and Embed-ding-based recall also developed in recent years. Furthermore, recall strategies (emo-tion-based recall and UIBB) that are specifically applicable to music and e-commerce recommendation systems are introduced. This paper briefly introduces these recall styles and collects researchers' evaluations and attitudes towards these recall styles, aiming to provide help for recommender system designers in optimizing recall methods.

Understanding the Artificial Intelligence Implementation for Allocating an Order to a Seller among Multiple Sellers Who Sell the Same Product

Article

Full-text available

Aug 2022

Md Imtiaz Ahmed

E-commerce sectors are growing rapidly worldwide and it adopts the new technological innovation drastically, such as embracing artificial intelligence in e-commerce sectors. Machine learning adaptation in e-commerce sectors is the most and much news already published by giant e-commerce companies, such as Amazon and eBay. The aim of this paper is to find out how artificial intelligence helps the e-commerce platform to choose a seller from multiple sellers when the same products or listings are sold by multiple sellers. When a customer will place the order, then who will get the order of the customer as multiple sellers sell the item within the same product listings. In the research, it is figured out that machine learning techniques are normally used for the selection of the seller where the prior points used for finding the appropriate seller are feedback or ratings, seller products location or distance from the customer, advertising or PPC or campaign, discounts, etc.

A Click Conversion Rate Model of E-Commerce Platforms Aiming at Effective Data Sparse

Article

Apr 2024

Click conversion rate is an important indicator that e-commerce platforms pay attention to. The user's behaviors toward the product on the e-commerce platform, to some extent, indicates a level of love for the product. In order to precisely predict the click conversion rate, this paper suggests a click conversion rate prediction model for e-commerce platforms based on various user behaviors in heterogeneous network. First, in view of the sparseness of interaction information data between users and items on the e-commerce, we take advantage of the ability to discover comparable nodes with the meta-path based random walk method. Simultaneously, the time decay function is introduced in order to construct a sequence of user nodes and a sequence of item nodes with comparable behaviors in the near future. Second, users' interests are diverse. Given that the self-attention mechanism can calculate the degree of association between each node and its neighbors, we propose a multi-head self-attention mechanism to aggregate neighborhood information representing the central node in order to capture the user's interests while improving the model's stability. Finally, to address the phenomenon that users have multiple behaviors under the e-commerce platform, we introduce behavioral weight factors. At the same time, we fill the user and item sequences generated according to the meta-path into the heterogeneous attention network and initialize the importance of user behaviors. Then, we propose the MUB-HAN (Multiple User Behavior Heterogeneous Attention Network) model to further improve the accuracy of click conversion rate. The experiments indicate that the model can generate superior predictions of click conversion rates based on user behavior toward items in e-commerce platforms while incorporating a huge quantity of auxiliary information.

Collaborative Filtering Recommendation Algorithm Combining Tag Relevance and Score Differences

Conference Paper

Nov 2023

A design and implementation of real-time product selection with matrix factorization, collaborative filtering

Conference Paper

Jan 2023

Event driven Recommendation System for E-commerce using Knowledge based Collaborative Filtering Technique

Article

Full-text available

Aug 2020

The Internet is changing the method of selling and purchasing items. Nowadays online trading replaces offline trading. The items offered by the online system can influence the nature of buying customers. The recommendation system is one of the basic tools to provide such an environment. Several techniques are used to design and implement the recommendation system. Every recommendation system passes from two phases similarity computation among the users or items and correlation between target user and items. Collaborative filtering is a common technique used for designing such a system. The proposed system uses a knowledge base generated from knowledge graph to identify the domain knowledge of users, items, and relationships among these, knowledge graph is a labelled multidimensional directed graph that represents the relationship among the users and the items. Almost every existing recommendation system is based on one of feature, review, rating, and popularity of the items in which users’ involvement is very less or none. The proposed approach uses about 100 percent of users’ participation in the form of activities during navigation of the web site. Thus, the system expects under the users’ interest that is beneficial for both seller and buyer. The proposed system relates the category of items, not just specific items that may be interested in the users. We see the effectiveness of this approach in comparison with baseline methods in the area of recommendation system using three parameters precision, recall, and NDCG through online and offline evaluation studies with user data, and its performance is better than all other baseline systems in all aspects.

Application of Page Ranking Algorithm Based on Numbers of Link Visits in Web Recommendation System for Online Business

Article

Full-text available

Sep 2018

Joint Representation Learning for Top-N Recommendation with Heterogenous Information Sources

Conference Paper

Full-text available

Nov 2017

The Web has accumulated a rich source of information, such as text, image, rating, etc, which represent different aspects of user preferences. However, the heterogeneous nature of this information makes it difficult for recommender systems to leverage in a unified framework to boost the performance. Recently, the rapid development of representation learning techniques provides an approach to this problem. By translating the various information sources into a unified representation space, it becomes possible to integrate heterogeneous information for informed recommendation. In this work, we propose a Joint Representation Learning (JRL) framework for top-N recommendation. In this framework, each type of information source (review text, product image, numerical rating, etc) is adopted to learn the corresponding user and item representations based on available (deep) representation learning architectures. Representations from different sources are integrated with an extra layer to obtain the joint representations for users and items. In the end, both the per-source and the joint representations are trained as a whole using pair-wise learning to rank for top-N recommendation. We analyze how information propagates among different information sources in a gradient-descent learning paradigm, based on which we further propose an extendable version of the JRL framework (eJRL), which is rigorously extendable to new information sources to avoid model re-training in practice. By representing users and items into embeddings offline, and using a simple vector multiplication for ranking score calculation online, our framework also has the advantage of fast online prediction compared with other deep learning approaches to recommendation that learn a complex prediction network for online calculation.

Classification and Comparison of Web Recommendation Systems used in Online Business

Conference Paper

Jan 2020

Knowledge Extraction Through Page Rank Using Web-Mining Techniques for E-Business: A Review

Chapter

Feb 2017

Internet plays a vital role for doing the business. It provides platform for creating huge number of customers for ease of business. E-business organizations are growing rapidly and doubly in every minute; World Wide Web (WWW) provides huge information for the Internet users. The accesses of user's behavior are recorded in web logs. This information seems to be very helpful in an E-business environment for analysis and decision making. Mining of web data come across many new challenges with enlarged amount of information on data stored in web logs. The search engines play key role for retrieving the relevant information from huge information. Nowadays, the well-known search engines, like Google, MSN, Yahoo, etc. Have provided the users with good search results worked on special search strategies. In web search services the web page ranker component plays the main factor of the Google. This paper discusses the new challenges faced by web mining techniques, ranking of web pages using page ranking algorithms and its application in E-business analysis to improve the business operations.

Explainable Entity-based Recommendations with Knowledge Graphs

Article

Jul 2017

Explainable recommendation is an important task. Many methods have been proposed which generate explanations from the content and reviews written for items. When review text is unavailable, generating explanations is still a hard problem. In this paper, we illustrate how explanations can be generated in such a scenario by leveraging external knowledge in the form of knowledge graphs. Our method jointly ranks items and knowledge graph entities using a Personalized PageRank procedure to produce recommendations together with their explanations.

Translation-based Recommendation

Article

Jul 2017

Modeling the complex interactions between users and items as well as amongst items themselves is at the core of designing successful recommender systems. One classical setting is predicting users' personalized sequential behavior (or `next-item' recommendation), where the challenges mainly lie in modeling `third-order' interactions between a user, her previously visited item(s), and the next item to consume. Existing methods typically decompose these higher-order interactions into a combination of pairwise relationships, by way of which user preferences (user-item interactions) and sequential patterns (item-item interactions) are captured by separate components. In this paper, we propose a unified method, TransRec, to model such third-order relationships for large-scale sequential prediction. Methodologically, we embed items into a `transition space' where users are modeled as translation vectors operating on item sequences. Empirically, this approach outperforms the state-of-the-art on a wide spectrum of real-world datasets. Data and code are available at https://sites.google.com/a/eng.ucsd.edu/ruining-he/.

The pagerank citation ranking: Bringing order to the web

Article

Jan 1999

L. Page

Hidden factors and hidden topics: Understanding rating dimensions with review text

Conference Paper

Oct 2013

In order to recommend products to users we must ultimately predict how a user will respond to a new product. To do so we must uncover the implicit tastes of each user as well as the properties of each product. For example, in order to predict whether a user will enjoy Harry Potter, it helps to identify that the book is about wizards, as well as the user's level of interest in wizardry. User feedback is required to discover these latent product and user dimensions. Such feedback often comes in the form of a numeric rating accompanied by review text. However, traditional methods often discard review text, which makes user and product latent dimensions difficult to interpret, since they ignore the very text that justifies a user's rating. In this paper, we aim to combine latent rating dimensions (such as those of latent-factor recommender systems) with latent review topics (such as those learned by topic models like LDA). Our approach has several advantages. Firstly, we obtain highly interpretable textual labels for latent rating dimensions, which helps us to `justify' ratings with text. Secondly, our approach more accurately predicts product ratings by harnessing the information present in review text; this is especially true for new products and users, who may have too few ratings to model their latent factors, yet may still provide substantial information from the text of even a single review. Thirdly, our discovered topics can be used to facilitate other tasks such as automated genre discovery, and to identify useful and representative reviews.

Power coefficient as a similarity measure for memory-based collaborative recommender systems

Article

Oct 2014
EXPERT SYST APPL

Mohammad Yahya H. Al-Shamri

E-commerce systems employ recommender systems to enhance the customer loyalty and hence increasing the cross-selling of products. However, choosing appropriate similarity measure is a key to the recommender system success. Based on this measure, a set of neighbors for the current active user is formed which in turn will be used later to recommend unseen items to this active user. Pearson correlation coefficient, the most popular similarity measure for memory-based collaborative recommender system (CRS), measures how much two users are correlated. However, statistic’s literature introduced many other coefficients for matching two sets (vectors) that may perform better than Pearson correlation coefficient. This paper explores Jaccard and Dice coefficients for matching users of CRS. A more general coefficient called a Power coefficient is proposed in this paper which represents a family of coefficients. Specifically, Power coefficient gives many degrees for emphasizing on the positive matches between users. However, CRS users have positive and negative matches and therefore these coefficients have to be modified to take negative matches into consideration. Consequently, they become more suitable for CRS research. Many experiments are carried out for all the proposed variants and are compared with the traditional approaches. The experimental results show that the proposed variants outperform Pearson correlation coefficient and cosine similarity measure as they are the most common approaches for memory-based CRS.

Implementation of Knowledge based Collaborative Filtering and Machine Learning for E-Commerce Recommendation System

Abstract and Figures

Recommended publications

Development of Product Recommendation Engine By Collaborative Filtering and Association Rule Mining...

Event driven Recommendation System for E-commerce using Knowledge based Collaborative Filtering Tech...

Knowledge-Based Recommendation System for Online Business Using Web Usage Mining

Classification and Comparison of Web Recommendation Systems used in Online Business

Learning over Knowledge-Base Embeddings for Recommendation