ArticlePDF Available

PAVE: Personalized Academic Venue recommendation Exploiting co-publication networks

December 2017
Journal of Network and Computer Applications 104

December 2017
104

DOI:10.1016/j.jnca.2017.12.004

Authors:

Shuo Yu

Dalian University of Technology

Zhuo Yang

The University of Warwick

Show all 7 authorsHide

Academic venues have risen beyond the imagination for the rapid development of information technology. It is necessary for researchers to acknowledge high quality and fruitful academic venues. However, the information overload problem in big scholarly data creates tremendous challenges for mining these venues and relevant information. In this work, we propose PAVE, a novel Personalized Academic Venue recommendation Exploiting co-publication networks. PAVE runs a random walk with restart model on a co-publication network which contains two kinds of associations, coauthor relations and author-venue relations. We define a transfer matrix with bias to drive the random walk by exploiting three academic factors, co-publication frequency, relation weight and researchers’ academic level. PAVE is inspired from the fact that researchers are more likely to contact those who have high co-publication frequencies and similar academic levels. Additionally, in PAVE, we consider the difference of weights between two kinds of associations. Extensive experiments on DBLP data set demonstrate that, in comparison to relevant baseline approaches, PAVE performs better in terms of precision, recall, F1 and average venue quality.

Structure of PAVE.

…

Process of random walk.

…

Impact of damping coefficient (Dm) on PAVE.

…

Figures - uploaded by Feng Xia

Content may be subject to copyright.

Content uploaded by Feng Xia

Content may be subject to copyright.

PAVE: Personalized Academic Venue Recommendation

Exploiting Co-publication Networks

Shuo Yua, Jiaying Liua, Zhuo Yanga,∗, Zhen Chena, Huizhen Jianga, Amr

Tolbab,c, Feng Xiaa

aSchool of Software, Dalian University of Technology, Dalian 116620, China

bComputer Science Department, Community College, King Saud University, Riyadh 11437,

Saudi Arabia

cMathematics Department, Faculty of Science, Menouﬁa University, Shebin-El-kom 32511,

Egypt

Abstract

Academic venues have risen beyond the imagination for the rapid development of

information technology. It is necessary for researchers to acknowledge high qual-

ity and fruitful academic venues. However, the information overload problem

in big scholarly data creates tremendous challenges for mining these venues and

relevant information. In this work, we propose PAVE, a novel Personalized Aca-

demic Venue recommendation Exploiting co-publication networks. PAVE runs

a random walk with restart model on a co-publication network which contains

two kinds of associations, coauthor relations and author-venue relations. We

deﬁne a transfer matrix with bias to drive the random walk by exploiting three

academic factors, co-publication frequency, relation weight and researchers’ aca-

demic level. PAVE is inspired from the fact that researchers are more likely to

contact those who have high co-publication frequencies and similar academic lev-

els. Additionally, in PAVE, we consider the diﬀerence of weights between two

kinds of associations. Extensive experiments on DBLP data set demonstrate

that, in comparison to relevant baseline approaches, PAVE performs better in

terms of precision, recall, F1 and average venue quality.

Keywords: Big scholarly data, Recommender systems, Academic venue

∗Corresponding author

Email address: yangzhuo@dlut.edu.cn (Zhuo Yang)

Preprint submitted to Elsevier November 23, 2017

recommendation, Random walk, Network science

1. Introduction

It is challenging to mine useful and eﬀective information in big scholarly data

due to information overload [38]. The number of researchers, publications, and

academic venues have risen beyond the imagination for the rapid development

of information technology. Recommender systems help researchers deal with5

the problem of rapid growth and complexity of information, and provide users

with personalized information services. With the continuous growth in the size

of research paper repository, recommendation technology for academic entities

has been developed gradually [11]. Nowadays, academic recommender systems

mainly focus on four aspects: collaborator recommendation, paper recommen-10

dation, citation recommendation and academic venue recommendation [41] [4].

Especially, the immense growth of academic venues makes it troublesome for re-

searchers to choose the most relevant venue, which is witnessed by DBLP [18], a

service that provides open bibliographic information on major computer science

journals and proceedings. It has recorded 3,711 conferences and 1,391 journals15

(until 2015). Academic venues recommender systems have substantiated their

necessity and importance because they provide researchers with personalized

venues information pushing service.

In order to better recommend personalized venues for researchers, we con-

sider the researchers’ requirements for venues from the perspective of scientiﬁc20

research progress as followings. (1) Where can researchers obtain high quality

venues? (2) What are the most relevant conferences researchers should par-

ticipate? (3) Which venues are the most suitable for researchers to contribute

papers? Firstly, researchers usually get inspirations from papers in high quality

venues. When doing research, researchers would be better to follow high-quality25

conferences and journals, in which we can ﬁnd more high-quality and relevant

publications. Since researchers want to grow fast in certain domain by obtain-

ing more speciﬁc knowledge and fresh ideas, they need to study more publi-

cations that can inspire them. However, new researchers usually do not know

which conferences or journals are better choices for their researches. This is30

because that there are big diﬀerences among diﬀerent venues in research focus,

research method, and writing style, etc. To avoid blindness and detours, it is

extremely necessary to recommend researchers more conferences and journals

of high quality [30]. Secondly, researchers participate in conferences to commu-

nicate with other researchers and promote scientiﬁc collaboration. As we all35

know, almost all of researchers participate in conferences every year. Academic

conferences not only serve as the platforms to present research work, but also

connect researchers in a domain to have a deep communication and boost the

potential collaboration. Thus, researchers can beneﬁt a lot and make progress

together [26]. While, how to choose more relevant conferences to attend is a te-40

dious task, especially for those new researchers due to the information overload.

Finally, researchers are in need of submitting papers to the most suitable venues

of high quality. With the rapid development of both the quantity and variety

of publication venues in recent decades, it is diﬃcult to decide where to submit

papers. For those experts who have much publication experience, it might be45

a trivial task that select suitable conferences, journals, or scientiﬁc forums to

publish their papers since they have already knew well about them. That is,

they might have target venues in mind before they ﬁnish their papers. However,

the junior researchers who have few or no publication records, may be not sure

of which speciﬁc venue the work should be contributed to. Under the guidance50

of senior researchers, junior researchers may have some distinct or indistinct

target venues to prepare their work. Nonetheless, since submissions may be

rejected, researchers always need backup plans. Thus, choosing an appropriate

venue will be very essential [35] [40].

In recent years, a variety of approaches relating to academic venue recom-55

mendation have been proposed [25, 40, 22, 8, 42, 20]. There are also some

smart conference systems or solutions that help improve participation experi-

ence and solve the conference recommendation problems [34]. Although there

are fruitful methods, systems, or solutions, some factors that may have an inﬂu-

ence on recommendation in practice are not roundly taken into consideration.60

Moreover, some of work recommend venues based on homogeneous network [5].

However, academic network is generally with a composition of authors, key-

words, aﬃliations, and venues, which is heterogeneous indeed. In this work, we

recommend venues for researchers based on a heterogeneous network and take

the three aforementioned requirements into account as well. We propose a novel65

Personalized Academic Venue recommendation model Exploiting co-publication

networks (PAVE). We ﬁrstly integrate the academic entities (i.e., authors, publi-

cations, and venues) into a co-publication network [17], which contains two kinds

of nodes (author and venue) and two kinds of associations (co-author relations

and author-venue relations). Figure 1 shows an example of the co-publication70

network. Alice, Bob, Cindy and David are four researchers whose papers have

been published in the four venues. The links include co-author relations (e.g.,

the link between Alice and Bob) and author-venue relations (e.g., the links

between venue A and Alice). Those links along with the nodes compose the

co-publication networks. Furthermore, we introduce three metrics in PAVE: 1)75

co-publication frequency. It can reﬂect the occurring times of the relations; 2)

relation weight. The two kinds of relations can make diﬀerences on the network

edges; 3) academic level. Researchers are more likely to contact those who have

similar academic levels (Researchers are more likely to approach those who have

similar academic level rather than with high academic level, and the researchers80

with similar academic level performs similar characteristic in some ways. For ex-

ample, in Figure 1, Alice and Bob are neighbors. If they are of similar academic

level, then Venue C and Venue D, which Alice has not published any papers but

Bob has already published his work, should be not only taken into account when

we recommend venues for Alice, but also with more attention. This is because85

researchers are more likely to contact other researchers with similar academic

levels and publish papers in a venue, which is most likely to accept their papers.

Details are in Section 3.2). Based on these three hypotheses, we deﬁne a transfer

matrix with bias to drive the random walk with restart model (RWR) [2, 12, 36]

by introducing these three academic factors, co-publication frequency, relations90

weight and researchers’ academic level. Besides, we innovatively present a new

metric called Ave-Quality to evaluate the performance of recommendation apart

from precision, recall and F1 metrics. Ave-Quality can well show the quality

of recommended venues. In our experiments, PAVE is proved to be eﬀective in

terms of leading a better academic venue recommendation.95

Figure 1: An example of co-publication networks.

In summary, we make the following contributions in this paper.

•We develop an innovative solution based on a random walk with restart

model to deal with academic venue recommendation over big scholarly

data. The proposed solution is more favourable in terms of achieving

remarkable personalized academic venue recommendations.100

•To reveal researchers’ real intention of academic venues, we deﬁne a trans-

fer matrix with bias by utilizing the aforementioned three academic fac-

tors, which can lead the random walk running on the co-publication net-

work with preference.

•In addition to precision, recall and F1, We also propose a new metric105

to evaluate the performance. Extensive experiments on DBLP data set

measure the basic RWR model, a topic-based model and a friends-based

model for comparison and promising results are presented and analyzed.

The rest of the paper is organized as follows. Related work is discussed

in the next section. Section 3 introduces PAVE model. Section 4 presents110

the performance evaluation results of PAVE, followed by a section dedicated to

conclusion.

2. Related work

2.1. Academic Recommendation

Recommender system is proposed to deal with the issues of information over-115

load and help people make decisions by providing accessible and high quality

recommendations. Academic recommendation generally consists of academic

collaborator recommendation, paper recommendation, citation recommenda-

tion, and academic venue recommendation. For the diﬀerent recommendation, a

variety of methods are basically divided into four types: Collaborative ﬁltering120

(CF) recommendation, content-based recommendation, network-based recom-

mendation, and hybrid recommendation. Details are as follows.

1. Collaborative ﬁltering recommendation. CF is a popular and widely ac-

cepted approach for recommendation system, like user-based CF, item-

based CF and Matrix Factorization (MF). There are some papers focus125

on academic recommendation by exploiting CF algorithm. For example,

Yu et al. [43] present a prediction method based on collaborative ﬁlter-

ing for personalized academic recommendation. Liang et al. [20] propose

a new probabilistic approach that directly incorporates user exposure to

items into collaborative ﬁltering. They consider continuity feature of user’s130

browsing content to help discover collaborative users.

2. Content-based Recommendation. Content-based recommendation mainly

focuses on the proﬁles, the content of papers, and the context. It is widely

used in academic paper recommendation [19] [?] and citation recom-

mendation [13] [3] [7]. Sugiyama and Kan [31] examined the eﬀect of135

modelling a researcher’s past works in recommending scholarly papers to

the researcher. The key part of this model is to enhance the proﬁle de-

rived directly from past works with information. The information comes

from the past works’ referenced papers and papers that cited the work.

High quality papers can bring us shining ideas and also we can cite them140

in our papers. He et al. [13] present the initiative of building a context-

aware citation recommendation system and implement a prototype system

in CiteSeerX since it is challenging to obtain the relevant papers of high

value. Caragea et al. [7] propose an application of Singular Value De-

composition to build a reliable citation recommendation system and to145

recommend the most relevant citations. Pan and Li [24] use topic model

techniques to make topic analysis on research papers.

3. Network-based Recommendation. In academia, collaboration makes re-

searchers more fruitful and productive. Friends-based model is a kind

of neighborhood-based recommendation approach, which is simple and150

fundamental in social network-based recommendation methods. Lopes et

al. [21] present an innovative approach to recommend collaborations on

the context of academic social networks. Speciﬁcally, they introduce the

architecture for such approach and the metrics involved in recommending

collaborations and also present an initial case study to validate their ap-155

proach. West et al. [33] propose a citation-based method that makes it

possible to recommend multiple scales of relevance for diﬀerent users by

using the hierarchical structure of scientiﬁc knowledge. Xia et al. [37] con-

sider features of diﬀerent researchers and propose a novel recommendation

method which results in better recommendations.160

In addition, based on the collaboration network, Random Walk model is

frequently used to analyze the network. Fouss et al. [12] use a Markov-

chain model of random walk to compute similarities between elements of

a graph. Stokes et al. [29] use a biased random walk to estimate the ex-

pected time of ﬁnding a maximum degree node in a graph. Xia et al. [36]165

present MVCWalker (Random Walk-Based Most Valuable Collaborators

Recommendation Exploiting Academic Factors), which takes three aca-

demic factors, i.e. coauthor order, latest collaboration time, and times of

collaboration into consideration. They compare MVCWalker with the ba-

sic model of RWR and a common neighbor-based model in various aspects170

and achieve better performance. Extraordinary, researchers have already

begun to study weights in random walk model using supervised learning

algorithm. Lars and Jure [2] develop a method based on Supervised Ran-

dom Walks that in a supervised way learns how to bias a PageRank-like

random walk on the network so that it visits given nodes (i.e., positive175

training examples) more often than the others. Similarly to this goal,

we propose the transfer matrix with bias by introducing three academic

factors in this work.

4. Hybrid-based Recommendation. For the collaboration recommendation,

with mined contents getting more and more, hybrid-based methods [16] [10] [32]180

come out gradually. Cohen and Ebel [10] focus on one particular ﬂavor of

context-based collaborator recommendation in a social network, given a

set of keywords. However, collaborations among diﬀerent domains broaden

our researches a lot. Tang et al. [32] analyze the cross-domain collabora-

tion data from research publications and propose the Cross-domain Topic185

Learning (CTL) model for ranking and recommending potential cross-

domain collaborators. They considered a linear combination of the scores

obtained by the Content-based and the CF methods.

Cold start issue is one of the most fundamental and intractable issues in rec-

ommender system. When tackling cold start problem, these above mentioned190

methods perform diﬀerently. CF recommendation methods rely on history col-

laboration relationships of other scholars, which results in poorer performance

comparing with other kinds of methods. In contrast to CF, Content-based rec-

ommendation methods mitigate this issue to some extent since these methods

focus on researchers’ own proﬁles. Nevertheless, content-based recommenda-195

tion methods suﬀer from cold start challenge when new scholars do not have

their own history proﬁles or collaboration relationships [1]. Network-based and

hybrid-based recommendation methods perform better in solving cold start is-

sues [27].

2.2. Academic Venue Recommendation200

Plenty of studies have been done on academic collaborator recommendation,

academic paper, and citation recommendation and conference session recom-

mendation, while few focuses on the academic venue recommendation. The tra-

ditional way of recommending a venue to a researcher is by analyzing her/his

papers and comparing it to the topics of diﬀerent conferences using content-205

based analysis. However, this approach is not so precise due to mismatches

caused by ambiguity in text comparisons. As a result, many researchers fo-

cus on social network based and CF methods. Additionally, some social aware

approaches and hybrid methods have also been proposed for academic venue

recommendation as mentioned above.210

Previous studies have already done some work. Yang et al. [40] propose a

memory-based neighborhood collaborative ﬁltering model to recommend venues

by incorporating both topic and writing-style information of papers. They as-

sume that papers and venues are distinguishable by their writing styles [39].

Pham et al. [25] propose a clustering approach based on the social information215

of users to derive the academic recommendation. They utilize clustering tech-

niques to improve the accuracy of collaborative ﬁltering. However, this approach

mainly involves predicting the publishing venue for a manuscript. Similarly,

Luong et al. [22] propose a social network based approach to recommend pub-

lication venues by exploring author’s network of related co-authors and other220

researchers in the same domain. In addition, Asabere et al. [42] propose a

socially aware based approach to recommend presentation session (community)

venues to participants based on high research interest similarity, strong social re-

lations, and the matching of contextual information between the presenters and

participants at the conference venue. Similarly, Xia et al. [35] propose a presen-225

tation session recommender for smart conference participants by utilizing social

properties such as tie strength and degree centrality. Hornick et al. [14] provide

a framework for extending preference-based recommender systems to deal with

problems such as the conference recommendation problem. Huynh Hoang [15]

proposed a collaborative knowledge model running on the collaborative network230

based on the combination of graph theory and probability theory, which aims

at supporting publication venue recommendation. Besides, Wongchokprasitti

et al. [34] present a design for a community-based conference navigator system

collecting the wisdom of community to help conference participants examine the

schedule of paper presentation and add the most interesting sessions.235

Previous works have not recommended venues according to the associations

with researchers. In our paper, we describe the academic publishing scene by a

co-publication network including author-venue network and co-author network,

and model the real publishing process by a RWR model based on graph theory

and probability theory. Our academic venue recommendation model, PAVE, is240

extended from the basic RWR model. We propose the transfer matrix with bias

by introducing three academic factors, i.e. co-publication frequency, relation

weight, and researchers’ academic levels, which ensures that the random walk

performs better when making academic venue recommendations.

3. Design of PAVE245

In this section, we describe the details of PAVE. Furthermore, we explain how

to compute the link importance in the co-publication networks by considering

three academic factors into consideration.

3.1. Overview of PAVE

We exploit PAVE to mine speciﬁc academic venues and make personalized250

recommendations for researchers. The model is inspired by the fact that, re-

searchers usually desire to keep contact with suitable academic venues, i.e. ac-

knowledging high-quality and fruitful academic venues, participating in most

academic conferences which are closely related to their research, and contribut-

ing to suitable venues where it is possible for them to publish their research pa-255

pers and achievements. Additionally, PAVE is the extension from our previous

work [9], which proposes a random walk based academic venue recommenda-

tions and achieves good recommendation results. In this work, we regard the

topic distribution of researchers’ publications content and venues’ publications

content as feature vectors respectively, which are calculated by an LDA (Latent260

Dirichlet Allocation) model [6]. We deﬁne the Kargument in LDA as value 10,

which means we clustered 10 topics for each venue and researcher. Then, we

consider more factors to evaluate the model. Most of all, the three academic

factors we introduced, co-publication frequency, relation weight and researchers’

academic level, aim at improving the recommendations by biasing the random265

walk, so that it traverses more easily to the positive nodes. However, in order

to improve the academic level of researchers, the high academic level of venues

needs to be guaranteed. Therefore, we deﬁne a new metric called Ave-Quality

to evaluate the academic level of venues recommended. The detailed process of

PAVE is described below. Also, the structure of our PAVE model is illustrated270

in Figure 2.

We model a co-publication network which consists in the author-venue net-

work and co-author network. As shown in Figure 1, there are two kinds of

nodes (venues and researchers) and two kinds of links (co-author relations and

author-venue relations). Additionally, PAVE is the evolution from a basic RWR275

model, which has been proved to be suitable for calculating the similarity of

nodes in networks. In PAVE, whether a venue should be recommended depends

on its importance of the target researcher. The importance is deﬁned by the

rank score of the venue, which is determined by two factors, i.e. the number

of neighbor nodes and the rank score of incident nodes. The theory seems like280

PageRank [23], a successful application of RWR, which provides us a suitable

Figure 2: Structure of PAVE.

use for reference. Equation (1) is similar with PageRank in form.

ARu=1−α

N+αX

v∈Iu

ARvPu,v (1)

AR represents the rank score vector. ARuis the rank score (academic level)

of node u.Iuis the set of nodes incident to node u.Pu,v is the transition

probability from node vto node u.αis the damping factor. Nis the number of285

nodes in the network. PAVE compute the node ranking by driving an imaginary

walker randomly walks in the network. The walker has two choices, i.e. with

probability α, walking to next node v, which is one of u’s direct neighbors

(v∈Iu)), or with probability 1 −α, returning to source vertex u. Equation (1)

represents one step to get one rank score for node u. With respect to all nodes290

in the whole network, the approach is deﬁned by Equation (2), which is an

iterative process.

AR(t+1) =αS·AR(t)+ (1 −α)q(2)

ARtis the rank score vector at step t.qis a row vector (q0, q1..., qu, ..., qn).

For the target node u,qu= 1 and others equal 0. It should be noted that,

AR0=q.Sis the transfer matrix, representing the probability for each node295

to skip to the next node. For basic RWR model, the cell of matrix S(i.e. Pu,v in

Equation (1)) is deﬁned as 1

Lv, in which Lvis the number of node v’s neighbors.

It means that, the walker has the same probability to skip to next node. In

PAVE, we do some guidance work by introducing three academic factors. The

change of Pu,v enables the walker to skip based on preference, which will be300

proved better in section 4 for academic venue recommendation.

With reference to Figure 2, the process of PAVE is described in detail as

follows.

•Step1. The initial input data is a set of publications with authors’ infor-

mation and venues’ information. PAVE ﬁrstly extracts the co-author re-305

lations and author-venue relations, and then, generates the co-publication

networks. There is a link between two authors if they coauthored at least

one paper, as well as a link between researcher and venue if the researcher

published a paper in the venue.

•Step2. After initializing the rank score of nodes and weight of edges, PAVE310

runs on the network. During the random walk process, the walker skips to

next node with a modiﬁed probability by considering the three academic

factors. The walk will stop until the rank score approximate convergent

or the iterations come to the upper limit.

•Step3. After getting the convergent rank score of each node, PAVE sorts315

the venue in accordance to their corresponding rank scores. Finally, re-

move the venues with which the target author has contacted, the TopN

venues are recommended to the target author.

We then present details of how the transfer matrix with bias is computed by

considering the three academic factors.320

3.2. Transfer Matrix with Bias

A random walk in network is a transition from a node to another node. In

the network, if the walker walks from node u, the probability that the walker

walks to node vby the next step is only determined by the conditions of node

uand node v. That means, the probability that the walker walks to node vis325

irrelevant to the step before node u. This process is called Markov process. The

process of a random walk is actually a Markov process.

Let puv represent the probability that the walker walks from node uto node

v, then puv can be represented in the following matrix form. This matrix is

called transfer matrix.330







p11 · · · p1m

· · · · · · · · ·

pm1· · · pmm







Obviously, 0 6puv 61, Pm

v=1 puv = 1.

Let tu(n) be the probability of the walker stops at node uafter ntimes walk.

tu(n) is called nsteps state probability. Then the state vector

T(n)=(t1(n), t2(n),· · · , tm(n)) (3)

Apparently, Pm

u=1 tu(n) = 1.

According to the total probability formula, we get335

tv(n+ 1) =

u=1

tu(n)puv n= 0,1,2,· · · (4)

Then we get the general recursive formula

T(n) = T(0)Pn(5)

It can be known from the Equation (5) that in order to improve the eﬃciency

of the algorithm, we should reduce nto cut down the multiplication times of

transition probability matrix. No matter what the ﬁnal recommend rank in

diﬀerent algorithms is, transition probability matrix with bias can make the340

walker walk to the suitable venue faster than matrix without bias. This is

because the walker walks on purpose under transition probability matrix with

bias, which reduce the steps of walking. So both nand the multiplication times

Figure 3: Process of random walk.

can be reduced to cut down running time of algorithm. That is, we should guide

the walker to the nodes that are more proper by proposing a transfer matrix with345

bias instead of transfer matrix without bias. Therefore, we use a transfer matrix

with bias in PAVE, of which each element represents the transition probability

between two corresponding nodes.

According to the Figure 3, we can clearly see the process of random walk

with the new transfer matrix. After initializing nodes and edges weight, we350

modify the transfer matrix by taking three steps as follows into consideration

and get the transfer matrix with bias.

•Diﬀerentiate the weights of author-venue relations and co-author relations.

•Explore frequency of interactions among researchers.

•Take the academic level of researchers into consideration.355

Referring to the example shown in Figure 1, there are eight academic entities.

With respect to recommend venues to Alice, she has never contacted venues C

and D. According to the characteristics of the RWR model, the walker can walk

from Alice to venues C and D via Bob and Cindy respectively. After several

times of iterative walking, venues C and D are recommended to Alice based on360

the sorted rank score. However, there are several academic factors that can be

introduced to meet the real scene. We exploit three of them to redeﬁne the

transfer matrix in RWR.

Generally, researchers prefer contacting the academic entities (researchers

and venues) which have high frequency of interaction with them, i.e. high365

publishing frequency in the venue or high collaborating frequency with the re-

searchers. As shown in Figure 1, Alice prefers contacting Bob rather than Cindy

because Alice collaborated with Bob twice and with Cindy once. Bob seems to

be more important than Cindy for Alice. Furthermore, Alice prefers contacting

venue A rather than B, since Alice published two papers in venue A. Based on370

this assumption, we deﬁne co-publication frequency as Equation (6).

Fu,v =





CPu,v u∈Author, v ∈V enues

CTu,v u, v ∈Authors

(6)

wherein, CPu,v is the count of author u’s publications in venue v.CTu,v is

author u’s collaboration times with author v.

In addition, there are two kinds of associations in co-publication networks,

i.e., co-author relations and author-venue relations. In the case of basic random375

walk model, the diﬀerence between these two relations is ignored. Author-

venue relations seems to be more important than co-author relations, because

the event of publishing a paper in the venue is more preferable when proﬁl-

ing the researchers’ interest. This proposition has been proved in subsequent

experiments which can lead to better performance when making academic rec-380

ommendation. We measure the relation weight using Equation (7) based on a

ratio β.

Wu,v =βFu,v (7)

The ratio βis a variable empirical value which is used to regulate the importance

of author-venue relations and co-author relations. The issue is how to set β

respectively for these two kinds of relations to achieve the best recommending385

performance. We conduct amount of experiments to determine the β. In PAVE,

the settings of βis determined as 20 for author-venue relations and 1 for co-

author relations, which veriﬁes our hypothesis, the author-venue relations is

more important than co-author relations when proﬁling the researchers’ interest.

Finally, we propose an assumption: the interest features of academic en-390

tities can be more accurately reﬂected by similar level neighbors. In case of

researchers, they are more likely to contact other researchers with similar aca-

demic levels and publish papers in a venue which is most likely to accept their

papers. In other words, the relations between similar-level academic entities are

more weighty. The walker should walk along these nodes with more probabili-395

ties in PAVE. In order to measure the similarity of academic entities, we deﬁne

a simple metric as shown in equation 8.

LevSimu,v = 1 −kARu−ARvk

maxx∈Nu(kARu−ARxk)(8)

The Nuis the neighbors set of node u. Equation (8) aims at discovering the

neighbor with smallest rank score disparities based on a normalization method.

If node vshows a maximal gap with node ucomparing with u’s other neighbors,400

the LevSimu,v will be zero. When computing the transfer probability Su,v from

node uto node v, PAVE model adopts Equation (9). The walker can run on

the network with a modiﬁed bias.

Su,v =Wu,v

Px∈NuWu,x

LevSimu,v (9)

4. Performance Evaluation

We conducted extensive experiments using data from DBLP [18], a computer405

science bibliography website hosted at University of Trier in Germany. In this

section, we describe the statistics of the data set, the evaluation metrics and

our experimental procedure for evaluating the performance of PAVE, as well as

detailed analysis of the results.

4.1. Experimental Settings410

To measure the performance of PAVE, we implement three comparison ap-

proaches, i.e. the basic RWR model, a Topic-based model and a Friends-based

model. The detailed settings are presented following. (1) RWR is a popular

model widely used in recommender systems. Similar to popular random walk

models, the details and veriﬁcation method of RWR is resemble to PAVE, except415

the deﬁnition of transfer matrix with bias. The probabilities of skipping to next

neighbor node are equal in RWR. (2) The Topic-based method is a content-based

recommendation approach in the strict sense, which is also a kind of famous ap-

proach for content-based recommender system. The core of the approach is to

compute the similarity between researchers and venues. In this implementation,420

we regard the topic distribution of researchers’ publications content and venues’

publications content as feature vectors respectively, which are calculated by an

LDA model [6]. We deﬁne the Kargument in LDA as value 10, which means we

clustered 10 topics for each venue and researchers. The similarity of researchers

and venues is deﬁned by the Cosine Similarity based on these feature vectors.425

(3) The Friends-based model is a kind of neighborhood-based recommendation

approach, which are widely used in social network-based recommendation. The

basic idea of friends-based model is to recommend venues according to the num-

ber of neighbors who have relations with the venues. In this implementation,

we treat the researcher’s collaborators and ”collaborators of collaborator” as430

neighbors. If there are many neighbors who contact a venue, the venue should

be recommended to the researcher.

4.2. Data Set

DBLP indexes more than 3.35 million articles in computer science. In this

experiments, the big scale of data makes it time consuming to process the data435

and run the PAVE model. To reduce training time, we use a subset of DBLP.

This subset covers the ﬁeld of data mining, involving 74 venues (36 journals

and 38 conferences) and 70,326 researchers altogether. Researchers and venues

are connected by 163,446 articles in this co-publication network. Covering most

Figure 4: Detailed statistics of the data set from DBLP.

high-quality journals and conferences in the data mining area, the subset has440

been used by other related studies with no subjective bias [36]. The statistics

pertaining to the data set is shown in Table 1. The data set is divided into two

parts. The data before year 2011 are chosen as a training set, and the rest as a

test set.

The detailed statistical characteristic of this co-publication network is shown445

in Figure 4. Figure 4(a) describes the scale of participants or contributors for

each venue. Almost half of the venues keep no more than 500 researchers. The

scale of 11 venues is so large that up to 3,000 researchers publish papers in them.

We can also observe that from Figure 4(b), almost 94.09% of these 70,326 re-

searchers contact not more than 3 venues (77.67% for 1 venue, 11.88% for 2450

venues and 4.54% for 3 venues). However, there are also some excellent re-

Table 1: Statistics of Data Set from DBLP.

Statistics venues researchers articles

Number 74 70326 163446

searchers with high academic level (account for 0.13%) contributing more than

14 venues. Similarly, Figure 4(c) shows the same trend for the number of re-

searchers’ publications. Most of them published not more than ﬁve papers, but

there were also 1.64% researchers publishing more than 14 papers. Figure 4(d)455

shows the number of co-authors for each researchers. In general, the distribu-

tions in Figure 4(b), Figure 4(c), and Figure 4(d) are in the line with long tail

distribution, which correspond to the fact that fewer researchers contribute the

most products or have the most co-authors. We can conclude that, the degrees

(number of neighbors) of most researchers are under 14, which indicates that460

this data set is very sparse.

All experiments were performed on a 64-bit Linux-based operation system,

Ubuntu 12.04 with a 4-duo and 3.2-Ghz Intel CPU, 8-G Bytes memory, and

implemented with Python.

4.3. Metrics465

In our previous work [9], we employ three popular metrics [28], precision, re-

call and F1 score, to evaluate the performance of recommendation. In this work,

we propose a new metric Ave-Quality to enhance the performance of recommen-

dation. For academic recommendation, we usually get a recommendation list

as the output. There is also an accepted list for the target node. So, we can470

divide the result data into three parts, whose details are shown as follows.

•A: The recommended and collaborated nodes;

•B: The recommended and not collaborated nodes;

•C: The collaborated and not recommended nodes.

The deﬁnition of precision is shown as below:475

P=A

A+B(10)

The metric recall is deﬁned as:

R=A

A+C(11)

To get an integrated metric over precision and recall, we can measure the

model by F1 score, which is usually called F1 and the equation is:

F1 = 2(P∗R)

P+R(12)

In recommender systems, the quality of recommended items is of great con-

cern. The higher quality systems recommended, the better performance the480

recommendation achieved. It is worth noting that the recommendation could

still be in high quality even if the authors paper was rejected. However, such

data can hardly be obtained, which makes it diﬃcult to consider rejected pa-

pers. As a consequence, in this work we regard a recommendation as high

quality when the author’s paper was accepted for publication in the test set. To485

evaluate the quality of the recommended venues generated by PAVE, we pro-

pose a metric Ave-Quality based on Google’s h5-index1. h5-index is a famous

and authoritative metric, which represents the venues academic level. A venue

is with an h5-index refers that this venue has published h papers each of which

has been cited in other papers at least h times in recent 5 years. The formalized490

deﬁnition is shown in equation 13. Vis the set of recommended items. Mis the

length of recommendation list and H5vis the h5-index of venue v. If the average

h5-index of recommended venues is high, that means the PAVE performs well

in recommending high quality venues.

Ave-Quality =PM

v∈VH5v

M(13)

1https:\\scholar.google.comintlenscholarmetrics.html#metrics

In this work, we will use this four metrics to evaluate the performance of495

PAVE.

4.4. Results and Analysis

In this section, we initially implement several experiments for PAVE, basic

RWR, topic-based and friends-based recommendation model on data set dis-

cussed above. We randomly choose 100 researchers as target nodes and run500

PAVE with diﬀerent target nodes, then, average the value of metrics for the 100

times in the experiments. We repetitively implement such experiments with

recommendation lists of diﬀerent lengths to evaluate the inﬂuence of recom-

mendation list on the result. Additionally, PAVE and RWR are implemented

with a αof 0.8, which is proved to be appropriate in following experiments.505

Figure 5: Performance of PAVE, basic RWR, topic-based and friends-based recommendation

model.

Figure 6: Impact of researchers’ publications number (PN) on PAVE.

In recommendation models, higher eﬃciency generally refers to higher rec-

ommendation accuracy with shorter length of recommendation list. Figure 5

shows the performance of PAVE, basic RWR, topic-based and friends-based

Figure 7: Impact of damping coeﬃcient (Dm) on PAVE.

recommendation model. The xaxis represents the length of recommendation

list, which is in the range of 1-25. The yaxis represents precision, recall and F1510

score respectively. In Figure 5(a), topic-based model decreases with the length of

recommendation list grows and the other three models decline with ﬂuctuation

when the length of recommendation list grows. Topic-based and friends-based

recommendation models perform better in precision only when the length of

recommendation list is 1. However, PAVE and basic RWR perform better in515

precision as a whole. A close view of range 1 to 11 on xaxis, PAVE achieves

higher precision, it comes to a peak value of 8.7% when recommending 3 venues.

With the growth of recommendation list, the performance of the four recom-

mendation approaches tend to be similar. In Figure 5(b), the lines rise. PAVE

and basic RWR have no signiﬁcant diﬀerence, but their recall perform better520

than that of topic-based and friends-based approach. With the number of rec-

ommended venues reaching the max of venues, the recall approximates to 1.

According to Figure 5(c), the F1 score shows similar trend with precision. The

F1 score of PAVE reaches the highest value of 12.95% when recommending 9

venues for each researcher. The upgrade rate ( F1(P AV E )−F1(RW R)

F1(RW R)) is 11.3% in525

comparison to basic RWR. It is worth mentioned that, PAVE reaches its peak

at point 9, while basic RWR achieves the highest F1 score at point 11. That

means the recommendation eﬃciency of PAVE is higher.

These experimental results demonstrate that, the RWR based model can

achieve more accurate academic venue recommendation than topic-based and530

friends-based approaches. Furthermore, our work on transfer matrix with bias

improves the performance of PAVE, and makes the recommendation more eﬃ-

cient. Comparing with RWR, the proposed transfer matrix with bias in PAVE

makes it possible for the walker walks along with preferred path rapidly and

precisely. Based on the analysis of experiment data and the theory of PAVE535

model, it can be conﬁrmed that PAVE model does improve the recommendation

accuracy and the modiﬁcation of transfer matrix with bias is quite proper.

We also made several extensive experiments to measure the performance

of PAVE on diﬀerent researchers. We mainly focused on the diﬀerence of re-

searchers academic level, which is reﬂected by the number of publications. To540

some extent, the number of publications can reﬂect the researchers’ contribu-

tions and activeness. Generally, in computer science domain, junior researchers

show lower academic level with few publications, while senior professor show

higher academic level with a lot of high-quality publications. We divide the re-

searchers into three sets: (1) C1 contains researchers whose publications range545

from 2 to 8. This is to ensure the target researcher can appear in both training

and testing data sets. Moreover, we ignore the researchers with only one pub-

lication; (2) C2 contains researchers with 8 to 15 publications; (3) C3 contains

researchers with more than 15 publications. The experimental results are shown

in Figure 6.550

From Figure 6, we can see signiﬁcant diﬀerences relating to the eﬀect on

diﬀerent sets of researchers even similar trends are shown in precision, recall,

and F1 score respectively. In Figure 6(c), the PAVE achieves the highest value

of 16.37% for F1 score at point 5 when making academic venue recommendation

for the researchers with 2 to 8 publications. The results mean that, PAVE can555

perform better at recommending academic venues for researchers with fewer

publications, i.e., junior researchers, which meets our innovative intention that

recommend academic venues for more eﬀective research and collaboration.

We conduct experiments to show the impact of damping coeﬃcient on PAVE

as shown in Figure 7. For the damping coeﬃcient is between 0 to 1, we test560

four diﬀerent values of damping coeﬃcient, 0.2, 0.4, 0.6, 0.8, respectively. We

can see it also show the similar trends for the metrics precision, recall, and F1.

From Figure 7(a), we can see precision reaches the highest value of 8.7% when

the damping coeﬃcient is 0.8%. For recall, it shows an upward trend and also

higher with the damping coeﬃcient value of 0.8. Similar to precision, F1 gets565

the higher value when the damping coeﬃcient is 0.8 as shown in Figure 7(c).

All in all, PAVE shows the best performance when the damping coeﬃcient is

0.8.

Figure 8: The New Metric Ave-Quality.

Furthermore, we explore the performance of the four models on Ave-Quality.

The αis set as 0.8. In Figure 8, we can see PAVE shows the best performance for570

the Ave-Quality. In other words, PAVE recommends venues of higher academic

level for researchers than other models. When the recommendation list is 3,

Ave-Quality reaches the peak. With the increasing of recommendation list, Ave-

Quality shows a downward trend, but the PAVE is still better than others. This

phenomenon corresponds to the theory that random walk model can identify575

the high-level node with the biased transfer matrix, which means that the three

academic factors we explored can lead the rank value transfer along high-level

nodes. Therefore, PAVE model can rank the high-level node on the top of the

recommending list and ﬁnally improve the quality of recommended venues. In

conclusion, PAVE shows a better performance than the other baseline methods.580

5. Conclusion

In this paper, we have focused on academic venue recommendation for re-

searchers based on the big scholarly data which is necessary in current academia.

To this end, we have proposed a novel academic venue recommendation model

called PAVE, which exploits three academic factors (i.e., co-publication fre-585

quency, relation weight and researchers’ academic level) to deﬁne transfer ma-

trix with bias which drives a random walk with restart model running on co-

publication networks. We conduct extensive experiments on a subset of DBLP

data set to evaluate the performance of PAVE in comparison to other state-

of-the-art approaches: basic RWR, topic-based approaches, and friends-based590

approaches. The experimental results show that, PAVE outperforms the other

approaches in terms of precision, recall, F1 score, and Ave-Quality. According

to the extended experiment, PAVE performs better at recommending academic

venues for researchers with fewer publications, i.e., junior researchers.

Nonetheless, there is still much work for future study in this direction. We595

only exploit three academic factors in co-publication networks. There are many

other features such as citation relations that need to be explored in PAVE. As

future work, more experiments will be performed on other academic data sets.

Acknowledgments

The authors extend their appreciation to the International Scientiﬁc Part-600

nership Program ISPP at King Saud University for funding this research work

through ISPP#0078.

References

[1] Adomavicius, G., Tuzhilin, A.. Toward the next generation of recom-

mender systems: A survey of the state-of-the-art and possible extensions.605

IEEE Transactions on Knowledge and Data Engineering 2005;17(6):734–

749.

[2] Backstrom, L., Leskovec, J.. Supervised random walks: predicting and

recommending links in social networks. In: Proceedings of the 4th ACM

international conference on Web search and data mining. ACM; 2011. p.610

635–644.

[3] Balog, K., Ramampiaro, H., Takhirov, N., Nørv˚ag, K.. Multi-step

classiﬁcation approaches to cumulative citation recommendation. In: Pro-

ceedings of the 10th Conference on Open Research Areas in Information

Retrieval. 2013. p. 121–128.615

[4] Beel, J., Gipp, B., Langer, S., Breitinger, C.. Research-paper rec-

ommender systems: a literature survey. International Journal on Digital

Libraries 2016;17(4):305–338.

[5] Beierle, F., Tan, J., Grunert, K.. Analyzing social relations for rec-

ommending academic conferences. In: Proceedings of the 8th ACM Inter-620

national Workshop on Hot Topics in Planet-scale mObile computing and

online Social neTworking. ACM; 2016. p. 37–42.

[6] Blei, D.M., Ng, A.Y., Jordan, M.I.. Latent dirichlet allocation. The

Journal of Machine Learning Research 2003;3:993–1022.

[7] Caragea, C., Silvescu, A., Mitra, P., Giles, C.L.. Can’t see the forest625

for the trees?: a citation recommendation system. In: Proceedings of the

13th ACM/IEEE-CS joint conference on Digital libraries. ACM; 2013. p.

111–114.

[8] Chen, J., Chen, G., Zhang, H., Huang, J., Zhao, G.. Social recommenda-

tion based on multi-relational analysis. In: IEEE/WIC/ACM International630

Conferences on Web Intelligence and Intelligent Agent Technology. IEEE;

volume 2; 2012. p. 471–477.

[9] Chen, Z., Xia, F., Jiang, H., Liu, H., Zhang, J.. Aver: Random

walk based academic venue recommendation. In: Proceedings of the 24th

International Conference on World Wide Web Companion. WWW; 2015.635

p. 579–584.

[10] Cohen, S., Ebel, L.. Recommending collaborators using keywords. In:

Proceedings of the 22nd international conference on World Wide Web com-

panion. WWW; 2013. p. 959–962.

[11] Dhanda, M., Verma, V.. Recommender system for academic literature640

with incremental dataset. Procedia Computer Science 2016;89:483–491.

[12] Fouss, F., Pirotte, A., Renders, J.M., Saerens, M.. Random-walk

computation of similarities between nodes of a graph with application to

collaborative recommendation. IEEE Transactions on Knowledge and Data

Engineering 2007;19(3):355–369.645

[13] He, Q., Pei, J., Kifer, D., Mitra, P., Giles, L.. Context-aware citation

recommendation. In: Proceedings of the 19th international conference on

World wide web. ACM; 2010. p. 421–430.

[14] Hornick, M.F., Tamayo, P.. Extending recommender systems for disjoint

user/item sets: The conference recommendation problem. IEEE Transac-650

tions on Knowledge and Data Engineering 2012;24(8):1478–1490.

[15] Huynh, T., Hoang, K.. Modeling collaborative knowledge of publish-

ing activities for research recommendation. In: International Conference

on Computational Collective Intelligence Technologies and Applications.

Springer; 2012. p. 41–50.655

[16] Lee, D.H., Brusilovsky, P., Schleyer, T.. Recommending collaborators

using social features and mesh terms. Proceedings of the American Society

for Information Science and Technology 2011;48(1):1–10.

[17] Lemarchand, G.A.. The long-term dynamics of co-authorship scien-

tiﬁc networks: Iberoamerican countries (1973–2010). Research Policy660

2012;41(2):291–305.

[18] Ley, M.. Dblp: some lessons learned. Proceedings of the VLDB Endow-

ment 2009;2(2):1493–1500.

[19] Li, L., Chu, W., Langford, J., Wang, X.. Unbiased oﬄine evaluation

of contextual-bandit-based news article recommendation algorithms. In:665

Proceedings of the fourth ACM international conference on Web search

and data mining. ACM; 2011. p. 297–306.

[20] Liang, D., Charlin, L., McInerney, J., Blei, D.M.. Modeling user exposure

in recommendation. In: Proceedings of the 25th International Conference

on World Wide Web. International World Wide Web Conferences Steering670

Committee; 2016. p. 951–961.

[21] Lopes, G.R., Moro, M.M., Wives, L.K., De Oliveira, J.P.M.. Col-

laboration recommendation on academic social networks. In: Advances

in Conceptual Modeling–Applications and Challenges. Springer; 2010. p.

190–199.675

[22] Luong, H., Huynh, T., Gauch, S., Do, L., Hoang, K.. Publication venue

recommendation using author network’s publication history. In: Intelligent

Information and Database Systems. Springer; 2012. p. 426–435.

[23] Page, L., Brin, S., Motwani, R., Winograd, T.. The pagerank citation

ranking: bringing order to the web. Stanford Digital Libraries Working680

Paper 1999;9(1):1–14.

[24] Pan, C., Li, W.. Research paper recommendation with topic analysis.

In: 2010 International Conference on Computer Design and Applications.

IEEE; volume 4; 2010. p. V4–264–V4–268.

[25] Pham, M.C., Cao, Y., Klamma, R., Jarke, M.. A clustering approach685

for collaborative ﬁltering recommendation using social network analysis. J

UCS 2011;17(4):583–604.

[26] Pham, M.C., Kovachev, D., Cao, Y., Mbogos, G.M., Klamma, R..

Enhancing academic event participation with context-aware and social rec-

ommendations. In: IEEE/ACM International Conference on Advances in690

Social Networks Analysis and Mining. IEEE Computer Society; 2012. p.

464–471.

[27] Rohani, V.A., Kasirun, Z.M., Kumar, S., Shamshirband, S.. An eﬀective

recommender algorithm for cold start problem in academic social networks.

Mathematical Problems in Engineering 2014;2014(2):505–519.695

[28] Shani, G., Gunawardana, A.. Evaluating recommendation systems. In:

Recommender systems handbook. Springer; 2011. p. 257–297.

[29] Stokes, J., Weber, S.. A markov chain model for the search time for max

degree nodes in a graph using a biased random walk. In: 2016 Annual

Conference on Information Science and Systems (CISS). IEEE; 2016. p.700

448–453.

[30] Sugiyama, K., Kan, M.Y.. Scholarly paper recommendation via user’s re-

cent research interests. In: Proceedings of the 10th annual joint conference

on Digital libraries. ACM; 2010. p. 29–38.

[31] Sugiyama, K., Kan, M.Y.. Towards higher relevance and serendipity in705

scholarly paper recommendation. ACM SIGWEB Newsletter 2015;(Win-

ter):4.

[32] Tang, J., Wu, S., Sun, J., Su, H.. Cross-domain collaboration recommen-

dation. In: Proceedings of the 18th ACM SIGKDD international conference

on Knowledge discovery and data mining. ACM; 2012. p. 1285–1293.710

[33] West, J.D., Wesley-Smith, I., Bergstrom, C.T.. A recommendation

system based on hierarchical clustering of an article-level citation network.

IEEE Transactions on Big Data 2016;2:113–123.

[34] Wongchokprasitti, C., Brusilovsky, P., Parra-Santander, D.. Conference

navigator 2.0: community-based recommendation for academic conferences.715

In: Workshop on Social Reminder Systems. ACM; 2010. .

[35] Xia, F., Asabere, N.Y., Rodrigues, J.J., Basso, F., Deonauth, N., Wang,

W.. Socially-aware venue recommendation for conference participants. In:

IEEE International Conference on Ubiquitous Intelligence and Computing.

IEEE; 2013. p. 134–141.720

[36] Xia, F., Chen, Z., Wang, W., Li, J., Yang, L.T.. Mvcwalker: Ran-

dom walk-based most valuable collaborators recommendation exploiting

academic factors. IEEE Transactions on Emerging Topics in Computing

2014;2(3):364–375.

[37] Xia, F., Liu, H., Lee, I., Cao, L.. Scientiﬁc article recommendation: Ex-725

ploiting common author relations and historical preferences. IEEE Trans-

actions on Big Data 2016;2:101–112.

[38] Xia, F., Wang, W., Bekele, T.M., Liu, H.. Big scholarly data: A survey.

IEEE Transactions on Big Data 2017;3(1):18–35.

[39] Yang, Z., Davison, B.D.. Distinguishing venues by writing styles. In: Pro-730

ceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries.

ACM; 2012. p. 371–372.

[40] Yang, Z., Davison, B.D.. Venue recommendation: Submitting your paper

with style. In: International Conference on Machine Learning and Appli-

cations. IEEE; volume 1; 2012. p. 681–686.735

[41] Yang, Z., Yin, D., Davison, B.D.. Recommendation in academia: A

joint multi-relational model. In: IEEE/ACM International Conference on

Advances in Social Networks Analysis and Mining. IEEE; 2014. p. 566–571.

[42] Yaw Asabere, N., Xia, F., Wang, W., Rodrigues, J.J., Basso,

F., Ma, J.. Improving smart conference participation through socially740

aware recommendation. IEEE Transactions on Human-Machine Systems

2014;44(5):689–700.

[43] Yu, J., Xie, K., Zhao, H., Liu, F.. Prediction of user interest based on

collaborative ﬁltering for personalized academic recommendation. In: 2nd

International Conference on Computer Science and Network Technology.745

IEEE; 2012. p. 584–588.

PUB-VEN: a personalized recommendation system for suggesting publication venues

Article

Full-text available

Oct 2023
MULTIMED TOOLS APPL

Researchers would like to publish their research articles in reputed journals along with quick review time. However, with the growing number of academic publications, it is becoming more difficult for scholars to find venues that are relevant to their domain. This study aims on the development of a technique that focuses on the priorities of the researchers that are linked to the recommendation of suitable suggestion of publication journal. The developed Recommendation System (RS) takes title, abstract, and keyword of the manuscript to be submitted. The proposed algorithm, named PUB-VEN which is hybridization of Content-Based Filtering (CBF), and Collaborative Filtering (CF), which is integrated with the Multi-Criteria Decision Making (MCDM) process to provide suitable journal recommendations by considering the researcher's point of view about different attributes gathered such as impact factor, eigen factor, average review time, etc. which affect the research process effectively. Our results demonstrate that the PUB-VEN provides better recommendations in comparison with state-of-the-art algorithms such as Term Frequency and Inverse Document Frequency (TF-IDF) and Latent Semantic Analysis (LSA). The study concluded that PUB-VEN is providing better precision, recall, F1 Score, Discounted Cumulative Gain (DCG), and Normalized DCG (NCDG). For precision, the gain ranges from 1% to 16%, the improvement in recall is between 33% and 3%, the betterment of result in F1 is by the ratio which ranges from 27% and 2%, the improvement in the result of DCG lies between 15% and 5% and the result of NDCG gain ranges from 6% to 1%. It is useful for the researchers in finding suitable venue for publication.

Understanding user intent modeling for conversational recommender systems: a systematic literature review

Article

Full-text available

Jun 2024
USER MODEL USER-ADAP

User intent modeling in natural language processing deciphers user requests to allow for personalized responses. The substantial volume of research (exceeding 13,000 publications in the last decade) underscores the significance of understanding prevalent models in AI systems, with a focus on conversational recommender systems. We conducted a systematic literature review to identify models frequently employed for intent modeling in conversational recommender systems. From the collected data, we developed a decision model to assist researchers in selecting the most suitable models for their systems. Furthermore, we conducted two case studies to assess the utility of our proposed decision model in guiding research modelers in selecting user intent modeling models for developing their conversational recommender systems. Our study analyzed 59 distinct models and identified 74 commonly used features. We provided insights into potential model combinations, trends in model selection, quality concerns, evaluation measures, and frequently used datasets for training and evaluating these models. The study offers practical insights into the domain of user intent modeling, specifically enhancing the development of conversational recommender systems. The introduced decision model provides a structured framework, enabling researchers to navigate the selection of the most apt intent modeling methods for conversational recommender systems.

A deep learning approach to enhance accuracy and diversity of recommendation for interdisciplinary journals

Preprint

Full-text available

May 2024

To meet scholars' need to recommend both higher accuracy and diversity when submitting interdisciplinary papers, this paper proposes an improved journal diversity recommendation method based on the attention mechanism in deep learning. This method can retain all key information in long texts by using the attention mechanism. It identifies and stores the research directions and hotspots covered in different papers across journals to extract common research topics for each journal type. Five deep learning models based on attention mechanism are introduced, 104,176 paper abstracts from 111 Web of Science journals are used to fine-tune the models. After learning on training set and model testing on the test set, recommendation accuracy and diversity results are calculated for 9 categories. Finally, the recommendation accuracy and diversity of the 5 attention mechanism based deep learning models are compared with benchmark models across different journal types. The experimental results demonstrate the feasibility and superiority of this method comprehensively considering the metrics of accuracy and diversity at a large scale. It provides theoretical and practical advancements to develop an effective journal recommender system which helps scholars to make wise decision for journal submission.

Understanding User Intent Modeling for Conversational Recommender Systems: A Systematic Literature Review

Preprint

Full-text available

Aug 2023

Context: User intent modeling is a crucial process in Natural Language Processing that aims to identify the underlying purpose behind a user’s request, enabling personalized responses. With a vast array of approaches introduced in the literature (over 13,000 papers in the last decade), understanding the related concepts and commonly used models in AI-based systems is essential. Method: We conducted a systematic literature review to gather data on models typically employed in designing conversational recommender systems. From the collected data, we developed a decision model to assist researchers in selecting the most suitable models for their systems. Additionally, we performed two case studies to evaluate the effectiveness of our proposed decision model. Results: Our study analyzed 59 distinct models and identified 74 commonly used features. We provided insights into potential model combinations, trends in model selection, quality concerns, evaluation measures, and frequently used datasets for training and evaluating these models. Contribution: Our study contributes practical insights and a comprehensive understanding of user intent modeling, empowering the development of more effective and personalized conversational recommender systems. With the Conversational Recommender System, researchers can perform a more systematic and efficient assessment of fitting intent modeling frameworks.

Comparing different search methods for the open access journal recommendation tool B!SON

Article

Full-text available

Jul 2023
Int J Digit Libr

Finding a suitable open access journal to publish academic work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, funders’ conditions and the risk of predatory publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. A systematic requirements analysis was conducted in the form of a survey. The developed tool suggests open access journals based on title, abstract and references provided by the user. The recommendations are built on open data, publisher-independent and work across domains and languages. Transparency is provided by its open source nature, an open application programming interface (API) and by specifying which matches the shown recommendations are based on. The recommendation quality has been evaluated using two different evaluation techniques, including several new recommendation methods. We were able to improve the results from our previous paper with a pre-trained transformer model. The beta version of the tool received positive feedback from the community and in several test sessions. We developed a recommendation system for open access journals to help researchers find a suitable journal. The open tool has been extensively tested, and we found possible improvements for our current recommendation technique. Development by two German academic libraries ensures the longevity and sustainability of the system.

Effective community detection with topic modeling in article recommender systems using LS-SLM and PCC-LDA

Article

Mar 2024
J INTELL FUZZY SYST

This paper introduces an innovative approach, the LS-SLM (Local Search with Smart Local Moving) technique, for enhancing the efficiency of article recommendation systems based on community detection and topic modeling. The methodology undergoes rigorous evaluation using a comprehensive dataset extracted from the “dblp. v12.json” citation network. Experimental results presented herein provide a clear depiction of the superior performance of the LS-SLM technique when compared to established algorithms, namely the Louvain Algorithm (LA), Stochastic Block Model (SBM), Fast Greedy Algorithm (FGA), and Smart Local Moving (SLM). The evaluation metrics include accuracy, precision, specificity, recall, F-Score, modularity, Normalized Mutual Information (NMI), betweenness centrality (BTC), and community detection time. Notably, the LS-SLM technique outperforms existing solutions across all metrics. For instance, the proposed methodology achieves an accuracy of 96.32%, surpassing LA by 16% and demonstrating a 10.6% improvement over SBM. Precision, a critical measure of relevance, stands at 96.32%, showcasing a significant advancement over GCR-GAN (61.7%) and CR-HBNE (45.9%). Additionally, sensitivity analysis reveals that the LS-SLM technique achieves the highest sensitivity value of 96.5487%, outperforming LA by 14.2%. The LS-SLM also demonstrates superior specificity and recall, with values of 96.5478% and 96.5487%, respectively. The modularity performance is exceptional, with LS-SLM obtaining 95.6119%, significantly outpacing SLM, FGA, SBM, and LA. Furthermore, the LS-SLM technique excels in community detection time, completing the process in 38,652 ms, showcasing efficiency gains over existing techniques. The BTC analysis indicates that LS-SLM achieves a value of 94.6650%, demonstrating its proficiency in controlling information flow within the network.

A Comprehensive Survey on Deep Graph Representation Learning

Article

Feb 2024
NEURAL NETWORKS

Scholarly recommendation systems: a literature survey

Article

Full-text available

Jun 2023
KNOWL INF SYST

A scholarly recommendation system is an important tool for identifying prior and related resources such as literature, datasets, grants, and collaborators. A well-designed scholarly recommender significantly saves the time of researchers and can provide information that would not otherwise be considered. The usefulness of scholarly recommendations, especially literature recommendations, has been established by the widespread acceptance of web search engines such as CiteSeerX, Google Scholar, and Semantic Scholar. This article discusses different aspects and developments of scholarly recommendation systems. We searched the ACM Digital Library, DBLP, IEEE Explorer, and Scopus for publications in the domain of scholarly recommendations for literature, collaborators, reviewers, conferences and journals, datasets, and grant funding. In total, 225 publications were identified in these areas. We discuss methodologies used to develop scholarly recommender systems. Content-based filtering is the most commonly applied technique, whereas collaborative filtering is more popular among conference recommenders. The implementation of deep learning algorithms in scholarly recommendation systems is rare among the screened publications. We found fewer publications in the areas of the dataset and grant funding recommenders than in other areas. Furthermore, studies analyzing users’ feedback to improve scholarly recommendation systems are rare for recommenders. This survey provides background knowledge regarding existing research on scholarly recommenders and aids in developing future recommendation systems in this domain.

A Survey on Recommendation System for Future Researchers Using Classifiers

Conference Paper

Apr 2023

Author-Profile-Based Journal Recommendation for a Candidate Article: Using Hybrid Semantic Similarity and Trend Analysis

Article

Full-text available

Jan 2023

Finding the right journal for a manuscript to be submitted is difficult and often time-consuming because authors take into account some criteria while searching for the appropriate journal for their manuscript. One of the most important criteria is the content similarity of the journals and manuscript. For this purpose, the subject of the manuscript should be in accordance with the scope of the journal. Also, the manuscript content should be closed to the journals’ trend for higher chance of acceptance. Second criterion is to take into account the impact-factor, acceptance-rate, review-time and publishing houses of the journal, which are suitable for the author’s past publication profile. In this study, a novel method is proposed in which both the content of the article and the author / authors profile are considered together to find the appropriate journal. To the best of our knowledge, this is the first effort in this direction. Experimental results conducted on real data sets have shown that the proposed method is applicable and performs high accuracy values.

Big Scholarly Data: A Survey

Article

Full-text available

Jan 2017

With the rapid growth of digital publishing, harvesting, managing, and analyzing scholarly information have become increasingly challenging. The term Big Scholarly Data is coined for the rapidly growing scholarly data, which contains information including millions of authors, papers, citations, figures, tables, as well as scholarly networks and digital libraries. Nowadays, various scholarly data can be easily accessed and powerful data analysis technologies are being developed, which enable us to look into science itself with a new angle. In this paper, we examine the background and state of the art of big scholarly data. We first introduce the background of scholarly data management and relevant technologies. Secondly, we review data analysis methods, such as statistical analysis, social network analysis, and content analysis for dealing with big scholarly data. Finally, we look into representative research issues in this area, including scientific impact evaluation, academic recommendation, and expert finding. For each issue, the background, main challenges, and latest research are covered. These discussions aim to provide a general overview and big picture to scholars interested in this emerging area. This survey paper concludes with a discussion of open issues and promising future directions.

AVER: Random Walk Based Academic Venue Recommendation

Conference Paper

Full-text available

May 2015

Academic venues act as the main platform of communities in academia and the bridge of connecting researchers, which have rapidly developed in recent years. However, information overload in big scholarly data creates tremendous challenges for mining useful and effective information in order to recommend researchers to acknowledge high quality and fruitful academic venues, thereby enabling them to participate in relevant academic conferences as well as contributing to important/influential journals. In this work, we propose AVER, a novel random walk based Academic VEnue Recommendation model. AVER runs a random walk with restart model on a co-publication network which contains two kinds of associations, coauthor relations and author-venue relations. Moreover, we define a transfer matrix with bias to drive the random walk by exploiting three academic factors, co-publication frequency, weight of relations and researchers' academic level. AVER is inspired from the fact that researchers are more likely to contact those who have high co-publication frequency and similar academic levels. Additionally, in AVER, we consider the difference of weights between two kinds of associations. We conduct extensive experiments on DBLP data set in order to evaluate the performance of AVER. The results demonstrate that, in comparison to relevant baseline approaches, AVER performs better in terms of precision, recall and F1.

Recommender System for Academic Literature with Incremental Dataset

Article

Full-text available

Dec 2016

On account of the colossal expansion in the size of research paper repository, the stature of Recommender System has increased, as it can guide the researchers to find papers akin to them from this vast collection. Furthermore, the recommendation methods like collaborative-filtering or content-based do not allow the user's to provide their personalized requirements explicitly; hence the focus is shifted towards the customized Recommender Systems that can scrutinize user's preferences by contemplating their inputs. But the state-of-art recommendation techniques satisfying user's personalized requirements make a strong assumption of static dataset. So, in this work we are going to present a customized Recommender System that can acknowledge the ever growing nature of research paper repository. To accomplish this, the Efficient Incremental High-Utility Itemset Mining algorithm (EIHI), which has been recently introduced in the literature, is used which is specialized to work with dynamic datasets. Experimental results prove that the proposed system satisfies the researcher's personalized requirements and at the same time handles the incremental nature of the research paper repository efficiently.

Analyzing Social Relations for Recommending Academic Conferences

Conference Paper

Full-text available

Jul 2016

Recommender systems are used to filter through vast amounts of items and recommend those that potentially have the highest relevance for the user. Recently, research dealing with recommendations in academia increased. In this paper, we analyze to what extent social relations from existing data can be utilized to generate academic conference recommendations. We design and implement a social recommender system and show how, without the need for explicit ratings, viable recommendations can be made, while at the same time reducing the cost of kNN-neighborhood selection.

Scientific Article Recommendation: Exploiting Common Author Relations and Historical Preferences

Article

Full-text available

Jun 2016

Scientific article recommender systems are playing an increasingly important role for researchers in retrieving scientific articles of interest in the coming era of big scholarly data. Most existing studies have designed unified methods for all target researchers and hence the same algorithms are run to generate recommendations for all researchers no matter which situations they are in. However, different researchers may have their own features and there might be corresponding methods for them resulting in better recommendations. In this paper, we propose a novel recommendation method which incorporates information on common author relations between articles (i.e., two articles with the same author(s)). The rationale underlying our method is that researchers often search articles published by the same author(s). Since not all researchers have such author-based search patterns, we present two features, which are defined based on information about pairwise articles with common author relations and frequently appeared authors, to determine target researchers for recommendation. Extensive experiments we performed on a real-world dataset demonstrate that the defined features are effective to determine relevant target researchers and the proposed method generates more accurate recommendations for relevant researchers when compared to a Baseline method.

Modeling User Exposure in Recommendation

Conference Paper

Apr 2016

Collaborative filtering analyzes user preferences for items (e.g., books, movies, restaurants, academic papers) by exploiting the similarity patterns across users. In implicit feedback settings, all the items, including the ones that a user did not consume, are taken into consideration. But this assumption does not accord with the common sense understanding that users have a limited scope and awareness of items. For example, a user might not have heard of a certain paper, or might live too far away from a restaurant to experience it. In the language of causal analysis (Imbens & Rubin, 2015), the assignment mechanism (i.e., the items that a user is exposed to) is a latent variable that may change for various user/item combinations. In this paper, we propose a new probabilistic approach that directly incorporates user exposure to items into collaborative filtering. The exposure is modeled as a latent variable and the model infers its value from data. In doing so, we recover one of the most successful state-of-the-art approaches as a special case of our model (Hu et al. 2008), and provide a plug-in method for conditioning exposure on various forms of exposure covariates (e.g., topics in text, venue locations). We show that our scalable inference algorithm outperforms existing benchmarks in four different domains both with and without exposure covariates.

A Markov chain model for the search time for max degree nodes in a graph using a biased random walk

Conference Paper

Mar 2016

Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions

Article

Jul 2005
IEEE T KNOWL DATA EN

This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches. This paper also describes various limitations of current recommendation methods and discusses possible extensions that can improve recommendation capabilities and make recommender systems applicable to an even broader range of applications. These extensions include, among others, an improvement of understanding of users and items, incorporation of the contextual information into the recommendation process, support for multicriteria ratings, and a provision of more flexible and less intrusive types of recommendations.

A Recommendation System Based on Hierarchical Clustering of an Article-Level Citation Network

Article

Jun 2016

The scholarly literature is expanding at a rate that necessitates intelligent algorithms for search and navigation.For the most part, the problem of delivering scholarly articles has been solved. If one knows the title of an article, locating it requires little effort and, paywalls permitting, acquiring a digital copy has become trivial. However, the navigational aspect of scientific search - finding relevant, influential articles that one does not know exist - is in its early development. In this paper, we introduce EigenfactorRecommends - a citation-based method for improving scholarly navigation. The algorithm uses the hierarchical structure of scientific knowledge, making possible multiple scales of relevance for different users. We implement the method and generate more than 300 million recommendations from more than 35 million articles from various bibliographic databases including the AMiner dataset. We find little overlap with co-citation, another well-known citation recommender, which indicates potential complementarity. In an online A-B comparison using SSRN, we find that our approach performs as well as co-citation, but this new approach offers much larger recommendation coverage. We make the code and recommendations freely available at babel.eigenfactor.organd provide an API for others to use for implementing and comparing the recommendations on their own platforms.

Recommendation in Academia: A joint multi-relational model

Conference Paper

Aug 2014

PAVE: Personalized Academic Venue recommendation Exploiting co-publication networks

Abstract and Figures

Recommended publications

Turbulence, Turbulence Suppression, and Velocity Shear in the Helimak

Measuring the Impact of Welfare Benefits on Welfare Durations: State Stratified Partial Likelihood a...

DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

In-run bias self-calibration for low-cost MEMS vibratory gyroscopes