Figure 1 - uploaded by Niloy Ganguly
Content may be subject to copyright.
(a) Median Average citation (MAC) versus M EAT . M EAT values are bucketed into 12 bins of equal size with range(1, 498.8).(b) MAC versus SRI and (c) MAC versus RADI. For both (b) and (c), the x-axis values are bucketed by values corresponding to (≥ 0 and < 0.1), (≥ 0.1 and < 0.2) and so on.

(a) Median Average citation (MAC) versus M EAT . M EAT values are bucketed into 12 bins of equal size with range(1, 498.8).(b) MAC versus SRI and (c) MAC versus RADI. For both (b) and (c), the x-axis values are bucketed by values corresponding to (≥ 0 and < 0.1), (≥ 0.1 and < 0.2) and so on.

Source publication
Conference Paper
Full-text available
Peer-review system has long been relied upon for bringing quality research to the notice of the scientific community and also preventing flawed research from entering into the literature. The need for the peer-review system has often been debated as in numerous cases it has failed in its task and in most of these cases editors and the reviewers wer...

Similar publications

Preprint
Full-text available
The search for and management of external funding now occupies much valuable researcher time. Whilst funding is essential for some types of research and beneficial for others, it may also constrain academic choice and creativity. Thus, it is important to assess whether it is ever detrimental or unnecessary. Here we investigate whether funded resear...
Article
Full-text available
Diversity in human capital is widely seen as critical to creating holistic and high quality research, especially in areas that engage with diverse cultures, environments, and challenges. Quantifying diverse academic collaborations and its effect on research quality is lacking, especially at international scale and across different domains. Here, we...
Preprint
Full-text available
A review of the various global (or hemispheric) millennial temperature reconstructions was carried out. Unlike previous reviews, technical analyses presented via internet blogs were considered in addition to the conventional peer-reviewed literature. There was a remarkable consistency between all of the reconstructions in identifying three climatic...
Article
Full-text available
In many countries research evaluations confer high importance to mainstream journals, which are considered to publish excellent research. Accordingly, research evaluation policies discourage publications in non-mainstream journals under the assumption that they publish low quality research. This approach has prompted a policy debate in low and midd...

Citations

... Several studies have that the lack of fairness in the peer review process has a major impact on which papers are accepted to conferences and journals [41]. Reviewers tend to accept the papers whose authors have the same gender and are from the same region [33]. ...
Chapter
Full-text available
To prevent potential bias in the paper review and selection process for conferences and journals, most include double blind review. Despite this, studies show that bias still exists. This implicit bias may persist when recommendation algorithms for paper selection are employed instead. To address this, we describe three fair algorithms that specifically take into account author diversity in paper recommendation. In contrast to fair algorithms that only take into account one protected variable, our methods provide fair outcomes across multiple protected variables concurrently. Five demographic characteristics—gender, ethnicity, career stage, university rank, and geolocation—are included in our multidimensional author profiles. The Overall Diversity approach uses a score for overall diversity to rank publications. The Round Robin Diversity technique chooses papers from authors who are members of each protected group in turn, whereas the Multifaceted Diversity method chooses papers that initially fill the demographic feature with the highest importance. By selecting papers from a pool of SIGCHI 2017, DIS 2017, and IUI 2017 papers, we recommend papers for SIGCHI 2017 and evaluate these algorithms using user profiles with Boolean and Continuous-valued attributes. We contrast the papers that were suggested with those that were approved by the conference. We find that utilizing profiles with either Boolean or Continuous feature values, all three techniques boost diversity while experiencing just a slight decrease in paper quality or no decrease at all. Our best-performing algorithm, Multifaceted Diversity, recommends a set of papers that achieve demographic parity with the demographics of pool of authors of all submitted papers. Compared to the actual authors of the papers actually selected for the conference, the recommended paper authors are 42.50% more diverse and actually achieve a 2.45% boost in paper quality as measured by the h-index. Our approach could be applied to reduce bias in the selection of grant proposals, conference papers, journal articles, and other academic duties.KeywordsUser profilingPaper recommendationDiversity and fairness
... Several studies have that the lack of fairness in the peer review process has a major impact on which papers are accepted to conferences and journals [41]. Reviewers tend to accept the papers whose authors have the same gender and are from the same region [33]. ...
Preprint
Full-text available
To prevent potential bias in the paper review and selection process for conferences and journals, most include double blind review. Despite this, studies show that bias still exists. Recommendation algorithms for paper review also may have implicit bias. We offer three fair methods that specifically take into account author diversity in paper recommendation to address this. Our methods provide fair outcomes across many protected variables concurrently, in contrast to typical fair algorithms that only use one protected variable. Five demographic characteristics-gender, ethnicity, career stage, university rank, and geolocation-are included in our multidimensional author profiles. The Overall Diversity approach uses a score for overall diversity to rank publications. The Round Robin Diversity technique chooses papers from authors who are members of each protected group in turn, whereas the Multifaceted Diversity method chooses papers that initially fill the demographic feature with the highest importance. We compare the effectiveness of author diversity profiles based on Boolean and continuous-valued features. By selecting papers from a pool of SIGCHI 2017, DIS 2017, and IUI 2017 papers, we recommend papers for SIGCHI 2017 and evaluate these algorithms using the user profiles. We contrast the papers that were recommended with those that were selected by the conference. We find that utilizing profiles with either Boolean or continuous feature values, all three techniques boost diversity while just slightly decreasing utility or not decreasing. By choosing authors who are 42.50% more diverse and with a 2.45% boost in utility, our best technique, Multifaceted Diversity, suggests a set of papers that match demographic parity. The selection of grant proposals, conference papers, journal articles, and other academic duties might all use this strategy.
... Quality peer review system helps authors to improve themselves. There are lots of debates on the quality [11] and bias 7 in a peer review system [7,9,12]. Jefferson et al. [11] investigated the quality of editorial peer review. ...
... They also studied whether the peer review system can be improved. In [12], the authors investigated anomalies in a peer review system. They computed different features from the editor and the reviewer information available. ...
... In absence of a public domain peer review dataset with sufficient datapoints, most works on peer review before 2017 have been on private datasets limiting the number of works. In a case study on peer review data of Journal of High Energy Physics, Sikdar et al. [19] has identified key features to cluster anomalous editors and reviewers whose works are often not aligned with the goal of the peer review process in general. Sikdar et al. [20] show that specific characteristics of the assigned reviewers influence the long-term citation profile of a paper more than the linguistic features of the review or the author characteristics. ...
Conference Paper
Full-text available
Scientific papers are complex and understanding the usefulness of these papers requires prior knowledge. Peer reviews are comments on a paper provided by designated experts on that field and hold a substantial amount of information, not only for the editors and chairs to make the final decision, but also to judge the potential impact of the paper. In this paper, we propose to use aspect-based sentiment analysis of scientific reviews to be able to extract useful information, which correlates well with the accept/reject decision. While working on a dataset of close to 8k reviews from ICLR, one of the top conferences in the field of machine learning, we use an active learning framework to build a training dataset for aspect prediction, which is further used to obtain the aspects and sentiments for the entire dataset. We show that the distribution of aspect-based sentiments obtained from a review is significantly different for accepted and rejected papers. We use the aspect sentiments from these reviews to make an intriguing observation, certain aspects present in a paper and discussed in the review strongly determine the final recommendation. As a second objective, we quantify the extent of disagreement among the reviewers refereeing a paper. We also investigate the extent of disagreement between the reviewers and the chair and find that the inter-reviewer disagreement may have a link to the disagreement with the chair. One of the most interesting observations from this study is that reviews, where the reviewer score and the aspect sentiments extracted from the review text written by the reviewer are consistent, are also more likely to be concurrent with the chair's decision.
... Quality peer review system helps authors to improve themselves. There are lots of debates on the quality [11] and bias 7 in a peer review system [7,9,12]. Jefferson et al. [11] investigated the quality of editorial peer review. ...
... They also studied whether the peer review system can be improved. In [12], the authors investigated anomalies in a peer review system. They computed different features from the editor and the reviewer information available. ...
Preprint
Full-text available
New researchers are usually very curious about the recipe that could accelerate the chances of their paper getting accepted in a reputed forum (journal/conference). In search of such a recipe, we investigate the profile and peer review text of authors whose papers almost always get accepted at a venue (Journal of High Energy Physics in our current work). We find authors with high acceptance rate are likely to have a high number of citations, high $h$-index, higher number of collaborators etc. We notice that they receive relatively lengthy and positive reviews for their papers. In addition, we also construct three networks -- co-reviewer, co-citation and collaboration network and study the network-centric features and intra- and inter-category edge interactions. We find that the authors with high acceptance rate are more `central' in these networks; the volume of intra- and inter-category interactions are also drastically different for the authors with high acceptance rate compared to the other authors. Finally, using the above set of features, we train standard machine learning models (random forest, XGBoost) and obtain very high class wise precision and recall. In a followup discussion we also narrate how apart from the author characteristics, the peer-review system might itself have a role in propelling the distinction among the different categories which could lead to potential discrimination and unfairness and calls for further investigation by the system admins.
... In absence of a public domain peer review dataset with sufficient datapoints, most works on peer review before 2017 have been on private datasets limiting the number of works. In a case study on peer review data of Journal of High Energy Physics, Sikdar et al. [19] has identified key features to cluster anomalous editors and reviewers whose works are often not aligned with the goal of the peer review process in general. Sikdar et al. [20] show that specific characteristics of the assigned reviewers influence the long-term citation profile of a paper more than the linguistic features of the review or the author characteristics. ...
Preprint
Full-text available
Scientific papers are complex and understanding the usefulness of these papers requires prior knowledge. Peer reviews are comments on a paper provided by designated experts on that field and hold a substantial amount of information, not only for the editors and chairs to make the final decision, but also to judge the potential impact of the paper. In this paper, we propose to use aspect-based sentiment analysis of scientific reviews to be able to extract useful information, which correlates well with the accept/reject decision. While working on a dataset of close to 8k reviews from ICLR, one of the top conferences in the field of machine learning, we use an active learning framework to build a training dataset for aspect prediction, which is further used to obtain the aspects and sentiments for the entire dataset. We show that the distribution of aspect-based sentiments obtained from a review is significantly different for accepted and rejected papers. We use the aspect sentiments from these reviews to make an intriguing observation, certain aspects present in a paper and discussed in the review strongly determine the final recommendation. As a second objective, we quantify the extent of disagreement among the reviewers refereeing a paper. We also investigate the extent of disagreement between the reviewers and the chair and find that the inter-reviewer disagreement may have a link to the disagreement with the chair. One of the most interesting observations from this study is that reviews, where the reviewer score and the aspect sentiments extracted from the review text written by the reviewer are consistent, are also more likely to be concurrent with the chair's decision.
... As pointed out in [22], time (in number of days) could be considered as an indicator of a reviewer's performance and henceforth an indicator for long-term citation of the paper. As proof of concept we segregate the papers based on the assigned reviewer's time till last assignment and calculate the mean citation (refer to Fig. 12a). ...
Article
Full-text available
The importance and the need for the peer-review system is highly debated in the academic community, and recently there has been a growing consensus to completely get rid of it. This is one of the steps in the publication pipeline that usually requires the publishing house to invest a significant portion of their budget in order to ensure quality editing and reviewing of the submissions received. Therefore, a very pertinent question is if at all such investments are worth making. To answer this question, in this paper, we perform a rigorous measurement study on a massive dataset (29k papers with 70k distinct review reports) to unfold the detailed characteristics of the peer-review process considering the three most important entities of this process—(i) the paper (ii) the authors and (iii) the referees and thereby identify different factors related to these three entities which can be leveraged to predict the long-term impact of a submitted paper. These features when plugged into a regression model achieve a high \(R^2\) of 0.85 and RMSE of 0.39. Analysis of feature importance indicates that reviewer- and author-related features are most indicative of long-term impact of a paper. We believe that our framework could definitely be utilized in assisting editors to decide the fate of a paper.
... In addition, the reviews received by authors having higher acceptance to submission ratio tend to contain more positive sentiments on average. Reviewers: In a previous work [26] the authors showed that the reviewers who tend to accept or reject most of the papers assigned to them fail to correctly judge the quality of the papers. We include such history based features of the reviewers in the set of supporting features. ...
... The success of the peer-review process is immensely dependent on the reviewers as they determine the quality of a paper and, consequently, the quality of the journal. We hence investigate certain reviewer behaviors (pointed out in [26]) that could be indicative of his/her performance. ...
... In lines of [26] we consider the time (in number of days) since the last assignment for the assigned reviewer as an indicator of reviewer's performance and hence an indicator of the long-term citation of the paper. To verify our hypothesis we segregate the papers based on the assigned reviewer's time till last assignment and calculate the mean citation. ...
Article
Full-text available
A `peer-review system' in the context of judging research contributions, is one of the prime steps undertaken to ensure the quality of the submissions received, a significant portion of the publishing budget is spent towards successful completion of the peer-review by the publication houses. Nevertheless, the scientific community is largely reaching a consensus that peer-review system, although indispensable, is nonetheless flawed. A very pertinent question therefore is "could this system be improved?". In this paper, we attempt to present an answer to this question by considering a massive dataset of around $29k$ papers with roughly $70k$ distinct review reports together consisting of $12m$ lines of review text from the Journal of High Energy Physics (JHEP) between 1997 and 2015. In specific, we introduce a novel \textit{reviewer-reviewer interaction network} (an edge exists between two reviewers if they were assigned by the same editor) and show that surprisingly the simple structural properties of this network such as degree, clustering coefficient, centrality (closeness, betweenness etc.) serve as strong predictors of the long-term citations (i.e., the overall scientific impact) of a submitted paper. These features, when plugged in a regression model, alone achieves a high $R^2$ of \0.79 and a low $RMSE$ of 0.496 in predicting the long-term citations. In addition, we also design a set of supporting features built from the basic characteristics of the submitted papers, the authors and the referees (e.g., the popularity of the submitting author, the acceptance rate history of a referee, the linguistic properties laden in the text of the review reports etc.), which further results in overall improvement with $R^2$ of 0.81 and $RMSE$ of 0.46.