(a) Median Average citation (MAC) versus M EAT . M EAT values are bucketed into 12 bins of equal size with range(1, 498.8).(b) MAC versus SRI and (c) MAC versus RADI. For both (b) and (c), the x-axis values are bucketed by values corresponding to (≥ 0 and < 0.1), (≥ 0.1 and < 0.2) and so on.

Source publication

Anomalies in the Peer-review System: A Case Study of the Journal of High Energy Physics

Conference Paper

Full-text available

Aug 2016

Peer-review system has long been relied upon for bringing quality research to the notice of the scientific community and also preventing flawed research from entering into the literature. The need for the peer-review system has often been debated as in numerous cases it has failed in its task and in most of these cases editors and the reviewers wer...

Figure 1. The results of manual checks of random samples of REF2021...

Figure 3. The average quality score of REF2021 journal articles by...

Figure 7. The average quality score of REF2021 journal articles by...

Figure 8. The average quality score of REF2021 journal articles by...

Figure 9. The average quality score of REF2021 journal articles by...

Is Research Funding Always Beneficial? A Cross-Disciplinary Analysis of UK Research 2014-20

Preprint

Full-text available

Dec 2022

The search for and management of external funding now occupies much valuable researcher time. Whilst funding is essential for some types of research and beneficial for others, it may also constrain academic choice and creativity. Thus, it is important to assess whether it is ever detrimental or unnecessary. Here we investigate whether funded resear...

Figure 1: Diversity analysis of coauthorship networks. In (a) we plot...

Figure 2: Headline results showing that diverse collaborations lead to...

Figure 3: Relationship between average weighted airport network...

Figure 4: Binned comparison of (a) average weighted airport network...

Figure 5: Piecewise linear estimation of the relationship between...

Impact of Geographic Diversity on Citation of Collaborative Research

Article

Full-text available

Feb 2023

Diversity in human capital is widely seen as critical to creating holistic and high quality research, especially in areas that engage with diverse cultures, environments, and challenges. Quantifying diverse academic collaborations and its effect on research quality is lacking, especially at international scale and across different domains. Here, we...

Global temperature changes of the last millennium (Open Peer Review Journal)

Preprint

Full-text available

Jan 2014

A review of the various global (or hemispheric) millennial temperature reconstructions was carried out. Unlike previous reviews, technical analyses presented via internet blogs were considered in addition to the conventional peer-reviewed literature. There was a remarkable consistency between all of the reconstructions in identifying three climatic...

Why Researchers Publish in Non-Mainstream Journals: Training, Knowledge Bridging, and Gap Filling

Article

Full-text available

Dec 2016

In many countries research evaluations confer high importance to mainstream journals, which are considered to publish excellent research. Accordingly, research evaluation policies discourage publications in non-mainstream journals under the assumption that they publish low quality research. This approach has prompted a policy debate in low and midd...

Multidimensional Fairness in Paper Recommendation

Chapter

Full-text available

Jul 2023

To prevent potential bias in the paper review and selection process for conferences and journals, most include double blind review. Despite this, studies show that bias still exists. This implicit bias may persist when recommendation algorithms for paper selection are employed instead. To address this, we describe three fair algorithms that specifically take into account author diversity in paper recommendation. In contrast to fair algorithms that only take into account one protected variable, our methods provide fair outcomes across multiple protected variables concurrently. Five demographic characteristics—gender, ethnicity, career stage, university rank, and geolocation—are included in our multidimensional author profiles. The Overall Diversity approach uses a score for overall diversity to rank publications. The Round Robin Diversity technique chooses papers from authors who are members of each protected group in turn, whereas the Multifaceted Diversity method chooses papers that initially fill the demographic feature with the highest importance. By selecting papers from a pool of SIGCHI 2017, DIS 2017, and IUI 2017 papers, we recommend papers for SIGCHI 2017 and evaluate these algorithms using user profiles with Boolean and Continuous-valued attributes. We contrast the papers that were suggested with those that were approved by the conference. We find that utilizing profiles with either Boolean or Continuous feature values, all three techniques boost diversity while experiencing just a slight decrease in paper quality or no decrease at all. Our best-performing algorithm, Multifaceted Diversity, recommends a set of papers that achieve demographic parity with the demographics of pool of authors of all submitted papers. Compared to the actual authors of the papers actually selected for the conference, the recommended paper authors are 42.50% more diverse and actually achieve a 2.45% boost in paper quality as measured by the h-index. Our approach could be applied to reduce bias in the selection of grant proposals, conference papers, journal articles, and other academic duties.KeywordsUser profilingPaper recommendationDiversity and fairness

Multidimensional Fairness in Paper Recommendation

Preprint

Full-text available

May 2023

To prevent potential bias in the paper review and selection process for conferences and journals, most include double blind review. Despite this, studies show that bias still exists. Recommendation algorithms for paper review also may have implicit bias. We offer three fair methods that specifically take into account author diversity in paper recommendation to address this. Our methods provide fair outcomes across many protected variables concurrently, in contrast to typical fair algorithms that only use one protected variable. Five demographic characteristics-gender, ethnicity, career stage, university rank, and geolocation-are included in our multidimensional author profiles. The Overall Diversity approach uses a score for overall diversity to rank publications. The Round Robin Diversity technique chooses papers from authors who are members of each protected group in turn, whereas the Multifaceted Diversity method chooses papers that initially fill the demographic feature with the highest importance. We compare the effectiveness of author diversity profiles based on Boolean and continuous-valued features. By selecting papers from a pool of SIGCHI 2017, DIS 2017, and IUI 2017 papers, we recommend papers for SIGCHI 2017 and evaluate these algorithms using the user profiles. We contrast the papers that were recommended with those that were selected by the conference. We find that utilizing profiles with either Boolean or continuous feature values, all three techniques boost diversity while just slightly decreasing utility or not decreasing. By choosing authors who are 42.50% more diverse and with a 2.45% boost in utility, our best technique, Multifaceted Diversity, suggests a set of papers that match demographic parity. The selection of grant proposals, conference papers, journal articles, and other academic duties might all use this strategy.

Characterising Authors on the Extent of their Paper Acceptance: A Case Study of the Journal of High Energy Physics

Conference Paper

Full-text available

Aug 2020

Aspect-based Sentiment Analysis of Scientific Reviews

Conference Paper

Full-text available

Aug 2020

Scientific papers are complex and understanding the usefulness of these papers requires prior knowledge. Peer reviews are comments on a paper provided by designated experts on that field and hold a substantial amount of information, not only for the editors and chairs to make the final decision, but also to judge the potential impact of the paper. In this paper, we propose to use aspect-based sentiment analysis of scientific reviews to be able to extract useful information, which correlates well with the accept/reject decision. While working on a dataset of close to 8k reviews from ICLR, one of the top conferences in the field of machine learning, we use an active learning framework to build a training dataset for aspect prediction, which is further used to obtain the aspects and sentiments for the entire dataset. We show that the distribution of aspect-based sentiments obtained from a review is significantly different for accepted and rejected papers. We use the aspect sentiments from these reviews to make an intriguing observation, certain aspects present in a paper and discussed in the review strongly determine the final recommendation. As a second objective, we quantify the extent of disagreement among the reviewers refereeing a paper. We also investigate the extent of disagreement between the reviewers and the chair and find that the inter-reviewer disagreement may have a link to the disagreement with the chair. One of the most interesting observations from this study is that reviews, where the reviewer score and the aspect sentiments extracted from the review text written by the reviewer are consistent, are also more likely to be concurrent with the chair's decision.

Characterising authors on the extent of their paper acceptance: A case study of the Journal of High Energy Physics

Preprint

Full-text available

Jun 2020

New researchers are usually very curious about the recipe that could accelerate the chances of their paper getting accepted in a reputed forum (journal/conference). In search of such a recipe, we investigate the profile and peer review text of authors whose papers almost always get accepted at a venue (Journal of High Energy Physics in our current work). We find authors with high acceptance rate are likely to have a high number of citations, high $h$-index, higher number of collaborators etc. We notice that they receive relatively lengthy and positive reviews for their papers. In addition, we also construct three networks -- co-reviewer, co-citation and collaboration network and study the network-centric features and intra- and inter-category edge interactions. We find that the authors with high acceptance rate are more `central' in these networks; the volume of intra- and inter-category interactions are also drastically different for the authors with high acceptance rate compared to the other authors. Finally, using the above set of features, we train standard machine learning models (random forest, XGBoost) and obtain very high class wise precision and recall. In a followup discussion we also narrate how apart from the author characteristics, the peer-review system might itself have a role in propelling the distinction among the different categories which could lead to potential discrimination and unfairness and calls for further investigation by the system admins.

Aspect-based Sentiment Analysis of Scientific Reviews

Preprint

Full-text available

Jun 2020

On the effectiveness of the scientific peer-review system: a case study of the Journal of High Energy Physics

Article

Full-text available

Jun 2020
Int J Digit Libr

The importance and the need for the peer-review system is highly debated in the academic community, and recently there has been a growing consensus to completely get rid of it. This is one of the steps in the publication pipeline that usually requires the publishing house to invest a significant portion of their budget in order to ensure quality editing and reviewing of the submissions received. Therefore, a very pertinent question is if at all such investments are worth making. To answer this question, in this paper, we perform a rigorous measurement study on a massive dataset (29k papers with 70k distinct review reports) to unfold the detailed characteristics of the peer-review process considering the three most important entities of this process—(i) the paper (ii) the authors and (iii) the referees and thereby identify different factors related to these three entities which can be leveraged to predict the long-term impact of a submitted paper. These features when plugged into a regression model achieve a high $R^2$ of 0.85 and RMSE of 0.39. Analysis of feature importance indicates that reviewer- and author-related features are most indicative of long-term impact of a paper. We believe that our framework could definitely be utilized in assisting editors to decide the fate of a paper.

Influence of Reviewer Interaction Network on Long-term Citations: A Case Study of the Scientific Peer-Review System of the Journal of High Energy Physics

Article

Full-text available

May 2017

A `peer-review system' in the context of judging research contributions, is one of the prime steps undertaken to ensure the quality of the submissions received, a significant portion of the publishing budget is spent towards successful completion of the peer-review by the publication houses. Nevertheless, the scientific community is largely reaching a consensus that peer-review system, although indispensable, is nonetheless flawed. A very pertinent question therefore is "could this system be improved?". In this paper, we attempt to present an answer to this question by considering a massive dataset of around $29k$ papers with roughly $70k$ distinct review reports together consisting of $12m$ lines of review text from the Journal of High Energy Physics (JHEP) between 1997 and 2015. In specific, we introduce a novel \textit{reviewer-reviewer interaction network} (an edge exists between two reviewers if they were assigned by the same editor) and show that surprisingly the simple structural properties of this network such as degree, clustering coefficient, centrality (closeness, betweenness etc.) serve as strong predictors of the long-term citations (i.e., the overall scientific impact) of a submitted paper. These features, when plugged in a regression model, alone achieves a high $R^2$ of \0.79 and a low $RMSE$ of 0.496 in predicting the long-term citations. In addition, we also design a set of supporting features built from the basic characteristics of the submitted papers, the authors and the referees (e.g., the popularity of the submitting author, the acceptance rate history of a referee, the linguistic properties laden in the text of the review reports etc.), which further results in overall improvement with $R^2$ of 0.81 and $RMSE$ of 0.46.

Influence of Reviewer Interaction Network on Long-Term Citations: A Case Study of the Scientific Peer-Review System of the Journal of High Energy Physics

Conference Paper

Jun 2017

(a) Median Average citation (MAC) versus M EAT . M EAT values are bucketed into 12 bins of equal size with range(1, 498.8).(b) MAC versus SRI and (c) MAC versus RADI. For both (b) and (c), the x-axis values are bucketed by values corresponding to (≥ 0 and < 0.1), (≥ 0.1 and < 0.2) and so on.

Similar publications

Citations