ArticlePDF Available

Over-optimization of academic publishing metrics: Observing Goodhart's Law in action

Authors:

Abstract and Figures

Background The academic publishing world is changing significantly, with ever-growing numbers of publications each year and shifting publishing patterns. However, the metrics used to measure academic success, such as the number of publications, citation number, and impact factor, have not changed for decades. Moreover, recent studies indicate that these metrics have become targets and follow Goodhart’s Law, according to which, “when a measure becomes a target, it ceases to be a good measure.” Results In this study, we analyzed >120 million papers to examine how the academic publishing world has evolved over the last century, with a deeper look into the specific field of biology. Our study shows that the validity of citation-based measures is being compromised and their usefulness is lessening. In particular, the number of publications has ceased to be a good metric as a result of longer author lists, shorter papers, and surging publication numbers. Citation-based metrics, such citation number and h-index, are likewise affected by the flood of papers, self-citations, and lengthy reference lists. Measures such as a journal’s impact factor have also ceased to be good metrics due to the soaring numbers of papers that are published in top journals, particularly from the same pool of authors. Moreover, by analyzing properties of >2,600 research fields, we observed that citation-based metrics are not beneficial for comparing researchers in different fields, or even in the same department. Conclusions Academic publishing has changed considerably; now we need to reconsider how we measure success.
Content may be subject to copyright.
GigaScience, 8, 2019, 1–20
doi: 10.1093/gigascience/giz053
Research
RESEARCH
Over-optimization of academic publishing metrics:
observing Goodhart’s Law in action
Michael Fire 1,*and Carlos Guestrin 2,*
1Software and Information Systems Engineering Department, Ben-Gurion University, Be’er Sheva 84105, Israel
and 2Paul G. Allen School of Computer Science & Engineering, University of Washington, Stevens Way NE,
Seattle, WA 98195, USA
Correspondence address. Michael Fire, Software and Information Systems Engineering Department, Ben-Gurion University, Be’er Sheva 84105, Israel.
Tel : +972-8-6461111; E-mail: micky@gmail.com http://orcid.org/0000-0002-6075-2568, Carlos Guestrin, Paul G. Allen School of Computer Science &
Engineering, University of Washington, Stevens Way NE, Seattle, WA 98195, USA.
E-mail: guestrin@cs.washington.edu http://orcid.org/0000-0001-6348-5939
Abstract
Background The academic publishing world is changing signicantly, with ever-growing numbers of publications each year
and shifting publishing patterns. However, the metrics used to measure academic success, such as the number of
publications, citation number, and impact factor, have not changed for decades. Moreover, recent studies indicate that
these metrics have become targets and follow Goodhart’s Law, according to which, “when a measure becomes a target, it
ceases to be a good measure.” Results In this study, we analyzed >120 million papers to examine how the academic
publishing world has evolved over the last century, with a deeper look into the specic eld of biology. Our study shows that
the validity of citation-based measures is being compromised and their usefulness is lessening. In particular, the number of
publications has ceased to be a good metric as a result of longer author lists, shorter papers, and surging publication
numbers. Citation-based metrics, such citation number and h-index, are likewise affected by the ood of papers,
self-citations, and lengthy reference lists. Measures such as a journal’s impact factor have also ceased to be good metrics
due to the soaring numbers of papers that are published in top journals, particularly from the same pool of authors.
Moreover, by analyzing properties of >2,600 research elds, we observed that citation-based metrics are not benecial for
comparing researchers in different elds, or even in the same department. Conclusions Academic publishing has changed
considerably; now we need to reconsider how we measure success.
Keywords: science of science; scientometrics; Goodhart’s Law; data science; big data; academic publishing metrics
Introduction
In the past century, the academic publishing world has changed
drastically in volume and velocity [1].Thevolumeofpapershas
increased sharply from <1 million papers published in 1980 to
>7 million papers published in 2014 [2]. Furthermore, the speed
at which researchers can share and publish their studies has in-
creased signicantly. Today’s researchers can publish not only in
an ever-growing number of traditional venues, such as confer-
ences and journals, but also in electronic preprint repositories
and in mega-journals that provide rapid publication times [1,3].
Along with the exponential increase in the quantity of pub-
lished papers, the number of ranked scientic journals has in-
creased to >34,000 active peer-reviewed journals in 2014 [1], and
the number of published researchers has soared [4]. As part of
this escalation, metrics such as the number of papers, number of
citations, impact factor, h-index, and altmetrics are being used
to compare the impact of papers, researchers, journals, and uni-
versities [5–8]. Using quantitative metrics to rank researchers
Received: 19 November 2018; Revised: 30 January 2019; Accepted: 12 April 2019
C
The Author(s) 2019. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons
Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium,
provided the original work is properly cited.
1
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
2Over-optimization of academic publishing metrics
contributes to a hypercompetitive research environment, which
is changing academic culture—and not in a positive direction [9].
Studies suggest that publication patterns have changed as a
result of Goodhart’s Law, according to which, “When a measure
becomes a target, it ceases to be a good measure” [9,10]. Good-
hart’s Law, and the closely related Campbell’s Law [11], inuence
many systems in our everyday life, including educational [11],
biological [12], and other decision-making systems [13,14]. As
an example, Goodhart’s Law can be found operating in the New
York Police Department’s manipulation of crime reports (the
“measure”) in order to improve crime statistics (the “target”) [15].
Another example is found in the educational system, revealing
that when “test scores become the goal of the teaching process,
they both lose their value as indicators of educational status and
distort the educational process in undesirable ways” [11]. One
more example can be found in the eld of medicine, where the
National Health Service in the UK sets incentives (pay for per-
formance) for primary care physicians to improve the quality of
care. Indeed, “they found the measures improved for diabetes
and asthma care in the rst years of the program. These im-
provements were on the basis of care reported in the medical
records but not necessarily on care provided. The main effect
of this pay-for-performance program may be to promote better
recording of care rather than better care.” [16]
Recent studies indicate that when measures become tar-
gets in academic publishing, the effectiveness of the measures
can be compromised, and unwelcome and unethical behaviors
may develop, such as salami publications [17], ghost author-
ships [18], p-hacking [19], metrics manipulation [20], faking re-
search data [21], faking of peer reviews [22], and even plagiariz-
ing by a peer reviewer [23].
If the inuence of Goodhart’s Law on academia is indeed sig-
nicant, then it should be possible to observe that academic en-
tities, such as researchers and journals, will over-optimize their
own measures to achieve a desired target. Similar to the con-
sequences of making test scores a target, chasing after certain
measures in the academic publishing world to gain advantage in
the battle of “impact or perish” [10] can have undesirable effects.
Certainly, newer academic publishing metrics have emerged
that are less biased [20,24–27], and these may thwart the trend
of measures becoming targets. Yet, the traditional metrics retain
a strong hold on the overall academic system, and they are still
widely used for ranking purposes [28,29].
In the present study, our main goal was to utilize new ad-
vances in data science tools to perform an in-depth and precise
bottom-up analysis of academic publishing over the decades.
Our comprehensive analysis ranged from micro to macro levels
as we studied individual researchers’ behaviors as well as be-
havioral changes within large research domains. Additionally,
we wanted to uncover how and whether Goodhart’s Law has
changed academic publishing, with an in-depth look at trends
within biology and genetics.
Our study was greatly inuenced by a recent study by Ed-
wards and Roy [9], who observed that academia has become a
hypercompetitive environment that can lead to unethical be-
haviors. The driving force behind such behaviors is an effort
to manipulate the metrics that measure the research’s impact
solely to increase the quantitative measures (and hence the sta-
tus) of the research.
To achieve our research goals, we developed an open
source code framework to analyze data from several large-scale
datasets containing >120 million publications, with 528 million
references and 35 million authors, since the beginning of the
19th century (see Results of Author Trends section). This pro-
vided a precise and full picture of how the academic publishing
world has evolved.
The objective of our study was to use this huge quantity of
data to examine the validity of commonly used citation-based
metrics for academic publishing. Specically, we wanted to see
whether Goodhart’s Law applied: are researchers focusing too much
on simply attaining certain target metrics at the expense of high-
quality, relevant research?
The remainder of the paper is organized as follows: in the
Background section, we provide an overview of related studies.
In the Data Description section, we present the datasets used
in this study, and in the Analyses section, we describe the algo-
rithms and experiments used to analyze the study’s data. In the
Results, Discussion, and Conclusions sections, we present and
discuss our results and offer our conclusions from the present
study.
Background
This research is a large-scale scientometrics study (also referred
to as the “science of science” [30]). Scientometrics is the study of
quantitative features and characteristics of scientic research.
In this section, we present studies that analyze changes in aca-
demic publications in recent years (see the Changes in Publica-
tion Trends section), and we provide an overview of common
metrics that measure the impact of published papers (see the
Success Metrics and Citation Trends section).
Changes in publication trends
One prevalent and increasing trend is to publish papers in
preprint repositories, such as arXiv, bioRxiv, and Research Papers
in Economics (RePEc) [1]. For example, the use of arXiv surged
from 4,275 submitted papers in September 2006 to 11,973 papers
in November 2018 [31]. Additionally, >1 million papers are now
downloaded from bioRxiv every month [32]. Another current
trend is to publish papers in mega-journals, such as PLoS One
and Nature’s Scientic Reports. Mega-journals are a new type of
scientic journal that publishes peer-reviewed, open-access ar-
ticles, where the articles have been reviewed for scientic trust-
worthiness but not for scientic merit. Mega-journals acceler-
ate review and publication times to 35 months and usually
have high acceptance rates of >50% [3]. In the rst quarter of
2017, >11,000 papers were published in PLoS One and Scientic
Reports [33].
Another observable trend is that more and more papers are
written by hundreds or even thousands of authors. This phe-
nomenon is known as hyperauthorship [34]orauthorina-
tion [35] and is common across research elds, where the ma-
jority of papers with >1,000 authors are produced in the phys-
ical sciences [36]. For example, the recent Laser Interferome-
ter Gravitational-Wave Observatory paper [37] listed >1,000 au-
thors [38]. Robert Aboukhalil measured this trend [39] and dis-
covered that the mean number of authors of academic papers
has increased sharply since the beginning of the 20th century.
Recently, Steven Kelly observed an unexpected increase in the
mean number of authors of papers in the biological sciences [4].
While papers’ mean number of authors has increased over
time, not all the authors have signicantly contributed to the
paper. In addition, honorary and ghost authors are prevalent.
Wislar et al. found such evidence in biomedical journals [40],
and similar ndings were observed by Kennedy et al. [41]
and by Vera-Badillo et al. [42]. The Economist recently pub-
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 3
lished an article titled “Why research papers have so many
authors” [43].
Lewison and Hartley [44] analyzed how papers’ titles have
changed over time. They discovered that titles’ lengths have
been increasing, along with the percentage of titles containing
colons. Additionally, Gwilym Lockwood observed that “articles
with positively-framed titles, interesting phrasing, and no word-
play get more attention online” [45].
In addition to paper title lengths increasing, Ucar et al. have
found lengthening reference lists for engineering journal arti-
cles, such as those published in Biomedical Engineering and Infor-
mation Theory [46].
Additionally, many studies have focused on how publication
trends have changed over time, often focusing on specic ge-
ographical areas, demographic characteristics, specic research
domains, or specic journals. For example, G´
alvez et al. [47]used
the Science Citation Index to understand publication patterns in
the developing world. Jagsi et al. [48] studied the gender gap in
authorship of academic medical literature over 35 years. They
discovered that the percentage of rst and last authors who
were women increased from 5.9% and 3.7% in 1970 to 29.3%
and 19.3%, respectively, in 2004. Johnson et al. [49] studied pub-
lication trends in top-tier journals of higher education. Peter
Aldhous analyzed publications in the National Academy of Sci-
ences (PNAS [Proceedings of the National Academy of Sciences of the
United States of America]) journal to consider the inuence of an
“old boys’ club” mentality [50]. In 2009, Porter and Rafols [51]
used bibliometric indicators alongside a new index of interdis-
ciplinarity to measure how the degree of interdisciplinarity has
changed between 1975 and 2005 for 6 research domains. Porter
and Rafols’ ndings suggest that “science is indeed becoming
more interdisciplinary, but in small steps.”
In 2016, Fanelli and Larivi`
ere [52] analyzed the publication
patterns of >40,000 researchers for more than a century. They
observed that for researchers in their early career, both the to-
tal number of papers and the mean number of collaborators in-
creased over time. Fanelli and Larivi `
ere also observed that when
the publication rate was adjusted to account for co-authorship,
then “the publication rate of scientists in all disciplines has not
increased overall, and has actually mostly declined” [52]. In 2017,
Dong et al. [53] used a dataset consisting of 89 million publica-
tions to study the evolution of scientic development over the
past century. In their study, Dong et al. examined trends in col-
laborations, citations, and impact. From the collaboration per-
spective, Dong et al. observed that “the average length of a pub-
lication’s author list tripled between 1900 and 2015, suggest-
ing an increasingly collaborative scientic process.” From ana-
lyzing citation patterns, they observed a sharp increase in the
number of references over time, where in recent years, on av-
erage, papers reference 30 other papers. From the perspective
of impact and innovations, Dong et al. observed “diversica-
tion of scientic development across the planet over the past
century” [53]. While both our study and that of Dong et al. use
the Microsoft Academic Graph (MAG) dataset, Dong et al. fo-
cused on the advancement of science and the globalization of
scientic collaborations, citations, and innovations. Our study’s
primary goal was to perform an in-depth analysis of how the
world of academic publishing has evolved over the decades.
Moreover, we used additional large-scale datasets (see the Data
Description section) to fully examine how academic publish-
ing has evolved, investigating both micro trends (trends in the
structure of papers) and macro trends (trends within research
elds).
Success metrics and citation trends
Over the years, various metrics have been proposed to mea-
sure papers, journal importance, and authors’ impact. One of the
most straightforward and commonly used measures is to sim-
ply count the researcher’s number of publications. Another com-
mon metric is the citation number, either of a particular paper or
the total citations received by all the author’s papers. However,
not all citations are equal [54]. Moreover, different research elds
have different citation metrics, and therefore comparing them
creates a problem: “The purpose of comparing citation records
is to discriminate between scientists” [55].
One of the best-known and most-used measures to evalu-
ate journals’ importance is the impact factor, devised >60 years
ago by Eugene Gareld [7]. The impact factor measures the fre-
quency in which an average article in a journal has been cited in
a specic year. Over time, the measure has been used to “eval-
uate institutions, scientic research, entire journals, and indi-
vidual articles” [56]. Another common metric to measure a re-
searcher’s output or a journal’s impact is the h-index, which
measures an author’s or a journal’s number of papers that have
at least hcitations each [6]. It has been shown that the h-index
can predict academic achievements [57].
The above measures have been the standard for measuring
academic publishing success. According to recent studies, and
following Goodhart’s Law, these metrics have now become tar-
gets, ripe for manipulation [9,10,58]. All types of manipula-
tive methods are used, such as increasing the number of self-
citations [20], increasing the number of publications by slic-
ing studies into the smallest quantum acceptable for publica-
tion [59], indexing false papers [60], and merging papers on
Google Scholar [61]. Indeed, a recent study by Fong and Wil-
hite [58], which used data from >12,000 responses to a series of
surveys sent to >110,000 scholars from 18 different disciplines,
discovered “widespread misattribution in publications and in re-
search proposals.” Fong and Wilhite’s ndings revealed that the
majority of researchers disapprove of this type of metric manip-
ulation, yet many feel pressured to participate; other researchers
blandly state “that it is just the way the game is played” [58].
While many of the aforementioned measures are easy to
compute, they fail to consider the added contribution that is
generally provided by the rst and last authors. This issue be-
comes more cardinal with a sharply increasing number of pa-
pers with hundreds of co-authors. For example, “the h-index
does not work well in the eld of life sciences, where an author’s
position on a paper typically depends on the author’s contribu-
tion” [25]. To tackle this issue, various measures such as the c-
index [24] and revised h-index [25] have been suggested. These
measures give higher weights to authors according to the co-
author order.
To overcome other shortcomings of commonly used mea-
sures, other alternative measures have been suggested. For ex-
ample, the q-index [20] and w-index [26] are alternatives to the
h-index. Likewise, the SCImago Journal Rank (SJR indicator) [62]
and simple citation distributions [63] are offered as alterna-
tives to the impact factor. Additional measures that normalize
citation-based indicators using a paper’s eld of study and year
of publication have also been suggested, and these are being
used by several institutions [27].
Senior employees at several leading science publishers called
upon journals to refrain from using the impact factor and sug-
gested replacing it with simple citation distributions [63,64].
Similarly, the altmetric [65] was proposed as an alternative met-
ric to the impact factor and h-index. The altmetric [66]isagen-
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
4Over-optimization of academic publishing metrics
eralization of article-level metrics and considers other aspects
of the impact of the work, such as the number of downloads,
article views, mentions in social media, and more. The altmet-
ric measure has gained in popularity in recent years, and several
large publishers have started providing this metric to their read-
ers. Additionally, Semantic Scholar [67] offers various measures
to judge papers and researchers’ inuence. A thorough report
regarding potential uses and limitations of metrics was written
by Wilsdon et al. [8]. Additionally, an overview of the changing
scholarly landscape can be found in the study by Roemer and
Borchardt [5].
Even with their many known shortcomings [8,24,55,68–
70], measures such as the impact factor, citation number, and
h-index are still widely used. For example, the Journal Cita-
tion Reports publishes annual rankings based on journals’ im-
pact factors, and it continues to be widely followed [29]. As an-
other example, the widely used Google Scholar web search en-
gine [71] calculates the h-index and total number of citations of
researchers, as well as journals’ h-index, to rank journals and
conferences [28].
Data Description
The Microsoft Academic Graph dataset
In this study we primarily used the MAG [72], which was released
as part of the 2016 KDD Cup [73]. The large-scale MAG dataset
contains scientic publication records of >120 million papers,
along with citation relationships among those publications as
well as relationships among authors, institutions, journals, con-
ferences, and elds of study. In addition, the MAG dataset con-
tains every author’s sequence number for each paper’s author
list. Furthermore, the dataset contains links between a publi-
cation and the eld or elds of study to which it belongs. The
elds of study are organized in hierarchical rankings with 4 lev-
els, L0L3, where L0 is the highest level, such as a research eld
of computer science, and L3 is the lowest level, such as a re-
search eld of decision tree [2,73]. Since its publication, the
MAG dataset has gained increasing popularity among scholars
who utilize the dataset for scientometric studies [74]. An in-
depth overview of the MAG dataset properties was presented by
Herrmannova and Knoth [2]. According to their analysis of the
MAG dataset, the 5 top elds of study—based on the number of
papers—are physics, computer science, engineering, chemistry,
and biology, with the number of papers ranging from slightly
<15 million in biology to >20 million in physics [2].
Even though the MAG dataset contains papers that were pub-
lished through 2016, we wanted to use years in which the data
were the most comprehensive, so we focused our analysis on
120.7 million papers that were published through the end of
2014. Furthermore, we noted that the dataset contains many pa-
pers that are news items, response letters, comments, and so
forth. Even though these items are important, they can affect a
correct understanding of the underlying trends in scientic pub-
lications. Therefore, we focused our research on a dataset sub-
set, which consists of >22 million papers. This subset contains
only papers that have a Digital Object Identier (DOI) and 5 ref-
erences. Additionally, while calculating various authors’ proper-
ties, we primarily considered only the 22.4 million authors with
unique author ID values in the selected papers’ subset. (Identi-
fying all the papers by the same author (also known as author
disambiguation [75]) is a challenging task. The MAG dataset pro-
vides a unique author ID for names that were matched to be the
same individual. Recently, Microsoft Academic published a post
titled “How Microsoft Academic uses knowledge to address the
problem of conation/disambiguation,” which explains how Mi-
crosoft Academic performs author disambiguation [76].)
The AMiner dataset
The AMiner open academic graph dataset [77]containsdata
from >154 million papers. The dataset contains various pa-
pers’ attributes, such as titles, keywords, abstracts, venues, lan-
guages, and ISSNs. In our study, we primarily used the AMiner
dataset to analyze papers’ abstracts, to estimate papers’ lengths,
and to compare results with those obtained using the MAG
dataset in order to validate the existence of observed patterns
in both datasets. The AMiner is a relatively new dataset, and we
are among the rst to use it for a scientometric study.
The SCImago Journal Ranking dataset
To better understand trends in journal publications, we used
the SCImago Journal Ranking (SJR) open dataset [78,79]. This
dataset contains details of >23,000 journals with unique names
between 1999 and 2016. For each journal, the SJR dataset con-
tains the journal’s SJR value, the number of published papers,
the h-index, and the number of citations in each year. Addition-
ally, the SJR dataset contains the best quartile, ranked from Q1
to Q4, of each journal. Journal quartiles are determined by the
value of the boundary at the 25th, 50th, and 75th percentiles
of an ordered distribution of the SJR indicator. Then, journals
ranked Q1, Q2, Q3, and Q4 reect the top 25%, 2550%, 5075%,
and the bottom 25% of the distribution of the SJR indicator, re-
spectively. The quartile rank is typically used to compare and
rank journals within a given subject category.
The Join dataset
To match the MAG journal IDs with their correlated various rank-
ing measures, such as h-index and SJR, we joined all 3 datasets
in the following manner: rst, we joined the MAG and AMiner
datasets by matching unique DOI values. Then, we matched
ISSN values between the MAG-AMiner joined dataset with the
SJR dataset.
Analyses
Analysis of publication trends
We used our developed code framework (see the Methods sec-
tion) to explore how papers, authors, journals, and research
elds have evolved over time. In the following subsections, we
describe the specic calculations that were performed. More-
over, our Supplementary Materials section includes the precise
code implementations that were used to obtain most of our re-
sults and to create the gures presented throughout the present
study.
Paper trends
To explore how the quantity and structure of academic papers
have changed over time, we performed the following: rst, we
calculated how many papers were published in the MAG dataset
every year. Then, we utilized the pycld2 package [80] to detect
the language of each paper’s title and calculated the number of
papers in each language. Next, we calculated the following paper
features over time:
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 5
rMean number of words in titles and mean number of charac-
ters per word (for papers with English titles)
rPercentage of titles that used question or exclamation marks
(for papers with English titles)
rMean number of authors
rPercentage of papers in which authors appear in alphabetical
order
rMean number of words in abstracts
rMean number of keywords
rMean number of references
rLength of papers
In addition, we utilized the papers with existing eld-of-
research values, matching the papers to their corresponding
elds in order to identify each paper’s top-level (L0) research
eld. Using the top-level data, we were able to estimate the num-
ber of multidisciplinary papers that had >1 L0 research eld.
Afterwards, we calculated the percentage and total number of
papers with no citations after 5 years, as well as the overall per-
centage of papers with self-citations over time.
(we selected only papers having English titles and abstracts,
existing author lists, references, and valid lengths; in addition,
we checked whether the paper’s title contained question or ex-
clamation marks) and were published between 1990 and 2009.
Using the selected papers, we calculated the Spearman correla-
tions among the title lengths, author numbers, reference num-
bers, overall lengths, and number of citations after 5 years. The
results of the above-described calculations are presented in the
Results of Paper Trends section. Moreover, the code implemen-
tation is provided in the “Part III - A: Analyzing Changing Trends
in Academia - Paper Trends” Jupyter Notebook (see the Availabil-
ity of Source Code and Requirements section).
Author trends
To study how authors’ behaviors and characteristics have
changed, we performed the following: rst, we calculated how
the number of new authors has changed over time. Second, for
all authors who published their rst paper after 1950, we divided
the authors into groups according to each author’s academic
birth decade, i.e., the decade in which an author published his
or her rst paper. Next, for each group of authors with the same
academic birth decade, we analyzed the following features:
rMean number of papers the authors in each group published
nyears after they began their careers, for n[0, 30]. We per-
formed these group calculations taking into account all pa-
pers, as well as only papers with 5 references
rMean number of conference and journal papers each group
published nyears after they began their careers, for n[0,
30]
rMean number of co-authors each group had nyears after they
began their careers, for n[0, 30]
rAuthors’ median sequence number each group had nyears
after they began their careers, for n[0, 60]. Additionally,
we calculated the mean percentage of times the authors in
each group were rst authors
The results of the above-described calculations are presented
in the Results of Author Trends section. Moreover, the code im-
plementation is provided in the “Part III - B: Analyzing Changing
Trends in Academia - Author Trends” Jupyter Notebook (see the
Availability of Source Code and Requirements section).
Journal trends
To investigate how journal publication trends have changed over
time, we used the SJR dataset to calculate the following features
between 1999 and 2016:
rNumber of journals with unique journal IDs that were active
in each year
rNumber of new journals that were published each year
rMean and maximal number of papers in each journal
Additionally, we utilized the SJR dataset to calculate how the
journals’ best quartile, mean h-index, mean SJR, and mean ci-
tation number [Citation Number/Documents Number (2 years)]
metrics changed between 1999 and 2016.
Furthermore, we selected the 40 journals with the highest SJR
values in 2016 and matched them to their corresponding journal
IDs in the MAG dataset by matching each journal’s ISSN and ex-
act name in the MAG-AMiner joined dataset. (The top journal
name was compared to the journal’s name in the MAG dataset.)
Using this method, we identied 30 unique journal IDs in the
MAG dataset that published 110,825 papers with 5 references.
Then, for the matching journal IDs, we calculated the following
features over time, for all papers that were published in the se-
lected top journals:
rFirst and last authors’ mean career age
rPercentage of papers in which the rst author had previously
published in one of the top journals
rPercentage of papers in which the last author had previously
published in one of the top journals
The results of the above-described calculations are presented
in the Results of Journal Trends section. Moreover, the code im-
plementation is provided in the “Part III - C: Analyzing Changing
Trends in Academia - Journal Trends” Jupyter Notebook (see the
Availability of Source Code and Requirements section).
Additionally, for >8,400 journals with 100 published papers
with 5 references, we calculated the following features over
time:
rNumber of papers
rNumber of authors
rTop keywords in a specic year
rFirst/last/all authors’ mean or median academic age
rMean length of papers
rPercentage of returning rst/last/all authors, i.e., those who
had published 1 prior paper in the journal
We developed a website with an interactive interface, which
visualizes how the above features changed for each journal (see
the Availability of Supporting Data and Materials section).
Field-of-research trends
We utilized the MAG dataset eld-of-study values and the hi-
erarchical relationship between various elds to match papers
to their research elds in various levels (L0L3). Then, for each
eld of study in its highest hierarchical level (L0), we calculated
the following features over time: number of papers, number of
authors, number of references, and mean number of citations
after 5 years. Next, we focused on the eld of biology, which is in
the L0 level. For all the L1 subelds of biology, we repeated the
same feature calculations as in the previous step. Afterwards,
we focused on genetics. For all the L2 subelds of genetics, we
repeated the same feature calculations as in the previous step.
Additionally, to better understand the differences in citation
patterns of various elds of research, we performed the follow-
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
6Over-optimization of academic publishing metrics
Tab le 1: L3 Fields-of-Study Features in 2009
Parent eld of
study Field of study name
Citations after 5 years
No. of papers
Mean No. of
authors
Median Maximum
Engineering Structural material 61.0 1,250 174 6.14
Biology Genetic recombination 50.5 451 196 6.07
Biology Nature 48.0 5,660 4,162 6.28
Biology microRNA 47.0 3,076 1,691 6.24
Biology Induced pluripotent stem cell... 39.0 987 213 6.53
Economics Signalling 39.0 695 1,030 5.87
Biology Genome evolution 35.5 392 140 5.04
Biology Non-coding RNA 35.0 1,414 375 5.39
Biology Post-transcriptional modication... 34.0 1,414 315 5.49
Biology Autophagy 34.0 789 381 5.71
Mathematics Finite impulse response 2.0 167 337 3.00
Computer
science
Pixel 2.0 380 2,484 3.27
Computer
science
Ontology 2.0 616 733 3.35
Computer
science
Mesh networking 2.0 62 274 3.43
Computer
science
Camera resectioning 2.0 43 114 3.13
Computer
science
Session initiation protocol... 2.0 116 100 3.60
Chemistry Gallium 2.0 73 484 3.43
Mathematics Presentation of a group 2.0 91 706 3.22
Mathematics Spiral 2.0 80 122 3.65
Mathematics Block code 2.0 54 281 2.83
ing: for each eld of study with 100 papers published in 2009,
we calculated the following features using only papers that were
published in 2009 and had 5 references:
rNumber of papers
rNumber of authors
rMedian and mean number of citations after 5 years
rMaximal number of citations after 5 years
The full features of >2,600 L3 elds of study are presented in
Tab le 1.
The results of the above-described calculations are presented
in the Results of Fields-of-Research Trends section. Moreover,
the code implementation is provided in the “Part III - D: Ana-
lyzing Changing Trends in Academia - Research Fields” Jupyter
Notebook (see the Availability of Source Code and Requirements
section).
Results
In the following subsections, we present all the results for the
experiments that were described in the Analysis of Publication
Trends section. Additional results are presented in the Supple-
mentary Materials.
Results of paper trends
In recent years there has been a surge in the number of pub-
lished academic papers, with >7 million new papers each year
and >1.8 million papers with 5 references (see Fig. 1).(There
is a decline in the number of papers after 2014, probably due
to missing papers in the MAG dataset, which was released in
2016.) Additionally, by analyzing the language of the papers’ ti-
tles, we observed an increase in papers with non-English titles
(see Fig. 2).
As described in the Paper Trends section, we analyzed how
various properties of academic papers have changed over time
to better understand how papers’ structures have evolved. In
this analysis, we discovered that papers’ titles became longer,
from a mean of 8.71 words in 1900 to a mean of 11.83 words in
2014 (see Fig. 3). Moreover, the mean number of characters per ti-
tle word increased from 5.95 in 1900 to 6.6 in 2014 (see Fig. 3). Ad-
ditionally, we observed that in recent years the percentage of pa-
pers with question or exclamation marks in their titles increased
sharply, from <1% of all papers in 1950 to >3% of all papers in
2013 (see Fig. S2). Furthermore, the use of interrobangs (repre-
sented by ?! or !?) in titles also increased sharply, from 0.0005%
in 1950 to 0.0037% in 2013 (see Fig. S2).
Figure 1: The number of papers over time.The total number of papers has surged
exponentially over the years.
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 7
Chinese
French
German
Japanese
Korean
Polish
Portuguese
Russian
Spanish
Title Language
Papers with Top-9 Non-English Titles over Time
1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Year
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
180,000
200,000
220,000
240,000
260,000
280,000
300,000
320,000
340,000
Number of Papers
Figure 2: Papers with titles in the top 9 non-English languages. An increasing
number of papers have non-English titles.
We explored how the number and order of the author list
has changed over time. The number of authors for papers with
5 references more than tripled over the years, from a mean of
1.41 authors to a mean of 4.51 authors per paper between 1900
and 2014 (see Fig. S3). Also, the maximal number of authors for
a single paper in each year increased sharply over time, espe-
cially in recent years (see Fig. S4). In fact, some recent papers
actually listed >3,000 authors. Moreover, we observed that the
percentage of author lists ordered alphabetically decreased in
recent years, from 43.5% of all papers published in 1950 to 21.0%
of all papers published in 2014 (see Fig. S5). Furthermore, we dis-
covered that with a higher number of authors, it is less likely that
the author list will be ordered alphabetically (see Fig. 4). For ex-
ample, in 2014 only 1% of papers with 6 authors were ordered
alphabetically.
When calculating how the abstracts of papers have changed
over time, we discovered that the abstract length increased from
a mean of 116.3 words in 1970 to a mean of 179.8 words in 2014
(see Fig. S6). Moreover, with each decade since 1950, the distri-
butions shifted to the right, showing that papers with longer ab-
stracts of 400 and even 500 words have become more common
over time (see Fig. 5). Additionally, we analyzed how the number
of keywords in papers has changed. We discovered that both the
number of papers containing keywords increased, as well as the
mean number of keywords per paper (see Fig. S7).
By estimating the percentage and number of multidisci-
plinary papers over time, we discovered an increase in the num-
ber of multidisciplinary papers until 2010, followed by a sharp
decrease (see Figs 6and S8). After performing further analysis,
we believe the decline in the number of multidisciplinary papers
is a result of papers with missing keywords in the MAG dataset,
such as papers that were published in PLoS One. These papers
have dynamically changing keywords in the online version but
not in the ofine version.
By examining how the number of references has changed
over time, we observed a sharp increase in the mean number of
references per paper (see Fig. S9). In addition, by analyzing the
reference number distributions grouped by publishing decade,
we can observe that higher numbers of references have become
increasingly common. For example, in 1960 few papers had >20
references, but by 2010 many papers had >20 references, and
some >40 references (see Fig. S10).
We also examined how self-citation trends have changed,
and we observed that both the total number of self-citations and
the percentage of papers with self-citations increased substan-
tially (see Fig. S12). Also, the mean number of self-citations per
paper, as well as the maximal number of self-citations in each
year, increased sharply (see Fig. 7). For example, 3.67% of all
papers in 1950 contained 1 self-citation, while 8.29% contained
self-citations in 2014 (see Fig. S12). Moreover, the maximal num-
ber of self-citations in a single paper increased sharply from 10
self-citations in a paper published in 1950 to >250 self-citations
in a paper published in 2013 (see Fig. 7).
By using the AMiner dataset to analyze how papers’ lengths
have changed, we discovered that the mean and median length
of papers decreased over time (see Fig. 8). The mean length of
a paper was 14.4, 10.1, and 8.4 pages in 1950, 1990, and 2014,
respectively.
By analyzing citation patterns over time, we discovered that
the percentage of papers with no citations other than self-
citations after 5 years decreased (see Fig. 9). Nevertheless, 72.1%
of all papers published in 2009, and 25.6% of those with 5 ref-
erences, were still without any citations after 5 years (see Fig. 9).
Moreover, the total number of papers without any citations in-
creased sharply (see Fig. S11).
Additionally, by analyzing the citation distributions of papers
published in different decades, we discovered that citation dis-
tributions changed notably over time (see Fig. 10).
Last, using the properties of >3.29 million papers published
between 1950 and 2009, we discovered positive correlations
among the papers’ citation numbers after 5 years and the follow-
ing features: (i) title lengths (τs=0.1), (ii) number of authors (τs
=0.22), (iii) abstract lengths (τs=0.26), (iv) number of keywords
(τs=0.15), (v) number of references (τs=0.48), (vi) paper lengths
(τs=0.13), and (vii) use of question or exclamation marks in the
title (τs=0.022) (see Fig. S13). (Similar correlation values were
obtained by calculating the correlations for papers published in
a specic year.)
Results of author trends
By analyzing the number of new authors each year, we discov-
ered a sharp increase over time, with several million new au-
thors publishing each year in recent years (see Fig. S14). (How-
ever, it is possible that the same author has several MAG author
IDs.) Additionally, when analyzing the trends grouped by the au-
thors’ academic birth decades, we discovered a substantial in-
crease in the mean number of published papers for the later
birth decades (see Fig. 11). For example, researchers who started
their careers in 1950 published on average 1.55 papers in a period
of 10 years, while researchers who started their careers in 2000
published on average 4.05 papers in the same time frame. Fur-
thermore, we observed that authors who started their careers
after 1990 tended to publish more in conferences in the rst
years of their career than their more senior peers who started
their careers in the 1950s or 1970s (see Fig. S15). For example,
researchers who started their careers in the 1970s published
on average 2 conference papers and 1.65 journal papers after
10 years; researchers who started their careers in the 2000s pub-
lished 4 conference papers and 2.59 journal papers in the same
time frame.
We can also observe that the mean number of co-authors
has considerably increased over the decades (Fig. 12). Moreover,
we note that researchers who started their careers in the 1950s
and 1970s had on average only a few co-authors over a period
of 25 years, while researchers who started their careers in the
1990s had >60 co-authors in the same career length of 25 years
(see Fig. 12).
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
8Over-optimization of academic publishing metrics
Figure 3: Mean title length over time. A paper’s mean title length increased from 8.71 words to 11.83 words. Moreover, the mean word length increased from 5.95
characters to 6.6 characters per title word.
Figure 4: Percentage of papers with author lists in alphabetical order, grouped
by the number of authors. The more authors, the less likely the authors will be
listed alphabetically in the byline.
Figure 5: Distribution over time of the number of words in abstracts. Over time,
papers’ abstracts have tended to become longer.
Last, by exploring how author sequence numbers evolved,
we discovered that with seniority, the researchers’ median se-
quence number increased (see Fig. S16). Additionally, with se-
niority, the percentage of published papers with the researcher
listed as the rst author decreased (Fig. 13). Moreover, by look-
ing at the decade researchers started their careers, we can see a
sharp decline in the percentages of rst authors (Fig. 13). Overall,
early-career researchers are publishing more in their careers but
appear as rst authors much less than in previous generations.
Results of journal trends
By analyzing journal trends using the SJR and MAG datasets, we
discovered that the number of journals increased greatly over
the years, with 20,975 active ranked journals in 2016 (Fig. 14).
Furthermore, we observed that hundreds of new ranked jour-
nals were published each year (see Figs S17 and S18). In addi-
tion, we discovered that the number of published papers per
journal increased sharply, from a mean of 74.2 papers in 1999
to a mean of 99.6 papers in 2016 (Fig. 14). We also observed that
in recent years, journals that publish thousands of papers have
become more common. For example, in 2016, according to the
SJR dataset, 197 journals published >1,000 papers each.
By exploring how various metrics have changed over time,
we discovered the following: First, over the past 18 years, the
number of papers published in Q1 and Q2 journals more than
doubled, from 550,109 Q1 papers and 229,373 Q2 papers in 1999
to 1,187,514 Q1 papers and 554,782 Q2 papers in 2016 (Fig. 15).
According to the SJR dataset, in 2016, 51.3% of journal papers
were published in Q1 journals and only 8.66% were published
in Q4 journals. Second, the h-index decreased over recent years
from a mean value of 37.4 and median of 23.0 in 1999 to a mean
value of 31.3 and median of 16 in 2016 (see Fig. S19). Third, we
noted that the SJR and the mean number of citations measures
both increased considerably during the past 18 years (see Figs 16
and S20).
Besides the number of papers in top journals doubling be-
tween 2000 and 2014, the number of authors increased substan-
tially (see Fig. S21). (The total number of authors each year was
determined by summing the number of authors in each pub-
lished paper.) Additionally, by calculating the mean academic
career ages of rst and last authors, we discovered that in recent
years the mean academic age has increased notably (Fig. 17).
Moreover, when looking at rst and last authors who previously
published in one of the selected top-30 journals, we discovered
that over time the percentage of returning authors increased
substantially (see Fig. 18). By 2014, 46.2% of all published pa-
pers in top-30 selected journals were published by last authors
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 9
Figure 6: The number and percentage of multidisciplinary papers over time. Between 1900 and 2010, both the number and percentage of multidisciplinary papers
increased.
Figure 7: The mean and maximal number of self-citations. Both the mean and maximal number of self-citations increased over time.
Figure 8: Paper’s lengths. Both the papers’ mean and median lengths decreased over time. In the right panel, the horizontal line indicates the median, and the box
encloses the interquartile range.
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
10 Over-optimization of academic publishing metrics
Figure 9: Papers with no citations other than self-citations after 5 years. The percentage of papers with no citations after 5 years decreased; nevertheless, 72.1% of all
papers published in 2009 had no citations after 5 years.
Figure 10: Citation distributions over time. The citation distributions of different
decades show notable changes.
Figure 11: Mean number of papers by authors’ academic birth decades. With
each decade, the rate of paper publication has increased.
who had published 1 paper in a top-30 selected journal before
(Fig. 18).
By calculating the number of papers, number of authors, au-
thors’ mean age, and percentage of returning authors in each se-
lected top-30 journal, we observed the following: (i) the number
of published papers per year increased considerably in the vast
majority of the journals (see Fig. S22); (ii) the mean career ages of
last authors in the vast majority of the selected journals consid-
erably increased (see Fig. S23); e.g., in Cell, the last authors’ ca-
Figure 12: Mean number of co-authors by academic birth decade. The mean
number of co-authors has considerably increased over the decades.
Figure 13: Percentage of times researcher was rst author. We can observe that
over time on average the percentage of senior researchers as rst authors de-
clined. Moreover, in the same time intervals, the percentage of times recent
generations of researchers were rst authors declined compared to older gen-
erations.
reer ages increased from 4.5 years in 1980 to 20 years in 2014
(see Fig. S23); and (iii) the percentage of returning authors in the
vast majority of the selected journals increased drastically; e.g.,
in Nature Genetics, in 86.6% of 2014 papers, 1 of the authors had
published in the journal before (see Fig. 19).
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 11
Figure 14: Number of active journals over time. Over a period of 18 years, from 1999 to 2016, both the number of active journals and the papers per journal increased
greatly.
Figure 15: Journals’ quartile number of papers over time. The number of papers
published in Q1 journals has vastly increased.
Results of elds-of-research trends
By matching each paper to its L0 eld of study and analyzing
each eld’s properties, we discovered substantial differences in
these properties. Namely, we observed the following:
rA large variance in the number of published papers in each
eld. For example, 231,756 papers were published in the eld
of biology in 2010, but only 5,684 were published that year in
the eld of history (see Figs 20: and S24).
rA considerable variance in the mean number of paper au-
thors among the various research elds. For example, the
number of authors in 2010 ranged from a mean of 2.28 au-
thors in the eld of political science to a mean of 5.39 authors
in medicine (see Fig. S25).
rA variance in the papers’ mean number of references in dif-
ferent elds. For example, in 2010, the mean reference num-
ber in the elds of material science and engineering was
<24, while in the elds of biology and history it was >33 (see
Fig. S26).
rA big variance in each L0 eld’s mean and median number
of citations after 5 years. For example, for 2009 papers in the
elds of computer science and political science, the median
number of citations 5 years after publication was 4, while in
biology and environmental science, the median was 9 and 13
citations, respectively (Fig. 21).
By repeating the above analysis for the L1 subelds of biology
and for the L2 subelds of genetics, we uncovered similar differ-
ences among elds of study. Namely, we observed the following
for subelds in the same hierarchal level: (i) signicant variance
in the mean number of papers (see Figs S27 and S28), (ii) notable
variance in the mean number of authors (see Figs S29 and S30),
(iii) noteworthy variance in the mean number of references (see
Figs S31 and S32), and (iv) vast variance in median citation num-
bers (see Figs S33 and S34).
Last, by analyzing various features of 2,673 L3 elds of study,
we observed a huge variance in the different properties (see Ta-
ble 1and Fig. S35). For example, several elds of study, such as
gallium (chemistry), ontology (computer science), and presenta-
tion of a group (mathematics), had median citation numbers of
2, while other elds of study, such as microRNA and genetic re-
combination (biology), had median citation numbers of >47 and
50.5, respectively (see Table 1and Fig. S35).
By analyzing the results presented in the Results section, the
following can be noted: rst, we observed that the structure of
academic papers has changed in distinct ways in recent decades.
While the mean overall length of papers has become shorter
(see Fig. 8), the title, abstract, and references have become longer
(see the Results of Paper Trends section and Figs 3,5,S3,S6,S9,
and S10). Also, the number of papers that include keywords has
increased considerably, as has the mean number of keywords
in each paper (see Fig. S7). Furthermore, the mean and median
number of authors per paper has increased sharply (see Figs S3
and S4).
Discussion
Below we discuss 9 aspects of our study that provide insights
into current academic publishing trends, and we explore the po-
tential impact of our results.
First, these results support Goodhart’s Law as it relates to
academic publishing: the measures (e.g., number of papers,
number of citations, h-index, and impact factor) have become
targets, and now they are no longer good measures. By making
papers shorter and collaborating with more authors, researchers
are able to produce more papers in the same amount of time.
Moreover, we observed that the majority of changes in papers’
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
12 Over-optimization of academic publishing metrics
Figure 16: The mean number of citations [Citation Number/Documents Number (2 years)] over time. The mean number of citations values have almost doubled in the
past 18 years; additionally, their distributions have changed considerably.
Figure 17: Top-selected journals’ mean rst and last authors ages. Both the rst
and last authors’ mean ages have increased sharply.
Figure 18: Percentage of papers with returning rst or last authors. The percent-
age of returning rst or last top-journal authors increased considerably.
properties are correlated with papers that receive higher num-
bers of citations (see Fig. S13). Authors can use longer titles
and abstracts, or use question or exclamation marks in titles,
to make their papers more appealing. Thus, more readers are
attracted to the paper, and ideally they will cite it, i.e., academic
clickbait [45]. These results support our hypothesis that the ci-
tation number has become a target. Consequently, the proper-
ties of academic papers have evolved in order to win—to score a
bullseye on the academic target.
Figure 19: Mean percentage of return authors in top-selected journals over time.
In most journals the number of papers with 1 author who previously published
in the journal increased sharply.In many of the selected journals the percentage
of papers with returning authors was >60%, and in some cases >80%.
It is worth noting that while the study’s results provide evi-
dence that many citation-based measures have become targets,
there also may be other factors that inuence academic publi-
cation trends. For example, the academic hypercompetitive en-
vironment itself may prompt an increase in productivity [81],
hence increasing the number of papers. However, this claim con-
tradicts the ndings of Fanelli and Larivi`
ere that researchers’ in-
dividual productivity did not increase in the past century [52].
Nevertheless, it is important to keep in mind that there may
be other underlying factors that contributed to the observed re-
sults.
Second, we observed that over time fewer papers list authors
alphabetically, especially papers with a relatively high number
of authors (see Results of Paper Trends section and Figs 4and S5).
These results may indicate the increased importance of an au-
thor’s sequence number in the author list, which may reect the
author’s contribution to the study. This result is another signal of
the increasing importance of measures that rate an individual’s
research contribution.
Third, from matching papers to their L0 elds of study, we
observed that the number of multidisciplinary papers has in-
creased sharply over time (see Fig. 6). It is important to keep in
mind that these results were obtained by matching keywords to
their corresponding elds of study. Therefore, these results have
several limitations: rst, not all papers contain keywords. Sec-
ond, the dataset may not extract keywords from papers in the
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 13
Figure 20: L0 Fields-of-study number of papers over time. The numbers of papers in each eld of study have increased drastically.
correct manner. For example, we found some papers contained
keywords in their online version but not in their ofine version
(see Results of Paper Trends section). It is also possible that in
some elds it is less common to use keywords. Therefore, the
papers’ keywords may be missing in the datasets, and the pre-
sented results may be an underestimate of the actual number
of multidisciplinary studies. Nevertheless, we observed a strong
trend in increasing numbers of multidisciplinary papers.
Fourth, from seeing sharp increases in both the maximal and
mean number of self-citations (see Results of Paper Trends sec-
tion and Figs 7,9,10, and S12), it is clear that citation numbers
have become a target for some researchers, who cite their own
papers dozens, or even hundreds, of times. Furthermore, we ob-
served a general increasing trend for researchers to cite their
previous work in their new studies. Moreover, from analyzing
the percentage of papers without citations after 5 years, we ob-
served that a huge quantity of papers (>72% of all papers and
25% of all papers with 5 references) have no citations at all (see
Fig. 9). Obviously, many resources are spent on papers with lim-
ited impact. The lack of citations may indicate that researchers
are publishing more papers of poorer quality to boost their to-
tal number of publications. Additionally, by exploring papers’ ci-
tation distributions (see Fig. 10), we can observe that different
decades have very different citation distributions. This result in-
dicates that comparing citation records of researchers who pub-
lished papers during different periods can be challenging.
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
14 Over-optimization of academic publishing metrics
Figure 21: L0 Field-of-study median citation number after 5 years. There is notable variance among the L0 elds-of-study median citation numbers.
Fifth, by exploring trends in authors (see Results of Author
Trends section and Figs 11,12,13, S14, S15, and S16), we observed
an exponential growth in the number of new researchers who
publish papers. We also observed that young career researchers
tend to publish considerably more than researchers in previous
generations, using the same time frames for comparison (see
Fig. 11). Moreover, young career researchers tend to publish their
work much more in conferences in the beginning of their careers
than older researchers did in previous decades (see Fig. S15). We
also observed that young career researchers tend to collaborate
considerably more in the beginning of their careers than those
who are older (see Fig. 12). Furthermore, we see that the mean
percentage of researchers as rst authors early in their career is
considerably less than those in previousg enerations (see Fig. 13).
In addition, authors’ median sequence numbers typically in-
crease over time, and the rate is typically faster for young career
researchers (see Fig. S16). These results emphasize the changes
in academia in recent years. In a culture of “publish or perish,”
researchers publish more by increasing collaboration (and be-
ing added to more author lists) and by publishing more confer-
ence papers than in the past. However, as can be observed by
the overall decline of researchers as rst authors, young career
researchers may be publishing more in their careers but con-
tributing less to each paper. The numbers can be misleading: a
researcher who has 5 “rst author” claims but has published 20
papers may be less of a true contributor than one with 4 “rst
author” claims and 10 published papers.
Sixth, by analyzing journal trends (see Results of Journal
Trends section), we see a rapid increase in the number of ranked
active journals in recent years (see Fig. 14). Moreover, on aver-
age, journals publish more papers than in the past, and dozens
of journals publish >1,000 papers each year (see Figs 14 and S17).
With the increase in the number of active journals, we observed
rapid changes in impact measures: (i) the number of papers pub-
lished in the rst and second quartiles (Q1 and Q2) has increased
sharply, and today the vast majority of papers are published in
these quartiles (see Fig. 15); (ii) the journals’ mean and median
h-index have decreased sharply (see Fig. S18); and (iii) both the
SJR and the mean number of citations have increased consider-
ably (see Figs 16 and S20). With these substantial changes, it is
clear that some measures, such as the use of quartiles and the
h-index, are rapidly losing meaning and value. Moreover, with
the abundance of journals, researchers can “shop around” for a
high-impact journal and submit a rejected paper from one Q1
journal to another Q1 journal, time after time, and then start
the review process again. These repeated reviews for the same
paper waste time, and in the long run the burden of reviewing
papers several times may affect the quality of the reviews.
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 15
There are compelling reasons to change the current system.
We need to think about making all reviews open and online.
We should consider the function of published journals; for that
matter, is it even necessary to have journals in a world with
>20,000 journals that publish hundreds or even thousands of
papers each year? We need to seriously evaluate the measures
we use to judge research work. If all these measures have been
devalued to being merely targets, they are no longer effective
measures. Instead, they should be adapted to meet our current
needs and priorities. Moreover, today there are alternative mea-
sures to evaluate researchers’ contributions and journals’ im-
pacts (see Background section). It would be benecial to the aca-
demic community to promote the use of these measures, while
concurrently raising awareness of the many limitations of the
traditional measures that are still commonly used.
Seventh, by focusing on trends in selected top journals, we
can observe that these journals have changed considerably in
recent years (see Figs 17,18,20, S21, and S22). The number of pa-
pers in the selected journals has increased sharply, along with
the career age of the authors and the percentage of returning
authors. The number of submissions to top journals, such as Na-
ture, have increased greatly in recent years [82]; however, many
of these journals mainly publish papers in which 1oftheau-
thors has previously published in the journal (see Figs 18 and 20).
We believe that this situation is also a result of Goodhart’s Law.
The target is the impact factor, and so researchers are vigorously
seeking journals with high impact factors. Therefore, the yearly
volume of papers sent to these top journals has considerably in-
creased, and, overwhelmed by the volume of submissions, ed-
itors at these journals may choose safety over risk and select
papers written by only well-known, experienced researchers.
Eighth, by analyzing how features evolve in the various L0
elds of study using the MAG dataset, we can observe that dif-
ferent elds have completely different sets of features (see Figs
20,21,19, S25, S26, and Table 1). While some elds have hun-
dreds of thousands of papers published yearly, others have only
thousands published yearly (see Figs 20 and S22). Moreover, sim-
ilar large differences are reected in other examined elds’ fea-
tures, such as the mean number of references and the mean and
median citation numbers (see Figs 21 and S35).
Last, by examining >2,600 research elds of various scales
(see Table 1and Fig. S35), we observed vast diversity in the
properties of papers in different domains: some research do-
mains grew phenomenally while others did not. Even research
domains in the same subelds presented a wide range of
properties, including papers’ number of references and me-
dian number of citations per research eld (see Table 1and
Figs S31, S32, S33, and S34). These results indicate that measures
such as citation number, h-index, and impact factor are useless
when comparing researchers in different elds, and even for
comparing researchers in the same subeld, such as genetics.
These results emphasize that using citation-based measures for
comparing various academic entities is like comparing apples to
oranges, and is to “discriminate between scientists.” [55]. More-
over, using these measures as gauges to compare academic en-
tities can drastically affect the allocation of resources and con-
sequently damage research. For example, to improve their world
ranking, universities might choose to invest in faculty for com-
puter science and biology, rather than faculty for less-cited re-
search elds, such as economics and psychology. Moreover, even
within a department, the selection of new faculty members can
be biased due to using targeted measures, such as citation num-
ber and impact factor. A biology department might hire genetic
researchers in the eld of epigenetics, instead of researchers in
the eld of medical genetics, due to the higher mean number of
citations in the epigenetics eld. Over time, this can unfairly fa-
vor high-citation research elds at the expense of other equally
important elds.
Conclusions
In this study, we performed a large-scale analysis of academic
publishing trends, examining data on >120 million papers and
>20,000 journals. By analyzing this huge dataset, we can observe
that over the past century, especially the past few decades, pub-
lished research has changed considerably, including the num-
bers of papers, authors, and journals; the lengths of papers; and
the mean number of references in specic elds of study (Fig. 22).
While the research environment has changed, many of the
measures to determine the impact of papers, authors, and jour-
nals have not changed. Even with the development of some new
and better measures, the academic publishing world too often
defaults to the traditional measures based on citations, such
as impact factor and citation number, that were used 60 years
ago, in a time before preprint repositories and mega-journals
existed and before academia became such a hypercompetitive
environment. Most important, however, is that these measures
have degenerated into becoming purely targets. Goodhart’s Law
is clearly being illustrated: when a citation-based measure be-
comes the target, the measure itself ceases to be meaningful,
useful, or accurate.
Our study’s extensive analysis of academic publications re-
veals why using citation-based metrics as measures of impact
is wrong at the core: First, not all citations are equal; there is
a big difference between a study that cites a paper that greatly
inuenced it and a study that cites multiple papers with only
minor connections. Many of the impact measures widely used
today do not take into consideration distinctions among the var-
ious types of citations. Second, it is not logical to measure a
paper’s impact based on the citation numbers of other papers
that are published in the same journal. In the academic world,
there are >20,000 journals that publish hundreds or even thou-
sands of papers each year, with papers written by hundreds or
even thousands of authors. It is even less logical to measure a
researcher’s impact based on a paper co-authored with many
other researchers according to the journal in which it is pub-
lished. Third, as we demonstrated in the section, it is wrong to
compare studies from different elds, and even to compare pa-
pers and researchers within the same parent eld of study, due
to the many differences in the median and mean number of ci-
tations in each eld (see Table 1).
As we have revealed in this study, to measure impact
with citation-based measures—that have now become targets—
clearly has many undesirable effects. The number of papers with
limited impact has increased sharply (see Fig. S11), papers may
contain hundreds of self-citations (see Fig. 7), and some top jour-
nals have become “old boys’ clubs” that mainly publish papers
from the same researchers (see Figs 17 and 18). Moreover, us-
ing citation-based measures to compare researchers in differ-
ent elds may have the dangerous effect of allocating more re-
sources to high-citation domains, shortchanging other domains
that are equally important.
We believe the solution to the aforementioned issues is to use
data-science tools and release new and open datasets in order
to promote using existing unbiased measures or to develop new
measures that will more accurately determine a paper’s impact
in a specic research eld. Moreover, it is vital to raise awareness
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
16 Over-optimization of academic publishing metrics
Figure 22: Measuring success in academic publishing.
of the shortcomings of commonly used measures, such as the
number of citations, h-index, and impact factor. Certain metrics
have been proposed, but the key is to wisely and carefully evalu-
ate new measures to ensure that they will not follow Goodhart’s
Law and end up merely as targets. Researchers do valuable work.
Communicating the work to others is vital, and correctly assess-
ing the impact of that work is essential.
Methods
To analyze the above MAG and AMiner large-scale datasets, we
developed an open source framework written in Python, which
provided an easy way to query the datasets. The framework uses
TuriCreate’s SFrame dataframe objects [83] to perform big-data
analysis on tens of millions of records to calculate how vari-
ous properties have changed over time. For example, we used
SFrame objects to analyze how the mean number of authors and
title lengths evolved. However, while SFrame is exceptionally
useful for calculating various statistics using all-papers features,
it is less convenient and less computationally cost-effective for
performing more complicated queries, such as calculating the
mean age of the last authors in a certain journal in a specic
year.
To perform more complex calculations, we loaded the
datasets into the MongoDB database [84]. Next, we developed
a code framework that easily let us obtain information on pa-
pers, authors, paper collections, venues, and research elds. The
framework supports calculating complex features of the speci-
ed object in a straightforward manner. For example, with only
a few and relatively simple lines of Python code, we were able to
calculate the mean number of co-authors per author in a specic
year for authors who started their career in a specic decade. An
overview of our code framework is presented in Fig. S1.
To make our framework accessible to other researchers and
to make this study completely reproducible, we have written
Jupyter Notebook tutorials that demonstrate how the SFrame
and MongoDB collections were constructed from the MAG,
AMiner, and SJR datasets (see Availability of Source Code and
Requirements section and RRID SCR 016958).
Availability of supporting data and materials
An interactive web interface to explore the study’s data is avail-
able at the project’s website. The web interface provides re-
searchers the ability to interactively explore and better under-
stand how various journals’ properties have changed over time
(see Fig. S36 and RRID: SCR 016958). Additionally, the website
contains the Fields-of-Research Features data.
Supporting data and a copy of the code are available from the
GigaScience GigaDB repository [85].
Availability of source code and requirements
One of the main goals of this study was to create an open
source framework, which provided an easy way to query the
datasets. Our code framework, including tutorials, is available
at the project’s website.
rProject name: Science Dynamics
rProject home page: http://sciencedynamics.data4good.io/
rOperating system(s): Platform independent
rProgramming language: Python
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 17
rOther requirements: Python 2.7, MongoDB, TuriCreate
Python Package
rLicense: MIT License
rRRID: SCR 016958
Additional les
Additional Figure S1: Overview of the code framework. The
datasets are loaded into SFrame objects and MongoDB collec-
tions. The SFrame objects are used mainly to obtain general
insights by analyzing tens of millions of papers and author
records. The MongoDB collections are used to construct Paper
and Author objects that can be used to analyze more compli-
cated statistics for specic venues and research elds with usu-
ally hundreds of thousands of records.
Additional Figure S2: Percentage of titles with question or ex-
clamation marks. The percentage of papers with question or ex-
clamation marks in their titles increased over time, as well as the
percentage of titles with interrobangs (represented by ?! or !?).
Additional Figure S3: Mean number of authors over time.
There has been an increase in the mean number of authors, es-
pecially in recent decades.
Additional Figure S4: Maximal number of authors over time.
In recent years the maximal number of authors per paper in-
creased sharply from 520 authors in 2000 to >3,100 authors in
2010.
Additional Figure S5: Percentage of author lists in alphabeti-
cal order. There has been a decline in the number of author lists
organized in alphabetical order.
Additional Figure S6: Mean length of abstracts. Since 1970
there has been an increase in abstracts’ mean number of words.
Additional Figure S7: Keyword trends. Both the number of
papers with keywords has increased, as well as the mean num-
ber of keywords per paper.
Additional Figure S8: Mean number of elds of study over
time. Over time both the mean number of L0 and L1 elds of
study per paper considerably increased. We believe the decrease
in the mean number of L0 and L1 elds is a direct result of the
decrease in the number of papers with keywords in the same
years (see the Results of Paper Trends section).
Additional Figure S9: Mean number of references over time.
Over time, the mean number of references sharply increased.
Additional Figure S10: Distributions over time of references
in papers. Over time, papers with a relatively high number of
references have become more common.
Additional Figure S11: Total number of papers with no cita-
tions after 5 years. The number of papers in this category in-
creased sharply over time.
Additional Figure S12: Total number of self-citations and per-
centage of papers with self-citations. We can observe that over
time both the total number of self-citations as well as the per-
centage of papers with self-citations increased signicantly.
Additional Figure S13: Spearman correlation heat map for
papers’ properties. We can observe positive correlations among
papers’ various structural properties and the papers’ total num-
ber of citations after 5 years.
Additional Figure S14: New authors over time. The number
of authors, with unique MAG author IDs, who published their
rst paper each year.
Additional Figure S15: Authors’ mean number of conference
and journal papers over time. The mean publication rate of both
journal and conference papers increased with every decade.
Additional Figure S16: Authors’ median sequence number
over time. We can see that over time the median sequence num-
bers increased; i.e., senior researchers tend to have higher se-
quence numbers.
Additional Figure S17: Number of journals over time accord-
ing to the MAG dataset. There has been a drastic increase in the
number of journals since the 1960s.
Additional Figure S18: Number of new journals by year. Hun-
dreds of new ranked journals are being published each year.
Additional Figure S19: Journals’ h-index mean and median
values. Over time both the mean and median values of the jour-
nals’ h-index measures decreased.
Additional Figure S20: SJRvaluesovertime.Wecanobserve
that over time both the mean and median SJR values increased.
Additional Figure S21: Top journals’ number of papers and
authors over time. We can observe that both the number of pa-
pers and authors increased sharply in recent years.
Additional Figure S22: Top selected journals’ number of pa-
pers over time. In the vast majority of the selected journals the
number of published papers with 5 references increased con-
siderably over time.
Additional Figure S23: Top selected journals’ mean author
career age over time. In the vast majority of the selected jour-
nals, the mean age of authors, especially last authors, increased
greatly over time.
Additional Figure S24: L0 Fields-of-study number of papers
over time. We can observe the large diversity in the number of
papers published in each L0 research eld.
Additional Figure S25: L0 Fields-of-study mean authors num-
ber. We can observe a variation in the mean number of authors
across the various research elds.
Additional Figure S26: L0 Fields-of-study mean references
numbers. We can observe variance among the reference num-
bers in different elds.
Additional Figure S27: Biology L1-subelds number of papers
over time. We can observe a big variance in the number of papers
over time in the various biology subelds.
Additional Figure S28: Genetics L2-subelds number of pa-
pers over time. We can observe a big variance in the number of
papers over time in the various genetics subelds.
Additional Figure S29: Biology L1-subelds mean number of
authors over time. We can observe a variance in the mean num-
ber of authors over time in the various biology subelds.
Additional Figure S30: Genetics L3-subelds mean number
of authors over time. We can observe a signicant variance in
the mean number of authors over time in the various genetics
subelds.
Additional Figure S31: Biology L1-subelds mean number of
references over time. We can observe a variance in the mean
number of references over time in the various biology subelds.
Additional Figure S32: Genetics L2-subelds mean number
of references over time. We can observe a signicant variance in
the mean number of references over time in the various genetics
subelds.
Additional Figure S33: Biology L1-subelds median number
of 5-year citations over time. We can observe a variance in the
median number of citations over time in the various biology sub-
elds.
Additional Figure S34: Genetics L2-subelds median number
of 5-year citations over time. We can observe a signicant vari-
ance in the median number of citations over time in the various
genetics subelds.
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
18 Over-optimization of academic publishing metrics
Additional Figure S35: L3 Fields-of-study median 5-year cita-
tion distributions by parent elds. We can observe the high vari-
ance among the L3 elds-of-study median citation numbers.
Additional Figure S36: Interactive website. We have devel-
oped an interactive website at http://sciencedynamics.data4goo
d.io/ that makes it possible to view and interact directly with the
study’s data.
Abbreviations
AWS: Amazon Web Services; DOI: Digital Object Identier; ISSN:
International Standard Serial Number; MAG: Microsoft Aca-
demic Graph; PNAS: Proceedings of the National Academy of Sci-
ences of the United States of America; RePEc: Research Papers in
Economics; SJR: SCImago Journal Rank.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
Both M.F. and C.G. conceived the concept of this study and de-
veloped the methodology. M.F. developed the study’s code and
visualization and performed the data computational analysis.
C.G. supervised the research.
Funding
This research was supported by the Washington Research Foun-
dation Fund for Innovation in Data-Intensive Discovery, the
Moore/Sloan Data Science Environments Project at the Univer-
sity of Washington, the Amazon Web Services (AWS)Cloud Cred-
its for Research, and the Microsoft Azure Research Award.
Acknowledgments
First and foremost, we thank the AMiner, Microsoft Academic
Graph, and SJR teams for making their datasets available online.
Additionally, we thank the AWS Cloud Credits for Research. We
also thank the Washington Research Foundation Fund for Inno-
vation in Data-Intensive Discovery, the Moore/Sloan Data Sci-
ence Environments Project at the University of Washington, and
Microsoft Azure Research Award for supporting this study. Fur-
thermore, we wish to thank the reviewers and the editors for
their insightful comments, which we feel have substantially im-
proved our paper. In addition, we wish to thank Lior Rokach and
Yuval Shahar for their helpful suggestions.
We also wish to especially thank Carol Teegarden for edit-
ing and proofreading this article to completion and Sean Mc-
Naughton for designing and illustrating the article’s infographic.
References
1. Ware M, Mabe M. The STM report: An overview of scientic
and scholarly journal publishing. 2015. https://www.stm-as
soc.org/2015 02 20 STM Report 2015.pdf. Accessed 12 May
2019.
2. Herrmannova D, Knoth P. An analysis of the Mi-
crosoft Academic Graph. D-Lib Mag 2016;22(9/10),
doi:10.1045/september2016-herrmannova.
3. Bj ¨
ork BC. Have the “mega-journals” reached the limits to
growth? PeerJ 2015;3:e981.
4. Kelly S. The continuing evolution of publishing in the biolog-
ical sciences. Biol Open 2018;7(8), doi: 10.1242/bio.037325.
5. Roemer RC, Borchardt R. From bibliometrics to altmet-
rics: A changing scholarly landscape. Coll Res Lib News
2012;73(10):596–600.
6. Hirsch JE. An index to quantify an individual’s scientic re-
search output. Proc Natl Acad Sci U S A 2005;102(46):16569–
72.
7. Gareld E. The agony and the ecstasy—the history and
meaning of the journal impact factor. 2005. http://garfield.l
ibrary.upenn.edu/papers/jifchicago2005.pdf.
8. Wilsdon J. The Metric Tide: Independent Review of the Role
of Metrics in Research Assessment and Management. Sage;
2015.
9. Edwards MA, Roy S. Academic research in the 21st cen-
tury: Maintaining scientic integrity in a climate of per-
verse incentives and hypercompetition. Environ Eng Sci
2017;34(1):51–61.
10. Biagioli M. Watch out for cheats in citation game. Nat News
2016;535(7611):201.
11. Campbell DT. Assessing the impact of planned social
change. Eval program plann 1979;2(1):67–90.
12. Newton AC. Implications of Goodhart’s Law for monitoring
global biodiversity loss. Conserv Lett 2011;4(4):264–68.
13. Mizen P. Central Banking, Monetary Theory and Practice: Es-
says in honour of Charles Goodhart, vol. 1. Edward Elgar;
2003.
14. Chrystal KA, Mizen PD, Mizen P. Goodhart’s Law: Its origins,
meaning and implications for monetary policy. . In: Central
Banking, Monetary Theory and Practice: Essays in honour of
Charles Goodhart. Edward Elgar, 2003: 221–43.
15. Francescani C. NYPD report conrms manipulation of crime
stats. Reuters 2012, . https://www.reuters.com/article/us-c
rime-newyork-statistics/nypd-report-confirms-manipulat
ion-of-crime- stats-idUSBRE82818620120309. Accessed 12
May 2019.
16. Kliger AS. Quality measures for dialysis: Time for a balanced
scorecard. Clini J Am Soc Nephrol 2016;11(2):363–68.
17. ˇ
Supak Smolˇ
ci´
c V. Salami publication: Denitions and exam-
ples. Biochem Med (Zagreb) 2013;23(3):237–41.
18. Schofferman J, Wetzel FT, Bono C. Ghost and guest au-
thors: You can’t always trust who you read. Pain Med
2015;16(3):416–20.
19. Head ML, Holman L, Lanfear R et al. The extent
and consequences of p-hacking in science. PLoS Biol
2015;13(3):e1002106.
20. Bartneck C, Kokkelmans S. Detecting h-index manipulation
through self-citation analysis. Scientometrics 2010;87(1):85–
98.
21. Kupferschmidt K. Tide of lies. Science 2018;361(6403):636–41.
22. Haug CJ. Peer-review fraud–hacking the scientic publication
process. N Engl J Med 2015;373(25):2393–95.
23. Dansinger M. Dear plagiarist: A letter to a peer reviewer who
stole and published our manuscript as his own. Ann Intern
Med 2017;166(2):143–143.
24. Post A, Li AY, Dai JB, et al. c-index and subindices of the h-
index: New variants of the h-index to account for variations
in author contribution. Cureus 2018;10(5):e2629.
25. Romanovsky AA. Revised h index for biomedical research.
Cell Cycle 2012;11(22):4118–21.
26. Wu Q. The w-index: A measure to assess scientic impact
by focusing on widely cited papers. J Am Soc Inf Sci Technol
2010;61(3):609–14.
27. Waltman L, van Eck NJ, van Leeuwen TN, et al. Towards a
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
Fire and Guestrin 19
new crown indicator: An empirical analysis. Scientometrics
2011;87(3):467–81.
28. Top publications. Google Scholar.https://scholar.google.com
/citations?view op=top venues. Accessed 13 February 2019.
29. Journal Citation Reports (JCR). https://jcr.clarivate.com/. Ac-
cessed 12 May 2019.
30. Fortunato S, Bergstrom CT, B ¨
orner K, et al. Science of science.
Science 2018;359(6379):eaao0185.
31. arXiv Monthly Submission Rates. https://arxiv.org/stats/mo
nthly submissions. Accessed 20 January 2019.
32. Learn JR. What bioRxiv’s rst 30,000 preprints reveal about
biologists. Nat News 2019, doi:10.1038/d41586-019-00199-6.
33. Davis P. Scientic Reports overtakes PLoS One as
largest megajournal. The Scholarly Kitchen. https:
//scholarlykitchen.sspnet.org/2017/04/06/scientic-repor
ts-overtakes-plos-one-as-largest- megajournal/. Accessed 9
July 2018.
34. Cronin B. Hyperauthorship: A postmodern perversion or evi-
dence of a structural shift in scholarly communication prac-
tices? J Am Soc Inf Sci Technol 2001;52(7):58–69.
35. Von Bergen C, Bressler MS. Academe’s unspoken ethical
dilemma: Author ination in higher education. Res High
Educ 2017;32.
36. Mallapaty S. Paper authorship goes hyper author. 2018. Na-
ture Index. https://www.natureindex.com/news-blog/pape
r-authorship-goes- hyper.
37. Abbott BP, Abbott R, Abbott T, et al. Observation of gravita-
tional waves from a binary black hole merger. Phys Rev Lett
2016;116(6):061102.
38. Castelvecchi D. LIGO’s unsung heroes. Nature News 2017.
https://www.nature.com/news/ligo-s-unsung-heroes-1.227
86.
39. Aboukhalil R. The rising trend in authorship. The Winnower
2014;2:e141832, doi:10.15200/winn.141832.26907.
40. Wislar JS, Flanagin A, Fontanarosa PB, et al. Honorary and
ghost authorship in high impact biomedical journals: A cross
sectional survey. BMJ 2011;343:d6128.
41. Kennedy MS, Barnsteiner J, Daly J. Honorary and ghost
authorship in nursing publications. J Nurs Scholarsh
2014;46(6):416–22.
42. Vera-Badillo FE, Napoleone M, Krzyzanowska MK, et al. Hon-
orary and ghost authorship in reports of randomised clinical
trials in oncology. Eur J Cancer 2016;66:1–8.
43. Economist T. Why research papers have so many authors.
The Economist 2016. http://www.economist.com/news/s
cience-and-technology/21710792-scientic- publications-
are-getting- more-and- more-names- attached-them-why.
Accessed 12 May 2019.
44. Lewison G, Hartley J. What’s in a title? Numbers of words and
the presence of colons. Scientometrics 2005;63(2):341–56.
45. Lockwood G. Academic clickbait: Articles with positively-
framed titles, interesting phrasing, and no wordplay
get more attention online. The Winnower 2016;3,
doi:10.15200/winn.146723.36330.
46. Ucar I, L´
opez-Fernandino F, Rodriguez-Ulibarri P, et al.
Growth in the number of references in engineering jour-
nal papers during the 1972–2013 period. Scientometrics
2014;98(3):1855–64.
47. G´
alvez A, Maqueda M, Martinez-Bueno M, et al. Scien-
tic publication trends and the developing world: What
can the volume and authorship of scientic articles tell us
about scientic progress in various regions? American Sci
2000;88(6):526–33.
48. Jagsi R, Guancial EA, Worobey CC, et al. The “gender gap”
in authorship of academic medical literature–a 35-year per-
spective. N Engl J Med 2006;355(3):281–87.
49. Johnson MR, Wagner NJ, Reusch J. Publication trends in top-
tier journals in higher education. J Appl Res Higher Educ
2016;8(4):439–54.
50. Aldhous P. Scientic publishing: the inside track. Nat News
2014;510(7505):330.
51. Porter A, Rafols I. Is science becoming more interdisci-
plinary? Measuring and mapping six research elds over
time. Scientometrics 2009;81(3):719–45.
52. Fanelli D, Larivi`
ere V. Researchers’ individual publica-
tion rate has not increased in a century. PLoS One
2016;11(3):e0149504.
53. Dong Y, Ma H, Shen Z, et al. A century of science: Glob-
alization of scientic collaborations, citations, and innova-
tions. In: Proceedings of the 23rd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining ACM;
2017:1437–46.
54. Yan E, Ding Y. Weighted citation: An indicator of an article’s
prestige. J Am Soc Inf Sci Technol 2010;61(8):1635–43.
55. Lehmann S, Jackson AD, Lautrup BE. Measures for measures.
Nature 2006;444(7122):1003.
56. Gareld E. The meaning of the impact factor. Rev Int Psicol
Clin Salud 2003;3(2):363–9.
57. Hirsch JE. Does the h index have predictive power? Proc Natl
Acad Sci U S A 2007;104(49):19193–8.
58. Fong EA, Wilhite AW. Authorship and citation manipulation
in academic research. PloS One 2017;12(12):e0187394.
59. The cost of salami slicing. Nat Mater 2005;4,
doi:10.1038/nmat1305.
60. Delgado L ´
opez-C ´
ozar E, Robinson-Garc´
ıa N, Torres-Salinas D.
The Google scholar experiment: How to index false papers
and manipulate bibliometric indicators. J Assoc Inf Sci Tech-
nol 2014;65(3):446–54.
61. Van Bevern R, Komusiewicz C, Niedermeier R, et al. H-index
manipulation by merging articles: Models, theory, and exper-
iments. Artif Intell 2016;240:19–35.
62. Falagas ME, Kouranos VD, Arencibia-Jorge R, et al. Compar-
ison of SCImago journal rank indicator with journal impact
factor. FASEB J 2008;22(8):2623–28.
63. Lariviere V, Kiermer V, MacCallum CJ, et al. A simple pro-
posal for the publication of journal citation distributions.
BioRxiv 2016, doi:10.1101/062109.
64. Callaway E. Beat it, impact factor! Publishing elite turns
against controversial metric. Nat News 2016;535(7611):210.
65. Altmetric. https://www.altmetric.com/. Accessed 14 Febru-
ary 2019.
66. Grifn SA, Oliver CW, Murray A. Altmetrics! Can you afford
to ignore it? Br J Sports Med 2017;52(18):1160–1.
67. Semantic Scholar. https://www.semanticscholar.org. Ac-
cessed 14 February 2019.
68. Seglen PO. Why the impact factor of journals should not be
used for evaluating research. BMJ 1997;314(7079):498.
69. Byrne A. Comment: Measure for measure. Nature
2017;546(7666):S22.
70. Hecht F, Hecht BK, Sandberg AA. The journal ”impact factor”:
a misnamed, misleading, misused measure. Cancer Genetics
Cytogenet 1998;104(2):77–81.
71. Google Scholar. https://scholar.google.com. Accessed 13
February 2019.
72. Sinha A, Shen Z, Song Y, et al. An overview of Microsoft
Academic Service (MAS) and applications. In: Proceedings of
the 24th International Conference on World Wide Web ACM;
2015:243–46, doi:10.1145/2740908.2742839.
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
20 Over-optimization of academic publishing metrics
73. KDD Cup 2016: Whose papers are accepted the most: towards
measuring the impact of research institutions. 2016. http://
www.kdd.org/kdd-cup/view/kdd-cup-2016/Data.
74. Semantic Scholar - An Overview of Microsoft Academic
Service (MAS) and Applications. https://www.semanticscho
lar.org/paper/An-Overview-of-Microsoft-Academic-Servic
e-(MAS)-and-Sinha- Shen/b6b6d2504fd57d27a0467654fa62
169cc7dedbdd?navId=citing-papers. Accessed 14 February
2019.
75. Pitts M, Savvana S, Roy SB, et al. ALIAS: Author Disam-
biguation in Microsoft Academic Search Engine Dataset.
In: Proceedings of the 17th International Conference
on Extending Database Technology (EDBT 2014); 2014,
doi:10.5441/002/edbt.2014.65.
76. How Microsoft Academic uses knowledge to ad-
dress the problem of conation/disambiguation. 2018.
https://www.microsoft.com/en-us/research/project/acade
mic/articles/microsoft-academic-uses-knowledge-address
-problem-conation-disambiguation/. Accessed 21 January
2019.
77. Tang J, Zhang J, Yao L, et al. ArnetMiner: extraction and
mining of academic social networks. In: Proceedings of
the 14th ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining, Las Vegas, NV, 2008. 2008,
doi:10.1145/1401890.1402008.
78. Butler D. Free journal-ranking tool enters citation market.
Nature 2008;451:6.
79. Scimago Journal & Country Rank. https://www.scimagojr.co
m/journalrank.php. Accessed 14 February 2019.
80. Pycld2 - Python Package. https://pypi.org/project/pycld2/.
Accessed 14 February 2019.
81. Colavizza G, Franssen T, van Leeuwen T. An empirical inves-
tigation of the tribes and their territories: are research spe-
cialisms rural and urban? J Informetr 2019;13(1):105–17.
82. Nature: Editorial Criteria and Processes. https://www.nature
.com/nature/for-authors/editorial-criteria- and-processes.
Accessed 15 July 2018.
83. Low Y, Gonzalez JE, Kyrola A, et al. Graphlab: A new frame-
work for parallel machine learning. arXiv 2014:1006.4990.
84. MongoDB. http://www.mongodb.com. Accessed 14 February
2019.
85. Fire M, Guestrin C. Supporting data for ”Over-optimization
of academic publishing metrics: observing Goodhart’s Law
in action.” GigaScience Database 2019. http://dx.doi.org/10.
5524/100587.
Downloaded from https://academic.oup.com/gigascience/article-abstract/8/6/giz053/5506490 by guest on 30 May 2019
... Research itself has been profoundly impacted by AI for several years. With the number of academic papers increasing exponentially, exceeding seven million per year [20] tasks such as preparing extensive, reliable, and systematic literature reviews [21] or identifying research gaps have become increasingly challenging. AI-powered tools for synthesising literature review are now indispensable for scientists, significantly driving advances across research domains [22]. ...
Preprint
Full-text available
This paper explores the potential of artificial intelligence to augment human capabilities. We claim that current AI advancements position it as a valuable partner, enhancing human creativity and productivity. However, the future might see AI evolve into a highly capable and cost-efficient workforce subordinated to humans. These developments raise ethical concerns, particularly regarding the impact of AI on the future of work and the management of free time.
... Rigorous and transparent, systematic reviews can reliably establish what is known about a topic, and reveal gaps that require further study . With an exponential increase in the number of articles published in the past 20 years (Fire & Guestrin, 2019), however, it has become increasingly difficult for researchers to systematically review all of the publications that are relevant to a specific topic. Considerable time and resources are required to collect relevant papers and identify information pertinent to the research question at hand. ...
Preprint
The sheer number of research outputs published every year makes systematic reviewing increasingly time- and resource-intensive. This paper explores the use of machine learning techniques to help navigate the systematic review process. ML has previously been used to reliably 'screen' articles for review - that is, identify relevant articles based on reviewers' inclusion criteria. The application of ML techniques to subsequent stages of a review, however, such as data extraction and evidence mapping, is in its infancy. We therefore set out to develop a series of tools that would assist in the profiling and analysis of 1,952 publications on the theme of 'outcomes-based contracting'. Tools were developed for the following tasks: assign publications into 'policy area' categories; identify and extract key information for evidence mapping, such as organisations, laws, and geographical information; connect the evidence base to an existing dataset on the same topic; and identify subgroups of articles that may share thematic content. An interactive tool using these techniques and a public dataset with their outputs have been released. Our results demonstrate the utility of ML techniques to enhance evidence accessibility and analysis within the systematic review processes. These efforts show promise in potentially yielding substantial efficiencies for future systematic reviewing and for broadening their analytical scope. Our work suggests that there may be implications for the ease with which policymakers and practitioners can access evidence. While ML techniques seem poised to play a significant role in bridging the gap between research and policy by offering innovative ways of gathering, accessing, and analysing data from systematic reviews, we also highlight their current limitations and the need to exercise caution in their application, particularly given the potential for errors and biases.
... All these criteria are elements of a quan�fied measure of academic success, o�en equated with academic quality. In contrast, this understanding of academic merit has led to a decline in the quality of publica�ons in several respects (Fire & Guestrin, 2019). More worryingly, the constant pressure to publish is taking its toll on the mental health of academics (Hanitzsch et al., 2023). ...
Preprint
Full-text available
In my essay I discuss how the permanent pressure of publishing in certain journals, that are classified as high impact is influencing the kind of research we do.
... Every year, more than 7 million scientific papers are published across over 20,000 academic journals, with publication rates varying by field and specialty. 1 Presently, the field of family medicine has 45 actively publishing journals with a Scopus Citescore, a relatively small number compared to other domains like internal medicine, which boasts 143 active journals with Citescores on the same database. 2 Family medicine is a broad and important area of medicine, with over 100,000 physicians practicing in the field. ...
Article
Full-text available
Background Family medicine, vital for patient care but underfunded, prompts an evaluation of how family medicine journals endorse, require, and advocate for reporting guidelines (RGs), clinical trial, and systematic review registration. Aim Assess endorsement and requirement of RGs, and the stance on clinical trial and systematic review registration in family medicine journals, impacting research quality and transparency. Design & setting A cross-sectional analysis of 43 "Family Practice" journals, identified through the 2021 Scopus CiteScore. Editors-in-Chief were contacted to confirm article types. Data extracted from "instructions to authors" pages focused on RG recommendations, requirements, and trial registration. Method To ensure confidentiality and prevent bias, authors independently extracted data on RG utilisation, adherence, and clinical trial registration provide a overview of research standards. Results Of 43 journals, the most recommended guidelines were CONSORT (69%), PRISMA (58%), and STROBE (60%). The most required were PRISMA (16%) and CONSORT (11%). Clinical trial registration was recommended or required by 67% of journals. Additionally, 40 out of the 43 (93%) journals cited at least one reporting guideline in their instructions to authors. Conclusion Family medicine journals exhibit varied endorsement and requirement patterns for RGs and clinical trial registration. While guidelines like CONSORT, PRISMA, and STROBE are acknowledged, caution is needed in presuming a direct link to enhanced research quality. A nuanced approach, promoting diverse reporting guidelines and rigorous study registration, is essential for elevating transparency and advancing research standards in family medicine.
... An additional set of studies raises the question of whether a well-defined human regulator with a clear goal is required for proxy failure to occur. These studies describe proxy failure in distributed social systems, such as academia or markets, where the goal may be a dynamically changing social agreement (Table 1, red area; Biagioli & Lippman, 2020;Braganza, 2022b;Fire & Guestrin, 2019;Poku, 2016). For instance, the academic publication process (Biagioli & Lippman, 2020;Braganza, 2020;, education (Nichols & Berliner, 2005), healthcare (O'Mahony, 2018;Poku, 2016), and competitive markets (Braganza, 2022a(Braganza, , 2022bMirowski & Nik-Khah, 2017) appear to be subject to proxy failure, even though it is difficult to clearly define the respective goals. ...
Article
In their target article, John et al. make a convincing case that there is a unified phenomenon behind the common finding that measures become worse targets over time. Here, we will apply their framework to the domain of animal welfare science and present a pragmatic solution to reduce its impact that might also be applicable in other domains.
Article
Full-text available
Yliopistojen rahoitusmallissa tärkeää roolia näyttelevä Julkaisufoorumi (JUFO) on virallisesti tarkoitettu laajempien julkaisumäärien tarkasteluun. On kuitenkin viitteitä siitä, että JUFOa käytetään laajasti myös yksittäisten tutkijoiden arviointiin. Tässä monimenetelmätutkimuksessa tarkastelemme sitä, miten JUFO-tasot vaikuttavat yksilöiden julkaisukanavia koskeviin päätöksiin, sekä sitä, miten tutkijat kokevat JUFOn roolin omassa työssään sekä tiedejulkaisemisessa laajemmin. Aineistonamme on keväällä 2023 toteutettu verkkokysely (N = 276), johon vastasi eri uravaiheiden ja alojen tutkijoita ympäri Suomea. Analysoimme vastaukset sekä määrällisiä (T- ja ANOVA-testit) sekä laadullisia (induktiivinen sisällönanalyysi) menetelmiä käyttäen. Tulosten perusteella JUFO-tasot vaikuttavat merkittävästi tutkijoiden enemmistön (54,9 %) päätöksiin siitä, millä perusteella julkaisukanava valitaan. Tasoilla oli suurempi merkitys tekniikan ja lääketieteen tutkijoille, Suomessa korkeimman tason tutkintonsa suorittaneille tutkijoille sekä hieman suurempi merkitys uran alkuvaiheen tutkijoille. Lisäksi tunnistimme viisi roolia, joita JUFOlla on tutkijoiden työssä. Vaikka jotkut näistä rooleista nähtiin positiivisesti, useimmat roolit näyttivät vaikuttavan tutkijan työhön negatiivisesti. Tulosten perusteella väitämme, että JUFO-tasoilla on merkittävä rooli yksittäisten tutkijoiden työssä, mikä puolestaan osoittaa, että yliopistot ovat osittain epäonnistuneet JUFOn hyödyntämisessä julkaisutoimintansa kehittämisessä.
Article
Full-text available
We propose an operationalization of the rural and urban analogy introduced in Becher and Trowler (2001). According to them, a specialism is rural if it is organized into many, smaller topics of research, with higher author mobility among them, lower rate of collaboration and productivity, lower competition for resources and citation recognitions compared to an urban specialism. It is assumed that most humanities specialisms are rural while science specialisms are in general urban: we set to test this hypothesis empirically. We first propose an operationalization of the theory in most of its quantifiable aspects. We then consider specialisms from history, literature, computer science, biology, astronomy. Our results show that specialisms in the humanities present a sensibly lower citation and textual connectivity, in agreement with their organization into more, smaller topics per specialism, as suggested by the analogy. We argue that the intellectual organization of rural specialisms might indeed be qualitative different from urban ones, discouraging the straightforward application of citation-based indicators commonly applied to urban specialisms without a dedicated re-design in acknowledgement of these differences.
Article
Full-text available
Objectives Bibliometrics are used to assess or compare the academic productivity of individuals or groups. Most of these metrics, including the widely used h-index, do not recognize the added contribution that is generally provided by authors listed first, second, second-to-last and last (enhanced positions) in a publication citation. We propose the c-index as a novel modification to the h-index that will better reflect an individual’s academic output, incorporating authorship position. Methods One hundred and sixty-six academic neurosurgeons in eight New York City (NYC) metropolitan region training programs were identified through department websites. Using the Scopus citation database, bibliometric profiles were created for each surgeon. Once an individual’s h-index was calculated, the h-core articles (those with h or more citations) were specifically assessed to determine citation author position. Novel bibliometric indices were created to reflect the number of h-core articles that accounted for primary (hp), senior (hs) or internal authorship (hi) position. Weighted “involvement factors” for primary (ip) and senior (is) author contribution were created to reflect the added value of “enhanced position” authorship in an individual’s h-core publications. c-indices were created to reflect the author’s h-index once augmented by primary (cp), senior (cs), and overall (co) “enhanced position” authorship. Comparisons were made within each institution and across institutions, according to academic rank (assistant professor, associate professor, professor and chairperson). Results Breakdown by academic rank showed an increasing average h-index progressing from assistant professor through professor rank with no significant difference demonstrated between professor and chair status. This pattern was seen across all departments (aggregate) but with fewer instances of significance at the level of individual departments. After h-index modification, cp, cs, and co indices showed a similarly significant trend. As faculty rank increased, there was a significant trend toward increasing numbers of articles with authors in enhanced positions and a higher percentage of articles with the author in a senior position. Academic faculty had higher h, cp, and cs indices than clinical faculty. Evaluation of each individual department revealed no significant trend regarding a department’s higher average cp or cs. Average c-index for a department paralleled the average h-index of that department, with larger departments tending to have larger cumulative h, cp, cs, and co indices. No consistent correlation was seen between mean h-indices and academic rank at an individual departmental level. Conclusions This study examines the academic productivity of a subset of neurosurgical programs in the NYC metropolitan area as a test bed for novel bibliometric indices. hp, hi, and hs represent the respective number of primary, internal and senior authorship papers that comprise an individual’s h-core papers. cp, cs, and co, variations of the h-index metric, are designed to more accurately reflect the contributions by primary, secondary and senior authors. Increasing academic rank was associated with an increased number of articles with the author in enhanced positions and a higher percentage of articles in a senior position.
Article
Full-text available
The whys and wherefores of SciSci The science of science (SciSci) is based on a transdisciplinary approach that uses large data sets to study the mechanisms underlying the doing of science—from the choice of a research problem to career trajectories and progress within a field. In a Review, Fortunato et al. explain that the underlying rationale is that with a deeper understanding of the precursors of impactful science, it will be possible to develop systems and policies that improve each scientist's ability to succeed and enhance the prospects of science as a whole. Science , this issue p. eaao0185
Article
Full-text available
Some scholars add authors to their research papers or grant proposals even when those individuals contribute nothing to the research effort. Some journal editors coerce authors to add citations that are not pertinent to their work and some authors pad their reference lists with superfluous citations. How prevalent are these types of manipulation, why do scholars stoop to such practices, and who among us is most susceptible to such ethical lapses? This study builds a framework around how intense competition for limited journal space and research funding can encourage manipulation and then uses that framework to develop hypotheses about who manipulates and why they do so. We test those hypotheses using data from over 12,000 responses to a series of surveys sent to more than 110,000 scholars from eighteen different disciplines spread across science, engineering, social science, business, and health care. We find widespread misattribution in publications and in research proposals with significant variation by academic rank, discipline, sex, publication history, co-authors, etc. Even though the majority of scholars disapprove of such tactics, many feel pressured to make such additions while others suggest that it is just the way the game is played. The findings suggest that certain changes in the review process might help to stem this ethical decline, but progress could be slow.
Article
Full-text available
Explains the role of Altmetrics in Sports Medicine.
Article
The researcher at the center of an epic scientific fraud remains an enigma to the scientists who exposed him.
Article
The system for assessing the link between science and innovation is flawed.