ArticlePDF Available

Toll-based access vs pirate access: a webometric study of academic publishers

Authors:

Abstract and Figures

Purpose The purpose of this paper is to draw a comparison of the Web traffic ranking, usage and popularity of websites of databases of reputed publishers, namely, ScienceDirect and Emerald Insight, that provide access on subscription basis with Sci-Hub, on the basis of data obtained from Alexa databank ( www.alexa.com ). Sci-Hub is a website that provides pirated open-access to the research literature, where piracy, according to The Economic Times (2020), refers to the unauthorized duplication of copyrighted content. Design/methodology/approach Under present study, the quantitative study of the collected data was carried out with help of descriptive research methodology. The Alexa databank was singled out as the source of data. This study crawled through Alexa databank on 01.12.2019 and collected relevant data regarding Sci-Hub, ScienceDirect and Emerald Insight using the search terms Sci-hub.tw, Sciencedirect.com and Emeraldinsight.com sequentially. Different criteria were taken into consideration, which include global traffic rank, the average number of page views per user, time taken for uploading, bounce rate, percentage of users, the number of in-links and daily time spent on the site. Findings The results of this study showed that ScienceDirect has the highest traffic rank and in-linking sites among the surveyed databases. But highest number of page visits were recorded for Sci-Hub with fastest downloading speed. It has also been observed that the users spent less time on ScienceDirect and Emerald Insight as compared to Sci-Hub. This study further observed that Sci-Hub has the lowest bounce rate. Users from both the developing and developed economies use the Sci-Hub, though the highest number of visitors belongs to the developing nations. Originality/value This study provides an overview of the performance of toll-based publishing databases with pirated database based on different criteria through World Wide Web. Though, this study in no way supports or endorses the unauthorized and illegal access to knowledge, but such data helps in depicting and analyzing how much a particular database is accessed by its users all over the globe and also determines and illustrates the time spent by users while accessing a specific database, thus, providing the user preferences in information seeking activities. This study provides an overall view of adoption of open resources.
Content may be subject to copyright.
Toll-based access vs pirate
access: a webometric study
of academic publishers
Raashida Amin,Arshia Ayoub and Sumeera Amin
University of Kashmir, Srinagar, India, and
Zahid Ashraf Wani
Department of Library and Information Science,
University of Kashmir, Srinagar, India
Abstract
Purpose The purpose of this paper is to draw a comparison of the Web trafc ranking, usage and
popularity of websites of databases of reputed publishers, namely, ScienceDirect and Emerald Insight,
that provide access on subscription basis with Sci-Hub, on the basis of data obtained from Alexa databank
(www.alexa.com). Sci-Hub is a website that provides pirated open-access to the research literature, where
piracy, according to The Economic Times (2020), refers to the unauthorized duplication of copyrighted
content.
Design/methodology/approach Under present study, the quantitative study of the collected data was
carried out with help of descriptive research methodology. The Alexa databank was singled out as thesource
of data. This study crawled through Alexa databank on 01.12.2019 and collected relevant data regarding Sci-
Hub, ScienceDirect and Emerald Insight using the search terms Sci-hub.tw, Sciencedirect.com and
Emeraldinsight.com sequentially. Different criteria were taken into consideration, which include global trafc
rank, the average number of page views per user, time taken for uploading, bounce rate, percentage of users,
the number of in-links and daily time spent on the site.
Findings The results of this study showed that ScienceDirect has the highest trafc rank and in-linking
sites among the surveyed databases. But highest number of page visits were recorded for Sci-Hub with fastest
downloading speed. It has also been observed that the users spent less time on ScienceDirect and Emerald
Insight as compared to Sci-Hub. This study further observed that Sci-Hub has the lowest bounce rate. Users
from both the developing and developed economies use the Sci-Hub, though the highest number of visitors
belongs to the developing nations.
Originality/value This study provides an overview of the performance of toll-based publishing
databases with pirated database based on different criteria through World Wide Web. Though, this study in
no way supports or endorses the unauthorized and illegal access to knowledge, but such data helps in
depicting and analyzing how much a particular database is accessed by its users all over the globe and also
determines and illustrates the time spent by users while accessing a specic database, thus, providing the
user preferences in information seeking activities. This study provides an overall view of adoption of open
resources.
Keywords Open access, Emerald Insight, Global trafc rank, Pirated access, Bounce rate, Sci-Hub,
ScienceDirect
Paper type Research paper
Introduction
User experience and perception with a product or service (e-service) is an important aspect
in its quality assessment. Generally, a product or service quality is dependent on what a
customer or user receives from it (Saeid et al.,2008). The extent of its usability and
Webometric
study of
academic
publishers
Received 20 December2020
Revised 2 February2021
Accepted 12 February2021
Digital Library Perspectives
© Emerald Publishing Limited
2059-5816
DOI 10.1108/DLP-12-2020-0127
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2059-5816.htm
accessibility are important factors for its evaluation (Lindgren and Jansson, 2013). Thus, a
product or service like a website would get the repeated visits by users only when it satises
their needs and demands. These visits and downloads are being used to gauge the product
or service visibility and utility, which is conducted by the method of Web-log analysis, an
application of webometrics. According to Bjorneborn and Ingwersen (2004), webometrics is
the study of Web-based phenomena using quantitative techniques and drawing upon
informetric methodsmainly for the purpose of ranking, categorizing, comparing and
evaluating websites (Islam, 2011). Statistically measuring the aspects such as site-trafcs,
download time, etc., are being used for evaluation purposes (Hung and McQueen, 2004) and
these evaluations and assessments aid organizations to generate and evolve products and
services that are more user-centered.
Internet services have preferably become information sources of prime importance.
Information seekers, particularly researchers, aim at accessing and retrieving information
hassle-free. But, the major hurdle in accessing information is the costs and pay-walls that
commercial publishers charge for providing access to their resources. One such famous
publisher is the Elsevier the worlds largest scientic publisher having about 2,000
journals and publishes almost 350,000 peer-reviewed articles per year (Dave, 2016) and hosts
ScienceDirect a gateway to those journals. Though Elsevier provides subscription based
access, it, to some extent, supports open science movement (Rogers, 2017) and provides open
access to 250,000 articles on ScienceDirect (ScienceDirect, 2019). But, this small initiative
provides no relief to the users and the huge subscriptions charged disgruntle the
researchers, which has led to its boycott by more than 15,000 researchers and open access
advocates have been protesting against the company and its subscription-based business
model (Dave, 2016).
It has been gured out that three-quarters of scholarly literature is inaccessible because
of pay-walls (Bosman and Kramer, 2018). Most importantly, institutions in developing
countries and other smaller institutions suffer most to get the access to scholarly literature
(Meadows, 2015). Open access initiatives have, to a certain extent, removed the barriers
(Piwowar et al., 2018). Similarly, Sci-Hub, a website started in 2011 by Alexandra Elbakyan
(Bohannon, 2016a) is liberating the public-funded research literature. It provides universal
open access (Milova, 2017) to almost 62 million publications (Greshake, 2017). It allows free
downloading of PDF versions of scholarly articles that are otherwise pay-walled or
subscription-based on their respective journal sites. Sci-Hub uses institutional networks as
proxies and by-pass pay-walls and get access to the subscription-based journals
(Himmelstein et al., 2018) by tracking leaked authentication credentials of academic
institutions (Elbakyan, 2017). Because of these copyright violations, Elsevier has led a civil
lawsuit against Sci-Hub, in the US court and won US$15m for infringement damages by Sci-
Hub (Himmelstein et al.,2018). But, the owner of Sci-Hub continues to operate it from Russia
using multitude of domain names and IP addresses (Elsevier seeks millions from Sci-Hub for
copyright infringement, 2018). Schiermeier (2017) notes that Mat Mckay, a spokesperson for
the International Association of Scientic, Technical and Medical publishers in Oxford, UK,
besides warning those violating the rights expressed that Sci-Hub neither fosters scientic
achievement nor values researchersachievements and adds no value to scholarly
community. Further, despite all the outrage over the pricing policies of publishing
commercials, scientic community is still reliant on one symbolic function of publishers-
that is to allocate academic capital(Schmitt, 2014). Elaborating it further, Larivière et al.
(2015) note that publications in high impact journals of prestigious publishers such as
Elsevier or Springer gain researchers tenure and grants and thus, these incentives force
researchers to publish in these journals which in turn facilitates a grip to the commercial
DLP
publishers on the scientic community. Therefore, keeping this view in to consideration, this
study upholds to gauge the popularity, rank and utility of Sci-Hub in comparison to
ScienceDirect a leading information gateway of Elsevier, using Alexa Ranking, which is a
website metrics system, owned by Amazon Company (Alexa.com, 2019). It uses the method
of Web-analytics, which according to Web Analytics Association is the measurement,
collection, analysis and reporting of internet data for the purpose of understanding and
optimizing web-usage(Fang, 2007).
Literature review
ScienceDirect and Sci-Hub are the two popular products of use for scientists. While the goal
of the former is to expand the boundaries of knowledge for the benet of humanity, the later
aspires for universal open-access to knowledge. Because both have the target of knowledge
dissemination, it becomes viable to gauge their global standing and ranking. As per best of
our knowledge this type of study is the pioneer and no previous literature has been found
regarding the comparison between fee-basedand pirated databases. Thus, wehave made an
attempt to study the literature with less or more similar topic.
Booth and Jansen (2009) afrms the use of metrics, such as number of visitors, number of
visits, visit duration, etc., as the performance indicators. Besides, Hong (2007) conducted a
survey on Korean websites and concluded that website metrics are key to success
measurements. Further, Zahran, Al-Nuaim, Rutter and Benyon (2014) note that websites
that are more usable have a higher page rank and score more popularity in Alexa.
Various studies pertaining to the website use comparisons using web trafc analysis
have been conducted. Accordingly, the literature survey conducted has been categorized
under various Web-analysis studies as:
Newspaper websites
Naheem (2016) conducted webometric analysis of Malayalam newspaper websites using
Alexa Internet and evaluated them on the basis of trafc rank, page views, speed, bounce
percentage, time on site, etc. Odeyemi (2017) conducted webometric analysis of Nigerian
newspaper websites using Alexa ranking tool and collected data pertaining to eight
indicators trafc rank, page-views, bounce percentage, downloading speed, links, time on
site, search percentage and users. Out of the 17 leading newspapers selected, only two, i.e.
Vanguard and The Punch, were having highest trafc rank. The Guardian was found to
have the greatest downloading speed. Highest number of average page views per day was
found for The Punch and the highest expected daily time spent for New Telegraph. A
similar study conducted by Muthuraja and Veerabasavaiah (2018) attempted to analyze
webometrics of Kannada language newspaper websites using Alexa Internettool. The
study found Vijayakarnataka as having the highest trafc rank in India while Udayavani
the highest global trafc rank. Jowkar and Didegah (2010) also conducted evaluation of 24
Iranian newspaper websites using correspondence analysis based on Alexa ranking data.
They found Iran newspaper having the highest trafc rank and largest count of links.
Bashirmazandaran was found as leading newspaper for foreign users, while Karvakargar
having no foreign users. Besides, most of the foreign users were traced down from USA.
University/institutional websites
Shaand Bhat (2014) conducted performance and visibility analysis of 21 Indian Research
Institutions on the Web using Alexa. The results reveal that global trafc ranks, number of
page views, number of links and time on site of Indian research institutions is low. The
trafc ranks of Indian research institutions differ signicantly, while differences in the page
Webometric
study of
academic
publishers
ranks were not signicant. Further, they found that Indian research institutionswebsites
have not been able to attract foreign visitors. Similarly, Khan and Idrees (2015) conducted
Web Impact Factors (WIFs) analysis for the websites of Pakistani universities and ranked
the top ve university websites of Pakistan and also compared the websites of universities
from India, Bangladesh, Srilanka and Indonesia on the basis of WIFs. The webometric tools
used for data collection include Open Site Explorer Service (i.e. Developer Shed) and search
engines Google and Bing. A total of 41,960 webpages and 49,740 in links were traced by
the study for the top-ten ranked Pakistani universities with University of Punjab (PU)
website having the top rank. Though, the aggregated Repeated WIFs for Pakistani
universities, as gauged by the study, ranked third among the South Asian countries but
reports the low worldwide impact of the websites.
Massive open online courses (MOOCs) platforms
Abu and Jayasekara (2017) conducted a study to evaluate the ten leading MOOC providers
using Alexa ranking tool. The study found Udemy as the most popular MOOC provider
from global ranking and Iversity as the least popular. The least bounce rate was traced for
Open2study and the highest bounce rate for Codecademy. FutureLearn was found to have
the highest daily page views per visitor while Coursera the least.
The increasing subscription charges of scientic publications hinder their smooth access
and also hamper public advocacy for persuasion of some policy changes and research
funding. But the scientic community, since some time, has started showing resentments
against this relentless exploitative behaviour of these leading for-prot publishers and
academic libraries increasingly cancelling the subscriptions. Even the prestigious
universities are unable to afford them (How is Sci-Hub affecting academic publishing?,
2018). Harvard university in 2010 (Sample, 2012), University of Konstanz in 2014 (Vogel,
2014) and University of California in 2019 (UC and Elsevier, 2019) have broken negotiations
and cancelled subscriptions with the scientic publisher Elsevier because of their inability
and unwillingness to withstand their aggressive pricing policy. Ultimately researchers get
pushed to embrace newer and easier means of getting desired literature. Millions of
researchers from developing as well as developed nations opt for Sci-Hub for accessing
freely available content (Kaube, 2018). Based on the server log data supplied by Sci-Hub
owner, Alexandra Elbakyan, it was found that Sci-Hub served 28 million documents in six
months period. The study further elaborates that 2.6 million download requests were from
Iran, 3.4 million from India and 4.4 million from China. Besides, it was estimated that half-a-
million downloads, in one week period, provided by Sci-Hub were of Elsevier papers
(Bohannon, 2016b). Daniel Himmelstein, a bio-data scientist at the University of
Pennsylvania reports that these results suggest the beginning of the endfor subscription
based scientic research access avenues (Mc Kenzie, 2017), whereas Justin Spence, partner
and co-founder of PSI Ltd., and the IP Registry, arguing against Sci-Hub, states that it instils
a false perception that high impact peer-reviewed research can be created and distributed
without any essentialcost and that there is no harm in utilizing pirated content(Anderson,
2019).
Thus, we can conclude that the use of web-log analytics is the indicator for gauging the
utility, visibility and ranking of websites and it can be effectively used for comparative
analysis of their position and performance.
Problem
The study made an attempt to compare the web trafc ranking, usage and popularity of
websites of reputed publishers, namely, ScienceDirect and Emerald Insight that provide
DLP
access on subscription basis with the Sci-Hub that provides pirated access, on the basis of
data obtained from Alexa databank.
Scope
The scope of this study is conned to the websites of Sci-Hub (Sci-hub.tw) and globally
refereed and accredited databases, namely, ScienceDirect (sciencedirect.com) from Elsevier
and Emerald Insight (emeraldinsight.com). ScienceDirect and Emerald Insight were selected
for the study owing to the fact that these are among the acclaimed and well-known sources
of authentic and credible information (Ansari and Raza, 2019; Khan, 2016). ScienceDirect
contains the worlds largest electronic collection of full-text and bibliographic information
on science, technology and medicine (An introduction to ScienceDirect, 2020). It provides
access to over a quarter of the worlds full text scientic, technical and medical articles
written by reputed authors and read by researchers from around the globe (The best
academic research databases, 2019). Emerald is considered as the largest provider on
international management development and information science (Craven and Dallas, 2009).
Independent Publishers Guild recently awarded the Emerald Publishing with Ingram
Content Group Independent Publisher of the Year, 2020. The emerald was honoured with
this title for its continuing commitment to successful books and journals program, together
with its innovation of its online content platform, Emerald Insight (Independent Publishers
Guild, 2020).
Objective
The main objectives of the study are as follows:
to compare the global trafc rank and in-links of the websites under study to have
an insight into their popularity and visibility over internet;
to analyze the usage of these sites on basis of daily pages viewed per visitor, daily
time on site, bounce rate and speed of uploading; and
to identify the top ve user countries of these websites.
Methodology
The data was collected on 01.12.2019. The methodology for the current study comprises of
following phases
Phase I
A detailed and comprehensive scanning of literature was directed to sift relevant literature
to have better visualization of topic in hand. It was discovered not much work was done on
this specic topic but nevertheless studies that might help investigators to rene modus
operandi for the study were identied and benetted from.
Phase II
To collect the data pertaining to the study, a review of web metric platforms that offer free
access to data was taken into consideration. After reviewing all the platforms, it was decided
to select Alexa databank for collection of pertinent data, for reason that it is the largest Web
metric platform which offers free access to data required by the study.
Webometric
study of
academic
publishers
Phase III
Thus, relevant queries, i.e. the respective names of selected databases (Sci-Hub,
ScienceDirect and Emerald Insight) as keywords were run in the alexa.com search box and
the required data, in line with the set objectives, was harvested from the Alexa databank
(www.alexa.com). The collected data was, tabulated, analyzed, interpreted and correlated to
reach logical conclusions.
Results and discussion
Global trac rank and inlinks
The study crawled through and compared a pirated database (Sci-Hub) that provides free
access, with the databases (ScienceDirect and Emerald Insight) that provide access on
subscription basis. Different criteria were used for the study. ScienceDirect (200) has the best
global trafc rank among the three, followed by Sci-Hub (783) and Emerald Insight (4,797).
The reason for exemplary global trafc rank of ScienceDirect is accredited to the fact that it
is the gateway to the millions of academic articles published by Elsevier and includes high
impact factor titles such as The Lancet, Cell and Tetrahedron (The best academic research
databases, 2019).
The study also focuses on In-links which are links coming into (or pointing to) a webpage
and connectivity on the Web. ScienceDirect (64,225) has highest number of linking-in sites
followed by Emerald Insight (8,216) and Sci-Hub (510). The low number of In-links of Sci-
Hub may be owing to the fact that it is considered an illegal rm/portal by dominant
governments and corporates globally; therefore, not many sites are hosting its link on their
respective sites. Table 1 offers a detailed view.
Daily pages viewed per visitor and time on site
Another criterion for evaluation is average number of pages viewed by users. Among
surveyed, Sci-Hub (3.49) has highest number of daily page visits followed by ScienceDirect
(2.99) and Emerald insight (2.34). Sci-Hub (6:01) has highest daily time spent on site followed
by Emerald insight (3:25) and ScienceDirect (3:01). The extensive number of daily page
visits and highest daily time spent on the site is an indicator of the signicance of Sci-Hub in
present scenario. This can be attributed to the fact that it provides access to full text of all
contents available on it without any restrictions and limitations, while ScienceDirect and
Emerald Insight provide subscription-based access to its contents, which becomes a huge
barrier in accessing the required information and ultimately turns a user disappointed and
disinterested. Table 2 offers detailed overview.
Bounce rate and average time taken for uploading
Bounce rate is the rate at which the visitors leave the website without examining it or not
completing a particular activity on it. Sci-Hub (23.40%) has the lowest bounce percentage
followed by ScienceDirect (56.20%) and Emerald insight (56.3%). The average time of
uploading is minimum for Sci-Hub (1.501 s) followed by ScienceDirect (1.861 s) and Emerald
Table 1.
Global trafc rank
and inlinks
Name of website Global traffic rank Linking in sites
Sciencedirect.com 200 64,225
Sci-hub.tw 783 510
Emeraldinsight.com 4,797 8,216
DLP
Insight (3.85s). The information provided on Sci-Hub without any restriction helps its users
to have a barrier free access to all the data available over it. This increases the probability
that users would spend more time on Sci-Hub leading to low bounce rate. On its contrary,
once a search engine would redirect user to ScienceDirect or Emerald Insight for the
information and if the user or the institution of the user/searcher is not subscribing to these
sources, the user fails to access the required information. This forces the user to leave the
site after visiting rst page because of restricted access of the required information on that
source. This depicts that Sci-Hub is considered a promising alternative by the researchers
irrespective of its breaching code of conduct. Table 3 offers lucid picture.
Top ve user countries
China is the top user country of all the three databases. Though Sci-Hub is most used in
China, it is also most used database in India. It is evident from the Table 4, that major users
of Sci-Hub mostly belong to developing countries that may be attributed to the poor
nancial conditions of those countries that deprive them from getting access to highly
Table 2.
Daily pages viewed
per visitor and time
on site
Name of website Daily pages viewed/visitor Daily time on site
Sciencedirect.com 2.99 3:01
Sci-hub.tw 3.49 6:10
Emeraldinsight.com 2.34 3:25
Table 3.
Bounce rate and
average time taken
for uploading
Name of website Bounce rate (%) Average time taken for uploading
Sciencedirect.com 56.20 1.861 sec
Sci-hub.tw 23.40 1.501 sec
Emeraldinsight.com 56.3 3.85 sec
Table 4.
Top ve user
countries
Database Country Users (%)
Sciencedirect.com China 32.6
USA 12.3
India 8.2
Japan 6.1
UK 3.6
Sci-hub.tw China 41.1
India 11.8
Iran 4.9
Japan 4.6
USA 3.8
Emeraldinsight.com China 16.6
India 10.3
USA 7.6
UK 7.5
Malaysia 5.9
Webometric
study of
academic
publishers
reputed subscription-based databases. Sci-Hub is serving as a treat for the researchers and
students of such countries, as it helps them to have access to the papers of high repute from
expensive journals and databases. The higher use of Sci-Hub by users from developing
countries is but obvious, peculiar is that it is being used in developed economies like Japan
and USA, too. Thus, Sci-Hub has highlighted the plight of researchers facing undue pay-
walls, world-over and has ushered in universal acceptance towards open access. Table 4
provides comprehensive view.
Conclusion
Alexa databank helps us to gauge the performance of a website based on different criteria
through World Wide Web. Such data proves to be very useful to depict how much a
particular website is accessed by users all over the globe and also determines the time spent
by users while accessing that particular website. This impact analysis gives an insight of
the position of a website or an online resource among the information-seekers and
determines the extent of permeation and up-take of the resource by a community.
Therefore, using Alexa as a tool for analysis, the current study reached to the conclusion
that even though ScienceDirect is having the highest global trafc rank, it is still lacking
behind in terms of page views and the downloading speed compared to Sci-Hub because of
distribution of free content. Moreover, the users spent less time on ScienceDirect and
Emerald Insight as compared to Sci-Hub, that depicts that users are more inclined towards
Sci-Hub than that of ScienceDirect and Emerald Insight. The study also analyzed the bounce
rate and found Sci-Hub is having lowest bounce rate in comparison to ScienceDirect and
Emerald Insight. Further, ScienceDirect was found to have a greater number of link-in
websites that helps it in achieving good trafc ranking despite of high bounce rate.
Furthermore, Sci-Hub has the highest number of users from developing nations as it
provides the free access to the resources of its database without any constraint orrestriction.
Therefore, commercial players need to rethink and be innovative in their publishing
business model or else Pirate access and Open access may eat up major chunk of their
market in near future, especially in developing world.
Scientic knowledge should be considered as a public good and needs to be delivered
either free of cost or at minimum possible cost. But, huge proteering by online publishers
has, in the words of Suber (2012), turned it to a priced commodity. This prompts researchers
to opt for even unauthorized means of accessing and acquiring knowledge. Also,
subscriptions to all or most of the peer-reviewed journals is unlikely to be affordable for an
individual researcher or a research institute (Suber, 2012) and even Harvard University
known for having the richest libraries in the world had in 2012, complained of increase of
subscription charges by 145%, by some publishers, in a span of six years (Manuel, 2018). It
has even warned the institutions 2,100 staff for making their research freely available and
also to resign from the journals that hide research behind pay-walls as the price hikes
imposed by large publishers bill the library around $3.5m a year which the University could
not afford (Sample, 2012).
Though, piracy is considered a crime and providing access to knowledge through illegal
means may harm the scholarly communication in the long run and poses a threat not only to
publishers but to academic and research libraries too, but huge proteering from a public
good is equally a violation and goes against theethos of science. Thus, acceptable prices and
ready access to important lifesaving research from publishers would prove helpful
(Anderson, 2019). Besides, stakeholders, such as governments, academicians, librarians,
funders, publishers, etc., should work individually as well as collectively to help break the
pay-walls without breaching the codes of ethics of scholarly communications.
DLP
Summarized below are some of the measures that can serve the purpose:
Stringent rules and regulations pertaining to issues of pricing/charging, providing
access, etc., should be set, to be followed by the online publishersworld over.
Besides, some serious initiatives should be taken and implemented too. Like, it
should be made mandatory for publishers to keep all the literature produced in a
country, open and freely accessible to the citizens of that particular country, i.e.
without any subscription charges, with a major concern for developing nations or at
least, facilitate some default open access publishing for all particular institutional
authored articles to the respective institutions.
National governments should make efforts and chalk out programmes and policies
like one nation one subscription policy”–that was recommended by some
academicians in India, in October 2019 (Mukunth, 2019) and work for the
implementation processes too.
Academicians should set-up departmental and institutional repositories and make it
imperative for scientists/researchers to deposit their publications for free access.
Besides, researchers should opt for open-access journals for publishing their work.
Funders should make it imperative that scientists receiving any public or private
grants must publish their works in open access journals or on open access platforms
(European Science Foundation, 2020).
Publishers need to reform their policies. They should decrease the embargo periods to
the minimum possible limit, support the concept of universal access to knowledge,
reduce the subscription costs with effective implementation of open access models
and work in collaboration with libraries in dissemination of scienticwork.
Signicance
The nding of this study reveals the user adoption and preference towards open literature,
in general, and Sci-Hub, in particular. While the study never agrees with the ethics of Sci-
Hub but sincerely attempts to serve as a wake-up call for the commercial publishers and
make it imperative for all the stakeholders to work in unison to mend and modify the
burdensome legalities in accessing scientic literature. Besides, this will also help
commercial publishers think beyond charging user directly but rather innovate in earning
through evolving alternate business models.
Limitations
The limitation of this study is that it is conned to two major fee-based and one pirated
database. If the scope of the study may be broadened, one can have more comprehensive
look into this matter. The other limitation of this study is that only one web metrics tool, i.e.
Alexa databank, was used as a source of data for this problem. Future research could use
other tools for data collection.
References
Abu, K.S. and Jayasekara, P.K. (2017), Webometric analysis on leading course providers in MOOC: a
study based on Alexa ranking, Paper presented at INFLIBNETs convention on automation of
libraries in education and research institutions (CALIBER), Chennai, India, available at: http://ir.
inibnet.ac.in/handle/1944/2101
Alexa.com (2019), available at: www.alexa.com
Webometric
study of
academic
publishers
An introduction to ScienceDirect (2020), available at: https://ieconferences.com/an-introduction-to-
sciencedirect/
Anderson, R. (2019), Researcher to reader (R2R) debate: is Sci-Hub good or bad for scholarly
communication? [blog post], available at: https://scholarlykitchen.sspnet.org/2019/04/16/
researcher-to-reader-r2r-debate-is-sci-hub-good-or-bad-for-scholarly-communication/
Ansari, N.A. and Raza, M.M. (2019), Awareness and usage of emerald insight database as determinant
of research output for researcher scholar of Aligarh Muslim university,Collection Management,
Vol. 45 No. 1, p. 45, doi: 10.1080/01462679.2019.1579013.
Bjorneborn, L. and Ingwersen, P. (2004), Toward a basic framework for webometrics,Journal of the
American Society for Information Science and Technology, Vol. 55 No. 14, pp. 1216-1227, doi:
10.1002/asi.20077.
Bohannon, J. (2016a), The frustrated science student behind Sci-Hub,Science, Vol. 352 No. 6285,
p. 511, doi: 10.1126/science.352.6285.511.
Bohannon, J. (2016b), Whos downloading pirated papers? Everyone,Science, Vol. 352 No. 6285,
pp. 508-512, doi: 10.1126/science.aaf5664.
Booth, D. and Jansen, B.J. (2009), A Review of Methodologies for Analyzing Websites: Handbook of
Research on Web Log Analysis, Hershey IGI Global, available at: https://faculty.ist.psu.edu/
jjansen/academic/jansen_website_analysis.pdf
Bosman, J. and Kramer, B. (2018), Open access levels: a quantitative exploration using web of science
and oaDOI data,PeerJ Preprints,doi:10.7287/peerj.preprints.3520v1.
Craven, A. and Dallas, G. (2009), Using what youve got: the development of emerald management
rst,Learned Publishing, Vol. 22 No. 3, pp. 199-205, doi: 10.1087/2009306.
Dave (2016), The quiet scientic revolution ...bypassing Elsevier paywalls [blog post], available at:
www.skeptical-science.com/science/quiet-scientic-revolution-bypassing-elsevier-paywalls/
Elbakyan, A. (2017), Some facts on Sci-Hub that Wikipedia gets wrong, Engineuring, available at:
https://engineuring.wordpress.com/2017/07/02/some-facts-on-sci-hub-that-wikipedia-gets-wrong/
(accessed 3 January 2019).
Elsevier seeks millions from Sci-Hub for copyright infringement (2018), available at: www.enago.com/
academy/elsevier-seeks-millions-fromsci-hub-for-copyright-infringement/
European Science Foundation (2020), Plan S making full and immediate open access a reality,
available at: www.coalition-s.org
Fang, W. (2007), Using Google analytics for improving library website content and design: a case
study,Library Philosophy and Practice, Vol. 1 No. 1, doi: 10.7282/T3MK6B6N.
Greshake, B. (2017), Looking into Pandoras box: the content of Sci-Hub and its usage,
F1000Research, Vol. 6, doi: 10.12688/f1000research.11366.1.
Himmelstein, D.S., Romero, A.R., Levernier, J.G., Munro, T.A., McLaughlin, S.R., Tzovaras, B.G. and
Greene, C.S. (2018), Sci-Hub provides access to nearly all scholarly literature,eLife, Vol. 7,
p. e32822, doi: 10.7554/eLife.32822.
Hong, I. (2007), A survey of web site success metrics used by internet-dependent organizations in
korea,Internet Research, Vol. 17 No. 3, pp. 272-290, doi: 10.1108/10662240710758920.
How is Sci-hub affecting academic publishing? (2018), available at: www.enago.com/academy/how-is-
sci-hub-affecting-academic-publishing/
Hung, H. and McQueen, J. (2004), Developing an evaluation instrument for e-commerce web sites from
the rst-time buyers viewpoint,Electronic Journal of Information Systems Evaluation, Vol. 7
No. 1, pp. 31-42, available at: www.academia.edu/4250731/Developing_an_Evaluation_
Instrument_for_e-Commerce_Web_Sites_from_the_First-Time_Buyer_s_Viewpoint
Independent Publishers Guild (2020), The independent publishing awards 20, available at: www.
independentpublishersguild.com/IPG/Events/IPA/Independent_Publishing_Awards.aspx
DLP
Islam, M.A. (2011), Webometrics study of universities in Bangladesh,Annals of Library and
Information Studies, Vol. 58, pp. 307-318, available at: http://nopr.niscair.res.in/bitstream/
123456789/13480/4/ALIS%2058%284%29%20307-318.pdf
Jowkar, A. and Didegah, F. (2010), Evaluating Iranian newspapersweb sites using correspondence
analysis,Library Hi Tech, Vol. 28 No. 1, pp. 119-130, doi: 10.1108/07378831011026733.
Kaube, B. (2018), Scientists should be solving problems, not struggling to access journals, The
Guardian, available at: www.theguardian.com/higher-educationnetwork/2018/may/21/scientists-
access-journals-researcher-article
Khan, S. (2016), Use of online databases in the faculties of social sciences and arts in Central
universities of Delhi and Uttar Pradesh,Aligarh Muslim University. Department of Library and
Information Science (ph.d thesis), available at: https://shodhganga.inibnet.ac.in/bitstream/
10603/121708/12/12_chapter3.pdf
Khan, A. and Idrees, H. (2015), Calculating web impact factor for university websites of Pakistan,The
Electronic Library, Vol. 33 No. 5, pp. 883-895, doi: 10.1108/EL-01-2014-0022.
Larivière, V., Haustein, S. and Mongeon, P. (2015), The oligopoly of academic publishers in the digital
era,PLoS One, Vol. 10 No.6, p. e0127502, doi: 10.1371/journal.pone.0127502.
Lindgren, I. and Jansson, G. (2013), Electronic services in the public sector: a conceptual framework,
Government Information Quarterly, Vol. 30 No. 2, pp. 163-172, doi: 10.1016/j.giq.2012.10.005.
Mc Kenzie, L. (2017), Sci-hubs cache of pirated papers is so big, subscription journals are doomed,
data analyst suggests,Science,doi:10.1126/science.aan7164.
Manuel, T. (2018), How scihub is at the forefront of the quest to frame scientic knowledge as public
good, The Wire, available at: https://thewire.in/science/how-scihub-is-at-the-forefront-of-the-
quest-to-frame-scientic-knowledge-as-public-good (accessed 4 November 2020).
Meadows, A. (2015), Beyond open: expanding access to scholarly content,The Journal of Electronic
Publishing, Vol. 18 No. 3, doi: 10.3998/3336451.0018.301.
Milova, E. (2017), Alexandra Elbakyan science should be open to all not behind paywalls, Life
Extension Advocacy Foundation, available at: www.leafscience.org/alexandra-elbakyan/
Mukunth, V. (2019), India will skip plan S, Focus on National Efforts in Science Publishing, available
at: The Wire, available at: https://thewire.in/the-sciences/plan-s-open-access-scientic-
publishing-article-processing-charge-insa-k-vijayraghavan (accessed 4 November 2020).
Muthuraja, S. and Veerabasavaiah, M. (2018), An evaluation of Kannada newspaper websites using
alexa internet tool: a webometric study,International Journal of Library and Information
Studies, Vol. 8 No. 1, pp. 202-209, available at: www.ijlis.org/img/2018_Vol_8_Issue_1/202-209.
pdf
Naheem, K.T. (2016), Malayalam newspaper websites: a webometric study using Alexa internet’”,
International Journal of Digital Library Services, Vol. 6 No. 3, pp. 67-75, available at: www.ijodls.
in/uploads/3/6/0/3/3603729/67-75.pdf
Odeyemi, O.J. (2017), Webometric analysis of Nigerian newspapers websites,International Journal of
Digital Library Services, Vol. 7 No. 4, pp. 13-20, available at: www.ijodls.in/uploads/3/6/0/3/
3603729/2ijdosl174.pdf
Piwowar, H., Priem, J., Lariviere, V., Alperin, J.P., Matthias, L., Norlander, B. and Haustein, S. (2018),
The state of OA: a large-scale analysis of the prevalence and impact of open access articles,
PeerJ, Vol. 6, doi: 10.7717/peerj.4375.
Rogers, A. (2017), Its gonna get a lot easier to break science journal paywalls, available at: www.
wired.com/story/its-gonna-get-a-lot-easier-to-break-science-journal-paywalls/
Saeid, M., Ghani, A.A. and Selamat, H. (2008), Rank-order weighting of web attributes for website
evaluation,The International Arab Journal of Information Technology, Vol. 8 No. 1, pp. 30-37,
available at: www.ccis2k.org/iajit/PDF/vol.8,no.1/7.pdf
Webometric
study of
academic
publishers
Sample, I. (2012), Harvard university says it cant afford journal publishersprices the guardian,
available at: www.theguardian.com/science/2012/apr/24/harvard-university-journal-publishers-
prices
Schiermeier, Q. (2017), US court grants Elsevier millions in damages from Sci-Hub,Nature, doi:
10.1038/nature.2017.22196.
Schmitt, J. (2014), Academic journals: the most protable obsolete technology in history, The
Hufngton post blog, available at: www.hufngtonpost.com/jason-schmitt/academic-journals-
the-mos_1_b_6368204.html (accessed 2 Febraury 2021).
Science Direct (2019), available at: www.sciencedirect.com/
Sha, S.M. and Bhat, M.H. (2014), Performance and visibility of Indian research institutions on the
web,VINE: The Journal of Information and Knowledge Management Systems, Vol. 44 No. 4,
pp. 537-547, doi: 10.1108/VINE-06-2013-0034.
Suber, P. (2012), Open access, MIT Press, Cambridge, Mass, available at: http://nrs.harvard.edu/urn-3:
HUL.InstRepos:10752204
The best academic research databases (2019), Paperpile, available at: https://paperpile.com/g/
academic-research-databases/
The Economic Times (2020), available at: https://economictimes.indiatimes.com/denition/piracy
UC and Elsevier (2019), Ofce of scholarly communication, University of California, Berkeley,
available at: https://osc.universityofcalifornia.edu/open-access-at-uc/publisher-negotiations/uc-
and-elsevier/
Vogel, G. (2014), German University tells Elsevier no deal’”, Science Insider, available at: http://news.
sciencemag.org/people-events/2014/03/german-university-tells-elsevier-no-deal (accessed 2
Febraury 2021).
Corresponding author
Raashida Amin can be contacted at:safarashida@gmail.com
For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com
DLP
... This unrestricted access can yield benefits such as generating more original work, replicating empirical findings across diverse settings, and shifting the research focus towards topics overlooked by researchers from more developed countries. Amin et al. (2022) [27] compared the web traffic ranking, usage, and popularity of ScienceDirect and Emerald Insight subscription databases with the pirated open-access platform Sci-Hub. The findings reveal that while ScienceDirect boasts the highest traffic rank and in-linking sites, Sci-Hub records the most page visits with the fastest download speed. ...
Article
Full-text available
This study analyzes the global literature on the black open-access phenomenon from 2011 to 2023. A bibliometric analysis was conducted using the Scopus database. The search strategy employed advanced queries with multiple synonymous terms to ensure exhaustive retrieval of relevant documents. The VOSviewer software was employed to visualize the co-occurrence networks. The findings reported 90 papers published during the study period. An evolving scholarly landscape was revealed, with heightened attention from 2016 onwards, peaking in 2017, 2021, and 2023. Articles constitute 83.3% of the total published documents. Singh and Srichandan are prolific authors, with 11.2% of the total publications. The United States contributes 18.9% of the papers, followed by India and Spain. Information Development and Scientometrics are pivotal journals in scholarly discussions about this scope, contributing 4.4% of publications. Co-occurrence network visualization revealed "Sci-Hub" and "open access" as the most used keywords in the global literature. The findings underscore the need for additional research to discover innovative business models to safeguard intellectual property rights while meeting researchers' evolving needs. The importance of this paper comes from being the first bibliometric study analyzing international literature related to this phenomenon, which provides a basis for future research efforts and policymaking.
... Plainly then, nearly all of the scholarly literature is available gratis to anyone with an Internet connection, as long as they are prepared to ignore the possibility that doing so constitutes copyright infringement. The findings of a study that compared Sci-Hub with the subscription-based publisher sites of ScienceDirect and Emerald Insight certainly evidence a widespread disregard of the illicitness of the service: Sci-Hub had the highest number of daily page visits, the highest daily time spent on site and the lowest bounce percentage; that is, the lowest rate of leaving the website without doing or completing an activity (Amin et al., 2021). ...
Article
Full-text available
Presents findings from a study into the attitudes and practices of pandemic‐era early career researchers (ECRs) in regard to obtaining access to the formally published scholarly literature, which focused on alternative providers, notably ResearchGate and Sci‐Hub. The study is a part of the Harbingers project that has been exploring the work lives and scholarly communication practices of ECRs in pre‐pandemic times and during the pandemic, and utilizes data from two rounds of interviews with around 170 ECRs from the sciences and social sciences in eight countries. Findings show that alternative providers, as represented by ResearchGate and Sci‐Hub, have become established and appear to be gaining ground. However, there are considerable country‐ and discipline‐associated differences. ECRs' country‐specific level of usage of the alternative providers is partly traceable to the adequacy of library provisions, although there are other factors at play in shaping ECRs' attitudes and practices, most notably convenience and time saving, as well as the fact that these platforms have become embedded in the scholarly dashboard. There is a dearth of evidence of the impact of the pandemic on ECRs' ways of obtaining scholarly papers.
Article
Full-text available
The present study has been done by using webometric methods. This paper intends to evaluate the Kannada language newspaper web sites using the most well known tool for evaluating websites "Alexa Internet" a company of Amazon.com. The 10 leading Kannada language newspaper websites from the state of Karnataka were taken for evaluation. Each newspaper web site was searched in Alexa databank and relevant data including traffic rank, pages viewed, speed, links, and bounce percentage, time on site, search percentage, and percentage of Indian/foreign users were collected and these data were tabulated and analyzed. The result of this study shows that Vijayakarnataka has 2,255 the highest traffic rank in India Udayavani has 27,903 the highest traffic rank in global. Vijayakarnataka has 7.32 having highest number of average pages viewed per day and 12:40 estimated daily time spent on site by the visitors.
Preprint
Full-text available
Across the world there is growing interest in open access publishing among researchers, institutions, funders and publishers alike. It is assumed that open access levels are growing, but hitherto the exact levels and patterns of open access have been hard to determine and detailed quantitative studies are scarce. Using newly available open access status data from oaDOI in Web of Science we are now able to explore year-on-year open access levels across research fields, languages, countries, institutions, funders and topics, and try to relate the resulting patterns to disciplinary, national and institutional contexts. With data from the oaDOI API we also look at the detailed breakdown of open access by types of gold open access (pure gold, hybrid and bronze), using universities in the Netherlands as an example. There is huge diversity in open access levels on all dimensions, with unexpected levels for e.g. Portuguese as language, Astronomy & Astrophysics as research field, countries like Tanzania, Peru and Latvia, and Zika as topic. We explore methodological issues and offer suggestions to improve conditions for tracking open access status of research output. Finally, we suggest potential future applications for research and policy development. We have shared all data and code openly.
Article
Full-text available
Despite growing interest in Open Access (OA) to scholarly literature, there is an unmet need for large-scale, up-to-date, and reproducible studies assessing the prevalence and characteristics of OA. We address this need using oaDOI, an open online service that determines OA status for 67 million articles. We use three samples, each of 100,000 articles, to investigate OA in three populations: (1) all journal articles assigned a Crossref DOI, (2) recent journal articles indexed in Web of Science, and (3) articles viewed by users of Unpaywall, an open-source browser extension that lets users find OA articles using oaDOI. We estimate that at least 28% of the scholarly literature is OA (19M in total) and that this proportion is growing, driven particularly by growth in Gold and Hybrid. The most recent year analyzed (2015) also has the highest percentage of OA (45%). Because of this growth, and the fact that readers disproportionately access newer articles, we find that Unpaywall users encounter OA quite frequently: 47% of articles they view are OA. Notably, the most common mechanism for OA is not Gold, Green, or Hybrid OA, but rather an under-discussed category we dub Bronze: articles made free-to-read on the publisher website, without an explicit Open license. We also examine the citation impact of OA articles, corroborating the so-called open-access citation advantage: accounting for age and discipline, OA articles receive 18% more citations than average, an effect driven primarily by Green and Hybrid OA. We encourage further research using the free oaDOI service, as a way to inform OA policy and practice.
Article
Full-text available
The website Sci-Hub enables users to download PDF versions of scholarly articles, including many articles that are paywalled at their journal's site. Sci-Hub has grown rapidly since its creation in 2011, but the extent of its coverage was unclear. Here we report that, as of March 2017, Sci-Hub's database contains 68.9% of the 81.6 million scholarly articles registered with Crossref and 85.1% of articles published in toll access journals. We find that coverage varies by discipline and publisher, and that Sci-Hub preferentially covers popular, paywalled content. For toll access articles, we find that Sci-Hub provides greater coverage than the University of Pennsylvania, a major research university in the United States. Green open access to toll access articles via licit services, on the other hand, remains quite limited. Our interactive browser at https://greenelab.github.io/scihub allows users to explore these findings in more detail. For the first time, nearly all scholarly literature is available gratis to anyone with an Internet connection, suggesting the toll access business model may become unsustainable.
Article
Full-text available
Despite the growth of Open Access, potentially illegally circumventing paywalls to access scholarly publications is becoming a more mainstream phenomenon. The web service Sci-Hub is amongst the biggest facilitators of this, offering free access to around 62 million publications. So far it is not well studied how and why its users are accessing publications through Sci-Hub. By utilizing the recently released corpus of Sci-Hub and comparing it to the data of ~28 million downloads done through the service, this study tries to address some of these questions. The comparative analysis shows that both the usage and complete corpus is largely made up of recently published articles, with users disproportionately favoring newer articles and 35% of downloaded articles being published after 2013. These results hint that embargo periods before publications become Open Access are frequently circumnavigated using Guerilla Open Access approaches like Sci-Hub. On a journal level, the downloads show a bias towards some scholarly disciplines, especially Chemistry, suggesting increased barriers to access for these. Comparing the use and corpus on a publisher level, it becomes clear that only 11% of publishers are highly requested in comparison to the baseline frequency, while 45% of all publishers are significantly less accessed than expected. Despite this, the oligopoly of publishers is even more remarkable on the level of content consumption, with 80% of all downloads being published through only 9 publishers. All of this suggests that Sci-Hub is used by different populations and for a number of different reasons, and that there is still a lack of access to the published scientific record. A further analysis of these openly available data resources will undoubtedly be valuable for the investigation of academic publishing.
Article
Purpose: This paper focuses on the usage of Emerald Insight database by the university researchers of all branches. This study intends to determine the researchers’ usage pattern and their level of satisfaction toward the Emerald Insight database. Methodology: A total of 260 well-structured questionnaires were randomly distributed to the researchers of distinguish faculties and out of these 187 questionnaires were received. Later, each response was examined to check whether the respondent had filled questionnaires appropriately or not. Only six questionnaires were found to be discarded due to incomplete information provided by the respondents. Finally, a total of 181 questionnaires were included in the study constituting 69.61% response rate. Afterward, these responses were coded and various statistics such as Central Tendency, Cumulative Frequency, Percentage, etc. were applied upon the data using SPSS. According to the nature of data graphs, charts, and tables were drawn using MS Excel. The usage data of Emerald Insight for last six years was also clubbed into the study which was arranged by the Maulana Azad (MA) Library on a request. Findings: Demographically, a total of 181 respondents were participated in the study of which 170 respondents were working on PhD and only 11 respondents were working on PDF (Post Doctorate Fellowship) program. All respondents were found aware of Emerald Insight database and none was found unaware; however some respondents (16.57%) were averagely aware, a very few respondents (10.49%) were scarcely aware, and remaining (72.89%) were fully aware. The availability of full text, multidisciplinary nature, and friendly interface were major reasons that compelled researchers to use Emerald Insight database. More or less, both searching techniques were used by respondents. In advance search category, phrase search (55.24%) appeared as most common method followed by author search (24.30%) and Boolean operators (12.15%). Their order of preference was as the newest one (46.96%) than relevancy (45.86%) and lastly, the oldest one (7.18%). Besides, the researchers were also found highly satisfied in all aspects of features provided by the Emerald Insight database from searching techniques to its recency and adequacy of literature. Furthermore, the majority of researchers (84.53%) were found fully satisfied, whereas only 2.20% respondents were fully dissatisfied. Their satisfaction was supported by the usage date received by the MA Library which had revealed a gradual increment in the number of items downloaded per year from 2012 to 2017. The usage just doubled from 4546 items downloaded in 2012 to 9541 items downloaded in 2017. Originality: Indeed, this study is original and guarantees that none of such study has been conducted earlier. After going through the profound literature review, no similar study is found. Hence, this study, of course, is new and original.
Chapter
This chapter is an overview of the process of Web analytics for Websites. It outlines how visitor information such as number of visitors and visit duration can be collected using log files and page tagging. This information is then combined to create meaningful key performance indicators that are tailored not only to the business goals of the company running the Website but also to the goals and content of the Website. Finally, this chapter presents several analytic tools and explains how to choose the right tool for the needs of the Website. The ultimate goal of this chapter is to provide methods for increasing revenue and customer satisfaction through careful analysis of visitor interaction with a Website.