Content uploaded by H.M.U.M. Herath
Author content
All content in this area was uploaded by H.M.U.M. Herath on Jan 16, 2024
Content may be subject to copyright.
An Analysis of Significance of Revised Web Impact
Factor for ranking the websites of state
universities in Sri Lanka
S. T. C. I. Wimaladharma
Department of Computer Science and Technology
Uva Wellassa University
Badulla , Sri Lanka
Email: charitha@uwu.ac.lk
H.M.U. Manjula Herath
Department of Computer Science and Technology
Uva Wellassa University
Badulla , Sri Lanka
Email: herath@uwu.ac.lk
Abstract—The level of connectivity and accessibility of a
university website and the amount of information it shares
have become a competitive factor among the world universities.
Nowadays, the World Wide Web considerably contributes in
the presence of information of university websites. Due to the
immense distribution of information over the World Wide Web,
it has been a challenging task to measure the quality and
the quantity of the information each university website shares.
Therefore, measuring the web impact has become the most
popular mechanism among the researchers. This study is to
compare the significance of web impact of the state university
websites in Sri Lanka based on the link analysis statistics
obtained from well-known search engines, Google and Yahoo!. In
this analysis the Revised Web Impact Factor, the ratio between
the number of inlinks (external back links) and the number
of web pages published in the website which are indexed by
the search engines (not all pages of the website),was taken
into account. The correlation coefficient between the rank of
resultant Revised Web Impact Factors and Impact Factors taken
from the Webometrics website derived by Cybermetrics Lab
was calculated with the ninety percent of inference level. If an
academic website increases its link density via Yahoo! Search
engine, it is relatively significance for its Webometrics impact
factor whereas Google indexing expresses less relevance for the
Webometrics impact factor.
Index Terms—Webometrics, Link Analysis, revised WIF, Uni-
versity Websites, Sri Lanka, Spearman0s Rank Correlation
I. INTRODUCTION
During the past few decades, information and knowledge
sharing has become one of the important corporate responsi-
bilities of educational institutes all over the world. The World
Wide Web (WWW), is an information space [1], was initiated
with the intention of bringing the global sources of information
into existence using contemporary available technologies [2].
Therefore, WWW, or simply the Web, has become one of the
main sources of information on academic and research activi-
ties [3]. So the degree of quality and availability of the shared
information by different universities have become a significant
concern. Because of that, Cybermetrics Lab, which is governed
by the National Research Council of Spain, introduced a
reliable and multidimensional methodology to quantitate the
degree of presence, impact, openness and excellence of the
university websites all over the world [4].
The methodology used for evaluating the scientific jour-
nals by Thomson ISI; previously known as the Institute for
Scientific Information (ISI), was reproduced to measure the
attractiveness of websites [5]. This new approach of studying
information resources, structures and technologies on the Web
included webometrics [6]. While there were several studies
on webometrics, Web Impact Factor (WIF) was introduced
to analyze the linkage between websites The WIF is the link
density of a given website. That is, the ratio between the total
number of links of a website and the number of pages of the
website indexed by a search engine (publicly accessible pages)
in a given time. The total number of links of a website consists
of backlinks (inlinks): hyperlinks on some other websites that
are directing visitors to a website, self-links: navigational links
used to direct visitors from one page to the other within the
website , and external links (outlinks): hyperlinks allowed the
site visitors to access the other websites [7].
W I F =A=total number of links of a website
B=number of web pages indexed by
a search engine(publicly accessible)
(1)
Web impact factor measures the average impact per page
of a website regardless of the link type. Generally, self-links
reflect the logical structure of a website and how the pages are
organized in webservers [8]. Therefore, self-links become less
important over backlinks since self-links are usually used to
navigate through the website itself [9] rather than enrichment
of information of it [10]. According to the recent studies,
the exclusion of self-links from WIF calculation was recom-
mended by several researchers.Then the new formulation to
calculate the link density (without self-link) of a website has
been defined as Revised Web Impact Factor (RWIF) [11].
978-1-5090-6132-7/16/$31.00 c
2016 IEEE
TABLE I
WEBOMETRICS RANKS OF SRI LANKAN STATE UNIVERSITIES IN THE
LOCAL CONTEXT
University Name Webometrics
Rank
Impact
Rank
(Global)
Impact
Rank
(Local)
University of Colombo 1 3965 3
University of Peradeniya 2 6091 4
University of Moratuwa 3 3487 2
University of Kelaniya 4 6696 6
University of Ruhuna 5 8064 8
University of Sri
Jayewardenepura
6 6582 5
University of Jaffna 7 9910 10
University of the Visual and
Performing Arts
8 1451 1
Open University of Sri Lanka 9 9492 9
Wayamba University of Sri
Lanka
10 8050 7
Sabaragamuwa University 11 12424 11
Eastern University of Sri Lanka 12 13262 12
Rajarata University 13 13898 14
South Eastern University of Sri
Lanka
14 13300 13
Uva Wellassa University 15 15190 15
A. Objectives
The foremost concern of the study is to look over the
importance of external backlinks that refer to the Sri Lankan
state university websites using the query results of Google and
Yahoo! search engines. This research further focuses to
1) Calculate the RWIFs for the Sri Lankan state university
websites using Google and Yahoo! search engine results.
2) Determine the Spearman0s rank correlation coefficient
between the ranks of RWIFs and the webometrics impact
factors of fifteen national university websites.
3) Analyze the significance of Google and Yahoo! based
RWIFs of each website for its webometrics impact factor
II. METHODOLOGY
1) Data Collection: The official websites of fifteen Sri
Lankan state universities [12] were selected for this study. All
the selected websites have .ac.lk as their first and second-level
domains that has ensured the minimization of effect of top-
level domains for the study. Mainly two webometric tools,
Developer Shed: An online open site explorer service [13]
and two commercial search engines (Google and Yahoo!),
were employed for data collection. The required data was
collected between 01st May 2016 and 15th September 2016
using the above tools. In addition to the link data, impact
rank of the selected websites were obtained from webometrics
website published in July 2016 by Cybermetrics Lab as shown
in Table I, where the ranking has done by connectivity-
based [4] criteria. Although Cybermetrics research group has
defined several rankings such as presence, impact, openness
and excellence, this study focuses on the impact ranking of
the selected websites.
It has been a proven fact that search engines have not
indexed all the web pages of a website; instead, they re-
turn estimated figures given by the search engine algorithms
[5]. Several years back, researchers used to use AltaVista
search engine for most of the web related studies due to the
availability of advanced search facilities but Google search
engine was lacking of at that time [6]. In 2003 AltaVista was
acquired by Yahoo! [7] and because of that Google and Yahoo!
search engines became the competitors to each other providing
domain statistics.
When investigating the search-queries, which are compat-
ible with search engines, it was encountered that different
search engines have produced different search results for the
same query [8]. In addition, there were certain queries that
are search engine dependent and could not be used for the
other search engines. For an example, link and linkdomain
search commands are no longer available in Yahoo! search
engine [14]. Therefore, alternative search tools and criteria
were adopted. Google supports linkdomain and host search
commands to obtain the number of inlinks (external backlinks)
to the website [15], which was the numerator of RWIF formula
(see formula 2).
Advanced search queries and the description of their results
are shown in Table II. [11]. The total number of indexed
pages of each website: the denominator of RWIF formula, was
obtained by site:domain name search command from Google
and Yahoo! search engines. The dataset was validated using
SEO CHAT, an online link analyzer tool, for all the selected
websites. The comparison result set for www.uwu.ac.lk web-
site is shown in Table III.
As a secondary parameter, the amount of digital information
each website shares has been counted for the analysis. There-
fore, different format of digital information such as Portable
Data Format (PDF), Microsoft Word (doc and docx), Microsoft
PowerPoint (ppt and pptx) are taken in to account [9].
2) Data Analysis: The Revised Web Impact Factor was
calculated as follows [11]
RW I F =A=inlinks (external backlinks) to the website
B=number of web pages in the website
which are indexed by the search engine
(2)
Spearman0s Rank Correlation Coefficient:
R= 1 −6Σd2
(n3−n)(3)
Where,
•d is the difference between ranks
•n is the sample size
The list of websites were then ranked according to the
RWIFs calculated from Google and Yahoo! datasets accord-
ingly. In order to analyze the relationship between the ranks
calculated above, Spearman0s rank correlation method was
used. The reason to select the above statistical method is
TABLE II
ADVANC ED SE AR CH QU ERI ES A ND TH EI R OUT PU TS
Command Description Supported by
linkdomain:domain name NOT domain:domain name Returns number of web pages that link back to
domain name
Google
inbody:“domain name” Returns number of web pages containing the
keyword domain name irrespective of being a
hyperlink
Yahoo!/Bing
TABLE III
CROS S CHE CK ED VALU ES F ROM ANA LYZE R TOO LS AN D SE LEC TE D SEA RC H ENG IN ES
Data Collection Tool
Type
Website Tool/Command line
description
Number of web pages
found Google
Number of web pages
found Yahoo!
Online Link Analyser
Tool
http://tools.seochat.com/tools/domain-
indexed-pages/
Returns the total indexed
page count for each URL
443 1130
Online Commercial
search engines
http://www.google.com and
http://www.yahoo.com
site:www.uwu.ac.lk
returns a list of indexed
pages for a website.
443 1130
that it can be applied to a ranked datasets coming from any
distribution, i.e. it is not necessary to have a population with
normal distribution [16].
The correlation coefficient between the ranks of Google
RWIFs and webometrics impact factors was calculated using
the statistics shown in Table IV. Similarly, the correlation coef-
ficient between the ranks of Yahoo! RWIFs and webometrics
impact factors was calculated using the statistics shown in
Table V.
3) Research Hypothesis:
•H1: There is a rank order correlation between impact
factors calculated using search engine statistics and the
impact factors available in the Webometrics ranking web-
site.
•H0: There is no rank order correlation between the above
impact factors.
4) Results and Discussions: The main intention of the study
was to investigate the behavior of RWIFs of national university
websites in Sri Lanka based on the facts obtained from
Google and Yahoo! search engines. RWIFs were calculated
for fifteen websites using Google link analysis as shown in
Table IV. The Open University of Sri Lanka has achieved
the highest RWIF (= 38.30570902) while the university of
Jaffna has the minimum (= 0.066188525). The main reason
for the significant difference between the maximum and the
minimum RWIFs was that the University of Jaffna has a less
link density compared with that of the Open University of Sri
Lanka. Similarly,Table V depicts the RWIFs relevant to Yahoo!
results. The University of Colombo has achieved the maximum
(= 3.00000000) whereas the Rajarata University of Sri Lanka
has the minimum (= 0.781609195). It has been encountered
that the degree of informativeness plays an important preface
for inlinks that are coming from other websites. If a university
website contains wealthy information, which can be referred
from other websites, it is obvious that the other institutional
websites tend to have a reference (link) to the subjective
website.
The calculated RWIFs of 15 websites under Google and
Yahoo! were ranked with respect to each other separately.
Because of that, there were three sets of ranks for 15 websites
as shown in Table IV and Table V). In order to understand
whether there is a relationship between each rank and We-
bometrics Impact Factor ranks, Spearmans rank correlation
coefficient between Google and Webometrics (see equation
4) and Yahoo! and Webometrics (see equation 5 ) ranks were
calculated.
Spearman0s Correlation Coefficient (Google RWIF ranks vs.
Webometrics IF ranks)
RG,W = 1 −(6X448)
15(152−1) (4)
RG,W = 0.2
The critical value of RG,W for N = 15 and = 0.10 is 0.447
|RG,W |<0.447
i.e. the absolute value of the calculated RG,W is less than
the critical value. Therefore, the null hypothesis is accepted
with 90 percent of confidence level saying that there is
no significant relationship between Google RWIFs and the
Webometrics impact factors. The graphical representation of
the correlation between Google RWIFs and the Webometrics
Impact Factors can be shown in Figure 1).
Spearmans Correlation Coefficient (Yahoo RWIF ranks vs.
Webometrics IF ranks)
RY,W = 1 −(6X314)
15(152−1) (5)
RY,W = 0.43929
TABLE IV
GOO GLE S TATIST IC S OF TH E SE LEC TE D WEB SIT ES
University Name A=inlinks
(back links
to the
website )
B=Number of web pages
publish in the website
indexed by SEO Chat
Google RWIF=
A/B
G=Google
Rank (local)
R=Impact
Rank
d2=(G-R)2
University of Colombo 4260 6020 0.707641196 8 3 25
University of Peradeniya 2040 10800 0.188888889 13 4 81
University of Moratuwa 51700 4790 10.79331942 2 2 0
University of Kelaniya 53000 47100 1.125265393 4 6 4
University of Ruhuna 1610 4030 0.399503722 11 8 9
University of Sri Jayewardenepura 1910 7270 0.262723521 12 5 49
University of Jaffna 646 9760 0.066188525 15 10 25
University of the Visual and
Performing Arts
235 260 0.903846154 7 1 36
Open University of Sri Lanka 208000 5430 38.30570902 1 9 64
Wayamba University of Sri Lanka 3270 3460 0.945086705 6 7 1
Sabaragamuwa University 1630 966 1.687370600 3 11 64
Eastern University of Sri Lanka 1960 1970 0.994923858 5 12 49
Rajarata University 334 2290 0.145851528 14 14 0
South Eastern University of Sri Lanka 1780 2840 0.626760563 9 13 16
Uva Wellassa University 177 443 0.399548533 10 15 25
Total Σ120 120 448
TABLE V
YAHOO !STATI STI CS O F THE S ELE CT ED WE BS ITE S
University Name A=inlinks
(back links
to the
website )
B=Number of web pages
publish in the website
indexed by SEO Chat
Yahoo! RWIF=
A/B
Y=Yahoo!
Rank (local)
R=Impact
Rank
d2=(Y-R)2
University of Colombo 10200 3400 3.000000000 1 3 4
University of Peradeniya 11100 5320 2.086466165 5 4 1
University of Moratuwa 13000 4720 2.754237288 3 2 1
University of Kelaniya 10500 10600 0.990566038 10 6 16
University of Ruhuna 4890 2290 2.135371179 4 8 16
University of Sri Jayewardenepura 8780 3080 2.850649351 2 5 9
University of Jaffna 3070 3860 0.795336788 14 10 16
University of the Visual and
Performing Arts
1080 1300 0.830769231 13 1 144
Open University of Sri Lanka 5100 4340 1.175115207 8 9 1
Wayamba University of Sri Lanka 3780 3430 1.102040816 9 7 4
Sabaragamuwa University 1650 1770 0.932203390 11 11 0
Eastern University of Sri Lanka 1250 637 1.962323391 6 12 36
Rajarata University 2040 2610 0.781609195 15 14 1
South Eastern University of Sri Lanka 2110 2410 0.875518672 12 13 1
Uva Wellassa University 1630 1110 1.468468468 7 15 64
Total Σ120 120 314
There is a moderate positive relationship between the Ya-
hoo! RWIF ranks and the Webometrics IF ranks of 15 Sri
Lankan national university websites.
The critical value of RY,W for N = 15 and = 0.10 is 0.447
|RY,W |>0.447
Although there is a moderate positive relationship between
the Yahoo! RWIF ranks and the Webometrics IF ranks of 15
Sri Lankan national university websites, the absolute value
of the calculated RY,W (= 0.43929 ) is just below the critical
value (= 0.447). Therefore, the null hypothesis is accepted
concluding that there is no distinguishable relationship be-
<
0 2 4 6 8 10 12 14 16
0
5
10
15
Impact Rank
Google Rank
R2=0.04
Fig. 1. Google rank vs Webometrics impact rank
tween the Yahoo! RWIF ranks and the Webometrics IF ranks
in the population. The scattered plot for the ranks of 15
university websites are shown in Figure 2 in which it repre-
sents the tendency of the relationship between two variables.
The above analysis has evidenced to say that the Sri Lankan
state university websites have a moderate impact through the
Yahoo! search engine than the Google search engine. That
is, the number of web pages, which have been indexed by
Yahoo! during the study period, is closer to the numbers in
Webometrics study. Nevertheless, when considering the entire
set of websites listed out in webometrics, the correlation
coefficient is not strong enough to justify the sample result.
0 2 4 6 8 10 12 14 16
0
5
10
15
Impact Rank
Yahoo Rank
R2=0.193
Fig. 2. Yahoo rank vs Webometrics impact rank
III. CONCLUSION
Even though the impact factor is preferably considered
when ranking the academic websites, it is highly dependent on
search engines, which are indexing the pages of websites. This
study concludes that the Yahoo! RWIFs of 15 Sri Lankan state
university websites are positively correlated with Webometrics
(local) Impact Factors. However, RWIF of a website is a
qualitative indicator that helps to measure the visibility of
the website through its different types of links. The most
importantly if an academic website aspires to improve its
webometrics impact factor rank through Search Engine Opti-
mization, it would be more effective to submit and index links
in Yahoo! search engine compare to Google search engine.
REFERENCES
[1] M. R. Henzinger, “Hyperlink analysis for the web,” IEEE Internet
Computing, vol. 5, no. 4, pp. 45–50, 2001.
[2] T. Berners-Lee, R. Cailliau, J.-F. Groff, and B. Pollermann, “World-
wide web: the information universe,” Internet Research, vol. 2, no. 3,
pp. 155–159, 2012.
[3] A. Noruzi. (2005) Web impact factors for iranian universities. [Online].
Available: http://www.webology.org/2005/v2n1/a11.html11
[4] (2016) Webometrics , january 2016 edition: 2016.1.1. [Online].
Available: http://www.webometrics.info/en
[5] A. Noruzi, “The web impact factor: a critical review,” The Electronic
Library, vol. 24, pp. 490–500, 2006.
[6] T. Almind and P. Ingwersen, “Informetric analyses on the world wide
web: Methodological approaches to webometrics,” Journal of Documen-
tation, vol. 53, no. 4, pp. 404–426, 1997.
[7] K. Arif and I. Haroon, “Article information:calculating web impact
factor for university websites of pakistan,” Electron. Libr., vol. 33, no. 5,
pp. 883–895, 2015.
[8] P. Ingwersen, “The calculation of web impact factors,” Journal of
Documentation, vol. 54, no. 2, pp. 236–243, 1998.
[9] A. Smith, “A tale of two web spaces,” Journal of Documentation, vol. 55,
no. 5, pp. 577–592, 1999.
[10] M. Thelwall, “Web impact factors and search engine coverage,” Journal
of Documentation, vol. 56, no. 2, pp. 185–189, 2000.
[11] K. Arif and I. Haroon, “Article information: Calculating web impact
factor for university websites of pakistan,” Electron. Libr, vol. 33, no. 5,
pp. 883–895, 2015.
[12] (2016) Universities, universities and higher educational institutions, ugc
sri lanka. [Online]. Available: http://www.ugc.ac.lk/en/universities-and-
institutes.html
[13] (2016) The developer shed network. [Online]. Available:
https://www.seochat.com/developer-shed/
[14] M. Vijayakumar, “Webometric analysis of university websites in sri
lanka,” vol. 2, no. 3, pp. 155–159, 2012.
[15] A. B. A. Bakar and N. P. . N. Leyni, “Webometric study of world class
universities websites. qualitative and quantitative methods in libraries
(qqml),” Special Issue Bibliometrics and Scientometrics, pp. 105–115,
2015.
[16] R. Chakravarty and S. Wasan, “Webometric analysis of library web-
sites of higher educational institutes (heis) of india: A study through
google search engine,” DESIDOC Journal of Library and Information
Technology, vol. 35, no. 5, pp. 325–329, 2015.