ArticlePDF Available

The validity of the SERVQUAL and SERVPERF scales: a meta-analytic view of 17 years of research across five continents

Authors:

Abstract

Purpose The purpose is to investigate, the difference between SERVQUAL and SERVPERF's predictive validity of service quality. Design/methodology/approach Data from 17 studies containing 42 effect sizes of the relationships between SERVQUAL or SERVPERF with overall service quality (OSQ) are meta‐analyzed. Findings Overall, SERVQUAL and SERVPERF are equally valid predictors of OSQ. Adapting the SERVQUAL scale to the measurement context improves its predictive validity; conversely, the predictive validity of SERVPERF is not improved by context adjustments. In addition, measures of services quality gain predictive validity when used in: less individualistic cultures, non‐English speaking countries, and industries with an intermediate level of customization (hotels, rental cars, or banks). Research limitations/implications No study, that were using non‐adapted scales were conducted outside of the USA making it impossible to disentangle the impact of scale adaptation vs contextual differences on the moderating effect of language and culture. More comparative studies on the usage of adapted vs non‐adapted scales outside the USA are needed before settling this issue meta‐analytically. Practical implications SERVQUAL scales require to be adapted to the study context more so than SERVPERF. Owing to their equivalent predictive validity the choice between SERVQUAL or SERVPERF should be dictated by diagnostic purpose (SERVQUAL) vs a shorter instrument (SERVPERF). Originality/value Because of the high statistical power of meta‐analysis, these findings could be considered as a major step toward ending the debate whether SERVPERF is superior to SERVQUAL as an indicator of OSQ.
The validity of the SERVQUAL
and SERVPERF scales
A meta-analytic view of 17 years of research
across five continents
Franc¸ois A. Carrillat
HEC Montre
´al, Montre
´al, Canada
Fernando Jaramillo
Department of Marketing, University of Texas at Arlington, Arlington,
Texas, USA, and
Jay P. Mulki
Marketing Group, Northeastern University, Boston, Massachusetts, USA
Abstract
Purpose – The purpose is to investigate, the difference between SERVQUAL and SERVPERF’s
predictive validity of service quality.
Design/methodology/approach Data from 17 studies containing 42 effect sizes of the
relationships between SERVQUAL or SERVPERF with overall service quality (OSQ) are
meta-analyzed.
Findings Overall, SERVQUAL and SERVPERF are equally valid predictors of OSQ. Adapting the
SERVQUAL scale to the measurement context improves its predictive validity; conversely, the
predictive validity of SERVPERF is not improved by context adjustments. In addition, measures of
services quality gain predictive validity when used in: less individualistic cultures, non-English
speaking countries, and industries with an intermediate level of customization (hotels, rental cars, or
banks).
Research limitations/implications – No study, that were using non-adapted scales were
conducted outside of the USA making it impossible to disentangle the impact of scale adaptation vs
contextual differences on the moderating effect of language and culture. More comparative studies on
the usage of adapted vs non-adapted scales outside the USA are needed before settling this issue
meta-analytically.
Practical implications – SERVQUAL scales require to be adapted to the study context more so
than SERVPERF. Owing to their equivalent predictive validity the choice between SERVQUAL or
SERVPERF should be dictated by diagnostic purpose (SERVQUAL) vs a shorter instrument
(SERVPERF).
Originality/value – Because of the high statistical power of meta-analysis, these findings could be
considered as a major step toward ending the debate whether SERVPERF is superior to SERVQUAL
as an indicator of OSQ.
Keywords Services, SERVQUAL, Culture, Quality
Paper type Research paper
Over the years, marketing researchers have reached consensuses on several issues related
to the domain of services. First, as the economy has become mostly service-based,
researchers now consider the marketing discipline as being service dominated.
The current issue and full text archive of this journal is available at
www.emeraldinsight.com/0956-4233.htm
IJSIM
18,5
472
Received 9 January 2006
Revised 4 February 2007
Accepted 17 May 2007
International Journal of Service
Industry Management
Vol. 18 No. 5, 2007
pp. 472-490
qEmerald Group Publishing Limited
0956-4233
DOI 10.1108/09564230710826250
Consumers in OECD countries spend more on services than for tangible goods (Martin,
1999). Indeed, service activities constitute about 70 percent of OECD (2005) countries GDP,
and this trend is expected to continue in the coming decade. The globalization of services
marketing has presented both academics and practitioners challenges and opportunities
in this area (Javalgi et al., 2006). Reflecting this changing emphasis services marketing
has become a wellestablished field of academic inquiry and now represents an alternative
paradigm to the marketing of goods (Lovelock and Gummesson, 2004).
Researchers also agree that a central topic in service research is service quality (SQ),
which is a critical determinant of business performance as well as firms’ long-term
viability (Bolton and Drew, 1991; Gale, 1994). This is because SQ leads to customer
satisfaction which in turn has a positive impact on customer word-of-mouth,
attitudinal loyalty, and purchase intentions (Gremler and Gwinner, 2000). The view
that SQ results from customers’ evaluation of the service encounter prevails in the
literature (Cronin and Taylor, 1992; Parasuraman et al., 1985). Under this perspective,
researchers further agree that SQ is best represented as an aggregate of the discrete
elements from the service encounter such as reliability, responsiveness, competence,
access, courtesy, communication, credibility, security, understanding, and tangible
elements of the service offer (Cronin and Taylor, 1992; Dabholkar et al., 2000;
Parasuraman et al., 1985).
On the other hand, the question of the operationalization of SQ has continued to evoke
discussion. This discussion has been primarily centered on two important issues. The
first relates to the debate of whether SERVQUAL or SERVPERF should be used for
measuring SQ (Cui et al., 2003; Hudson et al., 2004; Jain and Gupta, 2004; Kettinger and
Lee, 1997; Mukherje and Nath, 2005; Quester and Romaniuk, 1997). SERVQUAL,
grounded in the Gap model, measures SQ as the calculated difference between customer
expectations and performance perceptions of a service encounter (Parasuraman et al.,
1988, 1991). Cronin and Taylor (1992) challenged this approach and developed the
SERVPERF scale which directly captures customers’ performance perceptions in
comparison to their expectations of the service encounter. In spite of recent attempts in
the literature toward settling this issue, the SERVQUAL-SERVPERF debate has never
been so relevant. In fact, numerous authors have supported the view that SERVPERF is
a better alternative than SERVQUAL (Babakus and Boller, 1992; Brady et al., 2002;
Brown et al., 1993; Zhou, 2004) while, on the other hand, SERVQUAL has enjoyed and
continues to enjoy widespread acceptance as a measure of SQ (Chebat et al., 1995; Furrer
et al., 2000; Zeithaml and Bitner, 2003). In addition, the web of science reveals that the
original SERVQUAL paper published in 1988, as well as the following 1991 scale
refinement paper has both received more than 46 percent of their total citations within
the last five years. The same is true of SERVPERF, which also received more than
46 percent of its citations within the last five years. This indicates that Cronin and
Taylor’s (1994) conceptual arguments in favor of SERVPERF, while it may have
contributed to SERVPERF popularity, have not reduced SERVQUAL’s usage among
scholars. In addition, it suggests that the multilevel scale, offered by Brady and Cronin
(2001) as a reconciling perspective, has not moved researchers away from either
SERVQUAL to SERVPERF. Therefore, shedding light on whether one scale is better
than the other remains a very important question to be answered.
The second issue centers on the trade-off between the generalizability and
specificity level of the SERVQUAL and SERVPERF scales (Asubonteng et al., 1996).
The validity of
the SERVQUAL
and SERVPERF
473
A scale can be applied in more diversified contexts as its items become more abstract
(Babakus and Boller, 1992; Dabholkar et al., 2000). However, this limits the scale’s
ability to capture specific context elements (Babakus and Boller, 1992; Dabholkar et al.,
2000). There is a general acceptance of the need to modify scale items to suit study
context. However, empirical investigation regarding the impact of item adaptation on
scale validity (i.e. when original SERVQUAL/SERVPERF items versus modified items
are used) has not been undertaken. In addition, research is needed to assess the
appropriateness of the SERVQUAL/SERVPERF scales when they are used outside
the USA. This is because differences in national culture or language require not only
modification of items but also create distortions in how respondents perceive the
construct under investigation (Herk et al., 2005).
The above discussion raises several important research questions. First, are
SERVQUAL and SERVPERF adequate predictors of SQ? And, as proposed by Cronin
and Taylor (1992), is SERVPERF a better predictor of SQ than SERVQUAL? Second, is
there an improvement in the predictive validity of the SERVQUAL and SERVPERF
measures when the scale items are adapted to the study context? Third, does the
predictive power of SERVQUAL and SERVPERF depend on national culture or scale
language? Finally, is the predictive validity of SERVQUAL and SERVPERF influenced
by the type of industry in which the study is conducted?
The current study addresses these research questions by meta-analyzing empirical SQ
research. Meta-analysis is appropriate for addressing these research questions because it
systematically integrates findings across studies, controls for statistical artifacts, and
provides very robust answers about relationships among variables (Arthur et al.,2001;
Hunter and Schmidt, 2004). Our meta-analytic framework relies on 42 effect sizes from 17
empirical studies conducted across five continents spanning 17 years.
Previous research has already attempted to compare SERVQUAL and SERVPERF
(Brady et al., 2002; Cronin and Taylor, 1992; Cui et al., 2003; Hudson et al., 2004; Jain and
Gupta, 2004; Kettinger and Lee, 1997; Quester and Romaniuk, 1997). However,
considering these studies individually provide dispersed evidence that might add, rather
than subtract, ambiguity surrounding the measurement debate. For instance, Jain and
Gupta (2004) as well as Kettinger and Lee (1997) found that SERVPERF was more
strongly correlated to overall service quality (OSQ) than SERVQUAL whereas Quester
and Romaniuk (1997) reported that SERVQUAL exhibited a stronger relationship with
OSQ than SERVPERF. In some cases, studies comparing SERVQUAL and SERVPERF
focus on dimensionality issues without considering predictive validity (Cui et al., 2003;
Hudson et al., 2004). Furthermore, the aforementioned studies rely on one or two samples
at most which prevent them from drawing robust conclusions and from testing the
impact of contingency factors such as country, language, or industry. Therefore,
the current research constitutes a significant contribution to the service literature
because it provides answers to the SERVQUAL/SERVPERF validity debate tackled by
Cronin and Taylor (1992, 1994), Brady et al. (2002), and Parasuraman et al. (1994).
In addition, because meta-analysis is based on the accumulation of empirical evidence
over the years, it allows investigating moderating factors by comparing sub-groups of
studies that share a similar characteristic e.g. the country where the sample was
drawn (Lipsey and Wilson, 2001).
This paper is organized as follows. First, a review of the literature is presented
and hypotheses are developed. Second, a description of the meta-analytic
IJSIM
18,5
474
procedure is provided. Third, results, as well as implications and suggestions for
further research, are discussed.
Conceptual background
Both SERVQUAL and SERVPERF’s operationalizations relied on the conceptual
definition that SQ is an attitude toward the service offered by a firm resulting from a
comparison of expectations with performance (Parasuraman et al., 1985, 1988; Cronin
and Taylor, 1992). However, SERVQUAL directly measures both expectations and
performance perceptions whereas SERVPERF only measures performance
perceptions. SERVPERF uses only performance data because it assumes that
respondents provide their ratings by automatically comparing performance
perceptions with performance expectations. Thus, SERVPERF assumes that directly
measuring performance expectations is unnecessary.
Research comparing the predictive validity of SERVQUAL with SERVPERF has
been based on assessing which of the two measures is a better predictor of OSQ. OSQ
has been used as the criterion because it is a global representation of the quality of the
service offered by an organization (Cronin and Taylor, 1992, 1994; Jain and Gupta,
2004; Kettinger and Lee, 1997; Quester and Romaniuk, 1997). In their comparison of
SERVQUAL with SERVPERF, Cronin and Taylor (1992) built their argument for the
superiority of SERVPERF over SERVQUAL by empirically showing that SERVPERF
is a better predictor of OSQ than SERVQUAL. Also, Parasuraman et al. (1988) assessed
the construct validity of SERVQUAL by evaluating whether the scale was an adequate
predictor of OSQ. In view of this, the predictive validity of SERVQUAL and
SERVPERF is assessed by meta-analyzing extant empirical research on the strength of
the relationship between each scale and OSQ.
The predictive validity of SERVQUAL and SERVPERF
SERVQUAL and SERVPERF are based on rigorous scale development procedures
(Parasuraman et al., 1988, 1991) and have been widely used by researchers. Therefore,
it is expected that both the SERVQUAL and SERVPERF measures of SQ will be
strongly related to OSQ. The literature on scale development does not specifically point
to a particular correlation value with a criterion against which the predictive validity of
a scale can be assessed. However, it is possible to turn to less formal guidelines
formulated by researchers. According to Cohen’s (1992) rule of thumb, a “small” effect
size is observed when the correlation is 0.10, a “medium” effect size is obtained when
the correlation is 0.30, and a “large” effect size corresponds to a correlation of 0.50.
These guidelines have been previously used to qualify the strength of meta-analytic
correlations (Jaramillo et al., 2005). Therefore, the following is hypothesized:
H1. The correlation between SERVQUAL or SERVPERF and OSQ will be strong
and above 0.50.
The disconfirmation vs performance-only debate
In Parasuraman et al.’s (1985) “disconfirmation” perspective, the SQ construct is seen as
an attitude resulting from customers’ comparison of their expectations about the service
encounter with their perceptions of the service encounter. The SERVQUAL instrument
operationalizes this construct as the difference between expected and actual (perceived)
performance (Parasuraman et al., 1988, 1991). Alternatively, SERVPERF is based on the
The validity of
the SERVQUAL
and SERVPERF
475
“performance only” perspective and operationalizes SQ as customers’ evaluations of
the service encounter. As a result, SERVPERF uses only the performance items of the
SERVQUAL scale (Brady et al., 2002; Cronin and Taylor, 1992, 1994).
In discussing the relative merits of each scale, the debate has been primarily
centered on predictive validity and specifically on whether SERVQUAL or SERVPERF
better captures SQ. First, some researchers have argued that SERVPERF is a better
measure because it does not depend on ambiguous customers’ expectations.
Arguments in favor of SERVPERF are based on the notion that performance
perceptions are already the result of customers’ comparison of the expected and actual
service (Babakus and Boller, 1992; Oliver and DeSarbo, 1988). Therefore, performance
only measures should be preferred to avoid redundancy. Second, as Teas (1993)
points out, Parasuraman et al.’s (1991) conceptualization of SQ is inconsistent with its
operationalization. Teas (1993) argues that, since Parasuraman et al. (1991) define
expectations as a type of attitudes, customer expectations must be considered as ideal
points. Hence, the Gap model implication that superior perceptions of SQ occur when
performance increasingly exceeds expectations is theoretically inconsistent. The
classical attitudinal perspective suggests that positive attitudes are formed when
evaluations of an object are close to an expected ideal point. Therefore, SQ should peak
when perceptions equal expectations (Teas, 1993).
Parasuraman et al. (1994) defended SERVQUAL by demonstrating that there was
virtually no difference in predictive power between SERVQUAL and SERVPERF.
Although, discussions have continued on whether disconfirmation-based measures are
superior to performance-only based measures (Dabholkar et al., 2000; Hudson et al.,
2004; Jain and Gupta, 2004), the above discussed arguments point toward the
superiority of SERVPERF over SERVQUAL. Thus:
H2. The relationship between SQ and OSQ is stronger when SQ is measured with
SERVPERF than with SERVQUAL.
Contextual factors
Any scale represents a compromise between relevance and the extent to which it can be
applied in a wide array of contexts (Babakus and Boller, 1992). Scale modification is
done by adding, deleting or rewording items to ensure suitability for a particular
research context. SERVQUAL and SERVPERF scale modifications have led to
discussions about:
.the universal versus context specific character of the scales; and
.whether changes to fit a specific context result in better predictive validity.
It is important to mention that in their original development, SERVQUAL and
SERVPERF were purported to be universal measures of SQ because the scale
development process relied on samples from multiple industries (Cronin and Taylor,
1992; Parasuraman et al., 1988). However, Parasuraman et al. (1988) recognize that
SERVQUAL can be adapted to the specific research needs of a particular organization.
As Rossiter (2002) indicates, the specificities of the measurement context play an
important role in construct validity.
Researchers are particularly concerned about the effect of environmental factors on
the validity of SQ scales (Babin et al., 2004). In fact, researchers have failed to replicate
IJSIM
18,5
476
the five original dimensions of the SERVQUAL/SERVPERF scales, namely tangibility,
reliability, responsiveness, assurance, and empathy (White and Schneider, 2000).
Based on this, researchers have noted that SQ scales need to be adapted to the study
context (Carman, 1990). For instance, tangibility might not be relevant for a cable
company because the customer might never see the facilities of the service provider,
whereas it may be critical for a healthcare facility customer. In their study on the
photography industry, Dabholkar et al. (2000) dropped items related to physical
facilities (tangibility) from the original SERVQUAL because customers did not have to
visit the company’s site; however, they added items related to “salespeople pressure”
that are absent from SERVQUAL. The above discussion suggests that context adapted
versions of SERVQUAL and SERVPERF, hereinafter referred to as MQUAL and
MPERF, will have a better predictive validity than non-modified versions (QUAL or
PERF, respectively). Thus:
H3a. The relationship between SQ and OSQ will be stronger when SQ is measured
with MQUAL rather than with QUAL.
H3b. The relationship between SQ and OSQ will be stronger when SQ is measured
with MPERF rather than with PERF.
Country culture
Studies using SERVQUAL and SERVPERF have been conducted across more than 17
countries and on each and every continent. The use of these scales in an international
context raises a legitimate concern about validity across borders because research has
shown that cultural values influence customer responses on measures of SQ (Laroche
et al., 2004; Zhou, 2004). According to Herk et al. (2005), research conducted
internationally can be affected both by construct bias (i.e. the construct studied differs
across countries) and item bias (i.e. items are distorted when used internationally). For
instance, Sultan et al. (2000) found significant differences across US and European
passengers on their expectations and performance perceptions of airlines SQ. In
addition, Mattila (1999) found that Western customers are more likely than their Asian
counterparts to rely on tangible cues from the physical environment, which evidences
that the tangibility dimension of SERVQUAL is more important for them.
Researchers have found that cultural differences can also create item bias.
Steenkamp and Baumgartner (1998) show that both:
(1) the metric invariance (i.e. the interpretation of the distance between the scale
points); and
(2) the scalar invariance (i.e. whether scale latent means have systematic biases) of
items become uncertain when scales are used across cultures.
In fact, Diamantopoulos et al. (2006) found that international differences in response
styles (i.e. item wording, type of scale, etc.) generate item bias. Therefore, we propose
that SERVQUAL and SERVPERF are likely to be affected by construct and item biases
when used in international settings.
In order to account for cultural differences, it was decided to rely on Hofstede’s (1997)
individualism/collectivism (IDV) measure of national culture. IDV is useful and
parsimonious for explaining cross-cultural differences in attitudes and behaviours. Also,
IDV has satisfactory reliability and uni-dimensionality (Cano et al., 2004; Triandis, 1995).
The validity of
the SERVQUAL
and SERVPERF
477
Research indicates that IDV may affect perceptions of OSQ and its dimensions. For
instance, Furrer et al. (2000) argue that, in high individualistic cultures, consumers tend
to be independent, have an ethic of self-responsibility and demand a higher level of SQ.
Furrer et al. (2000) also note that individualistic consumers prefer to maintain a
significant distance between themselves and the service provider. In addition, their
study results show that consumers with a high degree of individualism considered
“responsiveness” and “tangibles” dimensions as more important compared to
consumers from collectivistic cultures. Individualistic customers tend focus on their
own benefits and interests, and expect the service providers to do the best in catering to
their needs (Donthu and Yoo, 1998). Thus, individualistic customers pay careful
attention to the service provided and are not likely to accept lower SQ. Donthu and Yoo’s
(1998) study showed that individualistic customers have higher OSQ expectations,
higher empathy and assurance expectations from their service providers compared to
customers from collectivistic societies. SERVQUAL and SERVPERF were developed in
the USA, a country with the highest IDV level (Hofstede, 1997). In view of this, the
existing dimensions of the SERVQUAL and SERVPERF scales should match more
closely with the expectations of consumers from individualistic countries. As a result, it
is expected that the predictive validity of SERVQUAL will be diminished in countries
with a lower IDV level:
H4a. The strength of the relationship between SERVQUAL or SERVPERF and
OSQ decreases as the degree of individualism of the country decreases.
Country language
It is generally known that language translation can be a worsening factor of cultural
bias. Even when scales are carefully translated and closely checked by experts
(Witkowski and Wolfinbarger, 2002; Zhou, 2004), the absence of a concept in a language
does not permit a perfect accuracy in scale translation (Herk et al., 2005). Thus, scale
translation can result in higher measurement error which attenuates relationships
among constructs (Hunter and Schmidt, 2004). Therefore, the following is hypothesized:
H4b. The strength of the relationship between SERVQUAL or SERVPERF and
OSQ is stronger when SERVQUAL or SERVPERF is administered in English
than when translated.
Type of services
It is expected that SERVQUAL or SERVPERF will perform differently depending on
the industry in which they are used. This is because the relevance of the scale
dimensions depends on the study setting (White and Schneider, 2000). Many
categorizations of services have been proposed in the literature (Bitner, 1992; Lovelock,
1983; Silvestro et al., 1992). Among the numerous service classifications, Silvestro
et al.’s (1992) production perspective has emerged as integrative of other service
typologies. Silvestro et al. (1992) divide service providers into the following three
groups that range from lower to higher intensity of customer processing:
(1) Professional services (PS) (i.e. low customer processing intensity) include services
provided by lawyers, business consultants, or field engineering. Some
characteristics of this group are: few transactions, highly customized,
process-oriented and long customer contact times. Value is added by front office
IJSIM
18,5
478
service employees who rely extensively on their own judgment to perform
the service.
(2) Service shops (SS) (i.e. intermediate customer processing intensity) such as
hotels, rental cars, or banks. This group has an intermediate level of
customization and judgment from service employees. Value added is generated
in both the back and front offices.
(3) Mass services (MS) (i.e. high customer processing intensity) such as provided
by retailer, transportation, or confectionery. The group has many customer
transactions, few contact opportunities, and limited customization. Value added
comes from the back office and service employees use little judgment.
According to Silvestro et al. (1992), as the intensity of custome r-processing decreases, the
emphasis on process rather than product intensifies. The process elements of a service
are by nature intangible while the product elements are more tangible (Zeithaml and
Bitner, 2003). Therefore, less customer-processing-oriented service industries will have
more intangible service offers. Because, SERVQUAL is purp orted to measure the service
aspects of the quality of customer experience, it is expected to perform better when
customer-processing intensity decreases while intangibility increases. Thus:
H5. The strength of the relationship between SERVQUAL and OSQ decreases as
the service category moves from PS to services shop and to MS.
Methodology
All studies containing an effect size (
r
) that measures the strength of the relationship
between SQ (SERVQUAL, SERVPERF) and OSQ were eligible for inclusion.
Valid statistics included Pearson’s correlation coefficients (r) or any other statistics
that could be converted to r, such as F-value, t-value, p-value, and
x
2
. Empirical
studies published in 1988 or after and available before May 30, 2005 were included in this
meta-analysis. This timeframe is used since SERVQUAL was first published in 1988.
Study search
The following procedure was used to obtain an ample collection of studies reporting
the desired effect sizes. First, an electronic search of the following databases was
conducted: Direct Science,Emerald,ProQuest (ABI/INFORM Global and dissertation
abstracts). Second, a manual examination of the articles identified from the
computer-based searches was carried out. Third, manual searches of leading
marketing and service journals were conducted. To contact marketing researchers, a
call for working papers, forthcoming articles, conference papers, and unpublished
research was posted on ELMAR-AMA (,5,000 subscribers). The search process
yielded a total of 17 studies containing 42 effect sizes resulting from studying 9,880
respondents (Table I).
Meta-analytic model
Meta-analyses can be conducted using either a fixed-effect (FE) or a random-effect (RE)
model (Hunter and Schmidt, 2004). A FE model assumes that the same
r
value
underlies the observed effect sizes in all the studies, whereas the RE model allows for
The validity of
the SERVQUAL
and SERVPERF
479
Authors Scale
a
Country IDV score
b
Language Services
c
n
d
r
e
Angur et al. (1999) QUAL USA 91 English Mass 143 0.70
Angur et al. (1999) PERF USA 91 English Mass 143 0.72
Babakus and Boller (1992) PERF USA 91 English Shop 520 0.66
Bojanic (1991) PERF USA 91 English Pro 32 0.57
Brady et al. (2002) MPERF USA 91 English *1548 0.62
Cronin and Taylor (1992) PERF USA 91 English *660 0.60
Cronin and Taylor (1992) QUAL USA 91 English *660 0.54
Dabholkar et al. (2000) MQUAL USA 91 English Mass 397 0.78
Dabholkar et al. (2000) MPERF USA 91 English Mass 397 0.65
Freeman and Dart (1993) MQUAL Canada 80 English Pro 217 0.63
Jabnoun and Al-Tamimi (2003) MQUAL UAI 38 Non-English Mass 462 0.82
Lam (1995) PERF Hong Kong 25 English Mass 214 0.82
Lam (1995) QUAL Hong Kong 25 English Mass 214 0.69
Lam (1997) PERF Hong Kong 25 English Pro 82 0.71
Lee et al. (2000) MQUAL USA 91 English Shop 196 0.75
Lee et al. (2000) MQUAL USA 91 English Pro 128 0.59
Lee et al. (2000) MPERF USA 91 English Mass 197 0.72
Lee et al. (2000) MPERF USA 91 English Pro 128 0.71
Lee et al. (2000) MPERF USA 91 English Shop 196 0.81
Lee et al. (2000) MQUAL USA 91 English Mass 197 0.47
Mehta et al. (2000) MPERF Singapore 20 Non-English Shop 161 0.63
Mehta et al. (2000) MPERF Singapore 20 Non-English Shop 161 0.75
Mittal and Lassar (1996) MQUAL USA 91 English Pro 123 0.79
Mittal and Lassar (1996) QUAL USA 91 English Pro 123 0.77
Mittal and Lassar (1996) MQUAL USA 91 English Pro 110 0.86
Mittal and Lassar (1996) QUAL USA 91 English Pro 110 0.85
(continued)
Table I.
Coding of effect sizes
included in the
meta-analysis
IJSIM
18,5
480
Authors Scale
a
Country IDV score
b
Language Services
c
n
d
r
e
Pariseau and McDaniel (1997) MQUAL USA 91 English Mass 39 0.71
Quester and Romaniuk (1997) PERF Australia 90 English Pro 182 0.55
Quester and Romaniuk (1997) QUAL Australia 90 English Pro 182 0.51
Smith (1999) MQUAL UK 89 English Pro 177 0.38
Smith (1999) MPERF UK 89 English Pro 177 0.36
Wal et al. (2002) QUAL South Africa 65 English Shop 583 0.08
Witkowski and Wolfinbarger (2002) MQUAL Germany 67 Non-English Mass 101 0.63
Witkowski and Wolfinbarger (2002) MQUAL USA 91 English Shop 86 0.62
Witkowski and Wolfinbarger (2002) MQUAL USA 91 English Shop 75 0.62
Witkowski and Wolfinbarger (2002) MQUAL Germany 67 Non-English Shop 114 0.56
Witkowski and Wolfinbarger (2002) MQUAL Germany 67 Non-English Mass 132 0.54
Witkowski and Wolfinbarger (2002) MQUAL USA 91 English Mass 81 0.59
Witkowski and Wolfinbarger (2002) MQUAL USA 91 English Pro 103 0.59
Witkowski and Wolfinbarger (2002) MQUAL Germany 67 Non-English Pro 105 0.58
Witkowski and Wolfinbarger (2002) MQUAL USA 91 English Shop 105 0.57
Witkowski and Wolfinbarger (2002) MQUAL Germany 67 Non-English Shop 119 0.57
Notes:
a
QUAL ¼original SERVQUAL, MQUAL ¼modified SERVQUAL, PERF ¼original SERVPERF, MPERF ¼modified SERVPERF;
b
Hofstede’s
individualism score;
c
type of service industry based on Silvestro et al. (1992);
d
sample size;
e
observed effect size; *these studies relied on multiple
industries spanning across service types and were not included in this moderator analysis
Table I.
The validity of
the SERVQUAL
and SERVPERF
481
variation of the population parameter
r
across studies. Credibility intervals (i.e. the
distribution of population parameter values) were computed in addition to confidence
intervals (i.e. the range of the true population value) (Hunter and Schmidt, 2004).
Hunter and Schmidt’s (2004) RE model was used as it accounts for both random and
systematic variance and has been shown to yield very accurate credibility intervals in
simulation studies (Hall and Brannick, 2002). Also, both the observed mean
correlations (r) and the corrected mean correlations (r
c
) were estimated by following
Arthur et al.’s (2001) procedure to account for measurement error.
Test of moderators
When estimating the significance of nominal moderator variables with two categories,
we relied on the “standard method” as advised by Schenker and Gentleman (2001) and
implemented in a recent marketing meta-analysis (Jaramillo et al., 2005). The “standard
method” consists of building only one interval around the difference between the
two-point estimates by adding and subtracting the appropriate z-value multiplied by
the square root of the sum of the squared SE of each point estimate. If that interval does
not include zero, the difference between the two point estimates is statistically
significant. The standard method is preferred to comparisons of confidence intervals
since testing of moderating hypotheses has greater statistical power (Jaramillo et al.,
2005; Schenker and Gentleman, 2001). Note that since all the moderator hypotheses
were directional, the z-value used for computing the interval around the difference
between the point estimates corresponded to a 90 percent confidence level to generate
an
a
level of 0.05 as in a one-tailed test (Jaramillo et al., 2005).
When testing for continuous moderators, or nominal moderators with more than two
categories, the weighted regression approach of Lipsey and Wilson (2001) was adopted.
This procedure consists in regressing the disattenuated effect sizes on independent
variables (continuous or dummy coded) with w
i
(the inverse variance component which
gives more weight to effect sizes coming from homogeneous distributions) as the weight
for each observation. The moderation effect of IDV is tested using weighted regression
analysis. Weighted regression analysis is adequate to test the moderating effect of IDV
since it is a continuous variable (Cano et al., 2004; Lipsey and Wilson, 2001).
Results
Table II presents the results of the meta-analysis. The overall strength of the relationship
between SERVQUAL and OSQ is larger than 0.50 (r¼0.58; r
c
¼0.68;
CI90percent ¼0:50 20:66). The average SERVPERF and OSQ correlation is also larger
than 0.50 (r¼0.64; r
c
¼0.75; CI90percent ¼0:52 20:77). Since, the lower bound values
of the 90 percent confidence intervals for both SERVQUAL and SERVPERF are above
0.50, the grand mean correlations can be interpreted as large (Cohen, 1992). This indicates
that both SERVQUAL and SERVPERF are valid measures of SQ, thus bringing support
for H1. The presence of moderators of the SERVQUAL-OSQ and SERVPERF-OSQ
relationships is evidenced in statistically significant Q-statistics (Table II). The Q-statistic
is distributed as a
x
2
with k21 degrees of freedom and is compared to the corresponding
critical
x
2
statistic. A significant Q-statistic demonstrates that the effect size distribution
is heterogeneous and indicates that the population varies systematically according to
some factors other than subject level sampling and measurement errors (Lipsey and
Wilson, 2001).
IJSIM
18,5
482
k
a
n
b
r
c
r
c
d
Q-statistic
e
Confidence
interval
f
Credibility
interval
g
Percentage of variance
explained
h
FS N
i
Overall 42 9.880 0.61 0.71 288 0.56-0.66 0.44-0.98 67.9 2.982
SERVQUAL 27 5.082 0.58 0.68 241 0.50-0.66 0.33-1.03 70.9 1.836
QUAL 7 2.015 0.46 0.54 56 0.27-0.66 0.11-0.96 71.6 378
MQUAL 20 3.067 0.66 0.77 57 0.60-0.72 0.57-0.97 64.4 1.540
SERVPERF 15 4.798 0.64 0.75 38 0.52-0.77 0.34-1.15 68.7 1.125
PERF 7 1.751 0.65 0.73 11 0.59-0.71 0.62-0.84 64.2 511
MPERF 8 2.965 0.64 0.75 29 0.57-0.70 0.64-0.87 42.9 600
English speaking 34 8.525 0.60 0.70 260 0.54-0.66 0.42-0.97 67.8 2.380
Non English speaking 8 1.355 0.69 0.79 19 0.60-0.77 0.63-0.96 61.6 632
Notes:
a
Number of effect sizes;
b
sample size;
c
attenuated mean effect size;
d
disattenuated (i.e. corrected) mean effect size;
e
critical values range from 12.59
to 56.93;
f
at the 95 percent level;
g
at the 90 percent level;
h
variance explained by sample and measurement artifact;
i
fail-safe N: number of studies with an
effect size of zero (r
i
¼0) needed to reduce the mean effect size (r
c
) to 0.01
Table II.
Overall meta-analytic
results and categorical
moderators for the
relationship between SQ
and OSQ
The validity of
the SERVQUAL
and SERVPERF
483
H2 posited that the relationship between SERVPERF and OSQ is stronger than
the SERVQUAL-OSQ relationship. However, a comparison of the strength of these
relationships reveals no significant difference. As shown in Table II, although the
mean SERVPERF-OSQ correlation (r
c
¼0.75) is larger than the SERVQUAL-OSQ
correlation (r
c
¼0.68), the difference is not statistically significant. In effect, the
90 percent confidence interval for the difference between the two point estimates
(r
c
¼0.75 and r
c
¼0.68) includes zero (CI90percent ¼20:06 to 0:19), indicating that
there is no significant difference between the predictive validity of SERVQUAL versus
SERVPERF (Schenker and Gentleman, 2001).
H3a and H3b stated that the modified SERVQUAL or SERVPERF scales would be
more strongly related to OSQ than the original scales. The observed difference between
the predictive validity of the original SERVQUAL and its modified version was
statistically significant (QUAL r
c
¼0.54 vs MQUAL r
c
¼0.77;
D
r
c
¼0.23,
CI90percent ¼0:06 20:40). This suggests that the predictive validity of SERVQUAL
increases when it is adapted to the study context. However, the observed difference
between the predictive validity of the original version of SERVPERF and its modified
version was not statistically significant (PERF r
c
¼0.73 vs MPERF r
c
¼0.75;
D
r
c
¼0.02, CI90percent ¼20:04 20:09). This suggests that the predictive validity of
SERVPERF does not change when the scale is modified.
According to H4a and H4b, the predictive validity of SERVQUAL on OSQ
decreases:
.as the individualism of the country sample decreases; and
.when the study is conducted in a non-English speaking country.
A weighted regression with the disattenuated correlations between SQ and OSQ as the
dependent variable, and IDV as the independent variable, revealed that a country’s
individualism negatively impacts the predictive validity of SERVQUAL (B¼20.001,
p,0.05), which is contrary to what was hypothesized in H4a (Table III). In addition, the
mean effect size for English speaking countries was smaller than the mean effect size for
non-English speaking countries (non-English speaking r
c
¼0.79 vs English speaking
r
c
¼0.70; CI90percent ¼0:04 20:16); thus, not providing support for H4b (Table II).
According to H5, when moving from lower to higher levels of customer processing
intensity, the predictive validity of SERVQUAL on OSQ should decrease. H5 implied
that the SERVQUAL-OSQ relationships should be strongest for PS followed by SS, and
weakest for MS. As shown in Table III, the strongest SERVQUAL-OSQ relationships
Model
b
Adjusted SE
a
z-value
Individualism-collectivism y¼b1x1þ1B
1
¼20.001 0.0004 22.68 *
Industry type
b
y¼b1x1þb2x2þ1B
1
¼0.096 0.035 2.78 *
B
2
¼20.12 0.037 23.28 *
Notes: *Significant at 1¼0.05;
a
when applied in a meta-analytic study, although the
b
coefficient
estimates are accurate, their standard errors need to be adjusted; Lipsey and Wilson (2001) indicate
that the standard errors of the
b
coefficients need be divided by the square root of the mean square
residuals of the regression model in order to yield z-value used for significance testing;
b
professional
services is the base level; B
1
corresponds to service shops and B
2
to mass services
Table III.
Results for continuous
moderators and
multimodal categorical
moderators
IJSIM
18,5
484
are for SS (B
1
¼0.096, p,0.05), followed by PS (base line), and then MS (B
2
¼20.12,
p,0.05). Hence, H5 is not supported.
Discussion
The study results have important implications because they question isolated findings
from earlier studies. In spite of the discussions and several arguments provided by
researchers about the superiority of SERVPERF over SERVQUAL (Cronin and Taylor,
1992, 1994), the results of this meta-analysis suggest that both scales are adequate and
equally valid predictors of OSQ. Because of the high statistical power of meta-analysis
(Cohn and Becker, 2003), these findings could be considered as a major step toward ending
the debate whether SERVPERF is superior to SERVQUAL as an indicator of OSQ.
As Parasuraman et al. (1994) pointed out, the use of performance-only (SERVPERF)
vs the expectation/performance difference scale (SERVQUAL) should be governed by
whether the scale is used for a diagnostic purpose or for establishing theoretically
sound models. We believe that the SERVQUAL scale would have greater interest for
practitioners because of its richer diagnostic value. By comparing customer
expectations of service versus perceived service across dimensions, managers can
identify service shortfalls and use this information to allocate resources to improve SQ
(Parasuraman et al., 1994).
Our findings also reveal that the need to adapt the measure to the context of the
study is greater when SERVQUAL rather than SERVPERF is used. In effect, the
original versions of SERVQUAL had a significantly lower OSQ predictive validity
than the modified versions. However, both the original and modified versions of
SERVPERF had the same level of OSQ predictive validity. This has important
implications for both practitioners and academics. Practitioners using SERVQUAL for
OSQ diagnostic purposes need to spend greater effort in modifying the scale for
context than SERVPERF users.
Our results also show an interesting pattern. Since, SERVQUAL and SERVPERF
were originally developed in the USA, we expected that the predictive validity of these
instruments would be higher when used in countries with national cultures and
languages similar to the US. However, results show that the predictive validity of
SERVQUAL and SERVPERF on OSQ was higher for non-English speaking countries
and for countries with lower levels of individualism. A closer examination of the
sample used in our study revealed that all studies conducted in non-English speaking
countries as well as those conducted in less individualistic countries relied on modified
versions of the SERVQUAL scale. Hence, scale modification rather than cultural
context could be driving the results. Since, there were no studies conducted outside the
US using non-modified scales, it was not possible to isolate the effect of national culture
and language. Further, research is needed to address this important issue. An
interesting avenue would be an experimental design where respondents outside the US,
would be given a modified scale (i.e. adapted to the industry context) and others would
be given the original items; this would allow teasing apart the effects of culture and
scale adaptation on the scale’s validity.
Finally, results suggest the predictive validity of SERVQUAL on OSQ is highest in
medium customer processing intensity contexts with an intermediate degree of
intangibility (SS) followed by low customer processing intensity (PS) and high
customer processing intensity (MS).
The validity of
the SERVQUAL
and SERVPERF
485
A plausible explanation for this finding is that SERVQUAL was developed as a
scale generalizable across service contexts. Hence, predictive validity peaks in the
category that represents a compromise between the emphasis on process and product
(i.e. service shop). Another reason could be the varying degree of importance of the
service used in the analysis to the customer. Additional research is needed for a better
understanding of this result.
With the growing proliferation of technology based self-service (SST) encounters,
factors that contribute to satisfaction and dissatisfaction in the SST customer
interaction have drawn considerable interest from researchers and practitioners
(Meuter et al., 2000). Further, research could explore the degree of predictive validity of
SERVQUAL on OSQ in SST customer interactions.
Like any other meta-analysis, this study is subject to the file drawer problem which
prevents the true effect size from being uncovered (Lipsey and Wilson, 2001). However,
as shown in Table II, the fail-safe Nstatistic reveals that several hundred studies
unaccounted for, with an effect size of zero, would be necessary to nullify the effect
sizes computed. This strengthens the confidence in the results obtained. Finally, in this
study, SERVQUAL and SERVPERF were only assessed through their predictive
validity of OSQ. A future meta-analysis could employ additional validation techniques.
For example, meta-analysis can be used to construct a broader nomological network
that includes constructs related to SQ such as customer satisfaction, customer loyalty,
purchase intention, and word-of-mouth (Zeithaml, 2000). Researchers could then assess
whether using SERVQUAL or SERVPERF affects the effect of SQ on the above
referred constructs.
References
Angur, M.G., Nataraajan, R. and Jahera, J.S. Jr (1999), “Service quality in the banking industry:
an assessment in a developing economy”, The International Journal of Bank Marketing,
Vol. 17 No. 3, pp. 116-25.
Arthur, W.J., Bennet, W. and Huffcutt, A.I. (2001), Conducting Meta Analysis Using SAS,
Lawrence Earlbaum Associates, Mahwah, NJ.
Asubonteng, P., McCleary, K.J. and Swan, J.E. (1996), “SERVQUAL revisited: a critical review of
service quality”, Journal of Services Marketing, Vol. 10 No. 6, pp. 62-70.
Babakus, E. and Boller, G.W. (1992), “An empirical assessment of the SERVQUAL scale”,
Journal of Business Research, Vol. 24 No. 3, pp. 253-68.
Babin, B., Chebat, J-C. and Michon, R. (2004), “Perceived appropriateness and its effect on quality,
affect and behavior”, Journal of Retailing & Consumer Services, Vol. 11 No. 5, pp. 287-98.
Bitner, M.J. (1992), “Servicescapes: the impact of physical surroundings on customers and
employees”, Journal of Marketing, Vol. 56 No. 2, pp. 57-71.
Bojanic, D.C. (1991), “Quality measurement in professional services firms”, Journal of
Professional Services Marketing, Vol. 7 No. 2, pp. 27-36.
Bolton, R.N. and Drew, J.H. (1991), “A multistage model of customers’ assessments of service
quality and value”, Journal of Consumer Research, Vol. 17 No. 4, pp. 375-84.
Brady, M.K. and Cronin, J.J. Jr (2001), “Some new thoughts on conceptualizing perceived service
quality: a hierarchical approach”, Journal of Marketing, Vol. 65 No. 3, pp. 34-49.
Brady, M.K., Cronin, J.J. Jr and Brand, R.R. (2002), “Performance-only measurement of service
quality: a replication and extension”, Journal of Business Research, Vol. 55 No. 1, pp. 17-31.
IJSIM
18,5
486
Brown, T.J., Churchill, G.A. Jr and Peter, P.J. (1993), “Improving the measurement of service
quality”, Journal of Retailing, Vol. 68 No. 1, pp. 127-39.
Cano, C.R., Carrillat, F.A. and Jaramillo, F. (2004), “A meta-analysis of the relationship between
market orientation and business performance: evidence from five continents”,
International Journal of Research in Marketing, Vol. 21 No. 2, pp. 179-200.
Carman, J.M. (1990), “Consumer perceptions of service quality: an assessment of the SERVQUAL
dimensions”, Journal of Retailing, Vol. 66 No. 1, pp. 33-55.
Chebat, J-C., Filiatrault, P., Gelinas-Chebat, C. and Vaninsky, A. (1995), “Impact of waiting
attribution and consumer’s mood on perceived quality”, Journal of Business Research,
Vol. 34 No. 3, pp. 191-6.
Cohen, J. (1992), “A power primer”, Psychological Bulletin, Vol. 112 No. 1, pp. 155-9.
Cohn, L.D. and Becker, B.J. (2003), “How meta-analysis increases statistical power”, Journal of
Applied Psychology, Vol. 8 No. 3, pp. 243-53.
Cronin, J.J. Jr and Taylor, A.S. (1992), “Measuring service quality: a reexamination and an
extension”, Journal of Marketing, Vol. 56 No. 3, pp. 55-67.
Cronin, J.J. Jr and Taylor, A.S. (1994), “SERVPERF versus SERVQUAL: reconciling performance
based and perception based – minus – expectation measurements of service quality”,
Journal of Marketing, Vol. 58 No. 1, pp. 125-31.
Cui, C.C., Lewis, B.R. and Park, W. (2003), “Service quality measurement in the banking sector
Korea”, International Journal of Bank Marketing, Vol. 21 No. 4, pp. 191-201.
Dabholkar, P.A., Shepherd, C.D. and Thorpe, D.I. (2000), “A comprehensive framework for
service quality: an investigation of critical conceptual and measurement issues through a
longitudinal study”, Journal of Retailing, Vol. 76 No. 2, pp. 139-73.
Diamantopoulos, A., Reynolds, N.L. and Simintiras, A.C. (2006), “The impact of response styles
on the stability of cross-national comparisons”, Journal of Business Research, Vol. 59 No. 8,
pp. 925-35.
Donthu, N. and Yoo, B. (1998), “Cultural influences on service quality expectations”, Journal of
Service Research, Vol. 1 No. 2, pp. 178-86.
Freeman, K.D. and Dart, J. (1993), “Measuring the perceived quality of professional business
services”, Journal of Professional Services Marketing, Vol. 9 No. 1, pp. 27-47.
Furrer, O., Liu, B.S-C. and Sudharshan, D. (2000), “The relationships between culture and service
quality perceptions: basis for cross-cultural market segmentation and resource allocation”,
Journal of Service Research, Vol. 2 No. 4, pp. 355-71.
Gale, B.T. (1994), Managing Customer Value: Creating Quality and Service that Customers can
See, The Free Press, New York, NY.
Gremler, D.D. and Gwinner, K.P. (2000), “Customer-employee rapport in service relationships”,
Journal of Service Research, Vol. 3 No. 1, pp. 82-104.
Hall, S.M. and Brannick, M.T. (2002), “Comparison of two random-effects methods of
meta-analysis”, Journal of Applied Psychology, Vol. 87 No. 2, pp. 377-89.
Herk, H.V., Poortinga, Y.H. and Verhallen, T.M.M. (2005), “Equivalence of survey data: relevance
for international marketing”, European Journal of Marketing, Vol. 39 Nos 3/4, pp. 351-64.
Hofstede, G. (1997), Cultures and Organizations: Software of the Mind, McGraw-Hill, Berkshire.
Hudson, S., Hudson, P. and Miller, G.A. (2004), “The measurement of service quality in the tour
operating sector: a methodological comparison”, Journal of Travel Research, Vol. 42 No. 3,
pp. 305-12.
The validity of
the SERVQUAL
and SERVPERF
487
Hunter, J.E. and Schmidt, F.L. (2004), Methods of Meta-Analysis: Correcting Error and Bias in
Research Findings, 2nd ed., Sage, Thousand Oaks, CA.
Jabnoun, N. and Al-Tamimi, H.A.H. (2003), “Measuring perceived service quality at UAE
commercial banks”, The International Journal of Quality & Reliability Management, Vol. 20
Nos 4/5, pp. 458-72.
Jain, S.K. and Gupta, G. (2004), “Measuring service quality: SERVQUAL vs SERVPERF scales”,
The Journal for Decision Makers, Vol. 29 No. 2, pp. 25-37.
Jaramillo, F., Carrillat, F.A. and Locander, W.B. (2005), “A meta-analytic comparison of
managerial ratings and self-evaluations”, Journal of Personal Selling & Sales Management,
Vol. 25 No. 4, pp. 315-29.
Javalgi, R.R.G., Martin, C.L. and Young, R.B. (2006), “Marketing research, market orientation and
customer relationship management: a framework and implications for service providers”,
Journal of Services Marketing, Vol. 20 No. 1, pp. 12-23.
Kettinger, W.J. and Lee, C.C. (1997), “Pragmatic perspectives on the measurement of information
systems service quality”, MIS Quarterly, Vol. 21 No. 2, pp. 223-41.
Lam, S.S.K. (1995), “Measuring service quality: an empirical analysis in Hong Kong”,
International Journal of Management, Vol. 12 No. 2, pp. 182-8.
Lam, S.S.K. (1997), “SERVQUAL: a tool for measuring patients’ opinions of hospital service
quality in Hong Kong”, Total Quality Management, Vol. 8 No. 4, pp. 152-4.
Laroche, M., Ueltschy, L.C., Abe, S., Cleveland, M. and Yannopoulos, P.P. (2004), “Service quality
perceptions and customer satisfaction: evaluating the role of culture”, Journal of
International Marketing, Vol. 12 No. 3, pp. 58-85.
Lee, T., Lee, Y. and Yoo, D. (2000), “The determinants of perceived service quality and its
relationship with satisfaction”, The Journal of Services Marketing, Vol. 14 No. 3, pp. 217-31.
Lipsey, M.W. and Wilson, D.B. (2001), Practical Meta-Analysis, Sage, Thousand Oaks, CA.
Lovelock, C.H. (1983), “Classifying services to gain strategic marketing insights”, Journal of
Marketing, Vol. 47 No. 3, pp. 9-21.
Lovelock, C.H. and Gummesson, E. (2004), “Whither services marketing? In search of a new
paradigm and fresh perspectives”, Journal of Service Research, Vol. 7 No. 1, pp. 20-41.
Martin, C.L. (1999), “The history, evolution and principles of services marketing: poised for the
new millennium”, Marketing Intelligence & Planning, Vol. 17 No. 7, pp. 324-8.
Mattila, A.S. (1999), “The role of culture in the service evaluation process”, Journal of Service
Research, Vol. 1 No. 3, pp. 250-61.
Meuter, M.L., Ostrom, A.L., Roundtree, R.I. and Bitner, M.J. (2000), “Self-service technologies:
understanding customer satisfaction with technology-based service encounters”, Journal
of Marketing, Vol. 64 No. 3, pp. 50-64.
Mehta, S.C., Ashok, K.L. and Han, S.L. (2000), “Service quality in retailing: relative efficiency of
alternative measurement scales for different product-service environments”, International
Journal of Retail & Distribution Management, Vol. 28 No. 2, pp. 62-72.
Mittal, B. and Lassar, W.M. (1996), “The role of personalization in service encounters”, Journal of
Retailing, Vol. 72 No. 1, pp. 95-110.
Mukherje, A. and Nath, P. (2005), “An empirical assessment of comparative approach to service
quality measurement”, Journal of Services Marketing, Vol. 19 No. 3, pp. 174-84.
OECD (2005), available at: http://ocde.P4.Siteinternet.Com/publications/doifiles/012005061t009.Xls
Oliver, R.L. and DeSarbo, W.S. (1988), “Response determinants in satisfaction judgments”,
Journal of Consumer Research, Vol. 14 No. 4, pp. 495-507.
IJSIM
18,5
488
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1985), “A conceptual model of service
quality and its implications for future research”, Journal of Marketing, Vol. 49 No. 4,
pp. 41-50.
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1988), “SERVQUAL: a multiple-item scale for
measuring consumer perception of service quality”, Journal of Retailing, Vol. 64 No. 1,
pp. 12-40.
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1991), “Refinement and reassessment of the
SERVQUAL scale”, Journal of Retailing, Vol. 67 No. 4, pp. 420-51.
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1994), “Reassessment of expectations as a
comparison standard in measuring service quality: implications for further research”,
Journal of Marketing, Vol. 58 No. 1, pp. 111-24.
Pariseau, S.E. and McDaniel, J.R. (1997), “Assessing service quality in schools of business”,
International Journal of Quality & Reliability Management, Vol. 14 No. 3, pp. 204-18.
Quester, P.G. and Romaniuk, S. (1997), “Service quality in the Australian advertising industry:
a methodological study”, Journal of Services Marketing, Vol. 11 No. 3, pp. 180-92.
Rossiter, J.R. (2002), “The C-OAR-SE procedure for scale development in marketing”,
International Journal of Research in Marketing, Vol. 19 No. 4, pp. 305-35.
Schencker, N. and Gentleman, J.F. (2001), “On judging the significance of difference by
examining the overlap between confidence intervals”, The American Statistician, Vol. 55
No. 3, pp. 182-6.
Silvestro, R., Fitzgerald, L. and Johnston, R. (1992), “Towards a classification of
services processes”, International Journal of Services Industry Management, Vol. 3 No. 2,
pp. 62-75.
Smith, A.M. (1999), “Some problems when adopting Churchill’s paradigm for the development of
service quality measurement scales”, Journal of Business Research, Vol. 46 No. 2,
pp. 109-20.
Steenkamp, J-B.E.M. and Baumgartner, H. (1998), “Assessing measurement invariance in
cross-national consumer research”, Journal of Consumer Research, Vol. 25 No. 1, pp. 78-90.
Sultan, F., Merlin, C. and Simpson, J. (2000), “International service variants: airline passenger
expectations and perceptions of service quality”, Journal of Services Marketing, Vol. 14
No. 3, pp. 188-96.
Teas, K.R. (1993), “Expectations, performance evaluation, and consumers’ perception of quality”,
Journal of Marketing, Vol. 57 No. 4, pp. 18-34.
Triandis, H.C. (1995), Individualism and Collectivism, Westview Press, Boulder, CO.
Wal, van der R.W.E., Pampallis, A. and Bond, C. (2002), “Service quality in a cellular
telecommunications company: a South African experience”, Managing Service Quality,
Vol. 12 No. 5, pp. 323-35.
White, S.S. and Schneider, B. (2000), “Climbing the commitment ladder: the role of expectations
disconfirmation on customers’ behavioral intentions”, Journal of Service Research, Vol. 2
No. 22, pp. 240-53.
Witkowski, T.H. and Wolfinbarger, M.F. (2002), “Comparative service quality: German and
American ratings across service settings”, Journal of Business Research, Vol. 55 No. 11,
pp. 875-81.
Zeithaml, V.A. (2000), “Service quality, profitability, and the economic worth of customers: what
we know and what we need to learn”, Journal of the Academy of Marketing Science, Vol. 28,
pp. 67-85.
The validity of
the SERVQUAL
and SERVPERF
489
Zeithaml, V.A. and Bitner, M.J. (2003), Services Marketing: Integrating Customer Focus across the
Firm, 3rd ed., Irwin McGraw-Hill, Boston, MA.
Zhou, L. (2004), “A dimension-specific analysis of performance-only measurement of service
quality and satisfaction in china’s retail banking”, The Journal of Services Marketing,
Vol. 18 Nos 6/7, pp. 534-46.
Corresponding author
Franc¸ois A. Carrillat can be contacted at: francois.carrillat@hec.ca
IJSIM
18,5
490
To purchase reprints of this article please e-mail: reprints@emeraldinsight.com
Or visit our web site for further details: www.emeraldinsight.com/reprints
... ServQual has been proven to be a valid and reliable tool across different fields [18,20,21]. A meta-analysis of studies which assessed the strength of the relationship between the over-all service quality (OSQ) and ServQual has shown that their overall relationship is greater than 0.50 (r=0.58; ...
... CI90%=0.50−0.66) proving that ServQual is a valid measure of service quality [21]. Multiple studies have demonstrated the internal consistency of ServQual with reliability coefficients of the different dimensions to be consistently higher than 0.7 [18,22]. ...
Article
Background: Teaching clinics provide low-cost health programs while offering valuable learning opportunities for student clinicians, which then contributes to increasing health care accessibility. To date, there is a paucity of literature exploring the satisfaction of patient seen in rehabilitation teaching clinics in developing countries. The Service Quality (ServQual) Scale is a valid and reliable tool that has been used to measure client satisfaction in different work settings and industries. Objectives: The aim of this study was to demonstrate the usefulness of ServQual in measuring the satisfaction of clients in a rehabilitation teaching clinic in a developing country. Methodology: A cross-sectional survey was conducted for three months among Clinic for Therapy Services- Adult and Adolescent Section (CTS-AA) clients who are at least 18 years old; have attended at least three sessions; and can read. Prior to administration in CTS-AA, the ServQual scale was translated to Filipino, validated and pilot tested for reliability. Results: Thirty-two respondents were included in the analysis. There was no statistically significant difference between the expectation and the perceptions of the clients for the domains of reliability (z=1.799, p=0.0721), responsiveness (z=0.839, p=0.4013), assurance (z=1.914, p=0.0556) and empathy (z=1.772, p=0.0764). However, there was a statistically significant difference between the clients' perception and expectation for tangibles (z=4.117, p<0.0001) and between the overall client perception and expectation (z=4.086, p<0.0001). The overall ServQual score for CTS-AA is -0.3782. Conclusion: The ServQual has been shown to be useful in assessing the satisfaction of clients in rehabilitation clinics and the specific areas that needs improvement. The tool can still be further improved by including items on cost, relationship of students with supervisors and outcomes of treatment.
... The paper responds to calls made to adapt service quality models to the healthcare context and use data from both public and private settings outside the United States. [1,2] Indeed, the healthcare management literature points to the lack of (and compelling need for) research on perceived service quality and performance in developing and emerging economies and the need for developing well-validated measurement models for evaluating service quality, and most models of healthcare service quality are of Western origin. [3] Data on patient satisfaction in CEECs are sparse and often unexploited. ...
Article
Objective: Recent increases in per capita income and longevity in Central and Eastern European counties (CEECs), alongside a slow-changing soviet-era public healthcare system, has led to the emergence of private hospitals. This paper investigates the differential patient service quality perceptions for private versus public hospitals, as well as for three types of healthcare services: primary, ambulatory, and inpatient care.Methods: Data from 1,673 patients of private and public hospitals in the capital of Romania were collected in face-to-face interviews. Analysis of covariance and partial-least-squares techniques were used to examine the relationships between perceived service quality, hospital ownership status and the type of health service patients received.Results: Over 70% of women prefer private health facilities to public hospitals (compared to less than 30% of men). While private hospitals rank higher than public hospitals on most attributes, the interaction effect of gender and hospital type reveals that assurance and empathy are the only significant attributes in driving women to private hospitals. (Physical facilities and staff appearance) as well as intangible dimensions of service quality (assurance, responsiveness, reliability, and empathy) have a positive impact on perceived overall service quality of healthcare. Improvements in perceptions of hospital’s tangibles, staff’s responsiveness and empathy have the greatest potential to enhance perceived overall service quality.Conclusions: This paper demonstrates the importance of breaking down health services into various sub-categories both in terms of perceived healthcare attributes and in terms of tangible healthcare facilities, such as public and private hospitals.
... However, the link between service quality and consumer expectation is difficult to establish. It has been proven that SERVQUAL and SERVPERF have adequate and valid scales for evaluating service quality (Carrillat et al., 2007). ...
Article
Most theories are tested in developed countries; moreover, there is still lack of research on SERVPERF, especially in low-cost hotel. Hence, due to high competition among low-cost hotels in Nigeria, many low-cost hotels are seeking for alternative ways of competing apart from pricing or operational costs. Thus, the aim of this study is to identify the effects of SERVPERF dimensions on guests’ satisfaction and loyalty in low-cost hotels as well as the mediating role of guests’ satisfaction on the relationships between SERVPERF dimensions and guests’ loyalty. Data were collected from 300 guests at low-cost hotels and was analyzed using structural equation modeling (SEM). Also, composite reliability, Cronbach alpha and average variance extracted were used to test the reliability and validity of the instrument. Result revealed that empathy, responsiveness and reliability influence guests’ satisfaction. Furthermore, empathy, reliability and satisfaction influence guests’ loyalty. Results on mediation revealed that guests’ satisfaction partially mediates the relationships between empathy as well as reliability and guests’ loyalty, while guests’ satisfaction fully mediates the path between responsiveness and guests’ loyalty. This study recommends that low-cost hotels management should emphasize on empathy and reliability to increase guest loyalty.
... However, the link between service quality and consumer expectation is difficult to establish. It has been proven that SERVQUAL and SERVPERF have adequate and valid scales for evaluating service quality (Carrillat et al., 2007). ...
Article
Most theories are tested in developed countries; moreover, there is still lack of research on SERVPERF, especially in low-cost hotel. Hence, due to high competition among low-cost hotels in Nigeria, many low-cost hotels are seeking for alternative ways of competing apart from pricing or operational costs. Thus, the aim of this study is to identify the effects of SERVPERF dimensions on guests’ satisfaction and loyalty in low-cost hotels as well as the mediating role of guests’ satisfaction on the relationships between SERVPERF dimensions and guests’ loyalty. Data were collected from 300 guests at low-cost hotels and was analyzed using structural equation modeling (SEM). Also, composite reliability, cronbach alpha and average variance extracted were used to test the reliability and validity of the instrument. Result revealed that empathy, responsiveness and reliability influence guests’ satisfaction. Furthermore, empathy, reliability and satisfaction influence guests’ loyalty. Results on mediation revealed that guests’ satisfaction partially mediates the relationships between empathy as well as reliability and guests’ loyalty, while guests’ satisfaction fully mediates the path between responsiveness and guests’ loyalty. This study recommends that low-cost hotels management should emphasize on empathy and reliability to increase guest loyalty.
Article
Full-text available
This meta-analysis compiles data from 30 studies on the quality of public transport services conducted in different countries. The studies collectively investigate different forms of public transportation, such as buses, paratransit, and rail services, utilizing a range of methodological approaches. The recurring themes in service quality are key dimensions such as reliability, responsiveness, assurance, empathy, and tangibles. The analysis demonstrates that these dimensions have a significant impact on user satisfaction, perceived value, and behavioural intentions. This emphasizes the widespread relevance of service quality measures such as SERVQUAL and SERVPERF. Furthermore, the results emphasize the differences in how service quality is perceived in different regions and the urgent requirement for targeted policy interventions to improve public transportation systems worldwide.
Article
Full-text available
Os serviços prestados por uma biblioteca precisam ser geridos ao longo do tempo considerando a multidimensionalidade da qualidade em serviços. Este estudo longitudinal investiga o impacto das intervenções realizadas entre 2013 e 2017 em uma biblioteca universitária sobre as dimensões da qualidade em serviços. A metodologia do estudo compreende a aplicação do instrumento SERVQUAL em uma biblioteca universitária pública em 2013 (T1), n=355 usuários, e 2017 (T2), n=184 usuários, sendo que em 2017 foi conduzido um estudo de caso para o levantamento das intervenções realizadas e que foram classificadas em quatro categorias: Infraestrutura, Equipamentos, Processos/Sistemas e Pessoas. Como resultados, tem-se que as lacunas de todas as dimensões da qualidade em serviços foram negativas para os dois períodos avaliados. Conclui-se que as intervenções impactaram reduzindo as lacunas negativas das dimensões da qualidade dos serviços da biblioteca em estudo.
Article
Full-text available
Este estudo teve como propósito analisar a possiblidade das determinantes da qualidade atuarem como preditoras da qualidade percebida pelos usuários das Atividades Pedagógicas Não Presenciais, no curso de Engenharia da Produção do IFES/ Campus Cariacica. Para o alcance desse objetivo, utilizou-se de uma survey adaptada para coletar dados de uma amostra de estudantes. Esses dados, após colhidos e tabulados, foram analisados por meio do método denominado Análise de Equações Estruturais pelos Mínimos Quadrados Parciais (PLS-SEM), a fim de verificar a validade de vinte hipóteses. Como resultado, dez hipóteses não foram rejeitadas, demonstrando que as determinantes da qualidade a Empatia e a Segurança nos modos assíncrono e síncrono, bem como, a Tangibilidade no modo assíncrono atuam como preditoras da qualidade percebida nas Atividades Pedagógicas Não Presenciais. Foram apresentadas, como limitações deste trabalho, o fato de ter sido aplicado em, apenas, um único curso de um campus e de ter sido utilizada técnica de amostragem baseada em aleatoriedade. Para trabalhos futuros, foi sugerida a utilização de técnica de amostragem probabilística, bem como, a ampliação da amostra para mais cursos e outros campi dos IFs.
Article
The purpose of this study is to evaluate the level of customer satisfaction with loan services in the banking industry. The study uses the service quality of loan facilities at Barclays Bank Ghana Limited as a case study. Through a stratified random sampling method, seventy (70) respondents of the bank's branch in Nkawkaw in the Eastern Region of Ghana were selected to participate in the study. Questionnaires and interviews were used to measure the respondents' level of satisfaction with the bank's loan services and other service quality dimensions. The data gathered were analysed using SPSS (Version 16.0) and presented by the use of statistical models such as graphs, charts, and tables. It was observed that the customers were generally satisfied with the loan services when assessed on 12 service quality dimensions. Interest rates, account charges, company reputation and past experience with the bank were found to be important factors in attracting the customers. Existing customers are also willing to recommend the bank to their reference group. These factors are useful for strategic decisions.
Article
Full-text available
Sharing economy has emerged as an influential concept, shaping the way and manner businesses are transacted. Of particular interest is the transport industry, where a combination of GPS and the development of software have resulted in the creation of an e-hailing transportation system. Issues such as the provision of quality services persist; giving rise to further investigations. The study, therefore, delved into service quality, satisfaction and loyalty of users of e-hailing transport services. Using a cross-sectional descriptive survey, a convenience sample was employed to intercept 436 users of e-hailing vehicles at pick-up and drop-off points in Accra, Ghana. A structural equation model was used to test the relationships between service quality, satisfaction and loyalty among users of e-hailing vehicles. Results show assurance (β = 0.149; p = .022); tangibles (β = 0.140; p = .045), responsiveness (β = 0.040; p = .008); reliability (β = 0.014; p = .015); empathy (β = 0.062; p = .000); system information (β = 0.013; p = .005) and price (β = 0.001; p = .008) significantly influence user satisfaction. Similarly, customer satisfaction was found to influence the loyalty of users of e-hailing vehicles. The study concludes that both individual service quality dimensions and the composite service quality model influence users’ satisfaction which in turn influences users’ willingness to re-use and recommend e-hailing services to others. Implications for theory and practice are discussed.
Article
Purpose The purpose of this manuscript is to provide a step-by-step primer on systematic and meta-analytic reviews across the service field, to systematically analyze the quality of meta-analytic reporting in the service domain, to provide detailed protocols authors may follow when conducting and reporting these analyses and to offer recommendations for future service meta-analyses. Design/methodology/approach Eligible frontline service-related meta-analyses published through May 2021 were identified for inclusion (k = 33) through a systematic search of Academic Search Complete, PsycINFO, Business Source Complete, Web of Science, Google Scholar and specific service journals using search terms related to service and meta-analyses. Findings An analysis of the existing meta-analyses within the service field, while often providing high-quality results, revealed that the quality of the reporting can be improved in several ways to enhance the replicability of published meta-analyses in the service domain. Practical implications This research employs a question-and-answer approach to provide a substantive guide for both properly conducting and properly reporting high-quality meta-analytic research in the service field for scholars at various levels of experience. Originality/value This work aggregates best practices from diverse disciplines to create a comprehensive checklist of protocols for conducting and reporting high-quality service meta-analyses while providing additional resources for further exploration.
Article
Full-text available
The authors respond to concerns raised by Cronin and Taylor (1992) and Teas (1993) about the SERVQUAL instrument and the perceptions-minus-expectations specification invoked by it to operationalize service quality. After demonstrating that the validity and alleged severity of many of those concerns are questionable, they offer a set of research directions for addressing unresolved issues and adding to the understanding of service quality assessment.
Article
Full-text available
The attainment of quality in products and services has become a pivotal concern of the 1980s. While quality in tangible goods has been described and measured by marketers, quality in services is largely undefined and unresearched. The authors attempt to rectify this situation by reporting the insights obtained in an extensive exploratory investigation of quality in four service businesses and by developing a model of service quality. Propositions and recommendations to stimulate future research about service quality are offered.
Article
The author examines conceptual and operational issues associated with the “perceptions-minus-expectations” (P-E) perceived service quality model. The examination indicates that the P-E framework is of questionable validity because of a number of conceptual and definitional problems involving the (1) conceptual definition of expectations, (2) theoretical justification of the expectations component of the P-E framework, and (3) measurement validity of the expectation (E) and revised expectation (E*) measures specified in the published service quality literature. Consequently, alternative perceived quality models that address the problems of the traditional framework are developed and empirically tested.
Article
A typology of service organizations is presented and a conceptual framework is advanced for exploring the impact of physical surroundings on the behaviors of both customers and employees. The ability of the physical surroundings to facilitate achievement of organizational as well as marketing goals is explored. Literature from diverse disciplines provides theoretical grounding for the framework, which serves as a base for focused propositions. By examining the multiple strategic roles that physical surroundings can exert in service organizations, the author highlights key managerial and research implications.