ArticlePDF Available

The validity of the SERVQUAL and SERVPERF scales: a meta-analytic view of 17 years of research across five continents

October 2007
International Journal of Service Industry Management 18(5):472-490

October 2007
18(5):472-490

DOI:10.1108/09564230710826250

Authors:

François A. Carrillat

University of Technology Sydney

Fernando Jaramillo

University of Texas at Arlington

Jay Mulki

Northeastern University

Purpose The purpose is to investigate, the difference between SERVQUAL and SERVPERF's predictive validity of service quality. Design/methodology/approach Data from 17 studies containing 42 effect sizes of the relationships between SERVQUAL or SERVPERF with overall service quality (OSQ) are meta‐analyzed. Findings Overall, SERVQUAL and SERVPERF are equally valid predictors of OSQ. Adapting the SERVQUAL scale to the measurement context improves its predictive validity; conversely, the predictive validity of SERVPERF is not improved by context adjustments. In addition, measures of services quality gain predictive validity when used in: less individualistic cultures, non‐English speaking countries, and industries with an intermediate level of customization (hotels, rental cars, or banks). Research limitations/implications No study, that were using non‐adapted scales were conducted outside of the USA making it impossible to disentangle the impact of scale adaptation vs contextual differences on the moderating effect of language and culture. More comparative studies on the usage of adapted vs non‐adapted scales outside the USA are needed before settling this issue meta‐analytically. Practical implications SERVQUAL scales require to be adapted to the study context more so than SERVPERF. Owing to their equivalent predictive validity the choice between SERVQUAL or SERVPERF should be dictated by diagnostic purpose (SERVQUAL) vs a shorter instrument (SERVPERF). Originality/value Because of the high statistical power of meta‐analysis, these findings could be considered as a major step toward ending the debate whether SERVPERF is superior to SERVQUAL as an indicator of OSQ.

Content uploaded by Jay Mulki

Content may be subject to copyright.

The validity of the SERVQUAL

and SERVPERF scales

A meta-analytic view of 17 years of research

across ﬁve continents

Franc¸ois A. Carrillat

HEC Montre

´al, Montre

´al, Canada

Fernando Jaramillo

Department of Marketing, University of Texas at Arlington, Arlington,

Texas, USA, and

Jay P. Mulki

Marketing Group, Northeastern University, Boston, Massachusetts, USA

Abstract

Purpose – The purpose is to investigate, the difference between SERVQUAL and SERVPERF’s

predictive validity of service quality.

Design/methodology/approach – Data from 17 studies containing 42 effect sizes of the

relationships between SERVQUAL or SERVPERF with overall service quality (OSQ) are

meta-analyzed.

Findings – Overall, SERVQUAL and SERVPERF are equally valid predictors of OSQ. Adapting the

SERVQUAL scale to the measurement context improves its predictive validity; conversely, the

predictive validity of SERVPERF is not improved by context adjustments. In addition, measures of

services quality gain predictive validity when used in: less individualistic cultures, non-English

speaking countries, and industries with an intermediate level of customization (hotels, rental cars, or

banks).

Research limitations/implications – No study, that were using non-adapted scales were

conducted outside of the USA making it impossible to disentangle the impact of scale adaptation vs

contextual differences on the moderating effect of language and culture. More comparative studies on

the usage of adapted vs non-adapted scales outside the USA are needed before settling this issue

meta-analytically.

Practical implications – SERVQUAL scales require to be adapted to the study context more so

than SERVPERF. Owing to their equivalent predictive validity the choice between SERVQUAL or

SERVPERF should be dictated by diagnostic purpose (SERVQUAL) vs a shorter instrument

(SERVPERF).

Originality/value – Because of the high statistical power of meta-analysis, these ﬁndings could be

considered as a major step toward ending the debate whether SERVPERF is superior to SERVQUAL

as an indicator of OSQ.

Keywords Services, SERVQUAL, Culture, Quality

Paper type Research paper

Over the years, marketing researchers have reached consensuses on several issues related

to the domain of services. First, as the economy has become mostly service-based,

researchers now consider the marketing discipline as being service dominated.

The current issue and full text archive of this journal is available at

www.emeraldinsight.com/0956-4233.htm

IJSIM

18,5

472

Received 9 January 2006

Revised 4 February 2007

Accepted 17 May 2007

International Journal of Service

Industry Management

Vol. 18 No. 5, 2007

pp. 472-490

qEmerald Group Publishing Limited

0956-4233

DOI 10.1108/09564230710826250

Consumers in OECD countries spend more on services than for tangible goods (Martin,

1999). Indeed, service activities constitute about 70 percent of OECD (2005) countries GDP,

and this trend is expected to continue in the coming decade. The globalization of services

marketing has presented both academics and practitioners challenges and opportunities

in this area (Javalgi et al., 2006). Reﬂecting this changing emphasis services marketing

has become a wellestablished ﬁeld of academic inquiry and now represents an alternative

paradigm to the marketing of goods (Lovelock and Gummesson, 2004).

Researchers also agree that a central topic in service research is service quality (SQ),

which is a critical determinant of business performance as well as ﬁrms’ long-term

viability (Bolton and Drew, 1991; Gale, 1994). This is because SQ leads to customer

satisfaction which in turn has a positive impact on customer word-of-mouth,

attitudinal loyalty, and purchase intentions (Gremler and Gwinner, 2000). The view

that SQ results from customers’ evaluation of the service encounter prevails in the

literature (Cronin and Taylor, 1992; Parasuraman et al., 1985). Under this perspective,

researchers further agree that SQ is best represented as an aggregate of the discrete

elements from the service encounter such as reliability, responsiveness, competence,

access, courtesy, communication, credibility, security, understanding, and tangible

elements of the service offer (Cronin and Taylor, 1992; Dabholkar et al., 2000;

Parasuraman et al., 1985).

On the other hand, the question of the operationalization of SQ has continued to evoke

discussion. This discussion has been primarily centered on two important issues. The

ﬁrst relates to the debate of whether SERVQUAL or SERVPERF should be used for

measuring SQ (Cui et al., 2003; Hudson et al., 2004; Jain and Gupta, 2004; Kettinger and

Lee, 1997; Mukherje and Nath, 2005; Quester and Romaniuk, 1997). SERVQUAL,

grounded in the Gap model, measures SQ as the calculated difference between customer

expectations and performance perceptions of a service encounter (Parasuraman et al.,

1988, 1991). Cronin and Taylor (1992) challenged this approach and developed the

SERVPERF scale which directly captures customers’ performance perceptions in

comparison to their expectations of the service encounter. In spite of recent attempts in

the literature toward settling this issue, the SERVQUAL-SERVPERF debate has never

been so relevant. In fact, numerous authors have supported the view that SERVPERF is

a better alternative than SERVQUAL (Babakus and Boller, 1992; Brady et al., 2002;

Brown et al., 1993; Zhou, 2004) while, on the other hand, SERVQUAL has enjoyed and

continues to enjoy widespread acceptance as a measure of SQ (Chebat et al., 1995; Furrer

et al., 2000; Zeithaml and Bitner, 2003). In addition, the web of science reveals that the

original SERVQUAL paper published in 1988, as well as the following 1991 scale

reﬁnement paper has both received more than 46 percent of their total citations within

the last ﬁve years. The same is true of SERVPERF, which also received more than

46 percent of its citations within the last ﬁve years. This indicates that Cronin and

Taylor’s (1994) conceptual arguments in favor of SERVPERF, while it may have

contributed to SERVPERF popularity, have not reduced SERVQUAL’s usage among

scholars. In addition, it suggests that the multilevel scale, offered by Brady and Cronin

(2001) as a reconciling perspective, has not moved researchers away from either

SERVQUAL to SERVPERF. Therefore, shedding light on whether one scale is better

than the other remains a very important question to be answered.

The second issue centers on the trade-off between the generalizability – and

speciﬁcity level of the SERVQUAL and SERVPERF scales (Asubonteng et al., 1996).

The validity of

the SERVQUAL

and SERVPERF

473

A scale can be applied in more diversiﬁed contexts as its items become more abstract

(Babakus and Boller, 1992; Dabholkar et al., 2000). However, this limits the scale’s

ability to capture speciﬁc context elements (Babakus and Boller, 1992; Dabholkar et al.,

2000). There is a general acceptance of the need to modify scale items to suit study

context. However, empirical investigation regarding the impact of item adaptation on

scale validity (i.e. when original SERVQUAL/SERVPERF items versus modiﬁed items

are used) has not been undertaken. In addition, research is needed to assess the

appropriateness of the SERVQUAL/SERVPERF scales when they are used outside

the USA. This is because differences in national culture or language require not only

modiﬁcation of items but also create distortions in how respondents perceive the

construct under investigation (Herk et al., 2005).

The above discussion raises several important research questions. First, are

SERVQUAL and SERVPERF adequate predictors of SQ? And, as proposed by Cronin

and Taylor (1992), is SERVPERF a better predictor of SQ than SERVQUAL? Second, is

there an improvement in the predictive validity of the SERVQUAL and SERVPERF

measures when the scale items are adapted to the study context? Third, does the

predictive power of SERVQUAL and SERVPERF depend on national culture or scale

language? Finally, is the predictive validity of SERVQUAL and SERVPERF inﬂuenced

by the type of industry in which the study is conducted?

The current study addresses these research questions by meta-analyzing empirical SQ

research. Meta-analysis is appropriate for addressing these research questions because it

systematically integrates ﬁndings across studies, controls for statistical artifacts, and

provides very robust answers about relationships among variables (Arthur et al.,2001;

Hunter and Schmidt, 2004). Our meta-analytic framework relies on 42 effect sizes from 17

empirical studies conducted across ﬁve continents spanning 17 years.

Previous research has already attempted to compare SERVQUAL and SERVPERF

(Brady et al., 2002; Cronin and Taylor, 1992; Cui et al., 2003; Hudson et al., 2004; Jain and

Gupta, 2004; Kettinger and Lee, 1997; Quester and Romaniuk, 1997). However,

considering these studies individually provide dispersed evidence that might add, rather

than subtract, ambiguity surrounding the measurement debate. For instance, Jain and

Gupta (2004) as well as Kettinger and Lee (1997) found that SERVPERF was more

strongly correlated to overall service quality (OSQ) than SERVQUAL whereas Quester

and Romaniuk (1997) reported that SERVQUAL exhibited a stronger relationship with

OSQ than SERVPERF. In some cases, studies comparing SERVQUAL and SERVPERF

focus on dimensionality issues without considering predictive validity (Cui et al., 2003;

Hudson et al., 2004). Furthermore, the aforementioned studies rely on one or two samples

at most which prevent them from drawing robust conclusions and from testing the

impact of contingency factors such as country, language, or industry. Therefore,

the current research constitutes a signiﬁcant contribution to the service literature

because it provides answers to the SERVQUAL/SERVPERF validity debate tackled by

Cronin and Taylor (1992, 1994), Brady et al. (2002), and Parasuraman et al. (1994).

In addition, because meta-analysis is based on the accumulation of empirical evidence

over the years, it allows investigating moderating factors by comparing sub-groups of

studies that share a similar characteristic – e.g. the country where the sample was

drawn (Lipsey and Wilson, 2001).

This paper is organized as follows. First, a review of the literature is presented

and hypotheses are developed. Second, a description of the meta-analytic

IJSIM

18,5

474

procedure is provided. Third, results, as well as implications and suggestions for

further research, are discussed.

Conceptual background

Both SERVQUAL and SERVPERF’s operationalizations relied on the conceptual

deﬁnition that SQ is an attitude toward the service offered by a ﬁrm resulting from a

comparison of expectations with performance (Parasuraman et al., 1985, 1988; Cronin

and Taylor, 1992). However, SERVQUAL directly measures both expectations – and

performance perceptions whereas SERVPERF only measures performance

perceptions. SERVPERF uses only performance data because it assumes that

respondents provide their ratings by automatically comparing performance

perceptions with performance expectations. Thus, SERVPERF assumes that directly

measuring performance expectations is unnecessary.

Research comparing the predictive validity of SERVQUAL with SERVPERF has

been based on assessing which of the two measures is a better predictor of OSQ. OSQ

has been used as the criterion because it is a global representation of the quality of the

service offered by an organization (Cronin and Taylor, 1992, 1994; Jain and Gupta,

2004; Kettinger and Lee, 1997; Quester and Romaniuk, 1997). In their comparison of

SERVQUAL with SERVPERF, Cronin and Taylor (1992) built their argument for the

superiority of SERVPERF over SERVQUAL by empirically showing that SERVPERF

is a better predictor of OSQ than SERVQUAL. Also, Parasuraman et al. (1988) assessed

the construct validity of SERVQUAL by evaluating whether the scale was an adequate

predictor of OSQ. In view of this, the predictive validity of SERVQUAL and

SERVPERF is assessed by meta-analyzing extant empirical research on the strength of

the relationship between each scale and OSQ.

The predictive validity of SERVQUAL and SERVPERF

SERVQUAL and SERVPERF are based on rigorous scale development procedures

(Parasuraman et al., 1988, 1991) and have been widely used by researchers. Therefore,

it is expected that both the SERVQUAL and SERVPERF measures of SQ will be

strongly related to OSQ. The literature on scale development does not speciﬁcally point

to a particular correlation value with a criterion against which the predictive validity of

a scale can be assessed. However, it is possible to turn to less formal guidelines

formulated by researchers. According to Cohen’s (1992) rule of thumb, a “small” effect

size is observed when the correlation is 0.10, a “medium” effect size is obtained when

the correlation is 0.30, and a “large” effect size corresponds to a correlation of 0.50.

These guidelines have been previously used to qualify the strength of meta-analytic

correlations (Jaramillo et al., 2005). Therefore, the following is hypothesized:

H1. The correlation between SERVQUAL or SERVPERF and OSQ will be strong

and above 0.50.

The disconﬁrmation vs performance-only debate

In Parasuraman et al.’s (1985) “disconﬁrmation” perspective, the SQ construct is seen as

an attitude resulting from customers’ comparison of their expectations about the service

encounter with their perceptions of the service encounter. The SERVQUAL instrument

operationalizes this construct as the difference between expected and actual (perceived)

performance (Parasuraman et al., 1988, 1991). Alternatively, SERVPERF is based on the

The validity of

the SERVQUAL

and SERVPERF

475

“performance only” perspective and operationalizes SQ as customers’ evaluations of

the service encounter. As a result, SERVPERF uses only the performance items of the

SERVQUAL scale (Brady et al., 2002; Cronin and Taylor, 1992, 1994).

In discussing the relative merits of each scale, the debate has been primarily

centered on predictive validity and speciﬁcally on whether SERVQUAL or SERVPERF

better captures SQ. First, some researchers have argued that SERVPERF is a better

measure because it does not depend on ambiguous customers’ expectations.

Arguments in favor of SERVPERF are based on the notion that performance

perceptions are already the result of customers’ comparison of the expected and actual

service (Babakus and Boller, 1992; Oliver and DeSarbo, 1988). Therefore, performance

only measures should be preferred to avoid redundancy. Second, as Teas (1993)

points out, Parasuraman et al.’s (1991) conceptualization of SQ is inconsistent with its

operationalization. Teas (1993) argues that, since Parasuraman et al. (1991) deﬁne

expectations as a type of attitudes, customer expectations must be considered as ideal

points. Hence, the Gap model implication that superior perceptions of SQ occur when

performance increasingly exceeds expectations is theoretically inconsistent. The

classical attitudinal perspective suggests that positive attitudes are formed when

evaluations of an object are close to an expected ideal point. Therefore, SQ should peak

when perceptions equal expectations (Teas, 1993).

Parasuraman et al. (1994) defended SERVQUAL by demonstrating that there was

virtually no difference in predictive power between SERVQUAL and SERVPERF.

Although, discussions have continued on whether disconﬁrmation-based measures are

superior to performance-only based measures (Dabholkar et al., 2000; Hudson et al.,

2004; Jain and Gupta, 2004), the above discussed arguments point toward the

superiority of SERVPERF over SERVQUAL. Thus:

H2. The relationship between SQ and OSQ is stronger when SQ is measured with

SERVPERF than with SERVQUAL.

Contextual factors

Any scale represents a compromise between relevance and the extent to which it can be

applied in a wide array of contexts (Babakus and Boller, 1992). Scale modiﬁcation is

done by adding, deleting or rewording items to ensure suitability for a particular

research context. SERVQUAL and SERVPERF scale modiﬁcations have led to

discussions about:

.the universal versus context speciﬁc character of the scales; and

.whether changes to ﬁt a speciﬁc context result in better predictive validity.

It is important to mention that in their original development, SERVQUAL and

SERVPERF were purported to be universal measures of SQ because the scale

development process relied on samples from multiple industries (Cronin and Taylor,

1992; Parasuraman et al., 1988). However, Parasuraman et al. (1988) recognize that

SERVQUAL can be adapted to the speciﬁc research needs of a particular organization.

As Rossiter (2002) indicates, the speciﬁcities of the measurement context play an

important role in construct validity.

Researchers are particularly concerned about the effect of environmental factors on

the validity of SQ scales (Babin et al., 2004). In fact, researchers have failed to replicate

IJSIM

18,5

476

the ﬁve original dimensions of the SERVQUAL/SERVPERF scales, namely tangibility,

reliability, responsiveness, assurance, and empathy (White and Schneider, 2000).

Based on this, researchers have noted that SQ scales need to be adapted to the study

context (Carman, 1990). For instance, tangibility might not be relevant for a cable

company because the customer might never see the facilities of the service provider,

whereas it may be critical for a healthcare facility customer. In their study on the

photography industry, Dabholkar et al. (2000) dropped items related to physical

facilities (tangibility) from the original SERVQUAL because customers did not have to

visit the company’s site; however, they added items related to “salespeople pressure”

that are absent from SERVQUAL. The above discussion suggests that context adapted

versions of SERVQUAL and SERVPERF, hereinafter referred to as MQUAL and

MPERF, will have a better predictive validity than non-modiﬁed versions (QUAL or

PERF, respectively). Thus:

H3a. The relationship between SQ and OSQ will be stronger when SQ is measured

with MQUAL rather than with QUAL.

H3b. The relationship between SQ and OSQ will be stronger when SQ is measured

with MPERF rather than with PERF.

Country culture

Studies using SERVQUAL and SERVPERF have been conducted across more than 17

countries and on each and every continent. The use of these scales in an international

context raises a legitimate concern about validity across borders because research has

shown that cultural values inﬂuence customer responses on measures of SQ (Laroche

et al., 2004; Zhou, 2004). According to Herk et al. (2005), research conducted

internationally can be affected both by construct bias (i.e. the construct studied differs

across countries) and item bias (i.e. items are distorted when used internationally). For

instance, Sultan et al. (2000) found signiﬁcant differences across US and European

passengers on their expectations and performance perceptions of airlines SQ. In

addition, Mattila (1999) found that Western customers are more likely than their Asian

counterparts to rely on tangible cues from the physical environment, which evidences

that the tangibility dimension of SERVQUAL is more important for them.

Researchers have found that cultural differences can also create item bias.

Steenkamp and Baumgartner (1998) show that both:

(1) the metric invariance (i.e. the interpretation of the distance between the scale

points); and

(2) the scalar invariance (i.e. whether scale latent means have systematic biases) of

items become uncertain when scales are used across cultures.

In fact, Diamantopoulos et al. (2006) found that international differences in response

styles (i.e. item wording, type of scale, etc.) generate item bias. Therefore, we propose

that SERVQUAL and SERVPERF are likely to be affected by construct and item biases

when used in international settings.

In order to account for cultural differences, it was decided to rely on Hofstede’s (1997)

individualism/collectivism (IDV) measure of national culture. IDV is useful and

parsimonious for explaining cross-cultural differences in attitudes and behaviours. Also,

IDV has satisfactory reliability and uni-dimensionality (Cano et al., 2004; Triandis, 1995).

The validity of

the SERVQUAL

and SERVPERF

477

Research indicates that IDV may affect perceptions of OSQ and its dimensions. For

instance, Furrer et al. (2000) argue that, in high individualistic cultures, consumers tend

to be independent, have an ethic of self-responsibility and demand a higher level of SQ.

Furrer et al. (2000) also note that individualistic consumers prefer to maintain a

signiﬁcant distance between themselves and the service provider. In addition, their

study results show that consumers with a high degree of individualism considered

“responsiveness” and “tangibles” dimensions as more important compared to

consumers from collectivistic cultures. Individualistic customers tend focus on their

own beneﬁts and interests, and expect the service providers to do the best in catering to

their needs (Donthu and Yoo, 1998). Thus, individualistic customers pay careful

attention to the service provided and are not likely to accept lower SQ. Donthu and Yoo’s

(1998) study showed that individualistic customers have higher OSQ expectations,

higher empathy and assurance expectations from their service providers compared to

customers from collectivistic societies. SERVQUAL and SERVPERF were developed in

the USA, a country with the highest IDV level (Hofstede, 1997). In view of this, the

existing dimensions of the SERVQUAL and SERVPERF scales should match more

closely with the expectations of consumers from individualistic countries. As a result, it

is expected that the predictive validity of SERVQUAL will be diminished in countries

with a lower IDV level:

H4a. The strength of the relationship between SERVQUAL or SERVPERF and

OSQ decreases as the degree of individualism of the country decreases.

Country language

It is generally known that language translation can be a worsening factor of cultural

bias. Even when scales are carefully translated and closely checked by experts

(Witkowski and Wolﬁnbarger, 2002; Zhou, 2004), the absence of a concept in a language

does not permit a perfect accuracy in scale translation (Herk et al., 2005). Thus, scale

translation can result in higher measurement error which attenuates relationships

among constructs (Hunter and Schmidt, 2004). Therefore, the following is hypothesized:

H4b. The strength of the relationship between SERVQUAL or SERVPERF and

OSQ is stronger when SERVQUAL or SERVPERF is administered in English

than when translated.

Type of services

It is expected that SERVQUAL or SERVPERF will perform differently depending on

the industry in which they are used. This is because the relevance of the scale

dimensions depends on the study setting (White and Schneider, 2000). Many

categorizations of services have been proposed in the literature (Bitner, 1992; Lovelock,

1983; Silvestro et al., 1992). Among the numerous service classiﬁcations, Silvestro

et al.’s (1992) production perspective has emerged as integrative of other service

typologies. Silvestro et al. (1992) divide service providers into the following three

groups that range from lower to higher intensity of customer processing:

(1) Professional services (PS) (i.e. low customer processing intensity) include services

provided by lawyers, business consultants, or ﬁeld engineering. Some

characteristics of this group are: few transactions, highly customized,

process-oriented and long customer contact times. Value is added by front ofﬁce

IJSIM

18,5

478

service employees who rely extensively on their own judgment to perform

the service.

(2) Service shops (SS) (i.e. intermediate customer processing intensity) such as

hotels, rental cars, or banks. This group has an intermediate level of

customization and judgment from service employees. Value added is generated

in both the back and front ofﬁces.

(3) Mass services (MS) (i.e. high customer processing intensity) such as provided

by retailer, transportation, or confectionery. The group has many customer

transactions, few contact opportunities, and limited customization. Value added

comes from the back ofﬁce and service employees use little judgment.

According to Silvestro et al. (1992), as the intensity of custome r-processing decreases, the

emphasis on process rather than product intensiﬁes. The process elements of a service

are by nature intangible while the product elements are more tangible (Zeithaml and

Bitner, 2003). Therefore, less customer-processing-oriented service industries will have

more intangible service offers. Because, SERVQUAL is purp orted to measure the service

aspects of the quality of customer experience, it is expected to perform better when

customer-processing intensity decreases while intangibility increases. Thus:

H5. The strength of the relationship between SERVQUAL and OSQ decreases as

the service category moves from PS to services shop and to MS.

Methodology

All studies containing an effect size (

) that measures the strength of the relationship

between SQ (SERVQUAL, SERVPERF) and OSQ were eligible for inclusion.

Valid statistics included Pearson’s correlation coefﬁcients (r) or any other statistics

that could be converted to r, such as F-value, t-value, p-value, and

. Empirical

studies published in 1988 or after and available before May 30, 2005 were included in this

meta-analysis. This timeframe is used since SERVQUAL was ﬁrst published in 1988.

Study search

The following procedure was used to obtain an ample collection of studies reporting

the desired effect sizes. First, an electronic search of the following databases was

conducted: Direct Science,Emerald,ProQuest (ABI/INFORM Global and dissertation

abstracts). Second, a manual examination of the articles identiﬁed from the

computer-based searches was carried out. Third, manual searches of leading

marketing and service journals were conducted. To contact marketing researchers, a

call for working papers, forthcoming articles, conference papers, and unpublished

research was posted on ELMAR-AMA (,5,000 subscribers). The search process

yielded a total of 17 studies containing 42 effect sizes resulting from studying 9,880

respondents (Table I).

Meta-analytic model

Meta-analyses can be conducted using either a ﬁxed-effect (FE) or a random-effect (RE)

model (Hunter and Schmidt, 2004). A FE model assumes that the same

value

underlies the observed effect sizes in all the studies, whereas the RE model allows for

The validity of

the SERVQUAL

and SERVPERF

479

Authors Scale

Country IDV score

Language Services

Angur et al. (1999) QUAL USA 91 English Mass 143 0.70

Angur et al. (1999) PERF USA 91 English Mass 143 0.72

Babakus and Boller (1992) PERF USA 91 English Shop 520 0.66

Bojanic (1991) PERF USA 91 English Pro 32 0.57

Brady et al. (2002) MPERF USA 91 English *1548 0.62

Cronin and Taylor (1992) PERF USA 91 English *660 0.60

Cronin and Taylor (1992) QUAL USA 91 English *660 0.54

Dabholkar et al. (2000) MQUAL USA 91 English Mass 397 0.78

Dabholkar et al. (2000) MPERF USA 91 English Mass 397 0.65

Freeman and Dart (1993) MQUAL Canada 80 English Pro 217 0.63

Jabnoun and Al-Tamimi (2003) MQUAL UAI 38 Non-English Mass 462 0.82

Lam (1995) PERF Hong Kong 25 English Mass 214 0.82

Lam (1995) QUAL Hong Kong 25 English Mass 214 0.69

Lam (1997) PERF Hong Kong 25 English Pro 82 0.71

Lee et al. (2000) MQUAL USA 91 English Shop 196 0.75

Lee et al. (2000) MQUAL USA 91 English Pro 128 0.59

Lee et al. (2000) MPERF USA 91 English Mass 197 0.72

Lee et al. (2000) MPERF USA 91 English Pro 128 0.71

Lee et al. (2000) MPERF USA 91 English Shop 196 0.81

Lee et al. (2000) MQUAL USA 91 English Mass 197 0.47

Mehta et al. (2000) MPERF Singapore 20 Non-English Shop 161 0.63

Mehta et al. (2000) MPERF Singapore 20 Non-English Shop 161 0.75

Mittal and Lassar (1996) MQUAL USA 91 English Pro 123 0.79

Mittal and Lassar (1996) QUAL USA 91 English Pro 123 0.77

Mittal and Lassar (1996) MQUAL USA 91 English Pro 110 0.86

Mittal and Lassar (1996) QUAL USA 91 English Pro 110 0.85

(continued)

Table I.

Coding of effect sizes

included in the

meta-analysis

IJSIM

18,5

480

Authors Scale

Country IDV score

Language Services

Pariseau and McDaniel (1997) MQUAL USA 91 English Mass 39 0.71

Quester and Romaniuk (1997) PERF Australia 90 English Pro 182 0.55

Quester and Romaniuk (1997) QUAL Australia 90 English Pro 182 0.51

Smith (1999) MQUAL UK 89 English Pro 177 0.38

Smith (1999) MPERF UK 89 English Pro 177 0.36

Wal et al. (2002) QUAL South Africa 65 English Shop 583 0.08

Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Mass 101 0.63

Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Shop 86 0.62

Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Shop 75 0.62

Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Shop 114 0.56

Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Mass 132 0.54

Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Mass 81 0.59

Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Pro 103 0.59

Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Pro 105 0.58

Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Shop 105 0.57

Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Shop 119 0.57

Notes:

QUAL ¼original SERVQUAL, MQUAL ¼modiﬁed SERVQUAL, PERF ¼original SERVPERF, MPERF ¼modiﬁed SERVPERF;

Hofstede’s

individualism score;

type of service industry based on Silvestro et al. (1992);

sample size;

observed effect size; *these studies relied on multiple

industries spanning across service types and were not included in this moderator analysis

Table I.

The validity of

the SERVQUAL

and SERVPERF

481

variation of the population parameter

across studies. Credibility intervals (i.e. the

distribution of population parameter values) were computed in addition to conﬁdence

intervals (i.e. the range of the true population value) (Hunter and Schmidt, 2004).

Hunter and Schmidt’s (2004) RE model was used as it accounts for both random and

systematic variance and has been shown to yield very accurate credibility intervals in

simulation studies (Hall and Brannick, 2002). Also, both the observed mean

correlations (r) and the corrected mean correlations (r

) were estimated by following

Arthur et al.’s (2001) procedure to account for measurement error.

Test of moderators

When estimating the signiﬁcance of nominal moderator variables with two categories,

we relied on the “standard method” as advised by Schenker and Gentleman (2001) and

implemented in a recent marketing meta-analysis (Jaramillo et al., 2005). The “standard

method” consists of building only one interval around the difference between the

two-point estimates by adding and subtracting the appropriate z-value multiplied by

the square root of the sum of the squared SE of each point estimate. If that interval does

not include zero, the difference between the two point estimates is statistically

signiﬁcant. The standard method is preferred to comparisons of conﬁdence intervals

since testing of moderating hypotheses has greater statistical power (Jaramillo et al.,

2005; Schenker and Gentleman, 2001). Note that since all the moderator hypotheses

were directional, the z-value used for computing the interval around the difference

between the point estimates corresponded to a 90 percent conﬁdence level to generate

level of 0.05 as in a one-tailed test (Jaramillo et al., 2005).

When testing for continuous moderators, or nominal moderators with more than two

categories, the weighted regression approach of Lipsey and Wilson (2001) was adopted.

This procedure consists in regressing the disattenuated effect sizes on independent

variables (continuous or dummy coded) with w

(the inverse variance component which

gives more weight to effect sizes coming from homogeneous distributions) as the weight

for each observation. The moderation effect of IDV is tested using weighted regression

analysis. Weighted regression analysis is adequate to test the moderating effect of IDV

since it is a continuous variable (Cano et al., 2004; Lipsey and Wilson, 2001).

Results

Table II presents the results of the meta-analysis. The overall strength of the relationship

between SERVQUAL and OSQ is larger than 0.50 (r¼0.58; r

¼0.68;

CI90percent ¼0:50 20:66). The average SERVPERF and OSQ correlation is also larger

than 0.50 (r¼0.64; r

¼0.75; CI90percent ¼0:52 20:77). Since, the lower bound values

of the 90 percent conﬁdence intervals for both SERVQUAL and SERVPERF are above

0.50, the grand mean correlations can be interpreted as large (Cohen, 1992). This indicates

that both SERVQUAL and SERVPERF are valid measures of SQ, thus bringing support

for H1. The presence of moderators of the SERVQUAL-OSQ and SERVPERF-OSQ

relationships is evidenced in statistically signiﬁcant Q-statistics (Table II). The Q-statistic

is distributed as a

with k21 degrees of freedom and is compared to the corresponding

critical

statistic. A signiﬁcant Q-statistic demonstrates that the effect size distribution

is heterogeneous and indicates that the population varies systematically according to

some factors other than subject level sampling and measurement errors (Lipsey and

Wilson, 2001).

IJSIM

18,5

482

Q-statistic

Conﬁdence

interval

Credibility

interval

Percentage of variance

explained

FS N

Overall 42 9.880 0.61 0.71 288 0.56-0.66 0.44-0.98 67.9 2.982

SERVQUAL 27 5.082 0.58 0.68 241 0.50-0.66 0.33-1.03 70.9 1.836

QUAL 7 2.015 0.46 0.54 56 0.27-0.66 0.11-0.96 71.6 378

MQUAL 20 3.067 0.66 0.77 57 0.60-0.72 0.57-0.97 64.4 1.540

SERVPERF 15 4.798 0.64 0.75 38 0.52-0.77 0.34-1.15 68.7 1.125

PERF 7 1.751 0.65 0.73 11 0.59-0.71 0.62-0.84 64.2 511

MPERF 8 2.965 0.64 0.75 29 0.57-0.70 0.64-0.87 42.9 600

English speaking 34 8.525 0.60 0.70 260 0.54-0.66 0.42-0.97 67.8 2.380

Non English speaking 8 1.355 0.69 0.79 19 0.60-0.77 0.63-0.96 61.6 632

Notes:

Number of effect sizes;

sample size;

attenuated mean effect size;

disattenuated (i.e. corrected) mean effect size;

critical values range from 12.59

to 56.93;

at the 95 percent level;

at the 90 percent level;

variance explained by sample and measurement artifact;

fail-safe N: number of studies with an

effect size of zero (r

¼0) needed to reduce the mean effect size (r

) to 0.01

Table II.

Overall meta-analytic

results and categorical

moderators for the

relationship between SQ

and OSQ

The validity of

the SERVQUAL

and SERVPERF

483

H2 posited that the relationship between SERVPERF and OSQ is stronger than

the SERVQUAL-OSQ relationship. However, a comparison of the strength of these

relationships reveals no signiﬁcant difference. As shown in Table II, although the

mean SERVPERF-OSQ correlation (r

¼0.75) is larger than the SERVQUAL-OSQ

correlation (r

¼0.68), the difference is not statistically signiﬁcant. In effect, the

90 percent conﬁdence interval for the difference between the two point estimates

¼0.75 and r

¼0.68) includes zero (CI90percent ¼20:06 to 0:19), indicating that

there is no signiﬁcant difference between the predictive validity of SERVQUAL versus

SERVPERF (Schenker and Gentleman, 2001).

H3a and H3b stated that the modiﬁed SERVQUAL or SERVPERF scales would be

more strongly related to OSQ than the original scales. The observed difference between

the predictive validity of the original SERVQUAL and its modiﬁed version was

statistically signiﬁcant (QUAL r

¼0.54 vs MQUAL r

¼0.77;

¼0.23,

CI90percent ¼0:06 20:40). This suggests that the predictive validity of SERVQUAL

increases when it is adapted to the study context. However, the observed difference

between the predictive validity of the original version of SERVPERF and its modiﬁed

version was not statistically signiﬁcant (PERF r

¼0.73 vs MPERF r

¼0.75;

¼0.02, CI90percent ¼20:04 20:09). This suggests that the predictive validity of

SERVPERF does not change when the scale is modiﬁed.

According to H4a and H4b, the predictive validity of SERVQUAL on OSQ

decreases:

.as the individualism of the country sample decreases; and

.when the study is conducted in a non-English speaking country.

A weighted regression with the disattenuated correlations between SQ and OSQ as the

dependent variable, and IDV as the independent variable, revealed that a country’s

individualism negatively impacts the predictive validity of SERVQUAL (B¼20.001,

p,0.05), which is contrary to what was hypothesized in H4a (Table III). In addition, the

mean effect size for English speaking countries was smaller than the mean effect size for

non-English speaking countries (non-English speaking r

¼0.79 vs English speaking

¼0.70; CI90percent ¼0:04 20:16); thus, not providing support for H4b (Table II).

According to H5, when moving from lower to higher levels of customer processing

intensity, the predictive validity of SERVQUAL on OSQ should decrease. H5 implied

that the SERVQUAL-OSQ relationships should be strongest for PS followed by SS, and

weakest for MS. As shown in Table III, the strongest SERVQUAL-OSQ relationships

Model

Adjusted SE

z-value

Individualism-collectivism y¼b1x1þ1B

¼20.001 0.0004 22.68 *

Industry type

y¼b1x1þb2x2þ1B

¼0.096 0.035 2.78 *

¼20.12 0.037 23.28 *

Notes: *Signiﬁcant at 1¼0.05;

when applied in a meta-analytic study, although the

coefﬁcient

estimates are accurate, their standard errors need to be adjusted; Lipsey and Wilson (2001) indicate

that the standard errors of the

coefﬁcients need be divided by the square root of the mean square

residuals of the regression model in order to yield z-value used for signiﬁcance testing;

professional

services is the base level; B

corresponds to service shops and B

to mass services

Table III.

Results for continuous

moderators and

multimodal categorical

moderators

IJSIM

18,5

484

are for SS (B

¼0.096, p,0.05), followed by PS (base line), and then MS (B

¼20.12,

p,0.05). Hence, H5 is not supported.

Discussion

The study results have important implications because they question isolated ﬁndings

from earlier studies. In spite of the discussions and several arguments provided by

researchers about the superiority of SERVPERF over SERVQUAL (Cronin and Taylor,

1992, 1994), the results of this meta-analysis suggest that both scales are adequate and

equally valid predictors of OSQ. Because of the high statistical power of meta-analysis

(Cohn and Becker, 2003), these ﬁndings could be considered as a major step toward ending

the debate whether SERVPERF is superior to SERVQUAL as an indicator of OSQ.

As Parasuraman et al. (1994) pointed out, the use of performance-only (SERVPERF)

vs the expectation/performance difference scale (SERVQUAL) should be governed by

whether the scale is used for a diagnostic purpose or for establishing theoretically

sound models. We believe that the SERVQUAL scale would have greater interest for

practitioners because of its richer diagnostic value. By comparing customer

expectations of service versus perceived service across dimensions, managers can

identify service shortfalls and use this information to allocate resources to improve SQ

(Parasuraman et al., 1994).

Our ﬁndings also reveal that the need to adapt the measure to the context of the

study is greater when SERVQUAL rather than SERVPERF is used. In effect, the

original versions of SERVQUAL had a signiﬁcantly lower OSQ predictive validity

than the modiﬁed versions. However, both the original and modiﬁed versions of

SERVPERF had the same level of OSQ predictive validity. This has important

implications for both practitioners and academics. Practitioners using SERVQUAL for

OSQ diagnostic purposes need to spend greater effort in modifying the scale for

context than SERVPERF users.

Our results also show an interesting pattern. Since, SERVQUAL and SERVPERF

were originally developed in the USA, we expected that the predictive validity of these

instruments would be higher when used in countries with national cultures and

languages similar to the US. However, results show that the predictive validity of

SERVQUAL and SERVPERF on OSQ was higher for non-English speaking countries

and for countries with lower levels of individualism. A closer examination of the

sample used in our study revealed that all studies conducted in non-English speaking

countries as well as those conducted in less individualistic countries relied on modiﬁed

versions of the SERVQUAL scale. Hence, scale modiﬁcation rather than cultural

context could be driving the results. Since, there were no studies conducted outside the

US using non-modiﬁed scales, it was not possible to isolate the effect of national culture

and language. Further, research is needed to address this important issue. An

interesting avenue would be an experimental design where respondents outside the US,

would be given a modiﬁed scale (i.e. adapted to the industry context) and others would

be given the original items; this would allow teasing apart the effects of culture and

scale adaptation on the scale’s validity.

Finally, results suggest the predictive validity of SERVQUAL on OSQ is highest in

medium customer processing intensity contexts with an intermediate degree of

intangibility (SS) followed by low customer processing intensity (PS) and high

customer processing intensity (MS).

The validity of

the SERVQUAL

and SERVPERF

485

A plausible explanation for this ﬁnding is that SERVQUAL was developed as a

scale generalizable across service contexts. Hence, predictive validity peaks in the

category that represents a compromise between the emphasis on process and product

(i.e. service shop). Another reason could be the varying degree of importance of the

service used in the analysis to the customer. Additional research is needed for a better

understanding of this result.

With the growing proliferation of technology based self-service (SST) encounters,

factors that contribute to satisfaction and dissatisfaction in the SST customer

interaction have drawn considerable interest from researchers and practitioners

(Meuter et al., 2000). Further, research could explore the degree of predictive validity of

SERVQUAL on OSQ in SST customer interactions.

Like any other meta-analysis, this study is subject to the ﬁle drawer problem which

prevents the true effect size from being uncovered (Lipsey and Wilson, 2001). However,

as shown in Table II, the fail-safe Nstatistic reveals that several hundred studies

unaccounted for, with an effect size of zero, would be necessary to nullify the effect

sizes computed. This strengthens the conﬁdence in the results obtained. Finally, in this

study, SERVQUAL and SERVPERF were only assessed through their predictive

validity of OSQ. A future meta-analysis could employ additional validation techniques.

For example, meta-analysis can be used to construct a broader nomological network

that includes constructs related to SQ such as customer satisfaction, customer loyalty,

purchase intention, and word-of-mouth (Zeithaml, 2000). Researchers could then assess

whether using SERVQUAL or SERVPERF affects the effect of SQ on the above

referred constructs.

References

Angur, M.G., Nataraajan, R. and Jahera, J.S. Jr (1999), “Service quality in the banking industry:

an assessment in a developing economy”, The International Journal of Bank Marketing,

Vol. 17 No. 3, pp. 116-25.

Arthur, W.J., Bennet, W. and Huffcutt, A.I. (2001), Conducting Meta Analysis Using SAS,

Lawrence Earlbaum Associates, Mahwah, NJ.

Asubonteng, P., McCleary, K.J. and Swan, J.E. (1996), “SERVQUAL revisited: a critical review of

service quality”, Journal of Services Marketing, Vol. 10 No. 6, pp. 62-70.

Babakus, E. and Boller, G.W. (1992), “An empirical assessment of the SERVQUAL scale”,

Journal of Business Research, Vol. 24 No. 3, pp. 253-68.

Babin, B., Chebat, J-C. and Michon, R. (2004), “Perceived appropriateness and its effect on quality,

affect and behavior”, Journal of Retailing & Consumer Services, Vol. 11 No. 5, pp. 287-98.

Bitner, M.J. (1992), “Servicescapes: the impact of physical surroundings on customers and

employees”, Journal of Marketing, Vol. 56 No. 2, pp. 57-71.

Bojanic, D.C. (1991), “Quality measurement in professional services ﬁrms”, Journal of

Professional Services Marketing, Vol. 7 No. 2, pp. 27-36.

Bolton, R.N. and Drew, J.H. (1991), “A multistage model of customers’ assessments of service

quality and value”, Journal of Consumer Research, Vol. 17 No. 4, pp. 375-84.

Brady, M.K. and Cronin, J.J. Jr (2001), “Some new thoughts on conceptualizing perceived service

quality: a hierarchical approach”, Journal of Marketing, Vol. 65 No. 3, pp. 34-49.

Brady, M.K., Cronin, J.J. Jr and Brand, R.R. (2002), “Performance-only measurement of service

quality: a replication and extension”, Journal of Business Research, Vol. 55 No. 1, pp. 17-31.

IJSIM

18,5

486

Brown, T.J., Churchill, G.A. Jr and Peter, P.J. (1993), “Improving the measurement of service

quality”, Journal of Retailing, Vol. 68 No. 1, pp. 127-39.

Cano, C.R., Carrillat, F.A. and Jaramillo, F. (2004), “A meta-analysis of the relationship between

market orientation and business performance: evidence from ﬁve continents”,

International Journal of Research in Marketing, Vol. 21 No. 2, pp. 179-200.

Carman, J.M. (1990), “Consumer perceptions of service quality: an assessment of the SERVQUAL

dimensions”, Journal of Retailing, Vol. 66 No. 1, pp. 33-55.

Chebat, J-C., Filiatrault, P., Gelinas-Chebat, C. and Vaninsky, A. (1995), “Impact of waiting

attribution and consumer’s mood on perceived quality”, Journal of Business Research,

Vol. 34 No. 3, pp. 191-6.

Cohen, J. (1992), “A power primer”, Psychological Bulletin, Vol. 112 No. 1, pp. 155-9.

Cohn, L.D. and Becker, B.J. (2003), “How meta-analysis increases statistical power”, Journal of

Applied Psychology, Vol. 8 No. 3, pp. 243-53.

Cronin, J.J. Jr and Taylor, A.S. (1992), “Measuring service quality: a reexamination and an

extension”, Journal of Marketing, Vol. 56 No. 3, pp. 55-67.

Cronin, J.J. Jr and Taylor, A.S. (1994), “SERVPERF versus SERVQUAL: reconciling performance

based and perception based – minus – expectation measurements of service quality”,

Journal of Marketing, Vol. 58 No. 1, pp. 125-31.

Cui, C.C., Lewis, B.R. and Park, W. (2003), “Service quality measurement in the banking sector

Korea”, International Journal of Bank Marketing, Vol. 21 No. 4, pp. 191-201.

Dabholkar, P.A., Shepherd, C.D. and Thorpe, D.I. (2000), “A comprehensive framework for

service quality: an investigation of critical conceptual and measurement issues through a

longitudinal study”, Journal of Retailing, Vol. 76 No. 2, pp. 139-73.

Diamantopoulos, A., Reynolds, N.L. and Simintiras, A.C. (2006), “The impact of response styles

on the stability of cross-national comparisons”, Journal of Business Research, Vol. 59 No. 8,

pp. 925-35.

Donthu, N. and Yoo, B. (1998), “Cultural inﬂuences on service quality expectations”, Journal of

Service Research, Vol. 1 No. 2, pp. 178-86.

Freeman, K.D. and Dart, J. (1993), “Measuring the perceived quality of professional business

services”, Journal of Professional Services Marketing, Vol. 9 No. 1, pp. 27-47.

Furrer, O., Liu, B.S-C. and Sudharshan, D. (2000), “The relationships between culture and service

quality perceptions: basis for cross-cultural market segmentation and resource allocation”,

Journal of Service Research, Vol. 2 No. 4, pp. 355-71.

Gale, B.T. (1994), Managing Customer Value: Creating Quality and Service that Customers can

See, The Free Press, New York, NY.

Gremler, D.D. and Gwinner, K.P. (2000), “Customer-employee rapport in service relationships”,

Journal of Service Research, Vol. 3 No. 1, pp. 82-104.

Hall, S.M. and Brannick, M.T. (2002), “Comparison of two random-effects methods of

meta-analysis”, Journal of Applied Psychology, Vol. 87 No. 2, pp. 377-89.

Herk, H.V., Poortinga, Y.H. and Verhallen, T.M.M. (2005), “Equivalence of survey data: relevance

for international marketing”, European Journal of Marketing, Vol. 39 Nos 3/4, pp. 351-64.

Hofstede, G. (1997), Cultures and Organizations: Software of the Mind, McGraw-Hill, Berkshire.

Hudson, S., Hudson, P. and Miller, G.A. (2004), “The measurement of service quality in the tour

operating sector: a methodological comparison”, Journal of Travel Research, Vol. 42 No. 3,

pp. 305-12.

The validity of

the SERVQUAL

and SERVPERF

487

Hunter, J.E. and Schmidt, F.L. (2004), Methods of Meta-Analysis: Correcting Error and Bias in

Research Findings, 2nd ed., Sage, Thousand Oaks, CA.

Jabnoun, N. and Al-Tamimi, H.A.H. (2003), “Measuring perceived service quality at UAE

commercial banks”, The International Journal of Quality & Reliability Management, Vol. 20

Nos 4/5, pp. 458-72.

Jain, S.K. and Gupta, G. (2004), “Measuring service quality: SERVQUAL vs SERVPERF scales”,

The Journal for Decision Makers, Vol. 29 No. 2, pp. 25-37.

Jaramillo, F., Carrillat, F.A. and Locander, W.B. (2005), “A meta-analytic comparison of

managerial ratings and self-evaluations”, Journal of Personal Selling & Sales Management,

Vol. 25 No. 4, pp. 315-29.

Javalgi, R.R.G., Martin, C.L. and Young, R.B. (2006), “Marketing research, market orientation and

customer relationship management: a framework and implications for service providers”,

Journal of Services Marketing, Vol. 20 No. 1, pp. 12-23.

Kettinger, W.J. and Lee, C.C. (1997), “Pragmatic perspectives on the measurement of information

systems service quality”, MIS Quarterly, Vol. 21 No. 2, pp. 223-41.

Lam, S.S.K. (1995), “Measuring service quality: an empirical analysis in Hong Kong”,

International Journal of Management, Vol. 12 No. 2, pp. 182-8.

Lam, S.S.K. (1997), “SERVQUAL: a tool for measuring patients’ opinions of hospital service

quality in Hong Kong”, Total Quality Management, Vol. 8 No. 4, pp. 152-4.

Laroche, M., Ueltschy, L.C., Abe, S., Cleveland, M. and Yannopoulos, P.P. (2004), “Service quality

perceptions and customer satisfaction: evaluating the role of culture”, Journal of

International Marketing, Vol. 12 No. 3, pp. 58-85.

Lee, T., Lee, Y. and Yoo, D. (2000), “The determinants of perceived service quality and its

relationship with satisfaction”, The Journal of Services Marketing, Vol. 14 No. 3, pp. 217-31.

Lipsey, M.W. and Wilson, D.B. (2001), Practical Meta-Analysis, Sage, Thousand Oaks, CA.

Lovelock, C.H. (1983), “Classifying services to gain strategic marketing insights”, Journal of

Marketing, Vol. 47 No. 3, pp. 9-21.

Lovelock, C.H. and Gummesson, E. (2004), “Whither services marketing? In search of a new

paradigm and fresh perspectives”, Journal of Service Research, Vol. 7 No. 1, pp. 20-41.

Martin, C.L. (1999), “The history, evolution and principles of services marketing: poised for the

new millennium”, Marketing Intelligence & Planning, Vol. 17 No. 7, pp. 324-8.

Mattila, A.S. (1999), “The role of culture in the service evaluation process”, Journal of Service

Research, Vol. 1 No. 3, pp. 250-61.

Meuter, M.L., Ostrom, A.L., Roundtree, R.I. and Bitner, M.J. (2000), “Self-service technologies:

understanding customer satisfaction with technology-based service encounters”, Journal

of Marketing, Vol. 64 No. 3, pp. 50-64.

Mehta, S.C., Ashok, K.L. and Han, S.L. (2000), “Service quality in retailing: relative efﬁciency of

alternative measurement scales for different product-service environments”, International

Journal of Retail & Distribution Management, Vol. 28 No. 2, pp. 62-72.

Mittal, B. and Lassar, W.M. (1996), “The role of personalization in service encounters”, Journal of

Retailing, Vol. 72 No. 1, pp. 95-110.

Mukherje, A. and Nath, P. (2005), “An empirical assessment of comparative approach to service

quality measurement”, Journal of Services Marketing, Vol. 19 No. 3, pp. 174-84.

OECD (2005), available at: http://ocde.P4.Siteinternet.Com/publications/doiﬁles/012005061t009.Xls

Oliver, R.L. and DeSarbo, W.S. (1988), “Response determinants in satisfaction judgments”,

Journal of Consumer Research, Vol. 14 No. 4, pp. 495-507.

IJSIM

18,5

488

Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1985), “A conceptual model of service

quality and its implications for future research”, Journal of Marketing, Vol. 49 No. 4,

pp. 41-50.

Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1988), “SERVQUAL: a multiple-item scale for

measuring consumer perception of service quality”, Journal of Retailing, Vol. 64 No. 1,

pp. 12-40.

Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1991), “Reﬁnement and reassessment of the

SERVQUAL scale”, Journal of Retailing, Vol. 67 No. 4, pp. 420-51.

Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1994), “Reassessment of expectations as a

comparison standard in measuring service quality: implications for further research”,

Journal of Marketing, Vol. 58 No. 1, pp. 111-24.

Pariseau, S.E. and McDaniel, J.R. (1997), “Assessing service quality in schools of business”,

International Journal of Quality & Reliability Management, Vol. 14 No. 3, pp. 204-18.

Quester, P.G. and Romaniuk, S. (1997), “Service quality in the Australian advertising industry:

a methodological study”, Journal of Services Marketing, Vol. 11 No. 3, pp. 180-92.

Rossiter, J.R. (2002), “The C-OAR-SE procedure for scale development in marketing”,

International Journal of Research in Marketing, Vol. 19 No. 4, pp. 305-35.

Schencker, N. and Gentleman, J.F. (2001), “On judging the signiﬁcance of difference by

examining the overlap between conﬁdence intervals”, The American Statistician, Vol. 55

No. 3, pp. 182-6.

Silvestro, R., Fitzgerald, L. and Johnston, R. (1992), “Towards a classiﬁcation of

services processes”, International Journal of Services Industry Management, Vol. 3 No. 2,

pp. 62-75.

Smith, A.M. (1999), “Some problems when adopting Churchill’s paradigm for the development of

service quality measurement scales”, Journal of Business Research, Vol. 46 No. 2,

pp. 109-20.

Steenkamp, J-B.E.M. and Baumgartner, H. (1998), “Assessing measurement invariance in

cross-national consumer research”, Journal of Consumer Research, Vol. 25 No. 1, pp. 78-90.

Sultan, F., Merlin, C. and Simpson, J. (2000), “International service variants: airline passenger

expectations and perceptions of service quality”, Journal of Services Marketing, Vol. 14

No. 3, pp. 188-96.

Teas, K.R. (1993), “Expectations, performance evaluation, and consumers’ perception of quality”,

Journal of Marketing, Vol. 57 No. 4, pp. 18-34.

Triandis, H.C. (1995), Individualism and Collectivism, Westview Press, Boulder, CO.

Wal, van der R.W.E., Pampallis, A. and Bond, C. (2002), “Service quality in a cellular

telecommunications company: a South African experience”, Managing Service Quality,

Vol. 12 No. 5, pp. 323-35.

White, S.S. and Schneider, B. (2000), “Climbing the commitment ladder: the role of expectations

disconﬁrmation on customers’ behavioral intentions”, Journal of Service Research, Vol. 2

No. 22, pp. 240-53.

Witkowski, T.H. and Wolﬁnbarger, M.F. (2002), “Comparative service quality: German and

American ratings across service settings”, Journal of Business Research, Vol. 55 No. 11,

pp. 875-81.

Zeithaml, V.A. (2000), “Service quality, proﬁtability, and the economic worth of customers: what

we know and what we need to learn”, Journal of the Academy of Marketing Science, Vol. 28,

pp. 67-85.

The validity of

the SERVQUAL

and SERVPERF

489

Zeithaml, V.A. and Bitner, M.J. (2003), Services Marketing: Integrating Customer Focus across the

Firm, 3rd ed., Irwin McGraw-Hill, Boston, MA.

Zhou, L. (2004), “A dimension-speciﬁc analysis of performance-only measurement of service

quality and satisfaction in china’s retail banking”, The Journal of Services Marketing,

Vol. 18 Nos 6/7, pp. 534-46.

Corresponding author

Franc¸ois A. Carrillat can be contacted at: francois.carrillat@hec.ca

IJSIM

18,5

490

To purchase reprints of this article please e-mail: reprints@emeraldinsight.com

Or visit our web site for further details: www.emeraldinsight.com/reprints

Using the ServQual scale to measure client satisfaction in a rehabilitation teaching clinic in the Philippines

Article

Dec 2018

Kristofferson Mendoza

Background: Teaching clinics provide low-cost health programs while offering valuable learning opportunities for student clinicians, which then contributes to increasing health care accessibility. To date, there is a paucity of literature exploring the satisfaction of patient seen in rehabilitation teaching clinics in developing countries. The Service Quality (ServQual) Scale is a valid and reliable tool that has been used to measure client satisfaction in different work settings and industries. Objectives: The aim of this study was to demonstrate the usefulness of ServQual in measuring the satisfaction of clients in a rehabilitation teaching clinic in a developing country. Methodology: A cross-sectional survey was conducted for three months among Clinic for Therapy Services- Adult and Adolescent Section (CTS-AA) clients who are at least 18 years old; have attended at least three sessions; and can read. Prior to administration in CTS-AA, the ServQual scale was translated to Filipino, validated and pilot tested for reliability. Results: Thirty-two respondents were included in the analysis. There was no statistically significant difference between the expectation and the perceptions of the clients for the domains of reliability (z=1.799, p=0.0721), responsiveness (z=0.839, p=0.4013), assurance (z=1.914, p=0.0556) and empathy (z=1.772, p=0.0764). However, there was a statistically significant difference between the clients' perception and expectation for tangibles (z=4.117, p<0.0001) and between the overall client perception and expectation (z=4.086, p<0.0001). The overall ServQual score for CTS-AA is -0.3782. Conclusion: The ServQual has been shown to be useful in assessing the satisfaction of clients in rehabilitation clinics and the specific areas that needs improvement. The tool can still be further improved by including items on cost, relationship of students with supervisors and outcomes of treatment.

Patient perceptions of healthcare service quality in Romania: Public versus private hospitals – Implications for developed and developing healthcare systems

Article

Mar 2024

Objective: Recent increases in per capita income and longevity in Central and Eastern European counties (CEECs), alongside a slow-changing soviet-era public healthcare system, has led to the emergence of private hospitals. This paper investigates the differential patient service quality perceptions for private versus public hospitals, as well as for three types of healthcare services: primary, ambulatory, and inpatient care.Methods: Data from 1,673 patients of private and public hospitals in the capital of Romania were collected in face-to-face interviews. Analysis of covariance and partial-least-squares techniques were used to examine the relationships between perceived service quality, hospital ownership status and the type of health service patients received.Results: Over 70% of women prefer private health facilities to public hospitals (compared to less than 30% of men). While private hospitals rank higher than public hospitals on most attributes, the interaction effect of gender and hospital type reveals that assurance and empathy are the only significant attributes in driving women to private hospitals. (Physical facilities and staff appearance) as well as intangible dimensions of service quality (assurance, responsiveness, reliability, and empathy) have a positive impact on perceived overall service quality of healthcare. Improvements in perceptions of hospital’s tangibles, staff’s responsiveness and empathy have the greatest potential to enhance perceived overall service quality.Conclusions: This paper demonstrates the importance of breaking down health services into various sub-categories both in terms of perceived healthcare attributes and in terms of tangible healthcare facilities, such as public and private hospitals.

Effects of SERVPERF Dimensions on Guests’ Satisfaction and Loyalty to Low-Cost Hotel: Mediating Role of Satisfaction

Article

Nov 2023
J Qual Assur Hospit Tourism

Most theories are tested in developed countries; moreover, there is still lack of research on SERVPERF, especially in low-cost hotel. Hence, due to high competition among low-cost hotels in Nigeria, many low-cost hotels are seeking for alternative ways of competing apart from pricing or operational costs. Thus, the aim of this study is to identify the effects of SERVPERF dimensions on guests’ satisfaction and loyalty in low-cost hotels as well as the mediating role of guests’ satisfaction on the relationships between SERVPERF dimensions and guests’ loyalty. Data were collected from 300 guests at low-cost hotels and was analyzed using structural equation modeling (SEM). Also, composite reliability, Cronbach alpha and average variance extracted were used to test the reliability and validity of the instrument. Result revealed that empathy, responsiveness and reliability influence guests’ satisfaction. Furthermore, empathy, reliability and satisfaction influence guests’ loyalty. Results on mediation revealed that guests’ satisfaction partially mediates the relationships between empathy as well as reliability and guests’ loyalty, while guests’ satisfaction fully mediates the path between responsiveness and guests’ loyalty. This study recommends that low-cost hotels management should emphasize on empathy and reliability to increase guest loyalty.

Effects of SERVPERF Dimensions on Guests’ Satisfaction and Loyalty to Low-Cost Hotel: Mediating Role of Satisfaction

Article

Nov 2023
J Qual Assur Hospit Tourism

Precious Chikezie Ezeh

Most theories are tested in developed countries; moreover, there is still lack of research on SERVPERF, especially in low-cost hotel. Hence, due to high competition among low-cost hotels in Nigeria, many low-cost hotels are seeking for alternative ways of competing apart from pricing or operational costs. Thus, the aim of this study is to identify the effects of SERVPERF dimensions on guests’ satisfaction and loyalty in low-cost hotels as well as the mediating role of guests’ satisfaction on the relationships between SERVPERF dimensions and guests’ loyalty. Data were collected from 300 guests at low-cost hotels and was analyzed using structural equation modeling (SEM). Also, composite reliability, cronbach alpha and average variance extracted were used to test the reliability and validity of the instrument. Result revealed that empathy, responsiveness and reliability influence guests’ satisfaction. Furthermore, empathy, reliability and satisfaction influence guests’ loyalty. Results on mediation revealed that guests’ satisfaction partially mediates the relationships between empathy as well as reliability and guests’ loyalty, while guests’ satisfaction fully mediates the path between responsiveness and guests’ loyalty. This study recommends that low-cost hotels management should emphasize on empathy and reliability to increase guest loyalty.

Synthesizing Insights: A Meta-Analysis of Global Public Transport Service Quality Research

Article

Full-text available

Jun 2024

This meta-analysis compiles data from 30 studies on the quality of public transport services conducted in different countries. The studies collectively investigate different forms of public transportation, such as buses, paratransit, and rail services, utilizing a range of methodological approaches. The recurring themes in service quality are key dimensions such as reliability, responsiveness, assurance, empathy, and tangibles. The analysis demonstrates that these dimensions have a significant impact on user satisfaction, perceived value, and behavioural intentions. This emphasizes the widespread relevance of service quality measures such as SERVQUAL and SERVPERF. Furthermore, the results emphasize the differences in how service quality is perceived in different regions and the urgent requirement for targeted policy interventions to improve public transportation systems worldwide.

Impacto de intervenções na qualidade dos serviços de uma biblioteca universitária

Article

Full-text available

Jan 2020

Os serviços prestados por uma biblioteca precisam ser geridos ao longo do tempo considerando a multidimensionalidade da qualidade em serviços. Este estudo longitudinal investiga o impacto das intervenções realizadas entre 2013 e 2017 em uma biblioteca universitária sobre as dimensões da qualidade em serviços. A metodologia do estudo compreende a aplicação do instrumento SERVQUAL em uma biblioteca universitária pública em 2013 (T1), n=355 usuários, e 2017 (T2), n=184 usuários, sendo que em 2017 foi conduzido um estudo de caso para o levantamento das intervenções realizadas e que foram classificadas em quatro categorias: Infraestrutura, Equipamentos, Processos/Sistemas e Pessoas. Como resultados, tem-se que as lacunas de todas as dimensões da qualidade em serviços foram negativas para os dois períodos avaliados. Conclui-se que as intervenções impactaram reduzindo as lacunas negativas das dimensões da qualidade dos serviços da biblioteca em estudo.

DETERMINANTES DA QUALIDADE COMO PREDITORAS DA QUALIDADE PERCEBIDA NAS ATIVIDADES PEDAGÓGICAS NÃO PRESENCIAIS

Article

Full-text available

Dec 2023

Este estudo teve como propósito analisar a possiblidade das determinantes da qualidade atuarem como preditoras da qualidade percebida pelos usuários das Atividades Pedagógicas Não Presenciais, no curso de Engenharia da Produção do IFES/ Campus Cariacica. Para o alcance desse objetivo, utilizou-se de uma survey adaptada para coletar dados de uma amostra de estudantes. Esses dados, após colhidos e tabulados, foram analisados por meio do método denominado Análise de Equações Estruturais pelos Mínimos Quadrados Parciais (PLS-SEM), a fim de verificar a validade de vinte hipóteses. Como resultado, dez hipóteses não foram rejeitadas, demonstrando que as determinantes da qualidade a Empatia e a Segurança nos modos assíncrono e síncrono, bem como, a Tangibilidade no modo assíncrono atuam como preditoras da qualidade percebida nas Atividades Pedagógicas Não Presenciais. Foram apresentadas, como limitações deste trabalho, o fato de ter sido aplicado em, apenas, um único curso de um campus e de ter sido utilizada técnica de amostragem baseada em aleatoriedade. Para trabalhos futuros, foi sugerida a utilização de técnica de amostragem probabilística, bem como, a ampliação da amostra para mais cursos e outros campi dos IFs.

Customer satisfaction of loan services in Ghanaian banks: A case of Barclays Bank Ghana Limited

Article

Dec 2015

The purpose of this study is to evaluate the level of customer satisfaction with loan services in the banking industry. The study uses the service quality of loan facilities at Barclays Bank Ghana Limited as a case study. Through a stratified random sampling method, seventy (70) respondents of the bank's branch in Nkawkaw in the Eastern Region of Ghana were selected to participate in the study. Questionnaires and interviews were used to measure the respondents' level of satisfaction with the bank's loan services and other service quality dimensions. The data gathered were analysed using SPSS (Version 16.0) and presented by the use of statistical models such as graphs, charts, and tables. It was observed that the customers were generally satisfied with the loan services when assessed on 12 service quality dimensions. Interest rates, account charges, company reputation and past experience with the bank were found to be important factors in attracting the customers. Existing customers are also willing to recommend the bank to their reference group. These factors are useful for strategic decisions.

Service quality, satisfaction and loyalty among sharing economy vehicle users

Article

Full-text available

Apr 2024

Sharing economy has emerged as an influential concept, shaping the way and manner businesses are transacted. Of particular interest is the transport industry, where a combination of GPS and the development of software have resulted in the creation of an e-hailing transportation system. Issues such as the provision of quality services persist; giving rise to further investigations. The study, therefore, delved into service quality, satisfaction and loyalty of users of e-hailing transport services. Using a cross-sectional descriptive survey, a convenience sample was employed to intercept 436 users of e-hailing vehicles at pick-up and drop-off points in Accra, Ghana. A structural equation model was used to test the relationships between service quality, satisfaction and loyalty among users of e-hailing vehicles. Results show assurance (β = 0.149; p = .022); tangibles (β = 0.140; p = .045), responsiveness (β = 0.040; p = .008); reliability (β = 0.014; p = .015); empathy (β = 0.062; p = .000); system information (β = 0.013; p = .005) and price (β = 0.001; p = .008) significantly influence user satisfaction. Similarly, customer satisfaction was found to influence the loyalty of users of e-hailing vehicles. The study concludes that both individual service quality dimensions and the composite service quality model influence users’ satisfaction which in turn influences users’ willingness to re-use and recommend e-hailing services to others. Implications for theory and practice are discussed.

A Q&A primer and systematic review of meta-analytic reporting in organizational frontline service research

Article

Nov 2023
J SERV MANAGE

Purpose The purpose of this manuscript is to provide a step-by-step primer on systematic and meta-analytic reviews across the service field, to systematically analyze the quality of meta-analytic reporting in the service domain, to provide detailed protocols authors may follow when conducting and reporting these analyses and to offer recommendations for future service meta-analyses. Design/methodology/approach Eligible frontline service-related meta-analyses published through May 2021 were identified for inclusion (k = 33) through a systematic search of Academic Search Complete, PsycINFO, Business Source Complete, Web of Science, Google Scholar and specific service journals using search terms related to service and meta-analyses. Findings An analysis of the existing meta-analyses within the service field, while often providing high-quality results, revealed that the quality of the reporting can be improved in several ways to enhance the replicability of published meta-analyses in the service domain. Practical implications This research employs a question-and-answer approach to provide a substantive guide for both properly conducting and properly reporting high-quality meta-analytic research in the service field for scholars at various levels of experience. Originality/value This work aggregates best practices from diverse disciplines to create a comprehensive checklist of protocols for conducting and reporting high-quality service meta-analyses while providing additional resources for further exploration.

Reassessment of Expectations as a Comparison Standard in Measuring Service Quality: Implications for Further Research

Article

Full-text available

Jan 1994
J MARKETING

The authors respond to concerns raised by Cronin and Taylor (1992) and Teas (1993) about the SERVQUAL instrument and the perceptions-minus-expectations specification invoked by it to operationalize service quality. After demonstrating that the validity and alleged severity of many of those concerns are questionable, they offer a set of research directions for addressing unresolved issues and adding to the understanding of service quality assessment.

A Conceptual Model of Service Quality and Its Implications for Future Research

Article

Full-text available

Sep 1985
J MARKETING

The attainment of quality in products and services has become a pivotal concern of the 1980s. While quality in tangible goods has been described and measured by marketers, quality in services is largely undefined and unresearched. The authors attempt to rectify this situation by reporting the insights obtained in an extensive exploratory investigation of quality in four service businesses and by developing a model of service quality. Propositions and recommendations to stimulate future research about service quality are offered.

A Conceptual Model of Service Quality and Its Implications for Future Research

Article

Full-text available

Jan 1985

Conducting Meta-Analysis Using SAS

Book

Jun 2001

Expectations, Performance Evaluation, and Consumers’ Perceptions of Quality

Article

Oct 1993
J MARKETING

R. Kenneth Teas

The author examines conceptual and operational issues associated with the “perceptions-minus-expectations” (P-E) perceived service quality model. The examination indicates that the P-E framework is of questionable validity because of a number of conceptual and definitional problems involving the (1) conceptual definition of expectations, (2) theoretical justification of the expectations component of the P-E framework, and (3) measurement validity of the expectation (E) and revised expectation (E*) measures specified in the published service quality literature. Consequently, alternative perceived quality models that address the problems of the traditional framework are developed and empirically tested.

Servicescapes: The Impact of Physical Surroundings on Customers and Employees

Article

Apr 1992
J MARKETING

Mary Jo Bitner

A typology of service organizations is presented and a conceptual framework is advanced for exploring the impact of physical surroundings on the behaviors of both customers and employees. The ability of the physical surroundings to facilitate achievement of organizational as well as marketing goals is explored. Literature from diverse disciplines provides theoretical grounding for the framework, which serves as a base for focused propositions. By examining the multiple strategic roles that physical surroundings can exert in service organizations, the author highlights key managerial and research implications.

Factors Influencing Booster Club Members^|^rsquo; Continuance Intentions in the bj-league: Focusing on the Service Quality of the Booster Club

Article

Jan 2010

A power primer

Article

Jan 1991

J. Cohen

Measuring service quality: an empirical analysis in Hong Kong

Article

Jan 1995

Simon S. K. Lam

Servicescapes: The impact of physical surroundings on customer and employees

Article

Jan 1992
J MARKETING

M.J. Bitner

The validity of the SERVQUAL and SERVPERF scales: a meta-analytic view of 17 years of research across five continents

Abstract

Recommended publications

Meta-analysis of effect sizes reported at multiple time points: A multivariate approach

Clinicopathologic and prognostic significance of C-reactive protein/albumin ratio in patients with s...

Psychosocial Functioning and the Cortisol Awakening Response: Meta-analysis, P-curve Analysis, and E...

Marchē de l'eau et dessalement de l'eau de mer caractēristiques de l'offre et de la demande