ArticlePDF Available

Analysing collaborative trials for qualitative microbiological methods: Accordance and concordance

Authors:

Abstract and Figures

In qualitative (detection) food microbiology, the usual measures of repeatability and reproducibility are inapplicable. For such studies, we introduce two new measures: accordance for within laboratory agreement and concordance for between laboratory agreement, and discuss their properties. These measures are based on the probability of finding the same test results for identical test materials within and between laboratories, respectively. The concordance odds ratio is introduced to present their relationship. A method to test whether accordance differs from concordance is discussed.
Content may be subject to copyright.
Analysing collaborative trials for qualitative microbiological
methods: accordance and concordance
S.D. Langton
a,
*, R. Chevennement
b
, N. Nagelkerke
c
, B. Lombard
d
a
DEFRA Central Science Laboratory, Sand Hutton, York, YO41 1LZ, UK
b
Ecole Nationale d’Industrie Laitie
`re et des Biotechnologies (ENILBIO), Poligny, BP: 49, 39800, Poligny, France
c
National Institute of Public Health and the Environment (RIVM), P.O. Box 1, 3720 BA Bilthoven, The Netherlands
d
Agence Francßaise de Securite
´Sanitaire des Aliments, Laboratoire d’Etudes et de Recherches en Hygiene et Qualite des Aliments,
41 Rue du 11 Novembre 1918, F-94 700 Maisons-Alfort, France
Received 5 March 2001; received in revised form 25 January 2002; accepted 26 February 2002
Abstract
In qualitative (detection) food microbiology, the usual measures of repeatability and reproducibility are inapplicable. For
such studies, we introduce two new measures: accordance for within laboratory agreement and concordance for between
laboratory agreement, and discuss their properties. These measures are based on the probability of finding the same test results
for identical test materials within and between laboratories, respectively. The concordance odds ratio is introduced to present
their relationship. A method to test whether accordance differs from concordance is discussed. D2002 Elsevier Science B.V.
All rights reserved.
Keywords: Collaborative trials; Qualitative microbiological methods; Accordance; Concordance
1. Introduction
The concepts of repeatability (r) and reproducibility
(R)(Anonymous, 1994; IDF, 1991) are widely used in
the analysis of data from collaborative trials in quanti-
tative microbiology (see for example, Szita et al., 1998;
Schulten et al., 2000). Repeatability provides a meas-
ure of the variability between analyses conducted on
identical test materials by the same technician in the
same laboratory, under conditions as similar as possible
(e.g. by using the same apparatus and reagents within
the shortest possible interval of time), whilst the
reproducibility measures the variability when the anal-
yses are conducted by different technicians at different
laboratories. The concepts were originally borrowed
from the literature relating to chemical analysis, but
they are equally applicable to microbiological data,
provided the counts are transformed to logarithms
before analysis. Formally, there is a 95% probability
that the absolute difference between two test results
should be less than rand Runder repeatability and
reproducibility conditions, respectively.
These measures of repeatability and reproducibility
do not measure departure from the ‘‘true value’’, i.e.
accuracy. They only indicate to what extent results can
be repeated or reproduced—either correctly or not. In
microbiology, where the objective is to count the
0168-1605/02/$ - see front matter D2002 Elsevier Science B.V. All rights reserved.
PII: S0168-1605(02)00107-1
*
Corresponding author. Tel.: +44-1904-455100; fax: +44-
1904-455254.
E-mail address: s.langton@csl.gov.uk (S.D. Langton).
www.elsevier.com/locate/ijfoodmicro
International Journal of Food Microbiology 79 (2002) 175 181
number of colony forming units (cfu), accuracy may
depend on many factors, such as the (food) matrix, the
strain of organism or the presence and type of compet-
ing flora. A particular quantification method may
perform differently under different circumstances.
However, it should not perform randomly. Random
behaviour indicates a lack of ‘‘control’’, e.g. due to
ambiguities in the definition of the method. It is this
random behaviour that rand Rare designed to meas-
ure. In addition, (substantial) differences between rand
Rmay give clues about the origin of randomness. From
the wide use that is being made of these concepts, one
may infer that they must be useful and aid micro-
biologists in decision making and in improving and
standardizing their procedures. The increasing need to
standardize microbiological methods (Lahellec, 1998;
Leclercq et al., 2000) provides an additional argument
for the development and use of (statistical) methods
and the organization of collaborative trials (e.g. Scotter
et al., 2001; Schulten et al., 1998; De Buyser and
Lombard, 2000) that promote further standardization.
A major limitation of repeatability and reproduci-
bility is that they are only applicable to quantitative
data and cannot be applied to qualitative analyses, i.e.
detection methods. This is because both parameters are
expressed in terms of the likely difference in log units
between two test results; clearly this can have no
meaning for qualitative data where two results are
either the same or are different. In this paper, we
describe two new measures, accordance and concord-
ance, which can be regarded as analogous quantities to
repeatability and reproducibility for qualitative data.
These new parameters have been developed as part
of European project SMT 4-CT 96-2098 funded by
European Commission, which aimed to validate the
six main reference methods (ISO/CEN Standards) in
food microbiology by means of interlaboratory trials.
They have been in particular applied to the validation
of the reference method EN ISO 11290-1 for the
detection of Listeria monocytogenes in foods (Scotter
et al., 2001).
2. Example data
The methods will be illustrated by means of the
data in Table 1, which is taken from a collaborative
study on L. monocytogenes. These data are just for
one level of one food type using one particular
medium. In practice, a collaborative trial would usu-
ally aim to involve at least 20 laboratories (in this
case, there was a similar number using a different
medium), but a smaller data set makes it easier to
explain the calculations.
3. Statistics for qualitative data
3.1. Accuracy
In a collaborative trial with quantitative data,
whenever possible, results will be compared with
the ‘‘true’ contents of the sample(s) in order to
demonstrate the accuracy of the method. The equiv-
alent statistics for qualitative data are straightforward.
For positive samples this is known as sensitivity and is
the percentage of samples correctly identified as
positives; in the example this is 46 (eight laboratories
with five correct plus two laboratories with three
correct) out of 50 or 92.0%. For the purposes of this
calculation it must be assumed that all supposedly
positive samples do in fact contain the organism; in
some cases with very low levels, this may not be the
case in reality. As the sensitivity can depend on
circumstances such as the food matrix, a reported
sensitivity only applies to the set of circumstances
under which it was measured.
For blank samples, the percentage of samples
correctly identified as being negative is recorded; this
is known as the specificity.
Approximate standard errors and confidence limits
for these figures can be based on standard methods for
Table 1
Example data taken from a collaborative study on L. monocytogenes
Laboratory Replicate number Number positive
12345
(out of 5)
1 +++++5
2 +++++5
3 +++++5
4 +++++5
5+++3
6 +++++5
7+++3
8 +++++5
9 +++++5
10 +++++5
S.D. Langton et al. / International Journal of Food Microbiology 79 (2002) 175–181176
binomial (positive/negative) data (Armitage and Ber-
ry, 1987). For example, using their methods to calcu-
late an exact 95% confidence interval for the sensi-
tivity, we obtain 80.8– 97.8%. However, such figures
may be misleading if there is any heterogeneity
among laboratories and should not be used where this
is clearly the case. Also, it is assumed that all samples
are in fact positive and that negative results are
therefore incorrect.
3.2. Accordance
We will define the qualitative equivalent of repeat-
ability as accordance; this is the (percentage) chance
that two identical test materials analysed by the same
laboratory under standard repeatability conditions will
both be given the same result (i.e. both found positive
or both found negative). To calculate the accordance,
we take each laboratory in turn and calculate the
probability that two samples will give the same result
and then average this probability over all laboratories.
For those laboratories (such as laboratory 1) where all
samples were found positive, the best estimate of the
probability of getting the same result is clearly 1.00 or
100%; all 10 out of 10 possible pairs are both positive.
For the others (in the example, laboratories 5 and 7)
we must count the number of pairs that are either both
negative (1 pair) or both positive (3 pairs). Thus a
total of 4 out of 10 pairs (40%) give the same result. In
general, when a laboratory has nresults and kof these
are positive, then the accordance for that laboratory is
estimated as
accordance for laboratory
¼fkðk1ÞþðnkÞðnk1Þg=nðn1Þ
The accordance for the collaborative trial as a
whole is the average (mean) of these probabilities
for each laboratory, which is 88% in this case, as can
be seen from Table 2. The calculation of the standard
errors and confidence intervals is discussed in Appen-
dix A.
3.3. Concordance
The equivalent of reproducibility is concordance,
which is the percentage chance that two identical test
materials sent to different laboratories will both be
given the same result (i.e. both found positive or both
found negative result). The most intuitive way to
calculate concordance is simply to enumerate all
possible between laboratory pairings in the data.
To enumerate all pairings, each observation in each
laboratory is considered in turn, starting with the first
observation of laboratory 1, which is positive. This
can be paired with any of the 45 observations from
other laboratories, and all but four of these pairings
(those with samples 1 and 2 of laboratories 5 and 7)
match (i.e. give a pair with both positive); there are
thus 41 pairs giving the same result. Similarly, the
other four samples from laboratory 1 also each have
41 out of 45 pairings giving the same result, and so
there are a total of 225 (5 45) between laboratory
pairings involving laboratory 1, of which 205
(5 41) give the same result. The same applies to
all other laboratories with all samples positive. For
laboratory 5, with three out of five positives, the two
negative samples each match with just two other
negative samples at other laboratories, whilst the three
positive samples each match with 43 positive samples.
Thus the total number of pairs with the same result for
laboratory 5 (and also laboratory 7) is 133
(2 2+343). These calculations are summarised
in Table 3.
The concordance is the percentage of all pairings
giving the same result; in this example this is 84.7%
(1906/2250 100). (The alert reader will have
noticed that there are actually only 1125 pairings as
we have double counted them all by this procedure;
this does not affect the results as it applies to all pairs
Table 2
Calculating the accordance for the example data; the accordance is
the average of the probabilities in the final column
Laboratory Number
positive
Positive
pairs
Negative
pairs
Total pairs
the same
Proportion
the same
1 5 10 0 10 100%
2 5 10 0 10 100%
3 5 10 0 10 100%
4 5 10 0 10 100%
5 3 3 1 4 40%
6 5 10 0 10 100%
7 3 3 1 4 40%
8 5 10 0 10 100%
9 5 10 0 10 100%
10 5 10 0 10 100%
Total = 88 Mean = 88%
S.D. Langton et al. / International Journal of Food Microbiology 79 (2002) 175–181 177
and is easier than allowing for it during the calcula-
tions.)
An alternative way to calculate concordance is to
first calculate all pairs with the same results from all
laboratories irrespective whether the result is from the
same laboratory or not. As there are four negative re-
sults and 46 positive results, we have 1041 (1035 + 6)
pairs with the same result (note that the number of pair-
ings possible between nitems is given by n(n1)/2,
for example for the 46 positives 46 45/2 = 1035).
From the calculation of accordance, we know that 88
are from the same laboratory (Table 2), and so 953
(1041 88) pairs with the same result are from differ-
ent laboratories (counting each pair only once). Sim-
ilarly, it is easily found that there are a total of 1125
(1225 100) pairs from different laboratories. Again,
we find 84.7% as the value of concordance.
1
The
calculation of the standard errors and confidence
intervals is discussed in Appendix A, and an Excel
spreadsheet to perform these calculations is available
by email from the authors.
3.4. Measuring and testing between laboratory
variation
With quantitative data, variation between laborato-
ries can be tested for statistical significance and its
magnitude can be estimated by examining the labo-
ratory component of variance or, alternatively, simply
by comparing the repeatability with the reproducibil-
ity. A similar comparison between accordance and
concordance can be used to assess the magnitude of
between laboratory variation; if the concordance is
smaller than the accordance (i.e. samples analysed at
the same laboratory are more likely to give the same
result than ones sent to different laboratories) between
laboratory variation is present. This may be due to, for
example, variations in media quality across laborato-
ries, differences in the likelihood of cross-contamina-
tion or different interpretations of guidelines by
different laboratory technicians.
Unfortunately, the magnitude of the concordance
and accordance is strongly dependent on the sensitiv-
ity, making it difficult to assess easily the degree of
between laboratory variation. It is therefore helpful to
calculate the concordance odds ratio (COR)
COR ¼accordanceð100 concordanceÞ
concordanceð100 accordanceÞ
where concordance and accordance are expressed as
percentages. This ratio is less sensitive to the level of
sensitivity than either accordance, concordance, or a
simple ratio between the two, and therefore provides a
convenient measure of the degree to which results
vary between laboratories. The magnitude of the ratio
can be interpreted as the relative chance (‘odds’ in
betting terms) of getting the same result when two
samples are analysed in the same laboratory compared
to if they are sent to the different laboratories. Thus an
odds ratio of 1.3 indicates that the samples are 1.3
times more likely to produce the same result (both
positive or both negative) if they are analysed in the
same laboratory than if they are analysed in different
laboratories. Ideally an odds ratio should be close to
1.0, indicating that results are just as likely to be the
same irrespective of whether the two samples are
analysed at the same or different laboratories. How-
ever, in practice, the variation between laboratories
will generally be larger than within laboratories. The
1
Concordance can also be calculated from accordance using the
formula
ðestimatedÞconcordance ¼f2rðrnLÞþnLðnL 1Þ
AnLðn1Þg=n2ÞLðL1Þg
where r= total number of positives, L= number of laboratories,
n= replications per laboratory, and A= accordance, expressed as a
proportion.
Table 3
Calculating the concordance for the example data
Laboratory Number
positive
Between-
laboratory pairings
with same results
Total
between-
laboratory
pairings
1 5 205 225
2 5 205 225
3 5 205 225
4 5 205 225
5 3 133 225
6 5 205 225
7 3 133 225
8 5 205 225
9 5 205 225
10 5 205 225
Total 1906 2250
S.D. Langton et al. / International Journal of Food Microbiology 79 (2002) 175–181178
larger the odds ratio, the greater is the level of inter-
laboratory variation in the data.
The COR equals infinity in the case when accord-
ance is 100% (i.e. when all laboratories report either
all positives or all negatives), indicating serious differ-
ences between laboratories. When both accordance
and concordance are 100%, there is no variability
whatsoever in the data and the COR can be considered
to equal the ideal value of 1.0.
Values of the concordance odds ratio above 1.00
may occur by chance variation, and so a statistical
significance test should be used to confirm whether
the evidence for extra variation between laboratories
is convincing. Such tests are based on the idea that if
accordance equals concordance, then each laboratory
has the same probability of finding a positive test. A
cross tabulation of laboratory by result (number pos-
itive, number negative) should then be homogeneous.
The best test to use is an ‘exact test’, which can be
obtained using statistical packages such as SAS or
StatXact. The philosophy behind such tests is that the
probabilities of occurrence are calculated for all sets
of results that could have produced the overall num-
bers of positives and negatives. In the example, with
46 positives and four negatives the arrangements
shown in Table 4 are possible.
The laboratories are not labelled in this table, as
any permutation would give the same overall result
(e.g. for the right hand column, the 1 could be for any
of the laboratories). The test adds up the probabilities
of all those possible arrangements that show at least as
much evidence for between laboratory variation as the
real result—here that means all permutations of the
three columns on the right. If this probability is less
than the conventional value of 0.05 or 5%, it is
unlikely that this degree of between laboratory varia-
tion could have occurred by chance and hence we
conclude that there is significant variation in perform-
ance between laboratories. In the example above
P= 0.039 indicating that variation between laborato-
ries is significant at the 5% level. Except in very
simple examples like this one, specialist software is
needed to calculate exact tests.
Where software for exact tests is not available, an
ordinary chi-squared analysis for contingency tables
(see e.g. Armitage and Berry, 1987) can be used as an
alternative. The results of this test will be less reliable
than the exact test with the number of replicates
usually used in collaborative trials, but simulations
suggest that the results provide a reasonable guide to
the significance of between laboratory differences.
With either test, it must be remembered that the
ability to detect between laboratory differences is
dependent on the number of laboratories and the
number of replicate samples analysed at each labo-
ratory. A non-significant test result should not be
taken to mean that performance does not vary between
laboratories, but rather that such differences have not
been proved; this is particularly true where the P-
value is only just above 0.05. The ideal solution is to
quote the odds ratio with a standard error or con-
fidence limits, but the distribution of the odds ratio is
highly skewed making it very difficult to produce
reliable limits.
4. Discussion
We have presented two new statistics, accordance
and concordance, that are aimed to assume the role of
repeatability and reproducibility for qualitative (detec-
tion) studies. Like the latter two measures, accordance
and concordance do not measure departure from the
‘‘true’ value, i.e. presence or absence of an organism.
They only indicate whether the particular procedure
used is sufficiently ‘‘standardised’’. Departure from
the true value can be measured using concepts such as
sensitivity (probability of giving a positive result
Table 4
Possible arrangements of the example data with 46 positives and
four negatives
43321
44345
44555
45555
55555
55555
55555
55555
55555
55555
Accordance (%) 84.0 86.0 88.0 90.0 96.0
Concordance (%) 85.1 84.9 84.7 84.5 84.0
COR 0.92 1.09 1.32 1.65 4.57
The observed arrangement is shown in bold. For convenience, the
laboratories are shown in ascending order of positives.
S.D. Langton et al. / International Journal of Food Microbiology 79 (2002) 175–181 179
when the organism is present) and specificity (prob-
ability of giving a negative result when the organism
is absent). These concepts are different: for example,
specificity may be low (e.g. due to the presence of a
cross-reacting competitive organism, or to cross-con-
tamination) while both accordance and concordance
are high.
Although this is strictly speaking not part of the
description of accordance and concordance, we feel it
is appropriate to say a few words on which data
should be included in the calculation of these meas-
ures. As with the validation of alternative methods
(Anonymous, 2000), data should not be excluded
solely due to a statistical test for outliers (e.g. Dixon,
1951; Grubbs, 1969). This is because routine removal
of ‘outliers’ may give an unrealistic picture of the
method’s performance by removing the extremes of
variation in the data. Data should only be rejected
where there is a clear protocol violation or where a
laboratory is clearly not producing acceptable results.
The latter criterion might include instances where a
laboratory reports a very high proportion of positives
amongst blank samples. However, if such data would
have been deemed ‘‘acceptable’ outside the context
of the trial (i.e. under routine circumstances), then
they should be retained in the analysis. In any case,
exclusion of data, and on what grounds, should
always be clearly reported.
One option where there are results which are
considered unusual (perhaps identified using Fisher’s
Exact Test), but cannot be ascribed to a protocol
violation, is to analyse the data with and without the
data from that laboratory. The value including all
laboratories should still be regarded as the definitive
figure, but comparison of the two figures will indicate
whether the one doubtful observation is having a
major influence on the overall statistics and what
scope there is for improvement in accordance and
concordance.
Acknowledgements
This study was supported by the European Union’s
Standards, Measurements and Testing Fourth Frame-
work Programme Project SMT4 CT 96 2098
(Coordinator C. Lahellec, Agence Francßaise de
Securite
´Sanitaire des Aliments).
Appendix A. Standard errors and confidence
intervals
The standard error of both the accordance and the
concordance depend on the way the data has been
collected and the interpretation we want to give to the
data.
The most common situation is that laboratories
should be considered a representative sample of a
larger ‘‘population’ of laboratories, e.g. all laborato-
ries in Europe recognised under a certain programme.
Similarly, they may represent all laboratories in a
certain region or represent, for example, high quality
laboratories. For example, in order to assess the
variability in performance of a test and to certify its
performances, a certification body might randomly
select a small sample of the laboratories it has
approved and ask them to take part in a collaborative
trial.
When laboratories are considered representative,
results from a trial have implications for all laborato-
ries in the ‘‘population’ of laboratories, not just the
participating ones. In the above example of a certif-
ication body examining the performance of a test, the
results of the trial have implications for all laborato-
ries which it has approved and therefore could have
been included in the trial. Standard errors then provide
an indication of how different the accordance and
concordance values might have been if, instead of the
actual sample, another sample of laboratories had
been taken.
Less frequently, the participating laboratories may
be ‘‘fixed’ in the sense that they are not considered a
random sample of a population of laboratories. They
only represent themselves. For example, a group of
laboratories may have joined forces to examine (and
improve) the performance of certain tests in their
laboratories. The results of the trial have no implica-
tions for non-participating laboratories. Standard
errors then refer to how variable the data would have
been under replication of the trial in the same labo-
ratories.
The easiest way to calculate standard errors is by
using a statistical device called the ‘‘bootstrap’ (Davi-
son and Hinkley, 1997). From the actual observations
a number M(usually a number of between 20 and 100
suffices) of bootstrap samples is created. The standard
deviation of accordance, concordance or COR (or any
S.D. Langton et al. / International Journal of Food Microbiology 79 (2002) 175–181180
other measure) among bootstrap samples then esti-
mates the standard error of these measures. A boot-
strap sample is constructed by sampling with re-
placement from the actual observations. The sample
size should be exactly equal to the original data set.
Where the Llaboratories can be considered as a
random sample of a larger population, we proceed
as follows to obtain a single bootstrap sample. We
give laboratories consecutive numbers 1, 2, 3,...,L.
We write these numbers on pieces of paper and put
them in a box. We then draw Ltimes a number, write
this number down, and after each draw we put the
piece of paper back in the box. Some laboratories are
thus ‘‘sampled’ more than once, while others are not
included at all. Of course, randomly drawing from a
box can be done efficiently by simulating it on a
computer using a random number generator in, for
example, Excel. Standard errors then have the inter-
pretation of showing how the value of accordance and
concordance might fluctuate under replication of the
same collaborative trial in different samples of labo-
ratories from the same ‘‘population’’.
For fixed laboratories, it makes little sense to
resample laboratories, as these are considered fixed.
Instead, each bootstrap sample is obtained by boot-
strapping within each laboratory. The idea behind this
is that while laboratories are fixed, results obtained
within each laboratory are not. All laboratories are
present (once) in each bootstrap sample, but observa-
tions from within a laboratory are bootstrapped and
thus may be either absent, occur once, or more than
once in each bootstrap sample.
Once bootstrap standard errors have been esti-
mated, approximate 95% confidence intervals can be
obtained by taking the actual number plus or minus
two standard deviations.
An Excel application to calculate all the statistics
described in this paper, including the bootstrapped
standard errors, can be obtained by sending an email
to s.langton@csl.gov.uk.
References
Anonymous, 1994. ISO 5725 Accuracy (Trueness and Precision) of
Measurement Methods and Results International Organization
for Standardization, Geneva, Switzerland.
Anonymous, 2000. prEN ISO 16 140 Microbiology of Food and
Animal Feeding Stuffs - Protocol for the Validation of Alterna-
tive Methods International Organization for Standardization,
Geneva, Switzerland.
Armitage, P., Berry, G., 1987. Statistical Methods in Medical Re-
search, 2nd edn. Blackwell, Oxford.
Davison, A.C., Hinkley, D.V., 1997. Bootstrap Methods and Their
Application Cambridge Univ. Press, Cambridge.
De Buyser, M.L., Lombard, B., 2000. Validation of ISO Microbio-
logical Methods: Enumeration of Coagulase-Positive Staphylo-
cocci According to EN ISO 6888 Part 1 and Part 2: 1999. Agence
Francßaise de Securite
´Sanitaire des Aliments, Maisons-Alfort.
Dixon, W.J., 1951. Ratios involving extreme values. Ann. Math.
Stat. 22, 68 78.
Grubbs, F.E., 1969. Procedures for detecting outlying observations
in samples. Technometrics 11, 1 21.
IDF, 1991, Precision characteristics of analytical methods—outline
of collaborative study procedure. International IDF Standard
135B.
Lahellec, C., 1998. Development of standard methods with special
reference to Europe. Int. J. Food Microbiol. 45, 13 16.
Leclercq, A., Lombard, B., Mossel, D.A., 2000. Standardisation of
food microbiological analysis methods: an asset or a constraint.
Sci. Aliments 20, 179 202.
Schulten, S.M., vd Lustgraaf, B.E.B., Nagelkerke, N.J.D., In’t Veld,
P.H., 1998. Validation of Microbiological Methods: Enumera-
tion of Bacillus cereus According to ISO 7932 (2nd ed., 1993).
RIVM, Bilthoven.
Schulten, S.M., in’t Veld, P.H., Nagelkerke, N.J.D., Scotter, S., de
Buyser, M.L., Rollier, P., Lahellec, C., 2000. Evaluation of the
ISO 7932 standard for the enumeration of Bacillus cereus in
foods. Int. J. Food Microbiol. 57, 53 61.
Scotter, S.L., Langton, S., Lombard, B., Lahellec, C., Schulten, S.,
Nagelkerke, N., In’t Veld, P.H., Rollier, P., de Buyser, M.L.,
2001. Validation of ISO method 11290: Part 1. Detection of
Listeria monocytogenes in foods. Int. J. Food Microbiol., 64,
295 306.
Szita, G., Tabajdi, V., Fabian, A., Biro, G., Reichart, O., Kormoczy,
P.S., 1998. A novel, selective synthetic acetamide containing
culture medium for isolating Pseudomonas aeruginosa from
milk. Int. J. Food Microbiol. 43, 123 127.
S.D. Langton et al. / International Journal of Food Microbiology 79 (2002) 175–181 181
... Accordance and concordance are measures to express, respectively, the repeatability (intra-operator variability) and reproducibility (inter-operator variability) of qualitative tests [30,31]. To evaluate the accordance and concordance of the mini-dbPCR-NALFIA, a single individual prepared 8 aliquots of a dilution series of FCR3 ringstage P. falciparum culture and five Plasmodium-negative blood samples. ...
... Here, L represents the number of different operators, and p 0,i , p 1,i , p 2,i and p 3,i represent the proportion of negative, pan single positive, P. falciparum single positive and double positive results for a particular operator i [31]. The 95% CI of the accordance and concordance estimates was calculated by means of bootstrapping [30,32]. ...
Article
Full-text available
Background Point-of-care diagnosis of malaria is currently based on microscopy and rapid diagnostic tests. However, both techniques have their constraints, including poor sensitivity for low parasitaemias. Hence, more accurate diagnostic tests for field use and routine clinical settings are warranted. The miniature direct-on-blood PCR nucleic acid lateral flow immunoassay (mini-dbPCR-NALFIA) is an innovative, easy-to-use molecular assay for diagnosis of malaria in resource-limited settings. Unlike traditional molecular methods, mini-dbPCR-NALFIA does not require DNA extraction and makes use of a handheld, portable thermal cycler that can run on a solar-charged power pack. Result read-out is done using a rapid lateral flow strip enabling differentiation of Plasmodium falciparum and non-falciparum malaria infections. A laboratory evaluation was performed to assess the performance of the mini-dbPCR-NALFIA for diagnosis of pan-Plasmodium and P. falciparum infections in whole blood. Methods Diagnostic accuracy of the mini-dbPCR-NALFIA was determined by testing a set of Plasmodium-positive blood samples from returned travellers (n = 29), and Plasmodium-negative blood samples from travellers with suspected malaria (n = 23), the Dutch Blood Bank (n = 19) and intensive care patients at the Amsterdam University Medical Centers (n = 16). Alethia Malaria (LAMP) with microscopy for species differentiation were used as reference. Limit of detection for P. falciparum was determined by 23 measurements of a dilution series of a P. falciparum culture. A fixed sample set was tested three times by the same operator to evaluate the repeatability, and once by five different operators to assess the reproducibility. Results Overall sensitivity and specificity of the mini-dbPCR-NALFIA were 96.6% (95% CI, 82.2%–99.9%) and 98.3% (95% CI, 90.8%–100%). Limit of detection for P. falciparum was 10 parasites per microlitre of blood. The repeatability of the assay was 93.7% (95% CI, 89.5%–97.8%) and reproducibility was 84.6% (95% CI, 79.5%–89.6%). Conclusions Mini-dbPCR-NALFIA is a sensitive, specific and robust method for molecular diagnosis of Plasmodium infections in whole blood and differentiation of P. falciparum. Incorporation of a miniature thermal cycler makes the assay well-adapted to resource-limited settings. A phase-3 field trial is currently being conducted to evaluate the potential implementation of this tool in different malaria transmission areas.
... The results were reported as estimated contamination levels (Engling et al. 2000). Parameters for presenting the frequencies of correct results (accuracy), and of sensitivity and specificity were proposed by Langton et al. (2002), which are modified versions of the class of similarity coefficients (Sneath and Sokal 1973). These parameters were used soon after their publication in interlaboratory studies (van Raamsdonk et al. 2003;Gizzi et al. 2004). ...
... In the situation of Booleans the frequency of different results between duplicates can still be used as basis for an indication of precision. The analogue parameters accordance and concordance have been developed to calculate repeatability and reproducibility, respectively, for Boolean results (Langton et al. 2002;van der Voet and van Raamsdonk 2004). ...
Article
Full-text available
Visual examination of visually recognisable substances, including microscopy, focus on targets or contaminants such as particles of animal origin, plant seeds, spore bodies of moulds, sclerotia, packaging material, microplastic and 'Besatz' (everything that differs from the norm). The two principal results are counts (numbers) and weights for macroscopic methods, or presence/absence for microscopic methods. The level of detection equals at least the size of one unit, usually with a weight exceeding 1 mg, which is in the range of parts per million (ppm). These parameters do not follow a normal distribution but Poisson (counts), lognormal (weights) or binomial (Booleans) distributions, with effect on the interpretation of validation parameters. As for other domains, examination methods for visual monitoring need to be properly validated and quality control during actual application is needed. In most cases procedures for validation of visual methods are based on principles adopted from other domains, such as chemical analysis. A series of examples from publications show inconsistent or not correct implementations of these validation procedures, which stress the need for dedicated validation procedures. Identification of legal ingredients and composition analysis in the domain of visual examination relies on the expertise of the laboratory staff, therefore validation of a method usually includes the validation of the expert. In the view of these specific circumstances, a Guidance for quality assurance and control of visual methods has been developed, which are being presented and discussed in this paper. The general framework of the Guidance is adopted from ISO standards (17023, 17043, 13528). Part 1 of the Guidance includes the general background, theory and principles. Part 2 presents the actual validation procedures with experimental designs and equations for calculating the relevant parameters, and can be used as blueprint for a SOP in a quality management system. An EURL and NRL network for physical hazards is strongly recommended.
... Perkinsus prevalence (% of infected individuals out of the total number of hosts sampled) were plotted for both methods and single-or co-infection was represented when possible. The concordance and the discordance parameters, adapted from Langton et al. (2002), were calculated. The concordance, calculated by counting paired positive samples and paired negative samples, is the percentage of chance that an identical sample analysed by two different methodologies will yield the same result. ...
Article
Full-text available
The parasitic species Perkinsus olseni (= atlanticus) (Perkinsea, Alveolata) infects a wide range of mollusc species and is responsible for mortality events and economic losses in the aquaculture industry and fisheries worldwide. Thus far, most studies conducted in this field have approached the problem from a “one parasite-one disease” perspective, notably with regards to commercially relevant clam species, while the impact of other Perkinsus species should also be considered as it could play a key role in the disease phenotype and dynamics. Co-infection of P. olseni and P. chesapeaki has already been sporadically described in Manila clam populations in Europe. Here, we describe for the first time the parasitic distribution of two Perkinsus species, P. olseni and P. chesapeaki, in individual clam organs and in five different locations across Arcachon Bay (France), using simultaneous in situ detection by quantitative PCR (qPCR) duplex methodology. We show that P. olseni single-infection largely dominated prevalence (46–84%) with high intensities of infection (7.2 to 8.5 log-nb of copies. g⁻¹of wet tissue of Manila clam) depending on location, suggesting that infection is driven by the abiotic characteristics of stations and physiological states of the host. Conversely, single P. chesapeaki infections were observed in only two sampling stations, Ile aux Oiseaux and Gujan, with low prevalences 2 and 14%, respectively. Interestingly, the co-infection by both Perkinsus spp., ranging in prevalence from 12 to 34%, was distributed across four stations of Arcachon Bay, and was detected in one or two organs maximum. Within these co-infected organs, P. olseni largely dominated the global parasitic load. Hence, the co-infection dynamics between P. olseni and P. chesapeaki may rely on a facilitating role of P. olseni in developing a primary infection which in turn may help P. chesapeaki infect R. philippinarum as a reservoir for a preferred host. This ecological study demonstrates that the detection and quantification of both parasitic species, P. olseni and P. chesapeaki, is essential and timely in resolving cryptic infections and their consequences on individual hosts and clam populations.
... Was estimated using concordance i.e. the probability of achieving the same test results for identical samples analysed by different laboratories. Concordance was calculated according to Langton et al. (2002) by ...
Article
Full-text available
A test performance study (TPS) was conducted in 2020 to evaluate performance of serological and PCR-based tests for the detection of Xylophilus ampelinus in homogenised vine samples. In total, 11 labs participated, although there were fewer participants for the serological tests than the PCR-based tests. The panel of samples sent to participants included spiked samples containing 10 ⁴ –10 ⁸ cfu per ml (serological tests) or 10 ² –10 ⁶ cfu per ml (PCR-based tests) as well as positive and negative controls. The five PCR-based tests were found to be fit for purpose, with similar performance across a range of metrics (analytical sensitivity, diagnostic sensitivity and specificity, and repeatability and reproducibility assessed in terms of accordance and concordance, respectively). Serological methods (two immunofluorescence tests and two ELISA tests) were found to be less sensitive with regard to both analytical and diagnostic sensitivity. Furthermore, the occurrence of false positives suggests that a positive IF result may not be conclusive when considered in isolation. One of the ELISA tests exhibited much lower analytical and diagnostic sensitivity than the other serological tests and would not be considered suitable for the purpose considered by this TPS.
... The performance of each master mix was evaluated in terms of diagnostic sensitivity (DSE), diagnostic specificity (DSP), and accuracy (ACC), for which the percentages of true negative (TN), false positive (FP), false negative (FN), and true positive (TP) results provided by the Ols were calculated [21,22]. Reproducibility was calculated according to the method reported by Langton et al. [23], considering samples 1, 2, 4, 5, 6, 7, and 8 with a pathogen concentration at the limit of detection [21]. ...
Article
Full-text available
In 2022, a test performance study (TPS) assessing the influence of different master mixes on the performance of the tetraplex real-time PCR (TqPCR) assay was organized. TqPCR allows for the specific detection and identification of Xylella fastidiosa (Xf) subspecies in a single reaction. Seventeen official laboratories of the Italian National Plant Protection Organization received a panel of 12 blind samples, controls, primers, probes, and different master mixes to participate in the TPS. Furthermore, the Research Centre for Plant Protection and Certification of the Council for Agricultural Research and Economics performed an intra-laboratory study (ITS) on spiked plant matrices to evaluate the analytical sensitivity of TqPCR employing the selected master mixes with the best performance. Naturally infected samples were analyzed for subspecies identification via TqPCR Citation: Pucci, N.; Scala, V.; Cesari, E.; Crosara, V.; Fiorani, R.; L'Aurora, A.; Lucchesi, S.; Tatulli, G.; Ciarroni, S.; De Amicis, F.; et al. An Inter-Laboratory Comparative Study on the Influence of Reagents to Perform the Identification of the Xylella fastidiosa Subspecies Using Tetraplex Real Time PCR. Horticulturae 2023, 9, 1053. https://doi.
... Statistical analyses were performed using R software, version 4.1.1 [19]. Raw data, consisting of Cq values of templates obtained from the different extractions, were normalized by the respective Cq values obtained by Tissue Lyser and Quick-RNA Plant Kit extraction, which was considered a benchmark protocol. ...
Article
Full-text available
Peach latent mosaic viroid (PLMVd) is an important pathogen that causes disease in peaches. Control of this viroid remains problematic because most PLMVd variants are symptomless, and although there are many detection tests in use, the reliability of PCR-based methods is compromised by the complex, branched secondary RNA structure of the viroid and its genetic diversity. In this study, a duplex RT-qPCR method was developed and validated against two previously published single RT-qPCRs, which were potentially able to detect all known PLMVd variants when used in tandem. In addition, in order to simplify the sample preparation, rapid-extraction protocols based on the use of crude sap or tissue printing were compared with commercially available RNA purification kits. The performance of the new procedure was evaluated in a test performance study involving five participant laboratories. The new method, in combination with rapid-sample-preparation approaches, was demonstrated to be feasible and reliable, with the advantage of detecting all different PLMVd isolates/variants assayed in a single reaction, reducing costs for routine diagnosis.
... Statistical analyses were performed using R software, version 4.1.1 [19]. Raw data, consisting of Cq values of templates obtained from the different extractions, were normalized by the respective Cq values obtained by Tissue Lyser and Quick-RNA Plant Kit extraction, which was considered a benchmark protocol. ...
Article
Full-text available
Peach latent mosaic viroid (PLMVd) is an important pathogen that causes disease in peaches. Control of this viroid remains problematic because most PLMVd variants are symptomless, and although there are many detection tests in use, the reliability of PCR-based methods is compromised by the complex, branched secondary RNA structure of the viroid and its genetic diversity. In this study, a duplex RT-qPCR method was developed and validated against two previously published single RT-qPCRs, which were potentially able to detect all known PLMVd variants when used in tandem. In addition, in order to simplify the sample preparation, rapid-extraction protocols based on the use of crude sap or tissue printing were compared with commercially available RNA purification kits. The performance of the new procedure was evaluated in a test performance study involving five participant laboratories. The new method, in combination with rapid-sample-preparation approaches, was demonstrated to be feasible and reliable, with the advantage of detecting all different PLMVd isolates/variants assayed in a single reaction, reducing costs for routine diagnosis.
... 21 After the outlier (no. 13 laboratory in the second interlaboratory study) was excluded, the concordance, accordance, and concordance odds ratio (COR) were calculated from the data of 14 laboratories as previously described, 22 and cumulative distributions of ΔΔCq values were analyzed. A summary of the interlaboratory study data is shown in Table S12. ...
Article
Full-text available
Real-time polymerase chain reaction (PCR) is the gold standard for DNA detection in many fields, including food analysis. However, robust detection using a real-time PCR for low-content DNA samples remains challenging. In this study, we developed a robust real-time PCR method for low-content DNA using genetically modified (GM) maize at concentrations near the limit of detection (LOD) as a model. We evaluated the LOD of real-time PCR targeting two common GM maize sequences (P35S and TNOS) using GM maize event MON863 containing a copy of P35S and TNOS. The interlaboratory study revealed that the LOD differed among laboratories partly because DNA input amounts were variable depending on measurements of DNA concentrations. To minimize this variability for low-content DNA samples, we developed ΔΔCq-based real-time PCR. In this study, ΔCq and ΔΔCq are as follows: ΔCq = Cq (P35S or TNOS) - Cq (SSIIb; maize endogenous gene), ΔΔCq = ΔCq (analytical sample) - ΔCq (control sample at concentrations near the LOD). The presence of GM maize was determined based on ΔΔCq values. In addition, we used optimized standard plasmids containing SSIIb, P35S, and TNOS with ΔCq equal to the MON863 genomic DNA (gDNA) at concentrations near the LOD as a control sample. A validation study indicated that at least 0.2% MON863 gDNA could be robustly detected. Using several GM maize certified reference materials, we have demonstrated that this method was practical for detecting low-content GM crops and thus for validating GM food labeling. With appropriate standards, this method would be applicable in many fields, not just food.
... At least one positive sample should have a low concentration of the target pest; e.g., close to the limit of detection estimated during the preliminary studies by the validation organiser. • a minimum of two replicates for each of three negative samples and two positive samples, with the samples independent of each other, for determination of the repeatability and reproducibility using the accordance and concordance of Langton et al. (2002). By using accordance and concordance, it is possible to determine if a particular laboratory performs poorly, or if a particular sample or test is performed poorly (see more details in Massart et al. 2022). ...
Chapter
Full-text available
The organisation of a test performance study (TPS) involves different steps that are mostly sequential, but some may be conducted simultaneously. This chapter details the following: the steps regarding the selection of the tests to be validated; the selection of the laboratories to participate in the TPS; the preparation of the materials and the dispatch of the samples; and the completion of the TPS (including the collection and analysis of the TPS results). The reader will be able to get the detailed information on how to define and plan timeline of the TPS, the appropriate number of samples (including replicates) and of laboratories that should be included in the TPS to ensure an appropriate statistical analysis, and how to perform basic analyses of the obtained data. In addition, this chapter covers the most important critical points which can endanger successful TPS organization providing the future TPS organisers in the field of plant health (but also in other similar fields) with the possibility to identify them in advance and carry-out successful TPS.
Article
A crescente importância das análises qualitativas e a necessidade de confiabilidade dos resultados analíticos para subsidiar as tomadas de decisões nos diversos setores da cadeia produtiva de alimentos são inquestionáveis, acarretando impactos na saúde pública, na economia e nos direitos do consumidor. Contudo, a validação desses métodos tem sido um ponto crítico na implantação de sistemas de gestão da qualidade e nos processos de acreditação de laboratórios. Apesar da existência de protocolos bem estabelecidos para validação de métodos quantitativos, ainda existe uma lacuna no desenvolvimento de abordagens para a implementação da metrologia em análises qualitativas. Neste contexto, o presente trabalho apresenta uma discussão sobre o tema validação de métodos qualitativos, abrangendo definições, delineamento experimental e análise de dados para avaliação dos parâmetros de desempenho aplicáveis como taxas de falsos resultados, taxa de confiabilidade, taxa de seletividade, taxa de sensibilidade, limite de detecção, região de perda de confiabilidade, acordância, concordância e robustez.
Article
Full-text available
Numerous microbiological methods for detection or enumeration of microorganisms in food have been developped and are extensively used in routine. However, the development of international trade and the requirement for quality assurance in laboratories have stressed the need for harmonisation of these methods. Standardisation is an appropriate way to solve this isssue. It has been established at three levels: National (such as the French Association for Standardisation: Afnor), European (European Committee for Standardisation: Cen) and International (International Organisation for Standardisation: ISO). Nowadays, these organisations essentially promote horizontal standards for microbiological analyses of food and mutual recognition of standards. The definition of microbiological criteria to consider the hygienic quality of food (which is the approach of the Codex Alimentarius Committee for Food Hygiene) cannot be carried out without combining it with a given method of analysis, since the result depends considerably on the methods selected. Consequently, it is not an easy task to obtain the consensus required to establish the various standards (reference, routine or validated commercial methods). Advantages and disadvantages of standardisation depend on the type of service and the framework of analysis carried out by private or public laboratories. Standardisation simplifies technical aspects of food analysis, but also helps laboratories in quality assurance management and customer trade by defining a common language and clarifying its services. Disadvantages of standardisation reside in its rigidity and slow evolution.
Article
Twenty, mostly European laboratories took part in a collaborative study to validate the general 1993 ISO 7932 standard for the enumeration of Bacillus cereus in foods (Anonymous, ISO 7932, Microbiology – General Guidance for the Enumeration of Bacillus cereus – Colony-count Technique at 30°C. International Organisation for Standardization, Geneva, Switzerland, 1993). The objective was to determine the precision data in terms of repeatability (r) and reproducibility (R) of the method using three different food types at various inoculum levels. The results are intended for publication in Comité Européen de Normalisation (CEN) and ISO standards. The method was challenged with three types of food product: fresh cheese, minced beef and potato powder. Each participant received eight samples for each food type: blind duplicates at four inoculum levels (target values of B. cereus of 0, 103, 104 and 105–106 colony forming particles (cfp)/g). In addition, two reference materials (RMs, capsules containing milk powder, artificially contaminated with B. cereus) were included in the study. All test materials were tested extensively for homogeneity and stability prior to the collaborative trial. In addition to determining the precision parameters of the ISO method, polymyxin pyruvate egg yolk bromothymol blue agar (PEMBA, incubation at 37°C) was included in the study to evaluate possible differences in performance compared to mannitol egg yolk polymyxin agar (MEYP, incubation at 30°C) which is prescribed in the ISO standard. In this study no difference in performance between MEYP and PEMBA medium was observed. Results from the glucose fermentation confirmation test indicated that generally 48 h was needed to obtain a yellow colour throughout the whole test tube. A high number of false negative results with the Voges Proskauer (VP) reaction, even after 48 h of incubation of the tubes, was observed in some laboratories. Values for r and R were therefore calculated without VP test results. Further studies have been initiated by the ISO technical committee in order to improve the performance of the confirmation tests. The precision of the test method appeared to be unaffected by the type of food or the concentration of B. cereus present in the test sample. The overall repeatability value found for the food samples was 0.29log10 units and the overall reproducibility value was 0.42log10 units. For the reference material, the repeatability was 0.11log10 units and the reproducibility 0.22log10 units. It was recommended to CEN and ISO to include these values into the revised horizontal standard for the enumeration of Bacillus cereus.
Article
Procedures are given in the report for determining statistically whether the highest observation, or the lowest observation, or the highest and lowest observations, or the two highest observations, or the two lowest observations, or perhaps more of the observations in the sample may be considered to be outlying observations or discrepant values. Statistical tests of significance are useful in this connection either in the absence of assignable physical causes or to support a practical judgement that some of the experimental observations are aberrant. Both the statistical formulae and illustrative applications of the procedures to practical examples are given, thus representing a rather complete treatment of significance tests for outliers in single univariate samples.
Article
Incluye bibliografía
Article
Ratios of the form $(x_n - x_{n-j})/(x_n - x_i)$ for small values of $i$ and $j$ and $n = 3, \cdots, 30$ are discussed. The variables concerned are order statistics, i.e., sample values such that $x_1 < x_2 < \cdots < x_n$. Analytic results are obtained for the distributions of these ratios for several small values of $n$ and percentage values are tabled for these distributions for samples of size $n \leqq 30$.
Article
A selective synthetic medium has been developed both in liquid (Z-broth) and solid (Z-agar) forms for selective isolation of Pseudomonas aeruginosa from foods. The simple, easy to prepare peptone-free synthetic medium contained acetamide that is metabolized to ammonia and acetic acid providing nitrogen and carbon supply. The medium contained no inhibitors. Selectivity of the liquid medium was tested by inoculation of pure cultures of different bacteria belonging to the groups Bacillus, Pseudomonas, Enterobacteriaceae and Staphylococcus. It was found that the selectivity of the medium was complete for the examined range of bacteria. However, a similar result was obtained when nitrofurantoin broth was used. Applicability of the synthetic agar medium was also tested by a nation-wide inter-laboratory test using two milk samples containing 10(3)/ml (sample I) and 10(5)/ml (sample II) Pseudomonas aeruginosa. According to this test, no microbiologically relevant differences were found between the results obtained by Z-agar and cetrimide-agar a frequently used selective agar in case of sample II. However, a relevant and statistically significant difference was found in the results of sample I in favour of the Z-agar, that could indicate the presence of a low number of bacteria. Concerning repeatability and reproducibility, Z-agar proved to be superior to cetrimide agar.