Content uploaded by Ina Bornkessel-Schlesewsky
Author content
All content in this area was uploaded by Ina Bornkessel-Schlesewsky
Content may be subject to copyright.
To Predict or Not to Predict: Influences of Task and
Strategy on the Processing of Semantic Relations
Dietmar Roehm
1
, Ina Bornkessel-Schlesewsky
1
, Frank Ro¨sler
2
,
and Matthias Schlesewsky
2
Abstract
& We report a series of event-related potential experiments de-
signed to dissociate the functionally distinct processes involved
in the comprehension of highly restricted lexical–semantic rela-
tions (antonyms). We sought to differentiate between influ-
ences of semantic relatedness (which are independent of the
experimental setting) and processes related to predictability
(which differ as a function of the experimental environment).
To this end, we conducted three ERP studies contrasting the
processing of antonym relations (black–white) with that of
related (black–yellow) and unrelated (black–nice) word pairs.
Whereas the lexical–semantic manipulation was kept constant
across experiments, the experimental environment and the
task demands varied: Experiment 1 presented the word pairs
in a sentence context of the form The opposite of X is Y and
used a sensicality judgment. Experiment 2 used a word pair
presentation mode and a lexical decision task. Experiment 3
also examined word pairs, but with an antonymy judgment
task. All three experiments revealed a graded N400 response
(unrelated > related > antonyms), thus supporting the as-
sumption that semantic associations are processed automati-
cally. In addition, the experiments revealed that, in highly
constrained task environments, the N400 gradation occurs
simultaneously with a P300 effect for the antonym condition,
thus leading to the superficial impression of an extremely ‘‘re-
duced’’ N400 for antonym pairs. Comparisons across experi-
ments and participant groups revealed that the P300 effect
is not only a function of stimulus constraints (i.e., sentence
context) and experimental task, but that it is also crucially
influenced by individual processing strategies used to achieve
successful task performance. &
INTRODUCTION
Like other domains of higher cognition, language-based
communication relies upon a successful interplay be-
tween previously processed information, information cur-
rently being processed, and the predictions arising from
the combination of these two information sources. It is
generally assumed that predictive processing behavior of
this type is one of the keys to the efficiency of real-time
language comprehension and, as such, that it applies at
all levels of linguistic processing. For example, predic-
tions have been shown to be operative in the phonologi-
cal analysis of the ongoing speech stream (Connolly &
Phillips, 1994), in the establishment of lexical–semantic
relations (Kutas & Federmeier, 2000; Federmeier & Kutas,
1999), in basic processes of syntactic structure building
(Stabler, 1994), and with respect to syntactic relations
(Gibson, 1998). In cognitive neuroscience, predictive lan-
guage processing has been examined primarily with re-
spect to the lexical–semantic domain, thereby revealing
consistent patterns of predictive processing behavior
on the word ( Van Petten, 1993), sentence (Kutas &
Hillyard, 1980), and even discourse levels (Hagoort, Hald,
Bastiaansen, & Petersson, 2004; van Berkum, Hagoort, &
Brown, 1999). Within this domain, additional costs arising
from the processing of unpredicted information or infor-
mation that was not otherwise preactivated on the basis
of previous processing steps have been associated with
a specific electrophysiological correlate, the so-called
N400 effect (i.e., a negative deflection over centroparietal
electrodes between approximately 300 and 600 msec after
critical stimulus onset and a maximal peak in the vicinity
of 400 msec). Notably, this effect spans all of the previ-
ously described levels of processing in that it is observable
for words, sentences, and texts. As such, the N400 has
been regarded a general marker for integration difficulties
as a function of the previous processing context.
Interestingly, this language-related effect of integra-
tion as a function of predictability differs markedly from
the event-related potential (ERP) effect that is common-
ly associated with domain general processes of ‘‘con-
text updating,’’ namely, the P300 effect (Picton, 1993;
Donchin & Coles, 1988; for an alternative perspective,
see Verleger, 1988). The P300 is a positive deflection
with a maximal amplitude at approximately 300 msec
and is typically classified into two subcomponents, the
P3a and the P3b. Whereas the P3a has an anterior scalp
1
Max Planck Institute for Human Cognitive and Brain Sciences,
Germany,
2
University of Marburg, Germany
D 2007 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 19:8, pp. 1259–1274
distribution and is typically associated with the process-
ing of novel stimuli, the P3b is characterized by a pos-
terior distribution and is engendered by target items
(Polich, 2004). In addition to these topographical differ-
ences, the P3a often shows a slightly shorter latency than
the P3b. Thus, the N400 and the P300 appear to index
functionally different levels of predictive processing via
distinct electrophysiological characteristics. The exis-
tence of these different levels suggests that there should
be a range of cognitive tasks that lead to a potential
overlap of both processing domains and, thereby, of
both components.
Indeed, a substantial portion of the early electrophys-
iological literature on word-level processing focused
on the question of whether apparent modulations of
the N400 should not rather be attributed to underlying
differences in P300 effects (for discussion, see Rugg,
1990; Holcomb, 1988). This discussion was primarily cen-
tered around effects of word frequency and repetition
and the general finding that highly frequent words and
repeated words engender more positive-going ERP
waveforms than their lower frequent and nonrepeated
counterparts. From these studies, three general conclu-
sions can be drawn: (a) There is evidence for a semantic
modulation of the N400 component that is independent
of other effects. (b) Nonetheless, apparent N400 mod-
ulations may be influenced by differences in the latency
and/or amplitude of simultaneously evoked P300 effects.
(c) P300 effects may follow N400 effects and thus show
up with a latency from 500 msec onward. In this context,
the strength of the interaction between the two compo-
nents appears to depend at least in part on attentional
aspects of processing that modulate the amplitude and
latency of the P300 within the experimental environ-
ment (Holcomb, 1988).
With respect to higher level (i.e., relational) process-
ing demands in language comprehension, the possible
involvement of P300 effects is much more controver-
sial. Whereas it is undisputed that late positive effects at
the word level can be regarded as (decision-related) in-
stances of P300s that follow the N400 (see, e.g., Chwilla,
Kolk, & Mulder, 2000), there is a debate as to whether
late positivities at the sentence level (P600/SPS) can also
be interpreted as members of the P300 family (e.g.,
Osterhout & Hagoort, 1999; Coulson, King, & Kutas,
1998). Perhaps more importantly for the question under
consideration, the possible occurrence of P300 effects
within the N400 time window has only been discussed
in the context of possible latency shifts within the P300.
Thus, it was suggested that apparent modulations of
N400 amplitude might be attributable solely to differ-
ences in P300 latency between the critical conditions
(e.g., Curran, Tucker, Kutas, & Posner, 1993; Polich,
1985). By contrast, the possibility of a true component
overlap between the N400 and the P300 within a single
time window has received virtually no attention in the
literature on relational semantic processing.
The literature does, however, provide some tentative
indications that N400 effects at the relational level may
not always be entirely due to reductions of the negativity
for predicted stimuli, but may rather result from the
co-occurrence of an N400 reduction and a P300 within
the same time range. Within the extensive range of ERP
studies discussed in Kutas and Federmeier’s (2000) re-
view of lexical–semantic processes during language com-
prehension, the effects engendered by a paradigm
involving antonym relations (e.g., black–white) stands
out. In contrast to all other studies, in which the mor-
phology of the ERP components appears fully compat-
ible with the standard functional interpretation that
the N400 component is reduced when a stimulus is pre-
dictable or preactivated, the presentation of the second
item in an antonym pair (Kutas & Iragui, 1998) appears
to go beyond a mere reduction of the N400 in inducing
a clear positive deflection within the same time range
(for similar findings, see Bentin, 1987, and Federmeier
& Kutas, 1999). Kutas and Iragui (1998) calculated differ-
ences between the ERP responses to antonyms (white in
the context of ‘‘The opposite of black’’) and unrelated
words ( peach in the same context) and interpreted the
maximal peak of the resulting difference wave as the
N400 effect. This effect was interpreted functionally as
reflecting semantic congruity and was used as an elec-
trophysiological marker of age-related differences in se-
mantic processing. In a similar manner, Bentin (1987)
describes the ‘‘negativity related to the expected anto-
nym’’ in his study as ‘‘almost nonexistent.’’ However,
an alternative interpretation of the ERP pattern ob-
served for antonyms may be that it results from the
co-occurrence of two logically independent processes,
namely, (a) a modulation of the degree of semantic as-
sociation between two words and (b) the predictability
of the antonym (e.g., white) as a target item following
a prime such as ‘‘the opposite of black.’’ Whereas the
former aspect engenders an N400 reduction for the an-
tonym, the latter process gives rise to a P300 effect
within the same time range as the N400. The superpo-
sition of these two effects could yield a surface pattern
with the appearance of an ‘‘extremely reduced’’ N400
effect. If true, this alternative interpretation would lead
to profound consequences for the interpretation of the
results in question. With respect to the Kutas and Iragui
study, for example, it raises the possibility that the age-
related changes observed in the surface N400 effect
might, in fact, stem from changes in the P300 (e.g.,
Yamaguchi & Knight, 1991) rather than in the N400. This
would, of course, call for a fundamentally different inter-
pretation of the neurophysiological effects of aging ob-
served in the study in question.
The Present Studies
Although the account outlined above appears to provide
an appealing solution to the differing morphologies for
1260 Journal of Cognitive Neuroscience Volume 19, Number 8
N400 modulations in various contexts, the two compo-
nents that appear to be involved (N400, P300) are dif-
ficult to disentangle on the basis of previous findings.
In the present article, we report a series of three ERP
studies designed to provide a differential modulation of
the functionally distinct processes involved in the pro-
cessing of antonym relations. In particular, we sought to
dissociate between influences of semantic relatedness,
which should be independent of the particular experi-
mental environment, and processes related to predict-
ability, which can be assumed to differ as a function of
the experimental context (e.g., mode of presentation,
task, etc.). To this end, we examined the processing of
antonym relations in three different experimental set-
tings, thus keeping semantic relatedness constant while
varying the degree of predictability of the second ele-
ment on the basis of different task demands and exper-
imental environments.
In Experiment 1, participants read sentences of the
form The opposite of black is white/yellow/nice and
were asked to perform a sensicality judgment. The ex-
perimental design thereby provided a parametric mod-
ulation of semantic association (antonyms > within-
category violations such as yellow > across-category
violations such as nice) in a maximally restricted—and
therefore predictive—setting. Thus, the combination of
the particular task and the sentence context induced a
strong and unique prediction for the sentence-final an-
tonym. Experiment 1 therefore aimed to replicate Kutas
and Iragui’s (1998) finding of a graded N400 effect for
antonym relations that includes a clear positive deflec-
tion for the antonym condition (see also Figure 1B in
Kutas & Federmeier, 2000). In addition, a parametric
semantic manipulation was introduced to allow for a
separation between true effects of semantic association
and effects related to the predictability of the second
item in an antonym pair.
Experiment 2 examined the processing of the identical
parametric semantic modulation in a maximally unre-
stricted setting with respect to task- and environment-
related predictability. To this end, the critical stimuli
(black–white/yellow/nice) were presented as word pairs
and interspersed with word pairs containing a pseudo-
word in first or second position. In this study, participants
were asked to perform a lexical decision task (i.e., to
decide whether a pseudoword had occurred within the
word pair they had just read), thereby rendering the
semantic modulation entirely irrelevant to successful task
performance of the task.
Finally, Experiment 3 aimed to induce an intermediate
degree of predictability with respect to the processing of
the critical antonym pairs. Thus, the same mode of pre-
sentation as in Experiment 2 was chosen (i.e., presenta-
tion of single words as word pairs), albeit without the
inclusion of additional material containing pseudowords.
Moreover, an antonymy decision task was used, thereby
again rendering the semantic relation between the words
relevant for task performance. Nonetheless, because the
word pairs were not embedded in a restrictive sentence
frame, we reasoned that the degree of predictive pro-
cessing would be reduced in this study as opposed to
Experiment 1 (i.e., here, predictions can be based only on
the task as opposed to task plus sentence frame).
On the basis of these manipulations, and assuming
that the degree of semantic association and the degree
of predictability are indeed functionally and neurophysi-
ologically independent of one another, we can formulate
two basic hypotheses. First, the degree of predictability
for the second word in an antonym relation should give
rise to modulations of the P300. This component should
be observable in the same time range as the N400 and
should vary in strength across the three experiments
such that it is strongest in Experiment 1, weakest in
Experiment 2, and of intermediate strength in Experi-
ment 3. Second, as the degree of semantic association
is held constant across the three experiments, we expect
to observe a modulation of the N400 that is indepen-
dent of task and experimental environment. As the ‘‘N400
reduction’’ for the antonyms may co-occur with a posi-
tive deflection (P300) in the same time range (see Hypo-
thesis 1), the best measure for the effects of semantic
relatedness across experiments should be the relative
difference in the N400 time window between the within-
and across-category violations. Finally, in view of pre-
vious findings on the processing of (target) pseudowords,
the pseudowords in Experiment 2 can be expected to
elicit an increased N400 effect followed by a late posi-
tivity. Recognizing a pseudoword as an inexistent word
requires lexical search processes, thereby increasing N400
amplitude. The task-related P300 effect should therefore
be delayed until the completion of this search and, there-
by, to the post-N400 time range (Bentin, McCarthy, &
Wood, 1985).
METHODS
Materials
All experiments used three critical conditions: antonym
pairs, pairs of related words and pairs of unrelated words.
In Experiment 1, these critical stimuli were embedded
in a fixed sentence context of the form The opposite of
XisY, with X and Y instantiating the manipulation of
interest (see Table 1). In Experiments 2 and 3, the critical
stimuli were presented as word pairs. Table 1 further
shows that the critical stimuli did not differ in frequency,
F(2,237) = 1.43, p > .24, or word length, F(2,237) = 1.70,
p >.18.
Eighty sets (triplets) of the three conditions (anto-
nyms, related, unrelated) were created, thus resulting in
a total of 240 critical stimuli. To balance the number of
‘‘yes’’ and ‘‘no’’ responses in Experiments 1 and 3, four
lists of 160 critical items were constructed (consisting
of 80 antonym pairs and 40 pairs for each of the two
Roehm et al. 1261
mismatch conditions). Thus, each list contained all 80
antonym relations, whereas only half of the total materi-
als were used for each list for the other two conditions.
In Experiment 2, four lists of 120 critical items were
constructed (40 pairs from each condition), as there was
no need for a 1:1 ratio of antonyms and nonantonyms
in view of the different task (see below). Rather, the
critical materials for Experiment 2 were supplemented
by 160 additional word pairs containing a pseudoword in
either first (80 pairs) or second position (80 pairs). These
were spread across the four lists of critical items such that
each list always contained 40 pairs with a pseudoword in
the first position and 40 pairs with a pseudoword in the
second position. Thus, each list in Experiment 2 com-
prised a total of 200 word pairs.
Questionnaire Pretest
To ensure that the critical items indeed differed with
respect to antonymy and semantic relatedness in the de-
sired way, we conducted a questionnaire pretest.
Twenty-two students (13 women, 9 men) of the Uni-
versity of Leipzig participated in the questionnaire study.
Participants were randomly assigned to two equally sized
groups (Group A: 5 women; age 25.5 [2.62] years, mean
[SD]; range 21–29 years; Group B: 8 women; age 23.4
[3.59] years, mean [SD]; range 17–28 years). There
was no significant difference with regard to age between
Groups A and B, F(1,10) = 2.06, p < .19.
Each questionnaire comprised 80 randomized word
pairs, of which 40 were antonym pairs (ANT), 20 were
related category word pairs (REL), and 20 were nonre-
lated category word pairs (NON). Four different versions
were constructed such that no single word was repeated
within a list. Lists were identical for Groups A and B and
the questionnaires differed solely in instruction.
Both groups were instructed to read a given word pair
carefully and to then rate the relationship between the
two words on a 7-point scale by circling a number be-
tween 1 and 7. Whereas participants in Group A were
instructed to judge the degree of antonymy, that is, the
degree to which a given word pair can be considered an
antonym pair (1 = optimal antonym,7=not at all an
antonym), participants in Group B were asked to judge
the degree of relatedness between the two words (1 =
very strongly related,7=completely unrelated).
The mean rating scores for both groups are shown in
Table 1. A repeated measures analysis of variance (ANOVA)
including the within-participants factor Condition (anto-
nym vs. related vs. unrelated) and the between-participants
factor Group (antonymy judgment vs. relatedness judg-
ment) revealed main effects of Group, F
1
(1,10) = 7.29,
p < .02; F
2
(1,79) = 35.56, p < .01, and Condition,
F
1
(2,20) = 448.06, p <.01;F
2
(2,158) = 1327.78, p <
.01, and a significant Group Condition interaction,
F
1
(2,20) = 31.26, p <.01;F
2
(2,158) = 83.56, p <.01.
As the main interest in the task manipulation within
the pretest lay in examining whether the rating differ-
ence between the antonym pairs and the related word
pairs would be modulated by the task requirements, we
performed an additional analysis involving the within-
participants factor Category (antonyms vs. related word
pairs) and the between-participants factor Group. This
analysis also revealed a significant Category Group in-
teraction, F
1
(1,10) = 41.55, p < .01; F
2
(1,79) = 153.11,
p < .01. Nevertheless, resolving interaction by Group
showed that the difference between antonyms and re-
lated word pairs reached significance for both Group
A, F
1
(1,10) = 231.41, p < .01; F
2
(1,79) = 585.42, p < .01,
and Group B, F
1
(1,10) = 8.18, p < .02; F
2
(1,79) = 55.73,
p < .01. An examination of the F values suggests that the
significant interaction Category Group was due to a
much smaller difference between both categories under
the instructions emphasizing relatedness in comparison
to the antonymy judgement instruction.
In summary, the results of the pretest confirmed that
our stimulus materials differed along the two critical
dimensions antonymy and relatedness. In addition, they
Table 1. Example Stimuli for Experiments 1–3
Condition Example
Length, Mean
No. of Syllables (SD) Frequency (SD)
Relatedness,
Mean (SD)
Antonymy,
Mean (SD)
A. Antonyms Schwarz–weiss 1.85 (0.80) 34,781.99 (53,684.45) 2.10 (0.83) 1.21 (0.19)
Black–white
B. Related Schwarz–gelb 2.02 (0.75) 19,303.10 (53,631.92) 3.05 (0.77) 4.80 (0.73)
Black–yellow
C. Nonrelated Schwarz–nett 1.78 (0.60) 15,731.45 (10,7226.5) 6.35 (0.61) 6.93 (0.16)
Black–nice
In Experiment 1, the critical word pairs were presented in a fixed sentence context (Das Gegenteil von X ist Y, ‘‘The opposite of X is Y’’).
Experiments 2 and 3 used the same critical word pairs in the absence of a sentence context. The third and fourth columns list the mean length and
frequency of the critical second word of the pairs. Frequencies consisted of raw occurrence frequencies within the corpus of the Project Deutscher
Wortschatz (wortschatz.uni-leipzig.de). The final two columns show the results of a norming study (n = 22) designed to test the degree of
relatedness and antonymy of the critical materials (scale 1 to 7; see the section Questionnaire Pretest).
1262 Journal of Cognitive Neuroscience Volume 19, Number 8
showed that the perceived difference between the an-
tonyms and the related word pairs is larger under task
conditions emphasizing the antonym relation.
Participants
Seventeen undergraduate students from the University
of Marburg participated in each of the three experi-
ments (Experiment 1: 13 women, mean age 23.7 years,
range 20–28 years; Experiment 2: 11 women; mean age
24.6 years; range 20–31 years; Experiment 3: 11 women;
mean age 23.2 years, range: 20–29 years). No participant
took part in more than one of the three studies. All
participants were right-handed (as assessed by an adap-
ted and modified German version of the Edinburgh
Handedness Inventory; Oldfield, 1971), were monolin-
gual native speakers of German, and had normal or
corrected-to-normal vision.
Procedure
In Experiment 1, sentences were presented visually in
the center of a computer screen in a word-by-word man-
ner. Each trial began with the presentation of an as-
terisk (2000 msec) in order to fixate participants’ eyes
at the center of the screen and to alert them to the
upcoming presentation of the sentence. Single words
were presented for 350 msec with an interstimulus
interval (ISI) of 200 msec. The presentation of a sen-
tence was followed by 650 msec of blank screen, after
which participants were required to complete an anto-
nym sentence verification task (signaled by the presen-
tation of a question mark). This task involved judging
whether the proposition expressed by the sentence was
right or wrong. Subjects responded by pressing the left
or right mouse button for yes or no (maximal reaction
time, 3000 msec). After a participant’s reaction, there was
an intertrial interval (ITI) of 2250 msec before the next
trial started.
In Experiments 2 and 3, the critical word pairs were
presented out of sentence context. The first word (prime)
was presented for 400 msec with an ISI of 400 msec,
whereas the second word (target) was presented for
350 msec with an ISI of 650 msec (preceding the task).
In Experiment 2, participants were required to complete
a lexical decision task, which involved judging whether
one of the two words was a pseudoword or not. In Experi-
ment 3, participants judged whether the word pairs were
antonyms or not. In both cases, the left and right mouse
buttons corresponded to yes or no, respectively, and the
reaction times were restricted to 3000 msec. Between the
trials there was an ITI of 1400 msec.
Participants were asked to avoid movements and
eyeblinks during the presentation of the sentences/word
pairs. All experimental sessions began with a short train-
ing session followed by four experimental blocks, be-
tween which the participants took short breaks. Each
experimental session lasted approximately 2 hr (includ-
ing electrode preparation).
Electroencephalograms (EEGs) were recorded by
means of 27 sintered Ag/AgCl electrodes fixed at the
scalp by means of an elastic cap (Easy Cap International,
Herrsching-Breitbrunn, Germany). The ground electrode
was positioned at C2. Recordings were referenced to the
left mastoid, but re-referenced off-line to linked mastoids.
Electrooculograms (EOGs) were monitored by means of
electrodes placed at the outer canthus of each eye for the
horizontal EOG and above and below the participant’s
left eye for the vertical EOG. Electrode impedances were
kept below 5 k. All EEG and EOG channels were
amplified using a BrainVision BrainAmp amplifier (time
constant, 0.9 sec; high cutoff, 70 Hz) and recorded with a
digitization rate of 250 Hz.
Data Analysis
For each experiment, average ERPs were calculated per
condition per participant from 334 msec before the on-
set of the critical stimulus item (i.e., the second word
of the critical stimulus pairs) to 1000 msec after onset,
before grand averages were computed over all partici-
pants. Trials for which the task was not performed cor-
rectly were excluded, as were trials containing ocular
or other artifacts. For the statistical analysis, multivariate
analyses of variance were computed using the condition
factor Type (antonyms vs. related vs. unrelated) and the
topographical factors Region of Interest (ROI) for the
lateral and midline electrodes. Lateral regions of interest
were defined as follows: left anterior (F7, F3, FC5), left
posterior (P7, P3, CP5), right anterior (F8, F4, FC6), and
right posterior (P8, P4, CP6). For the midline electrodes,
each electrode (FZ, CZ, PZ, and OZ) was treated as an
ROI of its own. All statistical analyses were carried out
in a hierarchical manner; that is, only significant inter-
actions ( p < .05) were resolved. To avoid Type I errors
resulting from violations of sphericity, the correction
proposed by Huynh and Feldt (1970) was applied. The
probability level for planned comparisons was adjusted
according to a modified Bonferroni procedure (Keppel,
1991).
RESULTS
Experiment 1: Sentence Context
Behavioral Data
The sensicality judgment task yielded the following mean
error rates and reaction times, respectively (standard de-
viations in parentheses): antonyms—1.54 (1.84), 478.38
(143.75); related—5.88 (3.30), 533.44 (166.62); unrelated—
0.29 (0.80), 440.45 (119.05).
For the error rates, a repeated measures ANOVA re-
vealed a main effect of Type, F
Subj
(2,32) = 28.75, p <
.001; F
Item
(2,158) = 22.05, p < .001. Planned comparisons
Roehm et al. 1263
showed a significant difference between all pairs of condi-
tions: antonyms versus unrelated word pairs, F
Subj
(1,16) =
6.48, p <.03;F
Item
(1,79) = 6.38, p < .02; antonyms ver-
sus related word pairs, F
Subj
(1,16) = 20.19, p <.001;
F
Item
(1,79) = 17.78, p < .001; and related versus unrelated
word pairs, F
Subj
(1,16) = 50.23, p <.001;F
Item
(1,79) =
37.06, p <.001.
Analysis of the reaction times again showed a main ef-
fect of Type, F
Subj
(2,32) = 10.60, p <.001;F
Item
(2,158) =
9.40, p < .003. The planned comparisons showed the
following effects: antonyms versus unrelated word pairs,
F
Subj
(1,16) = 11.86, p <.004;F
Item
(1,79) = 8.55, p <.006;
antonyms versus related word pairs, F
Subj
(1,16) = 5.03,
p <.04;F
Item
(1,79) = 7.39, p <.01;andrelatedver-
sus unrelated word pairs, F
Subj
(1,16) = 16.85, p <.002;
F
Item
(1,79) = 11.32, p <.002.
In summary, the behavioral data showed the lowest
error rates and fastest reaction times for the unrelated
condition and the highest error rates and slowest reaction
times for the related condition. The antonym condition
lay between the other two conditions in both cases.
ERP Data
Grand average ERPs at the position of the critical (sen-
tence final) word are shown in Figure 1.
AsshowninFigure1,thethreeconditionsdifferedfrom
one another in two time windows. In the N400 time win-
dow, unrelated stimuli engendered more negative-going
ERP waveforms than did related stimuli, which, in turn,
engendered more negative ERP responses than did the
antonyms. Furthermore, and as predicted, antonyms elic-
ited a pronounced parietal positive peak within the N400
time window. Following the N400, the two nonantonym
conditions elicited broadly distributed positivities. For the
related condition, this effect had an anterior distribution,
whereas in the unrelated condition, it was more posterior.
On account of the visual inspection of the data, the
statistical analyses were conducted in two time windows:
240–440 (N400) and 500–750 (late positivity) msec.
Repeated measures ANOVAs for the N400 time window
revealed highly reliable main effects of Type and inter-
actions Type ROI for both lateral electrode sites, Type:
F(2,32) = 51.83, p < .001; Type ROI: F(6,96) = 21.83,
p < .001, and midline electrode sites, Type: F(2,32) =
64.63, p < .001; Type ROI: F(6,96) = 11.96, p < .001.
Further analyses within each ROI showed reliable ef-
fects of Type in all regions: minimal F(2,32) = 4.95
for the left anterior ROI; maximal F(2,32) = 101.24 at
Pz. Planned comparisons for antonyms versus unrelated
word pairs also reached significance in all regions, mini-
mal F(1,16) = 4.48 for the left anterior ROI; maximal
Figure 1. Grand average ERPs
for the antonyms and related
and unrelated conditions in
Experiment 1 (onset at the
vertical bar). Negativity is
plotted upward.
1264 Journal of Cognitive Neuroscience Volume 19, Number 8
F(1,16) = 143.84 at Pz, as did the planned compari-
sons for related versus unrelated word pairs, minimal
F(1,16) = 11.51 at Fz; maximal F(1,16) = 84.60 for the
right posterior ROI. The planned comparisons between
antonyms and related word pairs reached significance
in all but two regions: left anterior ROI and Fz; minimal
significant F(1,16) = 7.99 for the right anterior ROI and
maximal F(1,16) = 79.87 at Pz.
Within the time window for the late positivity, the sta-
tistical analysis again revealed significant main effects of
Type and Type ROI at lateral, Type: F(2,32) = 25.82,
p < .001; Type ROI: F(6,96) = 12.68, p <.001,and
midline sites, Type: F(2,32) = 21.51, p <.001;Type
ROI: F(6,96) = 9.06, p < .001. The analyses of Type with-
in each ROI yielded reliable effects in all regions: minimal
F(2,32) = 11.96 at Oz; maximal F(2,32) = 30.72 for the left
posterior ROI. The planned comparisons showed signifi-
cant effects in all regions for antonyms versus unrelated
words, minimal F(1,16) = 11.50 for the right anterior
ROI and maximal F(1,16) = 52.97 for the left posterior
ROI, as well as for antonyms versus related words, minimal
F(1,16) = 5.52 at Oz; maximal F(1,16) = 27.41 for the
right posterior ROI. The planned comparisons for related
versus unrelated words were significant at all but three
sites: left and right anterior ROI and Fz; minimal signifi-
cant F(1,16) = 8.66 for the right posterior ROI and maxi-
mal F(1,16) = 21.05 for the left posterior ROI.
To summarize, Experiment 1 showed a parametric
modulation within the N400 time window, with ERPs to
unrelated word pairs more negative-going than those
to related word pairs, which, in turn, were more nega-
tive than those elicited by antonyms. Moreover, the
morphology of the waveforms showed a positive peak
for antonyms within this time window. This pattern
therefore replicates previous results in the domain of
antonym processing (Kutas & Federmeier, 2000; Kutas &
Iragui, 1998). In addition, and in contrast to previous
studies, the two nonantonym conditions engendered
late positivities in response to the antonym stimuli. Al-
though this late effect had a posterior maximum for the
unrelated condition, its topographical distribution was
more anterior for the related stimuli.
Experiment 2: Word Pairs with a Lexical
Decision Task
Behavioral Data
The mean error rates and reaction times, respectively,
for the lexical decision task were as follows (standard
deviations in parentheses): antonyms—0.73 (1.47),
467.88 (204.71); related—0.88 (1.23), 506.96 (251.24);
unrelated—1.62 (2.49), 524.94 (245.58); pseudowords—
2.06 (2.54), 519.56 (288.45).
For the error rates, a repeated measures ANOVA re-
vealed no main effect of Type, F
Subj
(3,48) = 1.55, p > .24;
F
Item
(3,237) = 2.16, p > .13.
By contrast, the analysis of the reaction times showed
a main effect of Type, F
Subj
(3,48) = 4.15, p <.03;
F
Item
(3,237) = 4.80, p < .004. The planned comparisons
yielded the following effects: antonyms versus unrelated
word pairs, F
Subj
(1,16) = 11.52, p < .005; F
Item
(1,79) =
15.34, p < .001; antonyms versus related word pairs,
F
Subj
(1,16) = 7.04, p < .02; F
Item
(1,79) = 6.45, p < .02;
related versus unrelated word pairs, F
Subj
(1,16) = 3.30,
p < .09; F
Item
(1,79) = 1.82, p < .19; antonyms versus
pseudowords, F
Subj
(1,16) = 3.96, p < .07; F
Item
(1,79) =
7.03, p < .02; related versus pseudowords, F
Subj
(1,16) <
1; F
Item
(1,79) < 1; unrelated versus pseudowords,
F
Subj
(1,16) < 1; F
Item
(1,79) < 1.
In summary, the behavioral data show no difference
in error rates. However, the antonym condition yielded
the fastest reaction times, whereas pseudowords and
the unrelated condition showed the slowest reaction
times. The reaction times for the related condition lay
in-between these two extremes.
ERP data
Figures 2 and 3 show grand average ERPs at the position
of the critical second word for the three conditions ex-
amined in Experiment 1 and antonyms versus unrelated
word pairs versus pseudowords, respectively. From Fig-
ure 2, it is apparent that the gradation of the N400
observed in Experiment 1 (unrelated > related > anto-
nyms) was also observable in Experiment 2. In contrast to
Experiment 1, however, the antonym condition did not
give rise to a positive deflection within the N400 time
window. In addition, no late positive effects were ob-
served in the comparison of the three critical conditions.
With respect to the pseudowords in second position,
Figure 3 indicates that these did not differ from the
unrelated condition within the N400 time window, that
is, pseudowords and unrelated words evoked an N400 of
similar amplitude in comparison to the antonyms. How-
ever, the two conditions differ strikingly in a later time
window: Only pseudowords engendered a late positivity
following the N400.
All of these descriptive observations were confirmed
by the statistical analysis of the data. ANOVAs for the
N400 time window (270–470 msec) showed highly reli-
able main effects of Type at both lateral, F(3,48) = 20.05,
p < .001, and midline sites, F(3,48) = 20.87, p < .001,
and a significant Type ROI interaction for the midline
electrodes, F(9,144) = 4.37, p < .005. The resolution
of the main effect of Type at lateral electrode sites re-
vealed significant effects for all contrasts but one: unre-
lated versus pseudowords; minimal significant F(1,16) =
11.66 for the contrast between related word pairs and
pseudowords, maximal F(1,16) = 36.15 for the con-
trast between antonyms versus unrelated word pairs.
Further analyses of Type in each ROI for the midline
electrodes showed reliable effects at all sites: minimal
F(9,144) = 10.07 at Fz and maximal F(9,144) = 20.02
Roehm et al. 1265
at Pz. The planned comparisons between antonyms and
unrelated words reached significance at all sites, mini-
mal F(1,16) = 18.32 at Fz, maximal F(1,16) = 40.47 at
Pz, as did the planned comparisons between antonyms
and related word pairs, minimal F(1,16) = 7.15 at Fz,
maximal F(1,16) = 43.61 at Pz; the planned comparisons
between related and unrelated word pairs, minimal
F(1,16) = 11.02 at Pz, maximal F(1,16) = 12.34 at Cz;
and the planned comparisons between antonyms and
pseudowords, minimal F(1,16) = 10.41 at Fz, maximal
F(1,16) = 41.16 at Pz. The planned comparison between
related words and pseudowords only reached signifi-
cance at Pz, F(1,16) = 6.11, whereas the comparison
between unrelated words and pseudowords was not sig-
nificant at any electrode. ANOVAs for the late positivity
time window (520–770 msec) revealed significant effects
of Type and Type ROI at both lateral, Type: F(3,48) =
12.83, p < .001; Type ROI: F(9,144) = 5.61, p < .005,
and midline electrodes, Type: F(3,48) = 11.21, p < .005;
Type ROI: F(9,144) = 6.02, p < .001. Analyses of Type
within each ROI showed reliable effects in all regions,
minimal F(9,144) = 4.46 for the right anterior ROI,
maximal F(9,144) = 16.64 for the right posterior ROI.
The planned comparisons for antonyms versus unrelat-
ed words and antonyms versus related words showed no
significant effects, whereas the planned comparison for
related versus unrelated words only showed a reliable
difference at Oz, F(1,16) = 6.68. In contrast, the planned
comparisons for antonyms versus pseudowords revealed
reliable effects at all sites, minimal F(1,16) = 5.79 at Oz,
maximal F(1,16) = 17.43 for the right posterior ROI, as
did the planned comparisons for related words versus
pseudowords, minimal F(1,16) = 5.82 for the right
anterior ROI, maximal. F(1,16) = 21.18 for the right
posterior ROI, and those for unrelated words versus
pseudowords, minimal F(1,16) = 8.56 at Oz, maximal
F(1,16) = 23.60 for the right posterior ROI.
In summary, Experiment 2 revealed two major differ-
ences in comparison to Experiment 1. On the one hand,
there was no early positive deflection for the antonym
condition. On the other hand, the two nonantonym
nonpseudoword conditions did not show a late positiv-
ity in comparison to the antonyms. By contrast, the
pseudowords in second position, which showed a very
similar N400 effect to the nonrelated word pairs, engen-
dered a late positivity.
The presence or absence of positivity effects in our
critical conditions thus appears to depend crucially
upon the experimental environment in which the stim-
uli are presented (i.e., constraining sentence context
plus a task focusing on the antonym relation in Exper-
iment 1 versus unconstraining word pairs plus a lexical
Figure 2. Grand average ERPs
for antonyms and related and
unrelated category conditions
in Experiment 2 (onset at the
vertical bar). Negativity is
plotted upward.
1266 Journal of Cognitive Neuroscience Volume 19, Number 8
decision task in Experiment 2). As both the task and
the experimental setting (i.e., constraining versus non-
constraining context) worked in the same direction in
these two experiments, the question arises of how the
processing system would respond to an intermediary
situation. This issue was examined in Experiment 3 by
presenting word pairs with an antonymy judgment task.
If the difference between Experiments 1 and 2 was wholly
determined by either the sentence context or the exper-
imental task, we should expect the results of Experi-
ment 3 to pattern either with Experiment 1 (if only the
task is decisive) or with Experiment 2 (if the sentence
context is the crucial parameter).
Experiment 3: Word Pairs with an
Antonymy Judgment
Behavioral Data
The antonymy judgment task yielded the following
mean error rates and reaction times, respectively (stan-
dard deviations in parentheses): antonyms—1.69 (1.87),
457.25 (153.04); related—4.56 (4.78), 505.85 (146.72);
unrelated—0.29 (0.83), 444.14 (135.23).
The statistical analysis of the error rates for the anto-
nym judgment task revealed a significant main effect of
Type, F
Subj
(2,32) = 10.34, p < .001; F
Item
(2,158) = 14.76,
p < .001. Planned comparisons showed significant dif-
ferences between all pairs of condition: antonyms ver-
sus unrelated word pairs, F
Subj
(1,16) = 8.14, p < .02;
F
Item
(1,79) = 4.16, p < .05; antonyms versus related
word pairs, F
Subj
(1,16) = 6.03, p <.03;F
Item
(1,79) =
12.63, p < .002; related versus unrelated word pairs,
F
Subj
(1,16) = 15.98, p < .002; F
Item
(1,79) = 17.92,
p < .001.
The analysis of the reaction times again showed a sig-
nificant main effect of Type, F
Subj
(2,32) = 15.77, p < .001;
F
Item
(2,158) = 11.29, p < .001. Planned comparisons
revealed the following effects: antonyms versus unrelated
word pairs ( p
Subj
>.3;F
Item
< 1); antonyms versus related
word pairs, F
Subj
(1,16) = 13.93, p < .003; F
Item
(1,79) =
10.96, p < .002; related versus unrelated word pairs,
F
Subj
(1,16) = 29.00, p < .001; F
Item
(1,79) = 20.49, p < .001.
ERP Data
Grand average ERPs for the three critical conditions in
Experiment 3 are shown in Figure 4. The figure suggests
that the pattern in the present experiment is rather
similar to that observed in Experiment 1: There is a
graded N400 effect (unrelated > related > antonyms) in
combination with a positive deflection for the antonym
Figure 3. Grand average ERPs
for pseudowords in second
position, antonyms, and
the unrelated condition in
Experiment 2 (onset at the
vertical bar). Negativity is
plotted upward.
Roehm et al. 1267
condition within the same time window as well as a late
positivity for the two nonantonym conditions.
These observations were confirmed by the statistical
analysis for the N400 and late positivity time windows,
respectively. ANOVAs for the N400 time window (270–
470 msec) revealed highly reliable main effects of Type
and Type ROI interactions at both lateral, Type:
F(2,32) = 3.75, p < .01; Type ROI: F(6,96) = 12.39,
p < .001, and midline electrode sites, Type: F(2,32) =
44.86, p < .001; Type ROI: F(6,96) = 14.44, p < .001.
Further analyses of Type within each ROI showed re-
liable effects in all regions: minimal F(2,32) = 9.06 for
the left anterior ROI, maximal F(2,32) = 59.28 at Oz.
The planned comparisons for antonyms versus unre-
latedwordpairsalsoreachedsignificanceinallre-
gions, minimal F(1,16) = 12.00 for the left anterior
ROI, maximal F(1,16) = 71.85 at Oz, as did the planned
comparisons for related versus unrelated word pairs,
minimal F(1,16) = 8.23 for the left anterior ROI, maximal
F(1,16) = 42.24 at Oz. The planned comparisons for
antonyms versus related word pairs were significant at
all sites but one, left anterior ROI; minimal significant
F(1,16) = 8.77 at Fz and maximal F(1,16) = 45.67 at
Oz. ANOVAs for the late positivity time window (510–
760 msec) revealed significant effects of Type and Type
ROI for the lateral electrodes and a significant Type
ROI interaction for the midline electrodes. The analyses
of Type in each ROI showed reliable differences in all
regions except the left anterior ROI, Cz, Pz, and Oz,
minimal significant F(2,32) = 3.14 for the right anterior
ROI, maximal F(2,32) = 13.45 for the left posterior ROI.
Planned comparisons for the significant regions showed
significant differences between antonyms and unrelated
words only in the left posterior ROI, F(1,16) = 29.60,
and the right posterior ROI, F(1,16) = 7.62. Further-
more, the planned comparisons for antonyms versus re-
lated word pairs only revealed significant effects in the
left posterior ROI, F(1,16) = 11.29, and at Fz, F(1,16) =
6.10. Finally, planned comparisons for related versus un-
related word pairs reached significance within the right
anterior ROI, F(1,16) = 6.27, and at Fz, F(1,16) = 6.78.
Thus, although all of the effects observed via visual in-
spection reached statistical significance, the overall data
pattern appears to suggest that the positivity effects were
somewhat weaker than those reported for Experiment 1.
In order to investigate more closely why the late posi-
tivity may have been somewhat attenuated in the present
experiment in comparison to Experiment 1, we examined
the individual participant averages for the three critical
conditions. The rationale for this post hoc comparison
was based on the assumption that individual participants
may have adopted different strategies in processing the
Figure 4. Grand average ERPs
for antonyms and related
and unrelated conditions in
Experiment 3 (onset at the
vertical bar). Negativity is
plotted upward.
1268 Journal of Cognitive Neuroscience Volume 19, Number 8
stimuli of Experiment 3, thus leading them to show either
a similar pattern to Experiment 1 or to Experiment 2. We
thus classified participants according to whether they
showed a difference between the two nonantonym con-
ditions and the antonym condition in the late positivity
time window or not. The result of this classification is
shown in Figure 5.
Strikingly, the group split based on the late positivity
showed a further and unexpected result: As is immediately
apparent from Figure 5, only the subset of participants
that showed a late positivity for the two nonantonym
conditions also showed a positive deflection for the anto-
nyms in the N400 time window. By contrast, the second
participant group showed no positivity effects whatso-
ever. To confirm this descriptive result, we carried out
an additional statistical analysis, in which we compared
difference waves between antonyms and related words
(henceforth referred to as ‘‘A’’), and unrelated and related
words (‘‘N’’), including Group as a between-participants
factor. Difference waves were used to factor out inherent
differences between the two groups. Note that, in the
following analysis, we only report interactions with and
effects of Group for the sake of brevity. Furthermore, the
additional analysis was only conducted for the N400 time
window, as the presence or absence of a late positivity
was the criterion for the group split (thus rendering the
comparison trivial).
Group Split (N400 Time Window: 270–470 msec)
In the N400 time window, the statistical analysis revealed
an ROI Type Group interaction, F(3,45) = 3.47,
p < .04. Separate analyses within each ROI showed a
significant Type Group interaction only in the left
posterior region, F(1,15) = 6.07, p < .027. This interac-
tion was resolved by Type in order to examine whether
the two groups would indeed differ for the A contrast
but not for the N contrast. As predicted, there was a
significant effect of Group for A, F(1,15) = 4.82, p < .045,
but not for N ( p > .29).
This critical difference between the groups with re-
spect to the A contrast was further supported by the
analysis of the midline electrodes. Here, the interaction
Electrode Type Group was marginally significant,
F(3,45) = 2.55, p < .09. Analyses for each electrode
showed that the interaction Type Group was margin-
al at CZ, F(1,15) = 4.14, p < .06, and significant at PZ,
F(1,15) = 5.91, p < .03, and OZ, F(1,15) = 6.73, p < .025.
In all cases, the A contrast yielded significant Group dif-
ferences, CZ: F(1,15) = 5.03, p <.041;PZ:F(1,15) = 7.90,
p <.02;OZ:F(1,15) = 22.74, p < .01, whereas there were
no effects of Group for the N contrast (all Fs<1).
The results of the group split thus suggest that there
is a correlation between the early positivity effect for the
antonyms and the late positivities for the nonantonym
conditions: only the group of participants who were se-
lected for showing late positivity effects also showed the
early effect. By contrast, the relative effect between the
related and unrelated conditions did not differ between
the two groups. If, as suggested above, this overall pat-
tern results from strategic differences between the two
groups—with the ‘‘positivity group’’ behaving similarly
to the participants in Experiment 1 and the ‘‘nonposi-
tivity group’’ behaving similarly to the participants in
Figure 5. Two groups of
grand averaged ERPs for
antonyms and related and
unrelated conditions in
Experiment 3 (onset at the
vertical bar). (A) Grand
averaged ERPs for subjects
(n = 9) who showed no late
positivity. (B) Grand averaged
ERPs for subjects (n =8)
who clearly showed a late
positivity. The results revealed
that subjects who showed a
late positivity also showed
an early positive peak for
antonyms. Negativity is
plotted upward.
Roehm et al. 1269
Experiment 2—we should expect to see very similar ef-
fects in a direct comparison of our first two experiments.
Thus, we should expect to find (a) a difference between
the relative effect for the antonym versus the related
condition (A) in the N400 time window, (b) a corre-
sponding difference in the late positivity time window,
and (c) no difference across experiments for the N400
contrast between related and unrelated stimuli (N).
Comparison of Experiments 1 and 2
N400 Time Window
Difference waves for the N400 were calculated as for the
group split in Experiment 3 (A = ANT REL; N = NON
REL). In the N400 time window, the statistical analysis
revealed an ROI Type Group interaction, F(3,96) =
8.40, p < .002. Separate analyses within each ROI showed
a significant Type Group interaction only in the pos-
terior regions: left posterior: F(1,32) = 18.38, p <.0001;
right posterior: F(1,32) = 23.90, p <.0001.Theinter-
actions were resolved by Type, in order to examine
whether the two groups would indeed differ for the A
contrast but not for the N contrast. As predicted, there
was a significant effect of Group for A, left posterior:
F(1,16) = 11.11, p < .003; right posterior: F(1,16) = 19.59,
p < .0001, but not for N (left posterior: p >.24;right
posterior: p >.07).
This critical difference between the groups with re-
spect to the A contrast was further supported by the
analysis of the midline electrodes. As for the lateral
electrodes, the interaction Electrode Type Group
was significant, F(3,96) = 5.57, p < .015. Analyses for
each electrode showed that the interaction Type
Group was significant at CZ, F(1,32) = 7.13, p < .018,
PZ, F(1,32) = 20.49, p < .0001, and OZ, F(1,32) = 36.59,
p < .0001. In all cases, the A contrast yielded significant
Group differences, CZ: F(1,16) = 6.36, p < .018; PZ:
F(1,16) = 21.21, p < .0001; OZ: F(1,16) = 26.43, p <
.0001, whereas there were no effects of Group for the
N contrast (CZ: F < 1; PZ: p > .18; OZ: p > .08).
Late Positivity
Difference waves for the P600 time window were calcu-
lated between related words and antonyms (A = REL
ANT), and unrelated words and antonyms (U = NON
ANT). In the P600 time window, the statistical analysis
revealed an ROI Type Group interaction, F(3,96) =
9.33, p < .0001. Separate analyses within each ROI
showed a significant Type Group interaction only in
the left posterior region, F(1,32) = 10.12, p < .005. The
interaction was resolved by Type, in order to examine
whether the two groups would indeed differ for the A
contrast as well as for the U contrast. As predicted, there
was a significant effect of Group for A, F(1,16) = 11.37,
p < .003, and U, F(1,16) = 36.71, p < .0001.
ThedirectcomparisonofExperiments1and2showed
that, in the N400 time window, the A contrast differed
between experiments, whereas the N contrast did not. In
thelatepositivitytimewindow,therewerecorresponding
differences between the A and the U contrasts between
the two studies. The contrast between Experiments 1 and
2 is thus fully equivalent to the difference between the
two groups in Experiment 3 (see Figure 6).
Figure 6. Comparison
of difference waves for
Experiments 1 and 2 and
for the two subgroups of
Experiment 3 (Exp3/P =
positivity group; Exp3/N =
nonpositivity group). (A)
Difference waves for antonyms
versus related conditions
showed an early as well as a
late positivity for Experiment 1
and the positivity group of
Experiment 3 in comparison
to Experiment 2 and the
nonpositivity group of
Experiment 3. The small
positive dip for the latter
two experiments is due to an
additional semantic priming
effect for antonyms relative to
related conditions and thus
reflects a decreased N400. (B)
Difference waves for unrelated
versus related conditions
showed no difference between
experiments and/or groups.
Negativity is plotted upward.
1270 Journal of Cognitive Neuroscience Volume 19, Number 8
Taken together, these findings provide strong converg-
ing support for the assumption that the N400 effect in
Experiment 1 indeed results from the superposition of
two distinct components: an N400 and a P300. Whereas
the N400 reduction between the critical stimulus types re-
mains unaffected by the experimental context (as shown
by the absence of a group effect for the N contrast in
the N400 time window; see Figure 6B), the ‘‘embedded
positivity’’ for the antonyms is modulated by task require-
ments and strategic factors (see Figure 6A).
DISCUSSION
We have presented three ERP experiments examining
the interplay between the N400 and the P300 in the pro-
cessing of relational semantic information. Experiment 1
compared electrophysiological responses to antonym
relations, semantically related words, and unrelated
words in a sentence context and using a sensicality judg-
ment task, whereas Experiments 2 and 3 presented the
critical stimuli as word pairs. In Experiment 2, the critical
conditions were rendered irrelevant to the task by in-
cluding further stimulus pairs with a pseudoword in first
or second position and using a lexical decision task. By
contrast, Experiment 3 required participants to perform
an antonymy judgment task. In all three experimental
environments, the critical stimuli engendered a graded
N400 response of the following form: unrelated words >
related words > antonyms. In addition, we observed
several positivities that were modulated by the experi-
mental context: in Experiment 1 and in a subgroup of
participants in Experiment 3, the antonym condition
elicited a positive deflection in the N400 time window
and the two nonantonym conditions engendered a late
positivity. We hypothesized that the group differences
in Experiment 3 may have been due to differences in
individual processing strategy, with several participants
adopting a strategy analogous to that in Experiment 1
and others performing as in Experiment 2. Converging
support for this assumption stems from the direct com-
parison of the first two experiments, which revealed a
cross-experimental difference for the contrast between
antonyms and related words in the N400 time window,
but no such difference for the contrast between related
and unrelated words in the same time window. Corre-
sponding cross-experimental differences were observed
with respect to the late positivities for the nonantonym
conditions. Thus, the group split in Experiment 3 indeed
revealed the identical pattern to the comparison be-
tween Experiments 1 and 2.
Semantic Association: Context-independent
N400 Differences
In combination, the three experiments show that the
effects of semantic relatedness within the N400 are in-
dependent of the experimental environment in which
the critical stimuli are presented. This finding is in line
with a large number of previous results indicating that
N400 effects of semantic association can be evoked with-
out a task focusing on relatedness and even via unat-
tended stimuli (e.g., Nunez-Pena & Honrubia-Serrano,
2005; Perrin & Garcia-Larrea, 2003; Rolke, Heil, Streb, &
Hennighausen, 2001; Kutas & Federmeier, 2000; Kutas &
Hillyard, 1984). It therefore appears that a word’s asso-
ciations within a semantic network are activated auto-
matically whenever that word is processed, irrespective
of task and experimental context. We were able to show
this directly for the unrelated versus related word pairs
by means of the interexperimental comparison between
Experiments 1 and 2, which revealed a highly compara-
ble difference between the two conditions in the two
studies. For the comparison between antonyms versus
related word pairs, the same argument can be made im-
plicitly, because both Experiments 1 and 2 showed a
significant difference between these two conditions with-
in the N400 time window, although the size of this dif-
ference varied between the experiments.
As a final remark on the N400 effects in the present
experiments, it is interesting that there appears to be no
inherent advantage for the processing of real words as
opposed to pseudowords with respect to the processing
effort reflected by the N400 component. Recall that the
N400 effect for the pseudowords was indistinguishable
from that for the unrelated words in Experiment 2, thus
indicating that there is no categorical difference between
the two types of stimuli. This appears plausible in view
of the fact that the transition between real words and
pseudowords is not completely sharp: very low frequen-
cy words (e.g., oologist, an expert on birds’ eggs) might
well be classified as pseudowords under a number of cir-
cumstances. Conversely, pseudowords can be endowed
with a meaning relatively effortlessly given a particular
context (cf. the conceptual associations evoked by Lewis
Carroll’s Jabberwocky).
Prediction, Experimental Environment, and
Processing Strategy: Experiment-specific
Positivity Effects
More importantly, for the purposes of the present ar-
ticle, our experiments showed an early positive response
to the antonyms (within the N400 time window and aug-
menting the overall N400 reduction) in Experiment 1
and in a subgroup of the participants of Experiment 3. In
contrast to the effects of semantic relatedness, which did
not differ across the three experiments (see the previous
section), the early positivity was crucially modulated as
a function of the experimental environment (task and
mode of stimulus presentation).
For the antonym condition in Experiment 1, the task-
relevant last word of the sentence fulfills the clear and un-
ambiguous expectation set up by the sentence context
Roehm et al. 1271
(The opposite of black is ...). It therefore elicits a target-
related P300 effect, which occurs in the same time range
as the N400 because the correct identification of the pre-
dicted word (antonym) does not require a lexical search
(there is a unique prediction that may either be fulfilled
or not). By contrast, no early positivity was observed when
the antonym relation was task irrelevant (Experiment 2).
This difference between the two studies was confirmed
by the direct statistical comparison.
1
The results of Experiment 3 further suggest that modu-
lations of P300 effects within the N400 time window may
even go beyond simple task effects. In this study, half of
the participants showed exactly the same component
pattern as that observed in Experiment 1, whereas the
other half showed an analogous ERP response to that
in Experiment 2. This descriptive assumption was sup-
ported by the statistical comparison of the two groups,
which yielded identical results to the direct comparison
between Experiments 1 and 2. These findings suggest that
interindividual differences in processing strategy may have
a significant impact upon the neurophysiological patterns
observed.
With respect to the source of the strategic differences
between the two participant groups in Experiment 3,
we can only speculate. Neither could the group split
be traced back to any systematic parameter nor were
there significant differences in the behavioral perform-
ance of the two groups. There was, however, a nu-
merical trend toward faster reaction times (in the
range of approximately 90 msec) for the positivity group
(M = 425.89 msec, SD = 91.96 sec; nonpositivity group:
M = 516.59 sec, SD = 181.04 sec), although this group
difference did not reach significance. Nonetheless, it
appears plausible to assume that, by adopting a strategy
analogous to that forced upon the processing system by
a constraining sentence context, the positivity group
may have been able to perform the task more effectively.
Perhaps a behavioral advantage would indeed have been
brought out more clearly by a more demanding task or
by time pressure.
Consequences for the Interpretation of ERP
Differences within the N400 Time Window
The comparison of the three experiments presented
here revealed that functionally distinct ERP components
may be superimposed within the N400 time window.
This constellation yields what appears to be an ‘‘N400
difference’’ on the surface, whereas, in fact, requiring a
more complex explanation. The present findings thus go
beyond previous discussions in the literature that were
concerned with the effects of possible P300 latency shifts
and their consequences for the interpretation of ERP
differences within the N400 time window. Rather, they
show additivity between a ‘‘true’’ N400 effect and a task-
or strategy-related P300 effect within this time range.
As an example of the consequences of this result for
the interpretation of what appear to be N400 reductions,
recall the results of the experiment by Kutas and Iragui
(1998), which were already discussed in the Introduc-
tion. Having shown that the comparison between anto-
nyms and unrelated word pairs engenders an ERP
difference that results from the superposition of two
effects (an N400 reduction due to semantic relatedness
and a P300 augmentation due to the experimental en-
vironment), it is no longer possible to provide a clear-
cut interpretation of the age-related reduction of the
electrophysiological difference between these stimulus
types. Thus, the changes observed as a function of age
may result either from a reduction of the N400 proper,
from a reduction of the P300 effect, or from a reduction
of both.
More generally, these observations indicate that the
results of ERP studies on language processing require
a much more detailed screening for possible task- or
strategy-related positivity effects than previously as-
sumed. Notably, this issue goes beyond the question
of whether late language-related positivities should be
regarded as P300s (e.g., Frisch, Kotz, von Cramon, &
Friederici, 2003; Osterhout & Hagoort, 1999; Coulson
et al., 1998), but rather concerns the possible occur-
rence of P300 effects within the N400 time range. As
shown in the present studies, these effects may be rela-
tively difficult to detect because of component overlap.
Nonetheless, they clearly cannot be neglected because
surface ERP differences call for a fundamentally different
interpretation in their presence. Crucially, the occur-
rence of ‘‘embedded P300s’’ is not only restricted to the
lexical–semantic domain as illustrated in the present ar-
ticle, but can also be observed in syntactic manipulations
at the sentence level. A case in point is provided by ex-
periments on the reanalysis of subject–object ambigui-
ties in German. Within this domain, several studies have
reported early positive effects for a (dispreferred) disam-
biguation toward an object-first order at the clause-final
position (Friederici, Steinhauer, Mecklinger, & Meyer,
1998; Mecklinger, Schriefers, Steinhauer, & Friederici,
1995). Strikingly, this positivity has also been observed at
the clause-final auxiliary (which completes the proposi-
tion expressed by the sentence) even when word or-
der disambiguation was effected at an earlier point in
the sentence (see Friederici et al., 1998; Figure 4). This
finding suggests that the early positivity may not be a
correlate of grammatical function reanalysis per se, but
that it may rather be related to the experimental task
demands. In the studies in question, participants per-
formed a comprehension task, which always focused
upon the object of the critical experimental sentences,
hence leading to a target effect at the element complet-
ing the proposition in the object-first sentences. Con-
verging support for an interpretation along these lines
stems from a recent ERP study examining the processing
of similar ambiguities under different task requirements
1272 Journal of Cognitive Neuroscience Volume 19, Number 8
(comprehension question vs. binary acceptability judgment)
(Gaermer, Schlesewsky, Roehm, Friederici, & Bornkessel-
Schlesewsky, submitted). Gaermer et al. (submitted) ob-
served a P300 modulation at the sentence-final auxiliary,
which occurred as a function of task demands and was
measurable even in subject-initial sentences. It therefore
appears highly plausible that the overlap between P300
effects in the N400 time window and language-related ERP
effects is not restricted to the lexical–semantic domain, but
that it rather constitutes a more general phenomenon in
experiments on sentence processing.
Conclusion
The present experiments revealed that the extreme
N400 reduction observable in highly predictable lexical–
semantic settings (antonym relations) results from the
superposition of two independent ERP effects: a reduced
N400 due to semantic relatedness, and an increased P300
due to target predictability. Moreover, the P300 effects
observed were not only attributable to concrete task
demands, but were also influenced by the individual
processing strategies used to achieve successful task per-
formance. The finding that P300 effects can occur in par-
allel with functionally distinct effects in ERP studies on
language comprehension calls for a more cautious inter-
pretation of language-related ERP differences.
Acknowledgments
The present research was supported by the Austrian ‘‘Fond zur
Fo¨rderung der wissenschaftlichen Forschung’’ Project P16281-
G03 as well as by the DFG-funded projects SCHL544/2-1 and
BO2471/2-1. This research was conducted while the first author
was a member of the Research Group Neurolinguistics at the
Philipps University Marburg.
Reprint requests should be sent to Dietmar Roehm, Junior
Research Group Neurotypology, Max Planck Institute for Hu-
man Cognitive and Brain Sciences, Stephanstraße 1A, 04103
Leipzig, Germany, or via e-mail: roehm@cbs.mpg.de.
Note
1. A similar explanation applies for the positivity observed in
response to the pseudowords in Experiment 2. Similarly to
the antonyms in Experiment 1, these are also target stimuli
because of the lexical decision task. However, to be correctly
identified as such, they require a lexical search, which is re-
flected in the increased N400 in comparison to related words
and antonyms. Therefore, the P300 is delayed to the post-N400
time range, a finding that has been observed in a number of
experiments examining the processing of pseudowords (e.g.,
Bentin, Mouchetant-Rostaing, Giard, Echallier, & Pernier, 1999;
Bentin et al., 1985). In this way, the early positivity for anto-
nyms and the late positivity for pseudowords appear to be
amenable to a very similar functional interpretation: both of
these stimulus types are overtly predicted because they are
targets in the context of the experimental task. They differ only
with respect to the processes that are prerequisites for target
identification, hence leading to differences in P300 latency.
REFERENCES
Bentin, S. (1987). Event-related potentials, semantic
processes, and expectancy factors in word recognition.
Brain and Language, 31, 308–327.
Bentin, S., McCarthy, G., & Wood, C. C. (1985). Event-related
potentials, lexical decision, and semantic priming.
Electroencephalography and Clinical Neurophysiology,
60, 343–355.
Bentin, S., Mouchetant-Rostaing, Y., Giard, M. H., Echallier,
J. F., & Pernier, J. (1999). ERP manifestations of processing
printed words at different psycholinguistic levels: Time
course and scalp distribution. Journal of Cognitive
Neuroscience, 11, 235–260.
Chwilla, D. J., Kolk, H. H. J., & Mulder, G. (2000). Mediated
priming in the lexical decision task: Evidence from
event-related brain potentials and reaction time.
Journal of Memory and Language, 42, 314–341.
Connolly, J. F., & Phillips, N. A. (1994). Event-related potential
components reflect phonological and semantic processing
of the terminal word of spoken sentences. Journal of
Cognitive Neuroscience, 6, 256–266.
Coulson, S., King, J. W., & Kutas, M. (1998). ERPs and
domain specificity: Beating a straw horse. Language
and Cognitive Processes, 13, 653–372.
Curran, T., Tucker, D. M., Kutas, M., & Posner, M. I. (1993).
Topography of the N400: Brain electrical activity ref lecting
semantic expectancy. Electroencephalography and
Clinical Neurophysiology, 88, 188–209.
Donchin, E., & Coles, M. G. H. (1988). Is the P300 component
a manifestation of context-updating? Behavioral and
Brain Sciences, 11, 355–372.
Federmeier, K. D., & Kutas, M. (1999). A rose by any
other name: Long-term memory structure and sentence
processing. Journal of Memory & Language, 41,
469–495.
Friederici, A. D., Steinhauer, K., Mecklinger, A., & Meyer, M.
(1998). Working memory constraints on syntactic
ambiguity resolution as revealed by electrical brain
responses. Biological Psychology, 47, 193–221.
Frisch, S., Kotz, S. A., von Cramon, D. Y., & Friederici, A. D.
(2003). Why the P600 is not just a P300: The role of the
basal ganglia. Clinical Neurophysiology, 114, 336–340.
Gaermer, F. S., Schlesewsky, M., Roehm, D., Friederici, A. D.,
& Bornkessel-Schlesewsky, I. (submitted). The status of
subject–object reanalyses in the language comprehension
architecture.
Gibson, E. (1998). Linguistic complexity: Locality of syntactic
dependencies. Cognition, 68, 1–76.
Hagoort, P., Hald, L., Bastiaansen, M., & Petersson, K. M.
(2004). Integration of word meaning and world
knowledge in language comprehension. Science,
304, 438–441.
Holcomb, P. (1988). Automatic and attentional processing:
An event-related brain potential analysis of semantic
priming. Brain and Language, 35, 66–85.
Huynh, H., & Feldt, L. S. (1970). Conditions under which
the mean-square ratios in repeated measurement designs
have exact F-distributions. Journal of the American
Statistical Association, 65, 1582–1589.
Keppel, G. (1991). Design and analysis (3rd ed.).
Englewood Cliffs, NJ: Prentice Hall.
Kutas, M., & Federmeier, K. D. (2000). Electrophysiology
reveals semantic memory use in language comprehension.
Trends in Cognitive Sciences, 4, 463–469.
Kutas, M., & Hillyard, S. A. (1980). Reading senseless
sentences: Brain potentials reflect semantic incongruity.
Science, 207, 203–205.
Roehm et al. 1273
Kutas, M., & Hillyard, S. A. (1984). Brain potentials during
reading reflect word expectancy and semantic association.
Nature, 307, 161–163.
Kutas, M., & Iragui, V. (1998). The N400 in a semantic
categorization task across 6 decades. Electroencephalography
and Clinical Neurophysiology, 108, 456–471.
Mecklinger, A., Schriefers, H., Steinhauer, K., & Friederici,
A. D. (1995). Processing relative clauses varying on syntactic
and semantic dimensions: An analysis with event-related
potentials. Memory and Cognition, 23, 477–494.
Nunez-Pena, M. I., & Honrubia-Serrano, M. L. (2005). N400
and category exemplar associative strength. International
Journal of Psychophysiology, 56, 45–54.
Oldfield, R. C. (1971). The assessment and analysis of
handedness: The Edinburgh inventory. Neuropsychologia,
9, 97–113.
Osterhout, L., & Hagoort, P. (1999). A superficial resemblance
does not necessarily mean you are part of the family:
Counterarguments to Coulson, King and Kutas (1998)
in the P600/SPS-P300 debate. Language and Cognitive
Processes, 14, 1–14.
Perrin, F., & Garcia-Larrea, L. (2003). Modulation of the
N400 potential during auditory phonological/semantic
interaction. Cognitive Brain Research, 17, 36–47.
Picton, T. W. (1993). The P300 wave of the human
event-related brain potential. Journal of Clinical
Neurophysiology, 9, 456–479.
Polich, J. (1985). Semantic categorization and event-related
potentials. Brain and Language, 26, 304–321.
Polich, J. (2004). Neuropsychology of the P3a and P3b:
A theoretical overview. In C. Moore & K. Arikan (Eds.),
Brainwaves and mind: Recent developments (pp. 15–29).
Wheaton, IL: Kjelberg.
Rolke, B., Heil, M., Streb, J., & Hennighausen, E. (2001).
Missed prime words within the attentional blink evoke
an N400 semantic priming effect. Psychophysiology, 38,
165–174.
Rugg, M. D. (1990). Event-related brain potentials dissociate
repetition effects of high- and low-frequency words.
Memory and Cognition, 18, 367–379.
Stabler, E. (1994). The finite connectivity of linguistic
structure. In C. Clifton, Jr., L. Frazier, & K. Rayner (Eds.),
Perspectives on sentence processing (pp. 303–336).
Hillsdale, NJ: Erlbaum.
van Berkum, J. J. A., Hagoort, P., & Brown, C. (1999).
Semantic integration in discourse: Evidence from the
N400. Journal of Cognitive Neuroscience, 11, 657–671.
Van Petten, C. (1993). A comparison of lexical and
sentence-level context effects in event-related brain
potentials. Language and Cognitive Processes, 8,
485–531.
Verleger, R. (1988). Event-related potentials and cognition:
A critique of the context-updating hypothesis and an
alternative interpretation of the P3. Behavioral and
Brain Sciences, 11, 343–427.
Yamaguchi, S., & Knight, R. T. (1991). Age effects on the P300
to novel somatosensory stimuli. Electroencephalography
and Clinical Neurophysiology, 78, 297–301.
1274 Journal of Cognitive Neuroscience Volume 19, Number 8