Monitoring sentence comprehension

Disentangling perceptual and psycholinguistic factors in syntactic processing: Tone monitoring via ERPs

Article

Jul 2017
BEHAV RES METHODS

Franco, Gaillard, Cleeremans, and Destrebecqz (Behavior Research Methods, 47, 1393-1403, 2015), in a study on statistical learning employing the click-detection paradigm, conclude that more needs to be known about how this paradigm interacts with statistical learning and speech perception. Past results with this monitoring technique have pointed to an end-of-clause effect in parsing-a structural effect-but we here show that the issues are a bit more nuanced. Firstly, we report two Experiments (1a and 1b), which show that reaction times (RTs) are affected by two factors: (a) processing load, resulting in a tendency for RTs to decrease across a sentence, and (b) a perceptual effect which adds to this tendency and moreover helps neutralize differences between sentences with slightly different structures. These two factors are then successfully discriminated by registering event-related brain potentials (ERPs) during a monitoring task, with Experiment 2 establishing that the amplitudes of the N1 and P3 components-the first associated with temporal uncertainty, the second with processing load in dual tasks-correlate with RTs. Finally, Experiment 3 behaviorally segregates the two factors by placing the last tone at the end of sentences, activating a wrap-up operation and thereby both disrupting the decreasing tendency and highlighting structural effects. Our overall results suggest that much care needs to be employed in designing click-detection tasks if structural effects are sought, and some of the now-classic data need to be reconsidered.

Audiovisual benefits for speech processing speed among children with hearing loss

Conference Paper

Aug 2019

Spoken Words in Sentence Contexts

Chapter

Jan 1999

Pienie Zwitserlood

Listeners sometimes have the impression that they know exactly which word a speaker is going to say next. However real such observations are, they are the exception rather than the rule. More than forty years of research has shown that recognizing spoken words is not a guessing game. Normally, word recognition is an extremely fast and efficient process that rarely reaches conscious awareness. Fluent speech is uttered at a rate of two to three words per second, and an adult language user has a mental lexicon in which the knowledge of about 30 000 to 50 000 words is stored (Aitchison, 1994). This implies that a listener has about one third of a second to select one word from this huge mental data base.

The predictive nature of language comprehension

Article

Jan 2009

Ellen Frances Lau

Short Report Orthographic Effects on Phoneme Monitoring

Article

A widely used task in the research on spoken word recognition is phoneme monitoring, in which subjects have to detect phonemes in spoken words. It is generally assumed that this task is performed using phonetic or phonological representations of words only. To test whether an orthographic representation of the words is employed as well, an experiment was conducted in which Dutch subjects monitored for phonemes with either a primary or secondary spelling in phonologicall y matched spoken words and nonwords. Phoneme monitor- ing times were slower when the phoneme had a secondary spelling than when it had a primary spelling. The effect was greater after than before the uniqueness point of the word, and monitoring times were faster for words than for nonwords. These findings indicate that an orthographic representation of words is engaged in phoneme monitoring.

Probabilistic Phonotactics and Neighborhood Activation in Spoken Word Recognition

Article

Apr 1999

Recent work (Vitevitch & Luce, 1998) investigating the role of phonotactic information in spoken word recognition suggests the operation of two levels of representation, each having distinctly different consequences for processing. The lexical level is marked by competitive effects associated with similarity neighborhood activation, whereas increased probabilities of segments and sequences of segments facilitate processing at the sublexical level. We investigated the two proposed levels in six experiments using monosyllabic and specially constructed bisyllabic words and nonwords. The results of these studies provide further support for the hypothesis that the processing of spoken stimuli is a function of both facilitatory effects associated with increased phonotactic probabilities and competitive effects associated with the activation of similarity neighborhoods. We interpret these findings in the context of Grossberg, Boardman, and Cohen's (1997) adaptive resonance theory of speech perception.

Processing Lexically Embedded Spoken Words

Article

Full-text available

Feb 1999

A large number of multisyllabic words contain syllables that are themselves words. Previous research using cross-modal priming and word-spotting tasks suggests that embedded words may be activated when the carrier word is heard. To determine the effects of an embedded word on processing of the larger word, processing times for matched pairs of bisyllabic words were examined to contrast the effects of the presence or absence of embedded words in both 1st- and 2nd-syllable positions. Results from auditory lexical decision and single-word shadowing demonstrate that the presence of an embedded word in the 1st-syllable position speeds processing times for the carrier word. The presence of an embedded word in the 2nd syllable has no demonstrable effect.

Lexical Effects on Phonetic Categorization: The Role of Stimulus Naturalness and Stimulus Quality

Article

Full-text available

Oct 1995

A series of experiments was conducted to determine whether the effects of lexical status on phonetic categorization were influenced by stimulus naturalness (replicating M. W. Burton, S. R. Baum, & S. E. Blumstein, 1989, who manipulated the intrinsic properties of the stimuli) and by stimulus quality (presenting the stimuli in white noise). The experiments compared continua varying in voice onset time (VOT) only to continua covarying VOT and amplitude of the burst and aspiration noise in no-noise and noise conditions. Results overall showed that the emergence of a lexical effect was influenced by stimulus quality but not by stimulus naturalness. Contrary to previous findings, significant lexical effects failed to emerge in the slower reaction time ranges. These results suggest that stimulus quality contributes to lexical effects on phonetic categorization, whereas stimulus naturalness does not.

Now you hear me, now you don’t: perception of highly lenited Chilean Spanish approximants and its implications for lexical access models

Chapter

Apr 2021

Chilean Spanish is special in that it displays particularly high degrees of lenition and elision of [β̞�], [ð̞]� and [ɣ̞] (Pérez, 2007). Interestingly, Chilean Spanish listeners can recover elided units effortlessly, which challenges the assumptions of some lexical access models, such as strong bottom-up abstractionist models (Mitterer & Ernestus, 2006). This proposal reports on a series of perception experiments in which synthetic continua from full approximants to elided variants were presented in several informational conditions. Results showed that increasing the amount of acoustic information and the number of semantic cues had a significant effect on listeners' responses, enabling lexical effects and minimizing phonological recovery. Moreover, these effects were different for the three consonants being tested, probably due to existing links between production and perception. These findings are discussed in light of previous research on lexical effects and recovery, and lexical access models in general.

Phonetic Bias in Sound Change

Article

Jan 2011

Parsing for Position

Article

Full-text available

Jan 2020

Monitoring tasks have long been employed in psycholinguistics, and the end-of-clause effect is possibly the better-known result of using this technique in the study of parsing. Recent results with the tone-monitoring task suggest that tone position modulates cognitive load, as reflected in reaction times (RTs): the earlier the tone appears in a sentence, the longer the RTs. In this study, we show that verb position is also an important factor. In particular, changing the time/location at which verb–noun(s) dependencies are computed during the processing of a sentence has a clear effect on cognitive load and, as a result, on the resources that can be devoted to monitoring and responding to a tone. This study is based on two pieces of evidence. We first report the acceptability ratings of six word orders in Spanish and then present monitoring data with three of these different word orders. Our results suggest that RTs tend to be longer if the verb is yet to be processed, pointing to the centrality of a sentence’s main verb in parsing in general.

Rôle de la syllabe dans la perception de la parole : études attentionnelles

Thesis

Dec 1994

Christophe Pallier

Nous présentons une série d'expériences qui montrent que le système perceptif reconstruit automatiquement la structure syllabique lors de la perception de la parole. Nos résultats rejettent la notion selon laquelle les syllabes sont des atomes de la perception de la parole, mais montrent plutôt que la structure syllabique agit comme un cadre d'organisation pour le décodage du signal de parole.

L’assimilazione di sonorità dei dialetti Emiliani nell’interfaccia fonetica-fonologia

Article

Full-text available

Mar 2018

Edoardo Cavirani

Linguistic perception is conditioned by phonology. In this study, this claim is tested on empirical data coming from a set of Emilia dialects (Italy), which are reported to show regressive voicing assimilation (RVA). The hypothesis is that, in the case RVA is part of the phonological competence of the speakers, consonant clusters whose segments display opposite voicing specification are misperceived as showing the same specification. The analysis of the production results show that RVA is systematic, although partial (de)voicing can be found too and sometimes the process of RVA is not applied, presumably under the influence of Standard Italian. Similarly, RVA is shown to variably constrain perception. Taken together, the production and perception data suggests that RVA, rather than a fully systematic phonological process, should be considered a phonetic implementation process applying at the phonetics-phonology interface.

Eye Movement Evidence for an Immediate Ganong Effect

Article

Full-text available

Aug 2016

Listeners tend to categorize an ambiguous speech sound so that it forms a word with its context (Ganong, 1980). This effect could reflect feedback from the lexicon to phonemic activation (McClelland & Elman, 1986), or the operation of a task-specific phonemic decision system (Norris, McQueen, & Cutler, 2000). Because the former account involves feedback between lexical and phonemic levels, it predicts that the lexicon’s influence on phonemic decisions should be delayed and should gradually increase in strength. Previous response time experiments have not delivered a clear verdict as to whether this is the case, however. In 2 experiments, listeners’ eye movements were tracked as they categorized phonemes using visually displayed response options. Lexically relevant information in the signal, the timing of which was confirmed by separate gating experiments, immediately increased eye movements toward the lexically supported response. This effect on eye movements then diminished over the course of the trial rather than continuing to increase. These results challenge the lexical feedback account. The present work also introduces a novel method for analyzing data from ‘visual-world’ type tasks, designed to assess when an experimental manipulation influences the probability of an eye movement toward the target.

Lexical Influences on the Perceptual Categorization of French Stops

Article

Full-text available

Jun 2011

Lexical effects on speech perception are not very reliable and they have been shown to depend on various factors among which word length. In the current models of phonemic decision, lexical effects are conceived as arising from top-down processing, with or without feedback, depending on the model. Lexical effects tend to be stronger in longer words, which can be ascribed to an increase in the amount of lexical evidence. The present study was aimed at collecting further evidence on this point. The existence of lexical effects was confirmed in a series of two experiments on voicing identification in French initial stops. The effects were present for stops in monosyllables and polysyllables whereas they were almost absent in bisyllables. We tentatively explain the U-shaped relationship between lexical evidence and phonemic identification by two different mechanisms which would be both weakly 162 Willy Sernicles, Renaud Beeckmans & Monique Radeau effective with moderate amounts of lexical evidence (in bisyllables). With fairly large amounts of lexical evidence (in polysyllables) the lexical effect would be due to the fairly complex top-down processes postulated in the literature. With low amounts of lexical evidence (in monosyllables), a much simpler mechanism based on a re-analysis of the acoustic input would be at work.

Computational models of spoken word recognition

Article

Jan 2012

Phonemic Categorization and Phonotactic Repair as Parallel Sublexical Processes

Thesis

Jul 2014

Kiyoshi Ishikawa

Phonemic perception exhibits coarticulation sensitivity, phonotactic sensitivity and lexical sensitivity. Three kinds of models of speech perception are found in the literature, which embody different answers to the question of how the three kinds of sensitivity are related to each other: two-step models, one-step models and lexicalist models. In two-step models (Church, 1987), phonemes are first extracted, and phonotactic repairs are subsequently made on the obtained phoneme string; both phonemic categorization and phonotactic repair are sublexical, and coarticulation sensitivity should only affect initial (pre-phonotactic) phonemic categorization. In one-step models (Dehaene-Lambertz et al., 2000; Dupoux et al., 2011; Mehler et al., 1990), phonemic categorization and phonotactic repair are sublexical and simultaneous; phonotactic repairs themselves depend on coarticulation cues. Such models can be implemented in two different versions: suprasegmental matching, according to which a speech signal is matched against phonotactics-respecting suprasegmental units (such as syllables), rather than phonemes, and slot filling, according to which a speech signal is matched against phonemes as fillers for slots in phonotactics-respecting suprasegmental units. In lexicalist models (Cutler et al., 2009; McClelland & Elman, 1986), coarticulation sensitivity and/or phonotactic sensitivity reduce to lexical sensitivity. McClelland & Elman (1986) claim a lexicalist reduction of phonotactic sensitivity; Cutler et al.'s (2009) make a claim implying lexicalist reductions both of phonotactic sensitivity and of coarticulation sensitivity. This thesis attempts to distinguish among those models. Since different perceptual processes are assumed in these three models (whether sublexical units are perceived, or how many stages are involved in perceptual processing), our understanding of how speech perception works crucially depends on the relative superiority of those three kinds of models. Based on the results available in the past literature on the one hand, and on the results of perceptual experiments with Japanese listeners testing their coarticulation sensitivity in different settings on the other, this thesis argues for the superiority of the slot filling version of one-step models over the others. According to this conclusion, phonemic parsing (categorization) and phonotactic parsing (repair) are separate but parallel sublexical processes.

Training Alters the Resolution of Lexical Interference: Evidence for Plasticity of Competition and Inhibition

Article

Full-text available

Jan 2016

Language learning is generally described as a problem of acquiring new information (e.g., new words). However, equally important are changes in how the system processes known information. For example, a wealth of studies has suggested dramatic changes over development in how efficiently children recognize familiar words, but it is unknown what kind of experience-dependent mechanisms of plasticity give rise to such changes in real-time processing. We examined the plasticity of the language processing system by testing whether a fundamental aspect of spoken word recognition, lexical interference, can be altered by experience. Adult participants were trained on a set of familiar words over a series of 4 tasks. In the high-competition (HC) condition, tasks were designed to encourage coactivation of similar words (e.g., net and neck) and to require listeners to resolve this competition. Tasks were similar in the low-competition (LC) condition, but did not enhance this competition. Immediately after training, interlexical interference was tested using a visual world paradigm task. Participants in the HC group resolved interference to a fuller degree than those in the LC group, demonstrating that experience can shape the way competition between words is resolved. TRACE simulations showed that the observed late differences in the pattern of interference resolution can be attributed to differences in the strength of lexical inhibition. These findings inform cognitive models in many domains that involve competition/interference processes, and suggest an experience-dependent mechanism of plasticity that may underlie longer term changes in processing efficiency associated with both typical and atypical development. (PsycINFO Database Record

Top-Down Effects of Syntactic Sentential Context on Phonetic Processing

Article

Full-text available

Dec 2015

Although much evidence suggests that the identification of phonetically ambiguous target words can be biased by preceding sentential context, interactive and autonomous models of speech perception disagree as to the mechanism by which higher level information affects subjects' responses. Some have suggested that the time course of context effects is incompatible with interactive models (e.g., TRACE). Two experiments examine this issue. In Experiment 1, subjects heard noun- and verb-biasing sentence contexts (e.g., Valerie hated the . . . vs. Brett hated to . . .), followed by stimuli from 2 voice-onset time continua: bay-pay (noun-verb) versus buy-pie (verb-noun). Consistent with prior research, identification of phonetically ambiguous targets was biased by the preceding context, and the size of this bias diminished in slower compared with faster responses. In Experiment 2, tokens from the same continua were embedded among filler target words beginning with /b/ or /p/ to elicit phonemically driven identification decisions and discourage word-level strategies. Results again revealed contextually biased responding, but this bias was as strong in slow as in fast responses. Together, these results suggest that phoneme identification decisions reflect robust, lasting top-down effects of lexical feedback on prelexical representations, as predicted by interactive models of speech perception. (PsycINFO Database Record

The impact of orthographic knowledge on speech processing

Article

Full-text available

Jul 2012

http://dx.doi.org/10.5007/2175-8026.2012n63p161 The levels-of-processing approach to speech processing (cf. Kolinsky, 1998) distinguishes three levels, from bottom to top: perception, recognition (which involves activation of stored knowledge) and formal explicit analysis or comparison (which belongs to metalinguistic ability), and assumes that only the former is immune to literacy-dependent knowledge. in this contribution, we first briefly review the main ideas and evidence supporting the role of learning to read in the alphabetic system in the development of conscious representations of phonemes, and we contrast conscious and unconscious representations of phonemes. Then, we examine in detail recent compelling behavioral and neuroscientific evidence for the involvement of orthographic representation in the recognition of spoken words. We conclude by arguing that there is a strong need of theoretical re-elaboration of the models of speech recognition, which typically have ignored the influence of reading acquisition.

Spoken Word Recognition

Chapter

Full-text available

Dec 2006

Spoken word recognition is a distinct subsystem providing the interface among low-level perception and cognitive processes of retrieval, parsing, and interpretation. The narrowest conception of the process of recognizing a spoken word is that it starts from a string of phonemes, establishes how these phonemes should be grouped to form words, and passes these words onto the next level of processing. Some theories, though, take a broader view and blur the distinctions among speech perception, spoken word recognition, and sentence processing. The broader view of spoken word recognition has empirical and theoretical motivations. One consideration is that by assuming that the input to spoken word recognition is a string of abstract, phonemic category labels, one implicitly assumes that the nonphonemic variability carried on the speech signal is not relevant for spoken word recognition and higher levels of processing. However, if this variability and detail is not random but is lawfully related to linguistic categories, the simplifying assumption that the output of speech perception is a string of phonemes may actually be a complicating assumption.

Handbook of Psychology, Second Edition

Chapter

Sep 2012

To produce and comprehend words and sentences, people use their knowledge of language structure; their knowledge of the situation they are in, including the previous discourse and the local situation; and their cognitive abilities, including memory, attention, and motor control. In this chapter, we explore how competent adult language users bring such knowledge and abilities to bear on the tasks of comprehending spoken and written language and producing spoken language. We emphasize experimental data collected using the tools of cognitive psychology, touching only briefly on language development, disordered language, and the neural basis of language. We also review some of the major theoretical controversies that have occupied the field of psycholinguistics, including the role that linguistic analyses of language structure should play and the debate between modular and interactive views. We also present some of the theoretical positions that have proven successful in guiding our understanding of language processing. We conclude by discussing the need to integrate studies of language comprehension and language production and pointing to emerging research topics.Keywords:psycholinguistics;auditory word recognition;reading;lexical access;sentence comprehension;word production;sentence production

Language Comprehension and Production

Chapter

Full-text available

Jan 2013

To produce and comprehend words and sentences, people use their knowledge of language structure, their knowledge of the situation they are in, including the previous discourse and the local situation, and their cognitive abilities, including memory, attention, and motor control. In this chapter, we explore how competent adult language users bring such knowledge and abilities to bear on the tasks of comprehending spoken and written language and producing spoken language. We emphasize experimental data collected using the tools of cognitive psychology, touching only briefly on language development, disordered language, and the neural basis of language. We also review some of the major theoretical controversies that have occupied the field of psycholinguistics, including the role that linguistic analyses of language structure should play and the debate between modular and interactive views. We also present some of the theoretical positions that have proven successful in guiding our understanding of language processing. We conclude by discussing the need to integrate studies of language comprehension and language production and pointing to emerging research topics.

Perspectives on Syllables, Stress, and Interactions in Speech Perception: Experimental and Connectionist Approaches

Article

Full-text available

Jan 1997

Athanassios Protopapas

The learner of a perception grammar as a source of sound change

Article

Full-text available

Silke Hamann

In this paper, I argue that a regular diachronic sound change is the result of a different interpretation of the same auditory information, as put forward by Ohala (1981 et seq.). Whereas Ohala describes such an account of sound change as purely phonetic, I show that it involves phonological knowledge, namely the language-specific use of auditory cues and their mapping onto language-specific phonological categories. Two diachronic developments of retroflex segments, namely retroflexion of rhotic plus coronal consonant sequences in Norwegian and retroflexion of labialised coronal obstruents in Minto-Nenana, illustrate these assumptions. For both, the differences across generations are modelled in Optimality Theory with the help of language-specific cue constraints in a perception grammar (following Boersma 1997 et seq.). This approach is shown to be superior to the descriptive approach of cue re-association proposed by Ohala because it provides a formal account that includes differences in cue weighting (especially the disregard of cues that became unreliable) and differences in emergent phonological categories.

Models of spoken-word recognition

Article

Full-text available

May 2012
Wiley Interdiscip Rev Cogn Sci

All words of the languages we know are stored in the mental lexicon. Psycholinguistic models describe in which format lexical knowledge is stored and how it is accessed when needed for language use. The present article summarizes key findings in spoken-word recognition by humans and describes how models of spoken-word recognition account for them. Although current models of spoken-word recognition differ considerably in the details of implementation, there is general consensus among them on at least three aspects: multiple word candidates are activated in parallel as a word is being heard, activation of word candidates varies with the degree of match between the speech signal and stored lexical representations, and activated candidate words compete for recognition. No consensus has been reached on other aspects such as the flow of information between different processing levels, and the format of stored prelexical and lexical representations. WIREs Cogn Sci 2012, 3:387-401. doi: 10.1002/wcs.1178 For further resources related to this article, please visit the WIREs website. Copyright © 2012 John Wiley & Sons, Ltd.

Spoken-word recognition

Article

Nov 1997

Cynthia M. Connine

Spoken‐word recognition is an efficient and generally error‐free process that occurs under a variety of speaking and listening conditions. The talk will focus on the mapping process between the speech signal and access of form and meaning. The nature of the representation that supports spoken‐word recognition will be discussed with a focus on the consequence of ambiguity and mismatching information. Research has been conducted in the past few years suggesting that activation of lexical representations is accomplished via feature mapping. It is argued that this architecture permits lexical activation given incomplete or erroneous input. Phonological variation and some recent work concerning representation and processing of common variants will also be discussed.

Working Memory Affects Older Adults’ Use of Context in Spoken-Word Recognition

Article

Full-text available

Jan 2014
Q J EXP PSYCHOL

Abstract Many older listeners report difficulties in understanding speech in noisy situations. Working memory and other cognitive skills may modulate older listeners' ability to use context information to alleviate the effects of noise on spoken-word recognition. In the present study, we investigated whether verbal working memory predicts older adults' ability to immediately use context information in the recognition of words embedded in sentences, presented in different listening conditions. In a phoneme-monitoring task, older adults were asked to detect as fast and as accurately as possible target phonemes in sentences spoken by a target speaker. Target speech was presented without noise, with fluctuating speech-shaped noise, or with competing speech from a single distractor speaker. The gradient measure of contextual probability (derived from a separate offline rating study) affected the speed of recognition. Contextual facilitation was modulated by older listeners' verbal working memory (measured with a backward digit span task) and age across listening conditions. Working memory and age, as well as hearing loss, were also the most consistent predictors of overall listening performance. Older listeners' immediate benefit from context in spoken-word recognition thus relates to their ability to keep and update a semantic representation of the sentence content in working memory.

The recognition of lexical units in speech

Article

Jan 1995

Auditory Processing of Polymorphemic Pseudowords

Article

Feb 2000

Lee H. Wurm

This study compared models of auditory word recognition as they relate to the processing of polymorphemic pseudowords. Semantic transparency ratings were obtained in a preliminary rating study. The effects of morphological structure, semantic transparency, prefix likelihood, and morphemic frequency measures were examined in a lexical decision experiment. Reaction times and errors were greater for pseudowords carrying a genuine prefix, and this effect was largest for pseudowords that also carried a genuine root. While results were grossly similar for bound and free root types, there were also some important differences. Regression analyses provided additional support for decompositional models: semantic transparency, prefix likelihood, prefix frequency, and root frequency all affected pseudoword rejection times. The results are most compatible with a modification of Taft's (1994) interactive-activation model or a dual-route model.

Connectionist psycholinguistics in perspective

Article

Full-text available

Jan 2001

Abstract

Spoken Word Recognition: A Stage-processing Approach to Language Differences

Article

Full-text available

Mar 1998

Regine Kolinsky

In recognising spoken words, the retrieval of a unique mental representation from the speech input involves the exceedingly difficult task of locating word boundaries in a quasi-continuous stimulus and of finding the single representation that corresponds to highly variable acoustical forms. Many cognitive psycholinguists have proposed that these segmentation and categorisation problems are easier to solve at the sublexical than at the word level: Some sublexical representation would mediate the mapping between the acoustic signal and the mental lexicon. Accordingly, much effort has gone into disclosing a hypothesised universal perceptual building block, for example the syllable. More recent advances in speech processing research indicate, however, that speakers of different languages process speech by relying on units or segmentation strategies that are appropriate to the phonological properties of their maternal tongue. Recent data on this topic will be reviewed, with special emphasis on the stage-processing approach of the experimental situations and phenomena reported. For example, the syllabic effects observed in fragment detection and the phenomenon of blending dichotically presented words are discussed. It will be argued that although there is a strong case for language specificity in listeners' intuitions about the phonological structure of their language as well as in word recognition, less evidence is available regarding the early perceptual stages.

RÔLE DE LA SYLLABE DANS LA PERCEPTION DE LA PAROLE : ÉTUDES ATTENTIONNELLES

Article

Full-text available

Jan 1994

Spoken Word Recognition: A Focus on Plasticity

Article

Jan 2024

Psycholinguists define spoken word recognition (SWR) as, roughly, the processes intervening between speech perception and sentence processing, whereby a sequence of speech elements is mapped to a phonological wordform. After reviewing points of consensus and contention in SWR, we turn to the focus of this review: considering the limitations of theoretical views that implicitly assume an idealized (neurotypical, monolingual adult) and static perceiver. In contrast to this assumption, we review evidence that SWR is plastic throughout the life span and changes as a function of cognitive and sensory changes, modulated by the language(s) someone knows. In highlighting instances of plasticity at multiple timescales, we are confronted with the question of whether these effects reflect changes in content or in processes, and we consider the possibility that the two are inseparable. We close with a brief discussion of the challenges that plasticity poses for developing comprehensive theories of spoken language processing. Expected final online publication date for the Annual Review of Linguistics, Volume 10 is January 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Perception and Production of Fluent Speech

Book

Jul 2016

Ronald A. Cole

Visual speech cues speed processing and reduce effort for children listening in quiet and noise

Article

Jul 2020

Processing speech can be slow and effortful for children, especially in adverse listening conditions, such as the classroom. This can have detrimental effects on children’s academic achievement. We therefore asked whether primary school children’s speech processing could be made faster and less effortful via the presentation of visual speech cues (speaker’s facial movements), and whether any audio-visual benefit would be modulated by the presence of noise or by characteristics of individual participants. A phoneme monitoring task with concurrent pupillometry was used to measure 7- to 11-year-old children’s speech processing speed and effort, with and without visual cues, in both quiet and noise. Results demonstrated that visual cues to speech can facilitate children’s speech processing, but that these benefits may also be subject to variability according to children’s motivation. Children showed faster processing and reduced effort when visual cues were available, regardless of listening condition. However, examination of individual variability revealed that the reduction in effort was driven by the children who performed better on a measure of phoneme isolation (used to quantify how difficult they found the phoneme monitoring task).

Contextual predictability and phonetic attention

Article

Jul 2019

Jonathan Manker

The interaction of contextual, high-level linguistic knowledge and the listener’s attention to low-level phonetic details has been the subject of a large body of research in speech perception for several decades. In the current paper, I investigate this interaction by considering the specific phenomenon of word predictability and its role in modulating the listener’s attention to subphonemic details of the acoustic signal. In the first experiment, subjects are presented with a discrimination task in which target words are presented in either predictable or unpredictable sentential context and then repeated in isolation, being either acoustically identical or subtly different. The subjects more accurately discriminate contextually unpredictable words, suggesting more attention to the phonetic details of words in unpredictable contexts. In the second experiment, considering the predictions of exemplar theory, I test whether this perceptual bias could result in changes in production. In this experiment, in which subjects heard and repeated sentences, I find a significant effect of word predictability on how close the subjects’ productions were to the model’s, which suggests a role of predictability on phonetic accommodation. The results of these experiments contribute to our understanding of stored exemplars and suggest the influence of contextual predictability in sound change.

Ab initio perceptual learning of foreign language sounds: Spanish consonant acquisition by Chinese learners

Article

Jun 2017
System

High-variability phonetic training is effective in the acquisition of foreign language sounds. Previous studies have largely focused on small sets of contrasts, and have not controlled for the quantity of prior or simultaneous exposure to new sounds. The current study examined the effectiveness of phonetic training in full-inventory foreign language consonant acquisition by listeners with no previous exposure to the language. Chinese adult listeners underwent an intensive training programme, bracketed by tests that measured both assimilation of foreign sounds to native categories, and foreign category identification rates and confusions. Very rapid learning was evident in the results, with initial misidentification rates halving by the time of the mid-test, and continuing to fall in subsequent training sessions. Changes as a result of training in perceptual assimilation together with improved identifications and reduced response dispersion suggest an expansion of listeners’ native categories to accommodate the foreign sounds and an incipient process of foreign language category formation.

Connectionist Psychology: a text with readings

Book

Full-text available

Jan 1999

English and French Speech Processing: Some Psycholinguistic Investigations

Chapter

Jan 1987

Syllables are identified faster than features or phonemes, at any rate in initial position of the target-item (Savin & Bever, 1970). This demonstration has led many authors to suggest that the syllable is the basic segment in speech perception and that phonemes can only be derived from the analysis of the perceptually primary segment, namely, the syllable. Considerable disagreement remains as to how this observation ought to be interpreted. Savin & Bever considered, that the phoneme had linguistic rather than psychological reality. Their interpretation, however, came up against considerable criticism from several authors including McNeil & Lindig (1973), Healy & Cutting (1976), Foss & Swinney (1973).

Segmented binaural presentation as a means to examine lexical substructure

Article

Dec 2015

We present an auditory presentation technique called segmented binaural presentation. The technique builds on the dichotic listening paradigm (Shankweiler & Studdert-Kennedy, 1967; Studdert-Kennedy & Shankweiler, 1970) and segmented lexical presentation (Libben, 2003; Betram, Kuperman, Baayen, & Hyönä, 2011). The technique allows the first part of a word to be presented to one ear and the second part of the word to be presented to the other ear. The experimenter may thus manipulate whether a stimulus is segmented in this binaural manner and, if it is segmented, the location of the binaural segmentation within the word. We discuss how the technique may be implemented on the Macintosh platform, using PsyScope and freely available software for audio file creation. We also report on a test implementation of the technique using suffixed and compound English words in a lexical decision task. Results suggest that the technique differentiates between segmentation that occurs within and between compound constituents.

The Psycholinguistics of Spoken Word Recognition

Article

Jan 1999

The process of mapping acoustic-phonetic level input to a lexical representation is multi-faceted. Models of spoken word recognition provide a variety of processing architectures and make different assumption(s) regarding the unit(s) of representation used in the exchange of information from signal-to-word and the nature of information flow through the system. The current models provide a backdrop for a discussion of some of the advances and debates in the field. Some of the issues considered are: early versus delayed commitment to a lexical hypothesis, consequences of multiple activation, segmentation and lexical access, the processing and representation of phonological variants, and the role of attention in spoken word recognition.

Phoneme Monitoring

Article

Full-text available

Dec 1996

Phoneme monitoring studies from 1969 to 1996 are reviewed and grouped in terms of issues that have been addressed with the task. These issues include the contribution of the lexicon to speech perception, processing complexity, attention, contribution of prosodic information, and the basic unit of speech perception. Within each issue, task demands and artifactual variables have been identified and highlighted.

Function and process in spoken word-recognition: A tutorial review

Chapter

Full-text available

Jan 1984

William D Marslen-Wilson

Audiovisual benefit for recognition of speech presented with single-talker noise in older listeners

Article

Sep 2012

Older listeners are more affected than younger listeners in their recognition of speech in adverse conditions, such as when they also hear a single-competing speaker. In the present study, we investigated with a speeded response task whether older listeners with various degrees of hearing loss benefit under such conditions from also seeing the speaker they intend to listen to. We also tested, at the same time, whether older adults need postperceptual processing to obtain an audiovisual benefit. When tested in a phoneme-monitoring task with single-talker noise present, older (and younger) listeners detected target phonemes more reliably and more rapidly in meaningful sentences uttered by the target speaker when they also saw the target speaker. This suggests that older adults processed audiovisual speech rapidly and efficiently enough to benefit already during spoken sentence processing. Audiovisual benefits for older adults were similar in size to those observed for younger adults in terms of response latencies, but smaller for detection accuracy. Older adults with more hearing loss showed larger audiovisual benefits. Attentional abilities predicted the size of audiovisual response time benefits in both age groups. Audiovisual benefits were found in both age groups when monitoring for the visually highly distinct phoneme /p/ and when monitoring for the visually less distinct phoneme /k/. Visual speech thus provides segmental information about the target phoneme, but also provides more global contextual information that helps both older and younger adults in this adverse listening situation.

Auditory Processing of Prefixed English Words Is Both Continuous and Decompositional

Article

Oct 1997

Lee H. Wurm

Two experiments compared continuous and discontinuous models of word recognition. Participants heard prefixed words whose full-form and root uniqueness points (UPs) differed, in either a gating or lexical decision paradigm. Identification points and reaction times were analyzed using multiple regression. Full-form UPs predicted performance better than root UPs did. Full-form frequency measures had reliable facilitative relationships with performance while root frequency measures were not consistently significant. Prefix frequency had a reliable, inhibitory effect. Judged prefixedness, semantic transparency, and prefix likelihood were related to performance, alone or in interaction. The results provide evidence for both kinds of word recognition procedures. A model is proposed with two parallel recognition routines: a whole-word routine and a decompositional routine that considers only unbound roots that can combine with the prefix in question. A preliminary rating study provides stimulus values on several dimensions and can be used as a database by other researchers.

The Role of Syllables in Speech Processing: Infant and Adult Data

Article

Oct 1981
PHILOS T R SOC B

An empirical account is offered of some of the constants that infants and adults appear to use in processing speech-like stimuli. From investigations carried out in recent years, it seems that syllable-like sequences act as minimal accessing devices in speech processing. Ss are aware in real time of syllabic structure in words and respond differently to words with the same initial three phonemes if the segmental one is CV/... and the other CVC/.... Likewise, infants seem to be aware that a `good' syllable must have at least one alternation if it is composed of more than one phoneme. When the segment is only one phoneme long, its status is necessarily somewhere between that of the phoneme and the syllable. An important problem that arises with the syllable is that it is an unlikely device for speech acquisition. Indeed, there are a few thousand syllables and the attribution of a given token to a type is far from obvious. Even if physical invariants for syllables in contexts were to be found, the task facing the child still remains one of sorting thousands of types from many more tokens. Issues concerning acquisition versus stable performance will be addressed to further constrain possible models. In addition, I try to show that even though information processing models are useful tools for describing synchronic sections of organisms, the elements that can account for development will have to be uncovered in neighbouring branches.

The Effect of Speaking Rate on the Role of the Uniqueness Point in Spoken Word Recognition

Article

Apr 2000

Using gender decision and shadowing tasks, we compared recognition of French nouns with early or late uniqueness points (UP) that were articulated at three different rates. With gender decision, the medium rate (3.6 syllables (syll)/s), which is close to that used by Radeau, Mousty, and Bertelson (1989), gave rise to a comparable UP location effect. The effect increased at the slower rate (2.2 syll/s), but disappeared at the faster rate (5.6 syll/s). With shadowing, only the slow rate gave rise to a UP effect. A similar pattern of results was found using speech that was linearly compressed or expanded. Because the fast rate is close to that typical of conversational speech, the present results cast doubt on the relevance of the UP in the processing of fluent speech. The implications of rate effects for models of spoken word recognition are discussed.

Monitoring around the relative clause

Article

Full-text available

Jun 1980

This article reports two experiments which examined the utility of the phoneme monitoring technique for studying syntactic processing of sentences. In French, by using self-embedded relative clauses, it is possible to isolate and examine the effect of a syntactic cue while controlling the factors known to effect phoneme detection times. Monitoring within and after the relative clause led to significant differences in phoneme detection times for reversible subject and object relatives only after the clause boundary. These results demonstrate the sensitivity of the phoneme monitoring task to syntactic processing and are taken to reflect structural calculations of the underlying grammatical relations for the reversible object relatives. When lexical information was introduced with nonreversible relatives, there was no longer a difference between the detection times for subject and object relatives after the clause boundary. Thus, it appears that lexical information can be used in the attribution of underlying grammatical roles.

Prosodic Phrasing and Comprehension

Article

Oct 1997

From previous research we know that prosodic features are perceptually effective in marking boundaries and that a suitable implementation of these features improves the quality of synthetic speech in terms of acceptability. It can further be assumed that listeners use the perceived prosodic information to compute the meaning of the input speech. This paper, therefore, investigates and determines whether a well-phrased utterance, (that is, an utterance with prosodic boundaries in appropriate positions and with appropriate realizations), is easier to comprehend than a poorly-phrased one. To measure this, we designed a method in which a kind of verification task is combined with a question-answering task ("monitoring for the answer"). The stimulus set consisted of structurally ambiguous sentences. The expectation was that when listeners hear a question followed by an appropriately phrased utterance, they will react more rapidly than when the question is followed by an utterance with neutral phrasing. Also, it was expected that in the latter situation reaction times (RTs) will be shorter than if an inappropriately phrased utterance is presented. The results confirmed the expectations: an appropriately phrased utterance always produced the fastest RTs.

Monitoring sentence comprehension

No full-text available

Recommended publications

Monitoring and alarm management for system and network security a web-based comprehensive approach.

In-machine Fault Monitoring Mechanism and Maintainability Improvement of Varian Linear Accelerator

Embargo

Advanced defect-detection methods for CMP process modules in semiconductor manufacturing