Fig 1 - uploaded by Jerker Rönnberg
Content may be subject to copyright.
A conceptual summary of a working memory model for poorly specified linguistic input. The model has three parts: amodal constraints, semi-abstract phonological processing, and multimodal input. 

A conceptual summary of a working memory model for poorly specified linguistic input. The model has three parts: amodal constraints, semi-abstract phonological processing, and multimodal input. 

Source publication
Article
Full-text available
This paper is an overview of the research we have carried out at Linköping University on cognitive aspects of visual language processing and related communicative forms. In the first section, a cognitive individual difference perspective on speechreading is introduced. In the second, compensatory perceptual mechanisms are discussed on the basis of...

Citations

... It was observed that cognitive functions like lexical access speed (Rönnberg, 1990), executive functions (Andersson and Lidestam, 2005), and inference-making capacity (Lyxell and Rönnberg, 1987) were associated with speech perception and understanding (reviewed in Rönnberg et al., 1998Rönnberg et al., , 2021Rönnberg, 2003). The data pattern withstood many experimental variations, especially in difficult speech-in-noise conditions (reviewed by Lyxell et al., 1996;Gatehouse et al., 2003Gatehouse et al., , 2006Akeroyd, 2008;Arlinger et al., 2009;Lunner et al., 2009;Besser et al., 2013), where WMC played the dominating role. ...
Article
Full-text available
The review gives an introductory description of the successive development of data patterns based on comparisons between hearing-impaired and normal hearing participants’ speech understanding skills, later prompting the formulation of the Ease of Language Understanding (ELU) model. The model builds on the interaction between an input buffer (RAMBPHO, Rapid Automatic Multimodal Binding of PHOnology) and three memory systems: working memory (WM), semantic long-term memory (SLTM), and episodic long-term memory (ELTM). RAMBPHO input may either match or mismatch multimodal SLTM representations. Given a match, lexical access is accomplished rapidly and implicitly within approximately 100–400 ms. Given a mismatch, the prediction is that WM is engaged explicitly to repair the meaning of the input – in interaction with SLTM and ELTM – taking seconds rather than milliseconds. The multimodal and multilevel nature of representations held in WM and LTM are at the center of the review, being integral parts of the prediction and postdiction components of language understanding. Finally, some hypotheses based on a selective use-disuse of memory systems mechanism are described in relation to mild cognitive impairment and dementia. Alternative speech perception and WM models are evaluated, and recent developments and generalisations, ELU model tests, and boundaries are discussed.
... When we first proposed the ELU model (Rönnberg, 2003;Rönnberg et al., 1998Rönnberg et al., , 2008Rudner et al., 2009Rudner et al., , 2008, we were interested in describing a mechanism that explains why language understanding in some conditions demands extra allocation of cognitive resources, while in other conditions language processing takes place smoothly and effortlessly. To do this, we relied on the three memory systems briefly described above. ...
Article
Full-text available
Purpose The purpose of this study was to conceptualize the subtle balancing act between language input and prediction (cognitive priming of future input) to achieve understanding of communicated content. When understanding fails, reconstructive postdiction is initiated. Three memory systems play important roles: working memory (WM), episodic long-term memory (ELTM), and semantic long-term memory (SLTM). The axiom of the Ease of Language Understanding (ELU) model is that explicit WM resources are invoked by a mismatch between language input—in the form of rapid automatic multimodal binding of phonology—and multimodal phonological and lexical representations in SLTM. However, if there is a match between rapid automatic multimodal binding of phonology output and SLTM/ELTM representations, language processing continues rapidly and implicitly. Method and Results In our first ELU approach, we focused on experimental manipulations of signal processing in hearing aids and background noise to cause a mismatch with LTM representations; both resulted in increased dependence on WM. Our second—and main approach relevant for this review article—focuses on the relative effects of age-related hearing loss on the three memory systems. According to the ELU, WM is predicted to be frequently occupied with reconstruction of what was actually heard, resulting in a relative disuse of phonological/lexical representations in the ELTM and SLTM systems. The prediction and results do not depend on test modality per se but rather on the particular memory system. This will be further discussed. Conclusions Related to the literature on ELTM decline as precursors of dementia and the fact that the risk for Alzheimer's disease increases substantially over time due to hearing loss, there is a possibility that lowered ELTM due to hearing loss and disuse may be part of the causal chain linking hearing loss and dementia. Future ELU research will focus on this possibility.
... sentences were presented at 5, 10, and 15 dB SL in AV condition. Jonghyun Kwak, et al. • Stentence Recognition in Visual-Only and Audiotry-Visual Conditions (Grant & Seitz, 2000;Lansing & Helgeson, 1995;Marslen-Wilson, Moss, & van Halen, 1996;Rönnberg, et al., 1998) ...
... Deprived of the auditory input, the linguistic channels of LSE participants are limited to lip movements and signs. Broadly speaking, accuracy in lipreading is estimated in a range of 5%-45% and decreases to 5% for words in sentences without appropriate contextual constraints (Rönnberg, Andersson, et al., 1998). In our study, participants of the LSE group had poorer lipreading skills, significantly worse than the CI group. ...
Article
Purpose The use of sign-supported speech (SSS) in the education of deaf students has been recently discussed in relation to its usefulness with deaf children using cochlear implants. To clarify the benefits of SSS for comprehension, 2 eye-tracking experiments aimed to detect the extent to which signs are actively processed in this mode of communication. Method Participants were 36 deaf adolescents, including cochlear implant users and native deaf signers. Experiment 1 attempted to shift observers' foveal attention to the linguistic source in SSS from which most information is extracted, lip movements or signs, by magnifying the face area, thus modifying lip movements perceptual accessibility (magnified condition), and by constraining the visual field to either the face or the sign through a moving window paradigm (gaze contingent condition). Experiment 2 aimed to explore the reliance on signs in SSS by occasionally producing a mismatch between sign and speech. Participants were required to concentrate upon the orally transmitted message. Results In Experiment 1, analyses revealed a greater number of fixations toward the signs and a reduction in accuracy in the gaze contingent condition across all participants. Fixations toward signs were also increased in the magnified condition. In Experiment 2, results indicated less accuracy in the mismatching condition across all participants. Participants looked more at the sign when it was inconsistent with speech. Conclusions All participants, even those with residual hearing, rely on signs when attending SSS, either peripherally or through overt attention, depending on the perceptual conditions. Supplemental Material https://doi.org/10.23641/asha.8121191
... These data prompted us to look for other ways to try to explain at least parts of the large variability in speech understanding observed across individuals (Rönnberg et al., 1998a, for an overview). In a set of studies, we identified the following predictor variables: verbal inference-making (sentence completion, Lyxell and Rönnberg, 1987, 1989), context-free word decoding (Lyxell and Rönnberg, 1991), and information processing speed that relies on semantic long-term memory (LTM; e.g., lexical access speed, Rönnberg, 1990; as well as rhyme decision speed; Lyxell et al., 1994; Rönnberg et al., 1998b). ...
Article
Full-text available
Working memory is important for online language processing during conversation. We use it to maintain relevant information, to inhibit or ignore irrelevant information, and to attend to conversation selectively. Working memory helps us to keep track of and actively participate in conversation, including taking turns and following the gist. This paper examines the Ease of Language Understanding model (i.e., the ELU model, Rönnberg, 2003; Rönnberg et al., 2008) in light of new behavioral and neural findings concerning the role of working memory capacity (WMC) in uni-modal and bimodal language processing. The new ELU model is a meaning prediction system that depends on phonological and semantic interactions in rapid implicit and slower explicit processing mechanisms that both depend on WMC albeit in different ways. It is based on findings that address the relationship between WMC and (a) early attention processes in listening to speech, (b) signal processing in hearing aids and its effects on short-term memory, (c) inhibition of speech maskers and its effect on episodic long-term memory, (d) the effects of hearing impairment on episodic and semantic long-term memory, and finally, (e) listening effort. New predictions and clinical implications are outlined. Comparisons with other WMC and speech perception models are made.
... It has been suggested that the relevance of working memory in speech understanding increases as the signal is more degraded by background noise (Pichora-Fuller et al., 1995;Rönnberg, 2003). In addition, an earlier study has indicated that cognitive compensatory mechanisms enhancing visual speech understanding may particularly support individuals with an early onset of hearing loss (Rönnberg et al., 1998). The onset of hearing loss and its potential modulating effect on the relationship between hearing loss and the performance on tests of memory and attention was not investigated in the current study. ...
Article
Full-text available
This explorative study investigated the relationship between auditory and cognitive abilities and self-reported hearing disability. Thirty-two adults with mild to moderate hearing loss completed the Amsterdam Inventory of Auditory Disability and Handicap (AIADH) and performed the Text Reception Threshold (TRT) test and tests of Spatial Working-Memory (SWM) test and visual sustained-attention. Regression analyses examined the predictive value of age, hearing thresholds (PTA), speech perception in noise (SRTN) and the cognitive tests for the five Amsterdam Inventory factors. Besides the variance explained by age, PTA and SRTN, cognitive abilities were related to each hearing factor. The reported difficulties with sound detection and speech perception in quiet were less severe for subjects with higher age, lower PTAs and better TRTs. Fewer sound localization and speech perception in noise problems were reported by subjects with better SRTNs and smaller SWM. Fewer sound discrimination difficulties were reported by subjects with better SRTNs, TRTs and smaller SWM. The results suggest a general role of the ability to read partly masked text in subjective hearing. Large working memory was associated with more reported hearing difficulties. This study shows that besides auditory variables and age, cognitive abilities are related to self-reported hearing disability.
... A second major effort to explain differences in speechrecognition performance in noise for listeners with similar audiograms has focused on the effects of aging, in general, and cognitive function related to working memory, attention, and speed of processing, in particular (e.g., van Rooij and Plomp, 1990;Gordon-Salant and Fitzgibbons, 1997;Rönnberg et al, 1998;Lunner, 2003;Rönnberg et al, 2008). Listening in complex backgrounds composed of several noise sources or more than one talker requires that the audible portions of the target speech signal be combined to identify the actual message. ...
... Further, because the typical conversational speech rate is roughly six syllables per second (Yavas x, 2011), whatever auditory and cognitive processing of the message is required to reconstruct the missing (or misheard) bits of speech must be done quickly enough so as not to fall behind as the talker continues to speak. Given these basic requirements, cognitive abilities related to working memory, atten-tion, speed of processing, use of context and linguistic knowledge will all be important in predicting speechrecognition performance in noise and patient satisfaction with hearing aids (Rönnberg et al, 1998(Rönnberg et al, , 2008. ...
Article
Full-text available
Background: Hearing-impaired (HI) individuals with similar ages and audiograms often demonstrate substantial differences in speech-reception performance in noise. Traditional models of speech intelligibility focus primarily on average performance for a given audiogram, failing to account for differences between listeners with similar audiograms. Improved prediction accuracy might be achieved by simulating differences in the distortion that speech may undergo when processed through an impaired ear. Although some attempts to model particular suprathreshold distortions can explain general speech-reception deficits not accounted for by audibility limitations, little has been done to model suprathreshold distortion and predict speech-reception performance for individual HI listeners. Auditory-processing models incorporating individualized measures of auditory distortion, along with audiometric thresholds, could provide a more complete understanding of speech-reception deficits by HI individuals. A computational model capable of predicting individual differences in speech-recognition performance would be a valuable tool in the development and evaluation of hearing-aid signal-processing algorithms for enhancing speech intelligibility. Purpose: This study investigated whether biologically inspired models simulating peripheral auditory processing for individual HI listeners produce more accurate predictions of speech-recognition performance than audiogram-based models. Research design: Psychophysical data on spectral and temporal acuity were incorporated into individualized auditory-processing models consisting of three stages: a peripheral stage, customized to reflect individual audiograms and spectral and temporal acuity; a cortical stage, which extracts spectral and temporal modulations relevant to speech; and an evaluation stage, which predicts speech-recognition performance by comparing the modulation content of clean and noisy speech. To investigate the impact of different aspects of peripheral processing on speech predictions, individualized details (absolute thresholds, frequency selectivity, spectrotemporal modulation [STM] sensitivity, compression) were incorporated progressively, culminating in a model simulating level-dependent spectral resolution and dynamic-range compression. Study sample: Psychophysical and speech-reception data from 11 HI and six normal-hearing listeners were used to develop the models. Data collection and analysis: Eleven individualized HI models were constructed and validated against psychophysical measures of threshold, frequency resolution, compression, and STM sensitivity. Speech-intelligibility predictions were compared with measured performance in stationary speech-shaped noise at signal-to-noise ratios (SNRs) of -6, -3, 0, and 3 dB. Prediction accuracy for the individualized HI models was compared to the traditional audibility-based Speech Intelligibility Index (SII). Results: Models incorporating individualized measures of STM sensitivity yielded significantly more accurate within-SNR predictions than the SII. Additional individualized characteristics (frequency selectivity, compression) improved the predictions only marginally. A nonlinear model including individualized level-dependent cochlear-filter bandwidths, dynamic-range compression, and STM sensitivity predicted performance more accurately than the SII but was no more accurate than a simpler linear model. Predictions of speech-recognition performance simultaneously across SNRs and individuals were also significantly better for some of the auditory-processing models than for the SII. Conclusions: A computational model simulating individualized suprathreshold auditory-processing abilities produced more accurate speech-intelligibility predictions than the audibility-based SII. Most of this advantage was realized by a linear model incorporating audiometric and STM-sensitivity information. Although more consistent with known physiological aspects of auditory processing, modeling level-dependent changes in frequency selectivity and gain did not result in more accurate predictions of speech-reception performance.
... A second major effort to explain differences in speechrecognition performance in noise for listeners with similar audiograms has focused on the effects of aging, in general, and cognitive function related to working memory, attention, and speed of processing, in particular (e.g., van Rooij and Plomp, 1990;Gordon-Salant and Fitzgibbons, 1997;Rönnberg et al, 1998;Lunner, 2003;Rönnberg et al, 2008). Listening in complex backgrounds composed of several noise sources or more than one talker requires that the audible portions of the target speech signal be combined to identify the actual message. ...
... Further, because the typical conversational speech rate is roughly six syllables per second (Yavas x, 2011), whatever auditory and cognitive processing of the message is required to reconstruct the missing (or misheard) bits of speech must be done quickly enough so as not to fall behind as the talker continues to speak. Given these basic requirements, cognitive abilities related to working memory, atten-tion, speed of processing, use of context and linguistic knowledge will all be important in predicting speechrecognition performance in noise and patient satisfaction with hearing aids (Rönnberg et al, 1998(Rönnberg et al, , 2008. ...
Presentation
Full-text available
Hearing technology has advanced to where it is now reasonable to ask whether signal processing algorithms can be developed to compensate for an individual's hearing loss, thus allowing them to hear functionally in a manner similar to persons with normal hearing. Clinically, the pure-tone audiogram is the primary tool used to represent the patient's hearing status. However, it has been well established, experimentally and theoretically, that the audiogram cannot reflect fully all aspects of the hearing loss, most notably that part which pertains to suprathreshold distortion. Much has been written about the distortion component of sensorineural hearing loss, yet there is little agreement on estimating its importance to speech recognition, nor much consensus on which hearing factors (e.g., spectral andor temporal resolution) most accurately represent the distortion. Recent attempts to use biologically inspired models of auditory processing to represent a patient's internal auditory experiences have shown promise as a way to understand the role that suprathreshold distortions might play in speech recognition. This talk reviews recent work to develop auditory models of {\it individual} hearing-impaired listeners to predict differences in the perception of speech and non-speech signals not readily explained by audiometric thresholds. [Work supported by the Oticon Foundation.].
... These studies primarily addressed cognitive predictions and neural correlates of speech and language understanding (seeRö nnberg, 1995; Rö nnberg, 2003b; Rö nnberg et al, 1998afor summaries; but seePichora-Fuller, 2007;and Pichora-Fuller & Singh, 2006for overviews of the general area of cognition and audiology). Three components have proven to be direct predictors of speech understanding, across degree of hearing loss and type of hearing impairment: (1) verbal inference-making (; (2) word decoding (Lyxell & Rö nnberg, 1991); and (3) speed of lexical access (Larsby et al, this issue;Rö nnberg, 1990;Lyxell et al, 1994;Rö nnberg et al, 1998a). Indirect predictors include: visual evoked potentials (the amplitude of which was related to word decoding skill,Rö nnberg et al, 1989), complex WM span, and verbal ability (Lyxell & Rö nnberg, 1992); the latter two being related to verbal inference making (). ...
Article
Full-text available
A general working memory system for ease of language understanding (ELU, Rönnberg, 2003a) is presented. The purpose of the system is to describe and predict the dynamic interplay between explicit and implicit cognitive functions, especially in conditions of poorly perceived or poorly specified linguistic signals. In relation to speech understanding, the system based on (1) the quality and precision of phonological representations in long-term memory, (2) phonologically mediated lexical access speed, and (3) explicit, storage, and processing resources. If there is a mismatch between phonological information extracted from the speech signal and the phonological information represented in long-term memory, the system is assumed to produce a mismatch signal that invokes explicit processing resources. In the present paper, we focus on four aspects of the model which have led to the current, updated version: the language generality assumption; the mismatch assumption; chronological age; and the episodic buffer function of rapid, automatic multimodal binding of phonology (RAMBPHO). We evaluate the language generality assumption in relation to sign language and speech, and the mismatch assumption in relation to signal processing in hearing aids. Further, we discuss the effects of chronological age and the implications of RAMBPHO.
... It has been suggested that the relevance of working memory in speech understanding increases as the signal is more degraded by background noise (Pichora-Fuller et al., 1995;Rönnberg, 2003). In addition, an earlier study has indicated that cognitive compensatory mechanisms enhancing visual speech understanding may particularly support individuals with an early onset of hearing loss (Rönnberg et al., 1998). The onset of hearing loss and its potential modulating effect on the relationship between hearing loss and the performance on tests of memory and attention was not investigated in the current study. ...
Article
Full-text available
This study investigated the relationship between hearing loss and memory and attention when nonverbal, visually presented cognitive tests are used. Hearing loss (pure-tone audiometry) and IQ were measured in 30 participants with mild to severe hearing loss. Participants performed cognitive tests of pattern recognition memory, sustained visual attention, and spatial working memory. All cognitive tests were selected from the Cambridge Neuropsychological Test Automated Battery (CANTAB expedio; Cambridge Cognition Ltd., 2002). Regression analyses were performed to examine the relationship between hearing loss and these cognitive measures of memory and attention when controlling for age and IQ. The data indicate that hearing loss was not associated with decreased performance on the memory and attention tests. In contrast, participants with more severe hearing loss made more use of an efficient strategy during performance on the spatial working memory subtest. This result might reflect the more extensive use of working memory in daily life to compensate for the loss of speech information. The authors conclude that the use of nonverbal tests is essential when testing cognitive functions of individuals with hearing loss.