ArticlePDF Available

Language-specific phoneme representations revealed by electric and magnetic brain responses

Authors:

Abstract

There is considerable debate about whether the early processing of sounds depends on whether they form part of speech. Proponents of such speech specificity postulate the existence of language-dependent memory traces, which are activated in the processing of speech but not when equally complex, acoustic non-speech stimuli are processed. Here we report the existence of these traces in the human brain. We presented to Finnish subjects the Finnish phoneme prototype /e/ as the frequent stimulus, and other Finnish phoneme prototypes or a non-prototype (the Estonian prototype /õ/) as the infrequent stimulus. We found that the brain's automatic change-detection response, reflected electrically as the mismatch negativity (MMN), was enhanced when the infrequent, deviant stimulus was a prototype (the Finnish /ö/) relative to when it was a non-prototype (the Estonian /õ/). These phonemic traces, revealed by MMN, are language-specific, as /õ/ caused enhancement of MMN in Estonians. Whole-head magnetic recordings located the source of this native-language, phoneme-related response enhancement, and thus the language-specific memory traces, in the auditory cortex of the left hemisphere.
... At 12 months of age, the latency may be slightly shorter due to maturation. IV) Because AD is an actual word that could have been learned by the infants in their normal language environment, AD may elicit an enhanced response representing word or word-form recognition (Pulvermüller et al., 2001; for the long-term memory contribution to the MMN, see also Näätänen et al., 1997;Winkler et al., 1999;Ylinen et al., 2010). Based on our previous study in the same age group delivering the same stimulus as in the current study (but with a different kind of context in the sound sequence; Ylinen et al., 2017), we expected the word recognition response to be of negative polarity. ...
Article
Full-text available
During the first year of life, infants start to learn the lexicon of their native language. Word learning includes the establishment of longer-term representations for the phonological form and the meaning of the word in the brain, as well as the link between them. However, it is not known how the brain processes word forms immediately after they have been learned. We familiarized 12-month-old infants (N = 52) with two pseudowords and studied their neural signatures. Specifically, we determined whether a newly learned word form elicits neural signatures similar to those observed when a known word is recognized (i.e., when a well-established word representation is activated, eliciting enhanced mismatch responses) or whether the processing of a newly learned word form shows the suppression of the neural response along with the principles of predictive coding of a learned rule (i.e., the order of the syllables of the new word form). The pattern of results obtained in the current study suggests that recognized word forms elicit a mismatch response of negative polarity, similar to newly learned and previously known words with an established representation in long-term memory. In contrast, prediction errors caused by acoustic novelty or deviation from the expected order in a sequence of (pseudo)words elicit responses of positive polarity. This suggests that electric brain activity is not fully explained by the predictive coding framework.
... We designed a simple AX auditory language discrimination task that relied on an automatic linguistic process rather than explicit learning 62,63 . We focused on a low-level language component, phonological processing, in a task optimized to induce performance differences between controls and HD mutation carriers. ...
Article
Full-text available
Cognitive reserve is the ability to actively cope with brain deterioration and delay cognitive decline in neurodegenerative diseases. It operates by optimizing performance through differential recruitment of brain networks or alternative cognitive strategies. We investigated cognitive reserve using Huntington’s disease (HD) as a genetic model of neurodegeneration to compare premanifest HD, manifest HD, and controls. Contrary to manifest HD, premanifest HD behave as controls despite neurodegeneration. By decomposing the cognitive processes underlying decision making, drift diffusion models revealed a response profile that differs progressively from controls to premanifest and manifest HD. Here, we show that cognitive reserve in premanifest HD is supported by an increased rate of evidence accumulation compensating for the abnormal increase in the amount of evidence needed to make a decision. This higher rate is associated with left superior parietal and hippocampal hypertrophy, and exhibits a bell shape over the course of disease progression, characteristic of compensation.
... The larger MMN to the language-specific across-category difference suggests that the processing system considers not only the physical differences between the speech sounds, but also the native long-term linguistic representations. In their seminal paper, Näätänen et al. (1997) investigated the phoneme representations more closely in another cross-linguistic MMN study. Using MEG, the authors studied the processing of prototypical speech sounds in Finnish and Estonian. ...
Article
Full-text available
The speech multi-feature MMN (Mismatch Negativity) offers a means to explore the neurocognitive background of the processing of multiple speech features in a short time, by capturing the time-locked electrophysiological activity of the brain known as event-related brain potentials (ERPs). Originating from Näätänen et al. (Clin Neurophysiol 115:140–144, 2004) pioneering work, this paradigm introduces several infrequent deviant stimuli alongside standard ones, each differing in various speech features. In this study, we aimed to refine the multi-feature MMN paradigm used previously to encompass both segmental and suprasegmental (prosodic) features of speech. In the experiment, a two-syllable long pseudoword was presented as a standard, and the deviant stimuli included alterations in consonants (deviation by place or place and mode of articulation), vowels (deviation by place or mode of articulation), and stress pattern in the first syllable of the pseudoword. Results indicated the emergence of MMN components across all segmental and prosodic contrasts, with the expected fronto-central amplitude distribution. Subsequent analyses revealed subtle differences in MMN responses to the deviants, suggesting varying sensitivity to phonetic contrasts. Furthermore, individual differences in MMN amplitudes were noted, partially attributable to participants’ musical and language backgrounds. These findings underscore the utility of the multi-feature MMN paradigm for rapid and efficient investigation of the neurocognitive mechanisms underlying speech processing. Moreover, the paradigm demonstrated the potential to be used in further research to study the speech processing abilities in various populations.
... Exposure to a specific data set alters the brain by establishing neural connections that commit the brain to processing information in an ideal way for that particular input (e.g., one's first language). Neural commitment functions as a filter that affects future processing ( Cheour et al., 1998;Kuhl, 1991;Kuhl, Williams, Lacerda, Stevens, et al., 1992;Naatanen, .2.2 ...
Article
When listeners hear a voice, they rapidly form a complex first impression of who the person behind that voice might be. We characterize how these multivariate first impressions from voices emerge over time across different levels of abstraction using electroencephalography and representational similarity analysis. We find that for eight perceived physical (gender, age, and health), trait (attractiveness, dominance, and trustworthiness), and social characteristics (educatedness and professionalism), representations emerge early (~80 ms after stimulus onset), with voice acoustics contributing to those representations between ~100 ms and 400 ms. While impressions of person characteristics are highly correlated, we can find evidence for highly abstracted, independent representations of individual person characteristics. These abstracted representationse merge gradually over time. That is, representations of physical characteristics (age, gender) arise early (from ~120 ms), while representations of some trait and social characteristics emerge later (~360 ms onward). The findings align with recent theoretical models and shed light on the computations underpinning person perception from voices.
Article
Full-text available
Ss can discriminate phonemes presented singly and in random order. Ss discriminated better between speech sounds to which they have attached different phonemic labels than between sounds which they normally put in the same phoneme class. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
A 122-channel neuromagnetometer with a helmet-shaped detector array covering the entire head allows simultaneous recording of magnetic fields over the whole cortex. The instrument has 122 planar first-order gradiometers in dual units at 61 measurement sites. The SQUIDs are directly coupled to the read-out electronics, with amplifier noise cancellation to eliminate the need for separate preamplifiers inside the magnetically shielded room. The authors analyze the performance of the device and compare it with traditional axial gradiometer arrays by considering signal-to-noise ratios, spatial sampling theory, confidence intervals for equivalent current dipole fits, and information-theoretical channel capacity. The analysis includes the fact that instrument noise is smaller than the background activity of the brain; the signal-to-noise ratio and the resolution of the planar array are in that case equal to or better than that of an axial array. The number of channels and their spacing are very suitable for neuromagnetic measurements
Article
Full-text available
Magnetoencephalography (MEG) is a noninvasive technique for investigating neuronal activity in the living human brain. The time resolution of the method is better than 1 ms and the spatial discrimination is, under favorable circumstances, 2-3 mm for sources in the cerebral cortex. In MEG studies, the weak 10 fT-1 pT magnetic fields produced by electric currents flowing in neurons are measured with multichannel SQUID (superconducting quantum interference device) gradiometers. The sites in the cerebral cortex that are activated by a stimulus can be found from the detected magnetic-field distribution, provided that appropriate assumptions about the source render the solution of the inverse problem unique. Many interesting properties of the working human brain can be studied, including spontaneous activity and signal processing following external stimuli. For clinical purposes, determination of the locations of epileptic foci is of interest. The authors begin with a general introduction and a short discussion of the neural basis of MEG. The mathematical theory of the method is then explained in detail, followed by a thorough description of MEG instrumentation, data analysis, and practical construction of multi-SQUID devices. Finally, several MEG experiments performed in the authors' laboratory are described, covering studies of evoked responses and of spontaneous activity in both healthy and diseased brains. Many MEG studies by other groups are discussed briefly as well.
Article
Full-text available
Linguistic experience affects phonetic perception. However, the critical period during which experience affects perception and the mechanism responsible for these effects are unknown. This study of 6-month-old infants from two countries, the United States and Sweden, shows that exposure to a specific language in the first half year of life alters infants' phonetic perception.
Article
Abstract The present study analyzed the neural correlates of acoustic stimulus representation in echoic sensory memory. The neural traces of auditory sensory memory were indirectly studied by using the mismatch negativity (MMN), an event-related potential component elicited by a change in a repetitive sound. The MMN is assumed to reflect change detection in a comparison process between the sensory input from a deviant stimulus and the neural representation of repetitive stimuli in echoic memory. The scalp topographies of the MMNs elicited by pure tones deviating from standard tones by either frequency, intensity, or duration varied according to the type of stimulus deviance, indicating that the MMNs for different attributes originate, at least in part, from distinct neural populations in the auditory cortex. This result was supported by dipole-model analysis. If the MMN generator process occurs where the stimulus information is stored, these findings strongly suggest that the frequency, intensity, and duration of acoustic stimuli have a separate neural representation in sensory memory.
Article
A new glottal wave analysis method, Pitch Synchronous Iterative Adaptive Inverse Filtering (PSIAIF) is presented. The algorithm is based on a previously developed method, Iterative Adaptive Inverse Filtering (IAIF), In the IAIF-method the glottal contribution to the speech spectrum is first estimated with an iterative structure. The vocal tract transfer function is modeled after eliminating the average glottal contribution. The glottal excitation is obtained by cancelling the effects of the vocal tract and lip radiation by inverse filtering. In the new PSIAIF-method the glottal pulseform is computed by applying the IAIF-algorithm twice to the same signal . The first IAIF-analysis gives as a result a glottal excitation that spans over several pitch periods. This pulseform is used in order to determine positions and lengths of frames for the pitch synchronous analysis. The final result is obtained by analysing the original speech signal with the IAIF-algorithm one fundamental period at a time. The PSIAIF-algorithm was applied in glottal wave analysis using both synthetic and natural vowels. The results show that the method is able to give a fairly accurate estimate for the glottal flow excluding the analysis of vowels with a low first formant that are produced with a pressed phonation type.
Article
In a dichotic listening situation stimuli were presented one at a time and at random to either ear of the subject at constant inter-stimulus intervals of 800 msec. The subject's task was to detect and count occasional slightly different stimuli in one ear. In Experiment 1, these ‘signal’ stimuli were slightly louder, and in Experiment 2 they had a slightly higher pitch, than the much more frequent, ‘standard’, stimuli. In both experiments signals occured randomly at either ear. Separate evoked potentials from three different locations were recorded for each of the four kinds of stimuli (attended signals, unattended signals, attended standards, unattended standards). Contrary to Hillyard et al. (1973), no early (N1 component) evoked-potential enhancement was observed to stimuli to the attended ear as compared with those to the unattended ear, but there was a later negative shift superimposed on potentials elicited by the former stimuli. This negative shift was considered identical to the N1 enhancement of Hillyard and his colleagues which in the present study was forced, by the longer inter-stimulus interval used, to demonstrate temporal dissociation with the N1 component. The ‘Hillyard effect’ was, consequently, explained as being caused by a superimposition of a CNV kind of negative shift on the evoked potential to the attended stimuli rather than by a growth of the ‘real’ N1 component of the evoked potential.