ArticlePDF Available

Sound Symbolism Scaffolds Language Development in preverbal infants

Authors:

Abstract and Figures

A fundamental question in language development is how infants start to assign meaning to words. Here, using three EEG-based measures of brain activity, we establish that preverbal 11-month-old infants are sensitive to the non-arbitrary correspondences between language sounds and concepts, that is, to sound symbolism. In each trial, infant participants were presented with a visual stimulus (e.g., a round shape) followed by a novel spoken word that either sound-symbolically matched ("moma") or mismatched ("kipi") the shape. Amplitude increase in the gamma band showed perceptual integration of visual and auditory stimuli in the match condition within 300 milliseconds of word onset. Furthermore, phase synchronization between electrodes at around 400 milliseconds revealed intensified large-scale, left-hemispheric communication between brain regions in the mismatch condition as compared to the match condition, indicating heightened processing effort when integration was more demanding. Finally, event-related brain potentials showed an increased adult-like N400 response –an index of semantic integration difficulty– in the mismatch as compared to the match condition. Together, these findings suggest that 11-month-old infants spontaneously map auditory language onto visual experience by recruiting a cross-modal perceptual processing system and a nascent semantic network within the first year of life.
(a) Time-frequency diagrams of grand average AMPz (in the standard deviation unit). AMPz was averaged across all 9 electrodes and 19 infants for the match (a, left) and mismatch (a, middle) conditions. Positive AMPz (in warm colours) indicates standardized changes in the direction of increased synchronization and negative AMPz (in cold colours) indicates decreased synchronization. (a, right) Difference in grand average PLVz across the match and mismatch conditions. Note that soon after the auditory onset (<300 msec), the grand average AMPz for the match condition increased than those for the mismatch condition in the gamma frequency range (at around 35 Hz). (b) Topographic maps of significant difference in AMPz between the match and mismatch conditions. The solid line circles and the dotted line circles show significantly (p < .05, FDR corrected, N = 19) higher AMPz in the match than in the mismatch condition and in the mismatch than in the match condition, respectively. (c) Topographic maps of significant difference in PLVz between the match and mismatch conditions. The solid lines and the dotted line show significantly (p < .05, FDR corrected, N = 19) higher PLVz in the match than in the mismatch condition and in the mismatch than in the match condition, respectively. (d) Time-frequency diagrams of grand average PLVz (in the standard deviation unit). PLVz was averaged across all 36 electrode pairs and 19 infants for the match (d, left) and mismatch (d, middle) conditions. Positive PLVz (in warm colours) indicates standardized changes in the direction of increased synchronization and negative PLVz (in cold colours) indicates decreased synchronization. (d, right) Difference in grand average PLVz across the match and mismatch conditions. Note that after the auditory onset, the grand average PLVz for the mismatch condition display more sustained pattern than for the match condition in the alpha-to-beta frequency range.
… 
Content may be subject to copyright.
A preview of the PDF is not available
... Conversely, voiceless stops (e.g., /k/), affricates (e.g., /tʃ/) and front-unrounded vowels (e.g., /i/) tend to be associated with spiky shapes [10]. These associations emerge on forced choice tasks in a bias to pair nonwords with certain shapes (e.g., [12]), on implicit tasks in facilitated responses to congruent nonword-shape pairings (e.g., [13]), and in different neural responses to congruent vs. incongruent pairings (e.g., [14]). In this study we address a pair of unanswered questions. ...
... There is also neuroimaging evidence of sensitivity to sound symbolism emerging by 12 months. Asano et al. [14] studied 49 11-month-olds with electroencephalogram (EEG) based measures of brain activity. Participants were first shown a shape and then heard a congruent or incongruent nonword. ...
... Although we only observed marginal sensitivity to sound symbolism at 12 months, other studies have found an effect in infants of that age [14,31,32]. If this is indeed the age by which sound symbolism emerges, what factors might contribute to its development? ...
Article
Full-text available
The maluma/takete effect refers to an association between certain language sounds (e.g., /m/ and /o/) and round shapes, and other language sounds (e.g., /t/ and /i/) and spiky shapes. This is an example of sound symbolism and stands in opposition to arbitrariness of language. It is still unknown when sensitivity to sound symbolism emerges. In the present series of studies, we first confirmed that the classic maluma/takete effect would be observed in adults using our novel 3-D object stimuli (Experiments 1a and 1b). We then conducted the first longitudinal test of the maluma/takete effect, testing infants at 4-, 8- and 12-months of age (Experiment 2). Sensitivity to sound symbolism was measured with a looking time preference task, in which infants were shown images of a round and a spiky 3-D object while hearing either a round- or spiky-sounding nonword. We did not detect a significant difference in looking time based on nonword type. We also collected a series of individual difference measures including measures of vocabulary, movement ability and babbling. Analyses of these measures revealed that 12-month olds who babbled more showed a greater sensitivity to sound symbolism. Finally, in Experiment 3, we had parents take home round or spiky 3-D printed objects, to present to 7- to 8-month-old infants paired with either congruent or incongruent nonwords. This language experience had no effect on subsequent measures of sound symbolism sensitivity. Taken together these studies demonstrate that sound symbolism is elusive in the first year, and shed light on the mechanisms that may contribute to its eventual emergence.
... In past studies, the interpretation of PMN and N400 was dependent on the nature of the task and stimulus words. Moreover, previous studies on the sound symbolism of pseudowords mostly involved judgment tasks that use simple pictures of round or angular shapes with no deeper meaning (Kovic et al., 2010;Asano et al., 2015;Sučević et al., 2015). In contrast, the task setting in this study was designed to reflect semantic processing by asking the participants to determine whether the depicted situation or object and matched or mismatched the sound stimuli (existing sound symbolic words or pseudowords). ...
... Therefore, the maximum amplitude between 250 and 350 ms after the sound onset was defined as PMN amplitude (Lee et al., 2012) and its latency was defined as PMN latency. The average amplitude between 350 and 500 ms after the sound onset was defined as N400 amplitude (D'Arcy et al., 2004;Asano et al., 2015;Manfredi et al., 2017). We analyzed PMN and N400 using Matlab R2022b (MathWorks, Inc., USA). ...
Article
Full-text available
Introduction Sound symbolism is the phenomenon of sounds having non-arbitrary meaning, and it has been demonstrated that pseudowords with sound symbolic elements have similar meaning to lexical words. It is unclear how the impression given by the sound symbolic elements is semantically processed, in contrast to lexical words with definite meanings. In event-related potential (ERP) studies, phonological mapping negativity (PMN) and N400 are often used as measures of phonological and semantic processing, respectively. Therefore, in this study, we analyze PMN and N400 to clarify the differences between existing sound symbolic words (onomatopoeia or ideophones) and pseudowords in terms of semantic and phonological processing. Methods An existing sound symbolic word and pseudowords were presented as an auditory stimulus in combination with a picture of an event, and PMN and N400 were measured while the subjects determined whether the sound stimuli and pictures match or mismatch. Results In both the existing word and pseudoword tasks, the amplitude of PMN and N400 increased when the picture of an event and the speech sound did not match. Additionally, compared to the existing words, the pseudowords elicited a greater amplitude for PMN and N400. In addition, PMN latency was delayed in the mismatch condition relative to the match condition for both existing sound symbolic words and pseudowords. Discussion We concluded that established sound symbolic words and sound symbolic pseudowords undergo similar semantic processing. This finding suggests that sound symbolism pseudowords are not judged on a simple impression level (e.g., spiky/round) or activated by other words with similar spellings (phonological structures) in the lexicon, but are judged on a similar contextual basis as actual words.
... Although several studies have considered sound-shape correspondences between heard nonsense words (e.g., bouba/kiki) and seen abstract shapes in early development (Asano et al., 2015;Chow et al., 2021;Fort et al., 2018;Maurer et al., 2006;Ozturk et al., 2013;Pejovic & Molnar, 2017;Tzeng et al., 2017), less is known about sound-shape correspondences between heard nonsense words and abstract shapes that are touched but not seen, audio-tactile (AT) associations. Prior studies highlight the role of visual experience in establishing correspondences between other senses, namely AT associations. ...
... Studies on crossmodal correspondences constitute a fruitful area of investigations in which psychologists consider, inter alia, the influence of correspondences on behavioural 16 performance (Brunetti et al. 2017), cultural differences in experienced correspondences (Wan et al. 2014), and the emergence of correspondences in ontogenetic development (Asano et al. 2015). It is believed that crossmodal correspondences occur mainly due to similarities in the way corresponding elements are represented, the learned statistical associations between corresponding elements, and possession of common semantic labels, as in the case of 'high' sounds and 'high' visual elements (Deroy and Spence 2016;Parise 2016;Spence 2011). ...
Chapter
Full-text available
It is common to characterize pain with touch-related terms, like ‘cutting’, ‘pressing’, ‘sharp’, and ‘pulsing’, or temperature-related terms, like ‘hot’ or ‘burning’. This suggests that many pains are phenomenally multimodal because they are experienced as having some tactile-like or thermal-like character. The goal of this chapter is to investigate the structure of phenomenally multimodal pain experiences. It is argued that the usual accounts of multimodal structure proposed in investigations regarding exteroceptive experiences cannot be plausibly applied to multimodal experiences of pain. Instead, an alternative framework is proposed which characterizes the structure of tactile-like and thermal-like pains by referring to the notion of crossmodal correspondences.
Article
This study investigates the crossmodal associations between naturally occurring sound textures and tactile textures. Previous research has demonstrated the association between low-level sensory features of sound and touch, as well as higher-level, cognitively mediated associations involving language, emotions, and metaphors. However, stimuli like textures, which are found in both modalities have received less attention. In this study, we conducted two experiments: a free association task and a two alternate forced choice task using everyday tactile textures and sound textures selected from natural sound categories. The results revealed consistent crossmodal associations reported by participants between the textures of the two modalities. They tended to associate more sound textures (e.g., wood shavings and sandpaper) with tactile surfaces that were rated as harder, rougher, and intermediate on the sticky-slippery scale. While some participants based the auditory-tactile association on sensory features, others made the associations based on semantic relationships, co-occurrence in nature, and emotional mediation. Interestingly, the statistical features of the sound textures (mean, variance, kurtosis, power, autocorrelation, and correlation) did not show significant correlations with the crossmodal associations, indicating a higher-level association. This study provides insights into auditory-tactile associations by highlighting the role of sensory and emotional (or cognitive) factors in prompting these associations.
Article
Previous studies showed that word learning is affected by children's existing knowledge. For instance, knowledge of semantic category aids word learning, whereas a dense phonological neighbourhood impedes learning of similar-sounding words. Here, we examined to what extent children associate similar-sounding words (e.g., rat and cat) with objects of the same semantic category (e.g., both are animals), that is, to what extent children assume meaning overlap given form overlap between two words. We tested this by first presenting children (N = 93, Mage = 22.4 months) with novel word-object associations. Then, we examined the extent to which children assume that a similar sounding novel label, that is, a phonological neighbour, refers to a similar looking object, that is, a likely semantic neighbour, as opposed to a dissimilar looking object. Were children to preferentially fixate the similar-looking novel object, it would suggest that systematic word form-meaning relations aid referent selection in young children. While we did not find any evidence for such word form-meaning systematicity, we demonstrated that children showed robust learning for the trained novel word-object associations, and were able to discriminate between similar-sounding labels and also similar-looking objects. Thus, we argue that unlike iconicity which appears early in vocabulary development, we find no evidence for systematicity in early referent selection.
Article
Kemmerer captured the drastic change in theories of word meaning representations, contrasting the view that word meaning representations are amodal and universal, with the view that they are grounded and language-specific. However, he does not address how language can be simultaneously grounded and language-specific. Here, we approach this question from the perspective of language acquisition and evolution. We argue that adding a new element-iconicity-is critically beneficial and offer the iconicity ring hypothesis, which explains how language-specific, secondary iconicity might emerge from biologically grounded and universally shared iconicity in the course of language acquisition and evolution.
Article
Infants experience language in rich multisensory environments. For example, they may first be exposed to the word applesauce while touching, tasting, smelling, and seeing applesauce. In three experiments using different methods we asked whether the number of distinct senses linked with the semantic features of objects would impact word recognition and learning. Specifically, in Experiment 1 we asked whether words linked with more multisensory experiences were learned earlier than words linked fewer multisensory experiences. In Experiment 2, we asked whether 2-year-olds' known words linked with more multisensory experiences were better recognized than those linked with fewer. Finally, in Experiment 3, we taught 2-year-olds labels for novel objects that were linked with either just visual or visual and tactile experiences and asked whether this impacted their ability to learn the new label-to-object mappings. Results converge to support an account in which richer multisensory experiences better support word learning. We discuss two pathways through which rich multisensory experiences might support word learning.
Article
Full-text available
We investigated grapheme--colour synaesthesia and found that: (1) The induced colours led to perceptual grouping and pop-out, (2) a grapheme rendered invisible through `crowding' or lateral masking induced synaesthetic colours --- a form of blindsight --- and (3) peripherally presented graphemes did not induce colours even when they were clearly visible. Taken collectively, these and other experiments prove conclusively that synaesthesia is a genuine perceptual phenomenon, not an effect based on memory associations from childhood or on vague metaphorical speech. We identify different subtypes of number--colour synaesthesia and propose that they are caused by hyperconnectivity between colour and number areas at different stages in processing; lower synaesthetes may have cross-wiring (or cross-activation) within the fusiform gyrus, whereas higher synaesthetes may have cross-activation in the angular gyrus. This hyperconnectivity might be caused by a genetic mutation that causes defective pruning of connections between brain maps. The mutation may further be expressed selectively (due to transcription factors) in the fusiform or angular gyri, and this may explain the existence of different forms of synaesthesia. If expressed very diffusely, there may be extensive cross-wiring between brain regions that represent abstract concepts, which would explain the link between creativity, metaphor and synaesthesia (and the higher incidence of synaesthesia among artists and poets). Also, hyperconnectivity between the sensory cortex and amygdala would explain the heightened aversion synaesthetes experience when seeing numbers printed in the `wrong' colour. Lastly, kindling (induced hyperconnectivity in the temporal lobes of temporal lobe epilepsy [TLE] patients) may explain the purp...
Article
Full-text available
Sound symbolism, or the nonarbitrary link between linguistic sound and meaning, has often been discussed in connection with language evolution, where the oral imitation of external events links phonetic forms with their referents (e.g., Ramachandran & Hubbard, 2001). In this research, we explore whether sound symbolism may also facilitate synchronic language learning in human infants. Sound symbolism may be a useful cue particularly at the earliest developmental stages of word learning, because it potentially provides a way of bootstrapping word meaning from perceptual information. Using an associative word learning paradigm, we demonstrated that 14-month-old infants could detect Köhler-type (1947) shape-sound symbolism, and could use this sensitivity in their effort to establish a word-referent association.
Article
Full-text available
Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infants are sensitive to sound symbolism, due to a biologically endowed ability to map and integrate multi-modal input, (ii) sound symbolism helps infants gain referential insight for speech sounds, (iii) sound symbolism helps infants and toddlers associate speech sounds with their referents to establish a lexical representation and (iv) sound symbolism helps toddlers learn words by allowing them to focus on referents embedded in a complex scene, alleviating Quine's problem. We further explore the possibility that sound symbolism is deeply related to language evolution, drawing the parallel between historical development of language across generations and ontogenetic development within individuals. Finally, we suggest that sound symbolism bootstrapping is a part of a more general phenomenon of bootstrapping by means of iconic representations, drawing on similarities and close behavioural links between sound symbolism and speech-accompanying iconic gesture.
Article
The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to problems of multiple significance testing is presented. It calls for controlling the expected proportion of falsely rejected hypotheses — the false discovery rate. This error rate is equivalent to the FWER when all hypotheses are true but is smaller otherwise. Therefore, in problems where the control of the false discovery rate rather than that of the FWER is desired, there is potential for a gain in power. A simple sequential Bonferronitype procedure is proved to control the false discovery rate for independent test statistics, and a simulation study shows that the gain in power is substantial. The use of the new procedure and the appropriateness of the criterion are illustrated with examples.
Article
Language dominance and factors that influence language lateralization were investigated in right-handed, neurologically normal subjects ( n = 100) and right-handed epilepsy patients ( n = 50) using functional MRI. Increases in blood oxygenation-dependent signal during a semantic language activation task relative to a non-linguistic, auditory discrimination task provided an index of language system lateralization. As expected, the majority of both groups showed left hemisphere dominance, although a continuum of activation asymmetry was evident, with nearly all subjects showing some degree of right hemisphere activation. Using a categorical dominance classification, 94% of the normal subjects were considered left hemisphere dominant and 6% had bilateral, roughly symmetric language representation. None of the normal subjects had rightward dominance. There was greater variability of language dominance in the epilepsy group, with 78% showing left hemisphere dominance, 16% showing a symmetric pattern and 6% showing right hemisphere dominance. Atypical language dominance in the epilepsy group was associated with an earlier age of brain injury and with weaker right hand dominance. Language lateralization in the normal group was weakly related to age, but was not significantly related to sex, education, task performance or familial left-handedness.