ArticlePDF Available

Towards a Native OPERA Hypothesis: Musicianship and English Stress Perception

October 2021
Language and Speech 65(1)

October 2021
65(1)

Authors:

The University of Hong Kong

Musical experience facilitates speech perception. French musicians, to whom stress is foreign, have been found to perceive English stress more accurately than French non-musicians. This study investigated whether this musical advantage also applies to native listeners. English musicians and non-musicians completed an English stress discrimination task and two control tasks. With age, non-verbal intelligence and short-term memory controlled, the musicians exhibited a perceptual advantage relative to the non-musicians. This perceptual advantage was equally potent to both trochaic and iambic stress patterns. In terms of perceptual strategy, the two groups showed differential use of acoustic cues for iambic but not trochaic stress. Collectively, the results could be taken to suggest that musical experience enhances stress discrimination even among native listeners. Remarkably, this musical advantage is highly consistent and does not particularly favour either stress pattern. For iambic stress, the musical advantage appears to stem from the differential use of acoustic cues by musicians. For trochaic stress, the musical advantage may be rooted in enhanced durational sensitivity.

The hit and false alarm rates of musicians and non-musicians in the English stress

…

The mean sensitivity index of musicians and non-musicians in the English stress

…

The mean hit rate and response time of musicians and non-musicians for trochaic and

…

Musical experience of the musicians

…

Comparison of age, non-verbal intelligence, and short-term memory between English musicians and non-musicians.

…

Figures - uploaded by William Choi

Content may be subject to copyright.

Content uploaded by William Choi

Content may be subject to copyright.

Running head: MUSIC-TO-LANGUAGE TRANSFER 1

Towards a Native OPERA Hypothesis:

Musicianship and English Stress Perception

William Choi

Academic Unit of Human Communication, Development, and Information Sciences,

The University of Hong Kong

Choi, W. (2021). Towards a native OPERA hypothesis: Musicianship and English stress

perception. Language and Speech. Advance online publication.

doi:10.1177/00238309211049458.

Address for correspondence: Room 765, Meng Wah Complex, The University of Hong Kong,

Pokfulam, Hong Kong; willchoi@hku.hk

MUSIC-TO-LANGUAGE TRANSFER 2

Abstract

Musical experience facilitates speech perception. French musicians, to whom stress is foreign,

have been found to perceive English stress more accurately than French non-musicians. This

study investigated whether this musical advantage also applies to native listeners. English

musicians and non-musicians completed an English stress discrimination task and two control

tasks. With age, non-verbal intelligence and short-term memory controlled, the musicians

exhibited a perceptual advantage relative to the non-musicians. This perceptual advantage was

equally potent to both trochaic and iambic stress patterns. In terms of perceptual strategy, the two

groups showed differential use of acoustic cues for iambic but not trochaic stress. Collectively,

the results could be taken to suggest that musical experience enhances stress discrimination even

among native listeners. Remarkably, this musical advantage is highly consistent and does not

particularly favour either stress pattern. For iambic stress, the musical advantage appears to stem

from the differential use of acoustic cues by musicians. For trochaic stress, the musical

advantage may be rooted in enhanced durational sensitivity.

Keywords: Music-to-language transfer, stress, music, pitch, rhythm, OPERA

MUSIC-TO-LANGUAGE TRANSFER 3

Towards a Native OPERA Hypothesis:

Musicianship and English Stress Perception

Long-term musical experience facilitates speech perception (Patel, 2011; 2014). Research

has frequently shown that musicians are better able to perceive lexical tones than non-musicians

(e.g., Alexander, Wong, & Bradlow, 2005; Choi, 2020; Kraus & Chandrasekaran, 2010; Zheng

& Samuel, 2018). These findings underpin contemporary theories of cross-domain transfer, the

most notable of which is the OPERA hypothesis (Patel, 2011; 2014). Unfortunately, most

research has only focused on lexical tone perception among non-native listeners. This has led to

research gaps concerning whether the OPERA hypothesis applies to native speech perception

and, more specifically, to other prosodic features such as stress. To ascertain the generalisability

of the OPERA hypothesis, this study investigates (i) whether English musicians outperform

English non-musicians on English stress discrimination. To provide additional insight into

music-to-language transfer, this study further examines (ii) whether the musical advantage is

selective about stress pattern, and (iii) the means by which musical experience enhances native

stress discrimination.

The OPERA hypothesis proposes that musical experience facilitates speech encoding

when five conditions are met: the neural networks for music and speech must overlap

anatomically (Overlap), and the music activities must entail more precise acoustic processing

than speech (Precision), bring about strong positive emotion (Emotion), repeat frequently

(Repetition), and require focused attention (Attention) (Patel, 2011; 2014). Patel’s hypothesis is

well supported by empirical studies, most of which have shown a musical advantage in lexical

tone perception (see Choi, 2020). For example, English musicians identified and discriminated

Mandarin tones more accurately than did English non-musicians (Alexander et al., 2005). Quite

MUSIC-TO-LANGUAGE TRANSFER 4

surprisingly, the English musicians even performed on a par with native Mandarin listeners.

English musicians’ perceptual advantage in Mandarin tone discrimination was also evident at the

phrase level (Zheng & Samuel, 2018). Neurophysiologically, French musicians also showed a

larger P3b response to Mandarin tonal and segmental variations than French non-musicians

(Marie, Delogu, Lampis, Belardinelli, & Besson, 2011).

In addition to the apparent acoustic similarities between lexical tones and musical pitch,

stress patterns coincide with the metrical structures in music (Henrich, Alter, Wiese, & Domahs,

2014; Palmer & Kelly, 1992; Patel, 2003; Lerdahl, 2001; see Gandour, 1981; Tong, Choi, &

Man, 2018 for lexical tones). Lexical stress is the relative prominence assigned to a certain

syllable in a word (Teschner & Whitley, 2004). In English, stressed syllables are typically

associated with higher fundamental frequency (f0), longer duration, and higher intensity (Choi,

2021a; Choi, Tong, & Samuel, 2019; Choi, Tong, & Singh, 2017; Fry, 1958; Wang, 2008; Yu &

Andruski, 2010). Unstressed syllables typically exhibit a vowel quality change (e.g., the second

vowel in harmony /ˈhɑməni/ is reduced to /ə/), although this is not necessarily the case (e.g., the

second vowel in import /ˈɪmport/ is not reduced). Similar to English speech, music is

characterised by repeated sequences of stressed and unstressed beats (rhythm; Toussaint, 2005).

Analyses of English- (with stress) and French- (without stress) composed music revealed that the

metrical structures paralleled the composers’ spoken language (Patel & Daniele, 2003a; 2003b).

This finding, together with the commonalities between lexical stress and musical rhythm, give

rise to the possibility of cross-domain transfer between these two features.

Indeed, musical experience facilitates English stress perception among French listeners.

Unlike English, French does not use lexical stress contrastively (it is a fixed stress language;

Garde, 1968). In an AX discrimination task, French listeners could discriminate lexical stress

MUSIC-TO-LANGUAGE TRANSFER 5

contrasts with a very low error rate (3%; Dupoux, Pallier, Sebastian, & Mehler, 1997). In an

event-related potential study, French listeners also showed a mismatch negativity (MMN)

response to stress violations, reflecting pre-attentive stress discrimination (Aguilera, El Yagoubi,

Espesser, & Astésano, 2014). The lack of MMN response in the reverse oddball task further led

to the claim that French listeners had long-term memory traces of stress. Although French

listeners are behaviourally and neurophysiologically sensitive to stress, they struggle to perceive

stress at more abstract perceptual levels. Specifically, French listeners recall stress sequences

with very high error rates (49% and 73%; Dupoux, Peperkamp, & Sebastian-Galles, 2010). Of

direct relevance to the current study is that French listeners’ difficulties in recalling stress

sequences could be mitigated by musical experience (Kolinsky, Cuvelier, Goetry, Peretz, &

Morais, 2009). In particular, French musicians were better able to recall stress sequences than

French non-musicians. This musical advantage was evident at all sequence lengths, which

suggested enhanced perception rather than increased memory span.

Considering the above findings in light of the OPERA hypothesis, musical experience

does facilitate non-native speech perception (e.g., Alexander et al., 2005; Kolinsky et al., 2008;

Zheng & Samuel, 2018; cf. Schellenberg, 2015). Here, a critical question arises as to whether the

OPERA hypothesis is also applicable to native speech perception. Subcortically, English

musicians were more sensitive to English consonantal changes (/ba/ /da/ and /ga/) than English

non-musicians (Parbery-Clark, Tierney, Strait, & Kraus, 2012). In terms of speech prosody,

French musicians exhibited a larger P200 response to metrical violations in naturally produced

French (Marie, Magne, & Besson, 2010). Collectively, these results offer some support to the

notion that musical experience facilitates native speech perception. The current study

hypothesises that English musicians discriminate English stress more accurately than do English

MUSIC-TO-LANGUAGE TRANSFER 6

non-musicians. English stress is chosen not only because of the formerly established non-native

musical advantage but also because English stress sensitivity contributes to reading

comprehension among English children (Holliman, Wood, & Sheehy, 2010; 2012; Kolinsky et

al., 2009).

Another way to extend the OPERA hypothesis is to examine the selectivity of musical

advantage. In a recent study, English musicians and non-musicians completed a Cantonese tone

discrimination task and a Cantonese tone sequence recall task (Choi, 2020). In both tasks, the

musicians outperformed the non-musicians only in half of the tonal contexts. This reflected that

musical experience only facilitated the perception of certain Cantonese tones. English stress

contains trochaic and iambic stress patterns (Ladd, 2008). In a trochaic stress pattern, a stressed

syllable precedes an unstressed one (e.g., CAmel) and vice-versa for an iambic stress pattern

(e.g., caNAL). Relative to the trochaic stress pattern, the iambic stress pattern is less common

and acquired later by English infants (Cutler, 2014; Jusczyk, Cutler, & Redanz, 1993). Thus, it is

possible that the musical advantage is more pronounced for iambic than trochaic stress patterns.

The current study tests this hypothesis.

A further goal of this study is to explore the means by which musical experience

facilitates native stress discrimination. One possibility is that musical experience alters listeners’

choice of acoustic cues, for which support is drawn from a tone perception study (Choi, 2020). In

the high-rising tone context for which musical advantage was shown, English musicians and

non-musicians attended to different acoustic cues (i.e., f0 contour and f0 onset, respectively).

However, in the low-rising tone context for which musical advantage was absent, the two groups

attended to the same acoustic cues (i.e., F0 contour). As mentioned above, stress is signalled by

f0, duration, and intensity (Choi et al., 2017; 2019; Fry, 1958; Wang, 2008; Yu & Andruski,

MUSIC-TO-LANGUAGE TRANSFER 7

2010). It is possible that English musicians and non-musicians attend to different acoustic cues

for stress discrimination. It is also possible that musicians and non-musicians attend to the same

acoustic cues but with different relative weights assigned to each. Drawing on parallels with

cross-linguistic research, Russian and English listeners attended to the same set of acoustic cues

for English stress perception (Chrabaszcz, Winn, Lin, & Idsardi, 2014). However, the Russian

and English listeners showed different weighting patterns among f0, duration, and intensity cues:

the f0 cue was weighted most heavily by the English listeners (f0 > intensity > duration) but least

heavily by the Russian listeners (intensity > duration > f0). Based on the above findings, it is

possible that musical experience drives listeners to rely on a different set of acoustic cues or to

rely differently on the same set of acoustic cues for English stress discrimination.

The main theme of this study is music-to-language transfer. In the literature, correlational

and intervention designs have been frequently adopted. Correlational studies compare musicians

and non-musicians on variables of interest, such as lexical tone sensitivity (e.g., Alexander et al.,

2005; Choi, 2020; Kolinsky et al., 2009; Zheng & Samuel, 2018). As the groups are pre-defined,

the correlational design guarantees that musicians have many years of musical experience. This

is particularly useful for studying cross-domain transfer, as long-term musical experience

induces more prominent plastic changes than does short-term musical experience (Patel, 2011;

2014). However, the standard caveat of correlational design is weak causal inference (see

Corrigall, Schellenberg, & Misura, 2013; Schellenberg, 2015). Intervention studies typically

involve two or three groups, each of which receives music training, music-irrelevant training, or

no training (e.g., Moreno, Marques, Santos, Santos, Castro, & Besson, 2009; Nan et al., 2018).

Clearly, this design permits a stronger causal inference. Nevertheless, laboratory training only

lasts for weeks or months so this design reduces the possibility of studying the long-term effect

MUSIC-TO-LANGUAGE TRANSFER 8

of musical experience. Correlational and intervention designs have their own merits and

limitations, which makes both types of research necessary. As long-term musical experience is

crucial for music-to-language transfer, the current study adopts a correlational design as a first

step.

The overarching goal of this study is to investigate (1) whether English musicians exhibit

a perceptual advantage in English stress discrimination. To elucidate the potential musical

advantage, this study further examines (2) whether the musical advantage is selective about

stress patterns, and (3) the means by which musical experience enhances English stress

discrimination. Given the possible influence of non-verbal intelligence and short-term memory

on English stress perception, these two constructs were controlled (Choi et al., 2019; see also

Asaridou, Hagoort, & McQueen, 2015; Bidelman, Hutka, & Moreno, 2013; Hutka, Bidelman, &

Moreno, 2015). To this end, participants were also tested on non-verbal intelligence and short-

term memory. To minimise testing time, I adopted two tasks that could provide quick and

reliable estimates of the above constructs among English listeners (Choi, 2020; Choi et al., 2019;

Zheng & Samuel, 2018).

Methods

Participants

Forty native English listeners were recruited at University College London through an

online participant recruitment system. Based on the criteria adopted in previous studies (Choi,

2020; 2021b; Tong et al., 2018), the listeners were assigned to the musician (n = 20) and non-

musician (n = 20) groups. All musicians had received at least seven years of continuous music

training and were able to play their instruments at the time of testing. All non-musicians had

MUSIC-TO-LANGUAGE TRANSFER 9

received no more than two years of music training, if any. None of them had received any music

training in the recent five years and were unable to play any musical instrument at the time of

testing. Two non-musicians and one musician were excluded from the study due to no-show,

excessive music training (non-musician), and Mandarin learning experience. Thus, there were 19

musicians (5 male, 14 female; Mage = 26.63 years, SD = 5.89 years) and 18 non-musicians (8

male, 10 female; Mage = 32.67 years, SD = 11.60 years) in the final sample.

Table 1 summarises the musical experience of the musicians. On average, the musicians

had received 11.63 years of music training (SD = 3.90 years) with a mean onset age of 7.84 years

(SD = 2.89 years). The non-musicians had received 0.90 year of music training (SD = 1.56

years). For the non-musicians who had received music training, their mean onset age of music

training was 12.00 years (SD = 4.86 years). None of the participants in the study reported having

absolute pitch.

English Stress Discrimination Task

Stimuli. Four pairs of real English words, /ˈpɚmɪt - pɚˈmɪt/, /ˈsəspekt - səsˈpekt/, /ˈɪnsɚt

- ɪnˈsɚt/, and /ˈimpɔrt - imˈpɔrt/ (permit, suspect, insert, import) were recorded at a sampling rate

of 48 kHz. All stimuli were naturally produced by two native English speakers (one male and

one female). The recording was made in a sound-shielded booth.

Material Presentation. An AX paradigm was adopted. In each trial, two real words were

audibly presented via Sennheiser HD280 PRO headphones. The inter-stimulus interval was 600

ms. The two real words either carried the same (e.g., /ˈɪnsɚt - ˈɪnsɚt/) or different stress (e.g.,

/ˈɪnsɚt - ɪnˈsɚt/). To prevent the listeners from adopting an ad-hoc acoustic strategy, the two real

MUSIC-TO-LANGUAGE TRANSFER 10

words in each trial were produced by speakers of different genders. The voice order was random

within each trial.

Procedure. Listeners were asked to judge, as quickly as possible, whether the two real

words carried the same stress. They responded by pushing keyboard buttons ([f] for same, [j] for

different). The accuracy and response time were recorded for each trial. Prior to the experimental

trials, six practice trials with feedback were run. There were 96 trials (8 stimuli × 2 speaker

orders × 2 trial types × 3 repetitions). A sensitivity index (d’) was obtained based on the hits and

false alarms for the same and different trials (see Figure 1). The sample-specific reliabilities were

high (αmusicians = .87, αnon-musicians = .90). This task has also been used successfully to assess

English stress discrimination among English listeners in a previous study (Choi et al., 2019).

Non-verbal Intelligence Task

This task consisted of 14 multiple-choice questions, all of which required participants to

organise pictures by a logical sequence under time pressure. In each trial, participants were given

30 seconds to choose the picture that best completed the visual pattern described in the question.

One point was awarded for each correct answer. This task has been used successfully in previous

studies to assess English listeners’ non-verbal intelligence (Choi, 2020; Choi et al., 2019; Zheng

& Samuel, 2018). The sample-specific reliabilities were moderate to high (αmusicians = .54, αnon-

musicians = .79).

Short-term Memory Task

This computerised task consisted of a plate displayed at the centre of a touchscreen. The

plate contained four coloured (red, green, blue, and yellow) wedges. On each trial, a sequence of

colours (e.g., yellow-blue-red) was presented. Following the presentation, the participants were

MUSIC-TO-LANGUAGE TRANSFER 11

required to reproduce the colour sequence by tapping the corresponding wedges. One point was

awarded for each correctly reproduced sequence. The sequence length started at one and

increased by one after each correct response. The score started at zero and increased by one

following each correct response. For example, a participant who correctly reproduced up to eight

sequences would score eight in that round. Each participant completed five rounds, from which

the median score was obtained. This task has also been used successfully in previous studies to

assess English listeners’ short-term memory (Choi, 2020; Choi et al., 2019; Zheng & Samuel,

2018). As in these previous studies, the sound was turned off so that the measure was

independent of auditory short-term memory. The sample-specific reliabilities were satisfactory to

high (αmusicians = .65, αnon-musicians = .81).

Results

Musical Advantage in Stress Discrimination

To investigate whether the musicians exhibited a perceptual advantage in English stress

discrimination, a one-way analysis of covariance (ANCOVA) was conducted on d’ with group

(musician and non-musician) as the independent variable. Age, non-verbal intelligence, and

short-term memory were controlled (see Table 2; see also Appendix I). As expected, the

ANCOVA revealed a significant group difference, F(1, 32) = 9.62, p < .01, η2 = .23, in which the

musicians discriminated English stress more accurately than did the non-musicians (see Figure

2). Correlational analyses further showed that d’ correlated significantly with years of music

training, r(35) = .37, p < .05, but not with onset age of music training, p = .398. This suggests

that for English stress discrimination, the amount of music training received matters more than

the age at which music training started.

MUSIC-TO-LANGUAGE TRANSFER 12

Selectivity of Musical Advantage

To evaluate the selectivity of the musical advantage, a two-way mixed ANCOVA was

conducted on hit rate with stress type (iambic and trochaic) as the within-subject factor and

group (musician and non-musician) as the between-subjects factor. Age, non-verbal intelligence,

and short-term memory were also controlled. The ANCOVA revealed a significant main effect

of group, F(1, 32) = 5.94, p < .05, η2 = .16 (see Figure 3). However, the main effect of stress

type, p = .171, and the interaction between stress type and group, p = .822, were non-significant.

Consistent with the earlier analysis, a clear musical advantage was found. Remarkably, this

musical advantage was highly consistent and did not particularly favour either stress pattern.

In terms of response time, the two-way mixed ANCOVA showed non-significant main

effects of group, p = .822, and stress type, p = .789. Their interaction effect was also non-

significant, p = .607. An analysis of the mean response time across all 96 trials yielded consistent

results (see Appendix II). The lack of a group difference in response time testifies against a

speed–accuracy trade-off: the greater accuracy of the musicians over the non-musicians was not

because they had taken longer to respond.

Use of Acoustic Cues by Musicians and Non-musicians

All stimuli were analysed acoustically with Praat 6.0.50 (Institute of Phonetic Sciences,

University of Amsterdam, the Netherlands), yielding the set of acoustic parameters summarised

in Table 3 (see also Appendix III). For each stimulus, the f0, durational, and intensity ratios of

the first to second syllables were obtained (see Table 4).

To explore the use of acoustic cues by musicians and non-musicians, an acoustic–

behavioural correlational analysis was conducted. All different trials (N = 48) were extracted

MUSIC-TO-LANGUAGE TRANSFER 13

from the dataset. Each different trial, as a single entry, contained eight variables: (1) the f0 ratio,

(2) durational ratio, and (3) intensity ratio of the trochaic stress stimulus; (4) the f0 ratio, (5)

durational ratio, and (6) intensity ratio of the iambic stress stimulus; (7) the trial-specific mean

accuracy of the musician group; and (8) the trial-specific mean accuracy of the non-musician

group.

Of interest to the study was whether the acoustic parameters (1–6) correlated with the

behavioural accuracies among the musicians and non-musicians (see Table 5). The mean

accuracies of the musicians correlated significantly with the f0 (r = .24, p < .05), durational (r =

-.38, p < .01), and intensity (r = .29, p < .05) ratios of the iambic stress stimuli. For the trochaic

stress stimuli, the mean accuracies of musicians correlated significantly with the durational ratio

(r = -.27, p < .05), but not with the f0 (p = .459) and intensity (r = .23, p = .062) ratios.

The mean accuracies of the non-musicians correlated significantly with the f0 (r = .46, p

< .01), durational (r = -.42, p < .01), and intensity (r = .28, p < .05) ratios of the iambic stress

stimuli. For the trochaic stress stimuli, the mean accuracies of non-musicians correlated

significantly with the durational ratio (r = -.29, p < .05), but not with the f0 (p = .361) and

intensity (r = .23, p = .055) ratios.

Taken together, both groups’ discriminatory abilities were related to (a) the degree of f0,

durational, and intensity variations among the iambic stress stimuli, and (b) the degree of

durational variations among the trochaic stress stimuli. For the trochaic stress stimuli, the

musicians and non-musicians attended mostly to duration. However, for the iambic stress

stimuli, the musicians attended mostly to duration whereas the non-musicians attended mostly to

f0 (see Table 5).

MUSIC-TO-LANGUAGE TRANSFER 14

Discussion

This study endeavoured to investigate (1) whether English musicians exhibit a perceptual

advantage in English stress discrimination, (2) whether the musical advantage is selective about

stress patterns, and (3) the means by which musical experience facilitates English stress

discrimination.

The core result was the presence of a musical advantage in English stress discrimination

among the English listeners. This fits the OPERA hypothesis well. In terms of precision, music

entails more precise metrical processing than speech. Provided that the other four conditions

(Overlap, Emotion, Repetition, and Attention) were met, musical experience enhanced the

English musicians’ sensitivity to English stress. As mentioned above, the OPERA hypothesis has

been widely applied to account for a musical advantage in non-native speech perception (e.g.,

Alexander et al., 2005; Choi, 2020; Kolinsky et al., 2009; Zheng & Samuel, 2018). Consistent

with the previous studies on native consonantal and metrical discrimination, the present result

supports the theoretical view that music-to-language transfer could also occur given relevant

linguistic experience (Marie et al., 2010; Parbery-Clark et al., 2012). Thus, it stands to reason

that the OPERA hypothesis applies to both non-native and native listeners. From a practical

perspective, this theoretical view points towards the potential use of music training to aid native

speech perception (e.g., Moreno et al., 2009; Vidal, Lousada, & Vigário, 2020). For example,

piano training enhanced Mandarin children’s behavioural sensitivities to Mandarin vowels and

neural sensitivities to Mandarin tones (Nan et al., 2018). For English children, English stress

sensitivity is essential for literacy development and poor readers often show deficits in stress

sensitivity (Holliman et al., 2010; 2012). Thus, with the musical advantage in English stress

perception now established, it is important to determine whether music training can improve

MUSIC-TO-LANGUAGE TRANSFER 15

English children’s stress perception. Interestingly, English stress sensitivity also contributes to

second language English literacy development among Cantonese children (Choi, Tong, & Cain,

2016; Choi, Tong, & Deacon, 2018). As such, it is also worthwhile to investigate whether

Cantonese musicians show a perceptual advantage in English stress perception.

Remarkably, the musical advantage identified herein was highly consistent across stress

patterns. It was originally believed that musical experience would exert differential effects on

trochaic and iambic stress perception. However, the results clearly showed that musical

experience did not particularly favour either stress pattern. This is in contrast to the recent

finding of a study on English listeners that musical advantage was selective about Cantonese

tones (Choi, 2020). These discrepancies may stem from acoustic or even linguistic differences.

Although Cantonese tones and English stress share f0 as a common acoustic cue, Cantonese

tones are signalled by f0 in a more fine-grained manner (Choi et al., 2019). Whereas Cantonese

tones are largely f0 variations, English stress has other acoustic cues, such as duration and

intensity (Choi, Tong, Gu, Tong, & Wong, 2017; Gandour, 1981; Ladd, 2008; Wang, 2008).

Conceivably, the differences in terms of the selectivity of musical advantage across Cantonese

tone and English stress perception might be due to the acoustic differences between the two

features. Hypothetically, the discrepancies might also arise from linguistic experience: Cantonese

tones were non-native to the musicians but English stress was native to them. As unlikely as this

may seem, future studies that include non-native listeners of English stress are needed to falsify

this hypothetical account.

Acoustically, the musicians appear to have adopted a different perceptual strategy for

iambic stress. Although the musicians and non-musicians attended to the same acoustic cues,

they relied on these acoustic cues differently. Specifically, non-musicians attended most heavily

MUSIC-TO-LANGUAGE TRANSFER 16

to f0 whereas musicians attended most heavily to duration. This is somewhat reminiscent of a

recent finding that, unlike non-musicians who attended to a less effective cue (f0 onset),

musicians attended to a more effective cue (f0 contour) for high-level tone perception (Choi,

2020). Indeed, for the iambic stress stimuli, stressed and unstressed syllables did not differ

significantly in f0, making it a less effective cue than duration and intensity. Considering the

OPERA hypothesis in the current context, it is possible that musical experience had orientated

the musicians to attend more heavily to more effective cues (duration and intensity), thereby

facilitating iambic stress perception. Future studies can further validate these findings by testing

stress perception across different acoustic conditions, e.g., f0-only, duration-only, and intensity-

only (Choi et al., 2019).

By contrast, the musicians and non-musicians adopted the same perceptual strategy for

trochaic stress. Specifically, the acoustic–behavioural correlation analysis of trochaic stress

implied that the two groups attended mainly, if not only, to durational cues.

Neurophysiologically, changes in speech temporal structure elicited the P200 response among

musicians but not non-musicians (Marie et al., 2010). This suggests that musicians have stronger

automatic detection of syllable temporal structure than non-musicians. Based on the literature, it

is believed that long-term musical experience in discerning metrical structures sharpened the

English musicians’ sensitivity to duration (e.g., Skoe & Kraus, 2013). Thus, for trochaic stress,

one plausible explanation for the musical advantage is enhanced sensitivity to duration. This

proposed mechanism is also consistent with the OPERA hypothesis (Patel, 2011; 2014).

Collectively, musical advantage may stem from the differential use of acoustic cues (iambic

stress) and enhanced durational sensitivity (trochaic stress).

MUSIC-TO-LANGUAGE TRANSFER 17

In terms of the study’s theoretical contribution, the current findings have several

implications for the OPERA hypothesis. Most importantly, the musical advantage in native stress

discrimination converges with studies on native consonantal and metrical discrimination (Marie

et al., 2010; Parbery-Clark et al., 2012). Taken together, these findings suggest that the OPERA

hypothesis also applies to native speech perception. In contrast to a previous finding on the

selectivity of musical advantage, the musical advantage identified herein was highly consistent

(Choi, 2020). Crucially, the current study further adds that musical advantage is not necessarily

selective, and the OPERA hypothesis can be revised to account for this. Although the OPERA

hypothesis argues that musical experience increases neuronal sensitivity to speech, the present

and previous studies further suggest that musical experience may also alter listeners’ perceptual

strategy (Choi, 2020; see Patel, 2011). This points to a need for the OPERA hypothesis to

incorporate new elements on how musical experience orients musicians to different acoustic

cues.

In terms of the methodological contribution, this study has demonstrated that stress is a

potent feature for investigating cross-domain transfer. As mentioned in the Introduction, most

studies on cross-domain transfer have focused on lexical tone perception, presumably due to its

sharing of an acoustic cue (f0) with musical pitch (e.g., Choi, 2020; Cooper & Wang, 2012;

Marie et al., 2011; Zheng & Samuel, 2018). Crucially, stress has three acoustic correlates – f0,

duration, and intensity – that are used intensively for discerning metrical structures in music

(Henrich et al., 2014; Palmer & Kelly, 1992; Patel, 2003; Lerdahl, 2001). Indeed, the current

study has identified linkages between stress discrimination and musical experience, highlighting

a more potent candidate for studying music-to-language transfer and even language-to-music

transfer. In the latter direction, one interesting question is whether English non-musicians discern

MUSIC-TO-LANGUAGE TRANSFER 18

metrical structures more accurately than do French non-musicians, given the presence of stress in

English.

Readers are cautioned that this study is correlational. Like the preponderance of studies

on music-to-language transfer, this study cannot rule out the possibility of gene–environment

interaction (e.g., Alexander et al., 2005; Choi, 2020; Cooper & Wang, 2012; Kolinsky et al.,

2009; Marie et al., 2011; Zheng & Samuel, 2018). Schellenberg (2015) argued that cognitive

abilities and socioeconomic status determine the likelihood of a child receiving music training.

More specifically, Corrigall and colleagues (2013) reasoned that high-functioning children from

high socioeconomic status families were more likely to take music lessons than other children.

As such, music training might only exaggerate pre-existing differences between musicians and

non–musicians. This is contrary to a widely adopted premise that musicians and non-musicians

do not differ systematically prior to musical experience (e.g., Francois & Schön, 2011; Fitzroy &

Sanders, 2013; Shook, Marian, Bartolotti, & Schroeder, 2013). Despite the English musicians

and non-musicians matching on non-verbal intelligence and short-term memory, they may still

have differed in other respects, such as learning motivation and personality traits, some of which

are difficult to control for. Intriguingly, there are numerous reports that musicians exhibit a

memory advantage (George & Coch, 2011; Roden, Grube, Bongard, & Kreutz, 2014; Schulze,

Dowling, & Tillmann, 2012). It might be that the English musicians in the current sample did not

possess this advantage; it is also be possible that their cognitive difference was not captured by

the tasks. Ideally, each cognitive construct should have been measured with multiple tasks.

In conclusion, the present study has identified a musical advantage in native stress

discrimination. This finding adds to the body of evidence that musical experience facilitates

native speech perception, in turn suggesting that the OPERA hypothesis also applies to native

MUSIC-TO-LANGUAGE TRANSFER 19

listeners (Marie et al., 2010; Parbery-Clark et al., 2012). The musical advantage identified herein

was highly consistent, suggesting that musical advantage is not necessarily selective. The present

results also imply that part of the musical advantage might arise from the differential use of

acoustic cues by the musicians. Despite the standard caveats of correlational studies, the current

study presents theoretically and practically significant findings that I believe will withstand

scrutiny by future intervention studies.

Acknowledgement

I wish to thank Mairéad MacSweeney for her dedicated support. I also appreciate Arthur

Samuel for recording the stimuli and sharing the control tasks. This research was supported by

the Croucher Postdoctoral Fellowship from the Croucher Foundation to William Choi. It was

also supported by the Start-up Research Fund from The University of Hong Kong to William

Choi.

MUSIC-TO-LANGUAGE TRANSFER 20

References

Aguilera, M., El Yagoubi, R., Espesser, R., & Astésano, C. (2014). Event-related potential

investigation of initial accent processing in French. Proceedings of Speech Prosody, 383–

387.

Asaridou, S. S., Hagoort, P., & McQueen, J. M. (2015). Effects of early bilingual experience

with a tone and a non-tone language on speech–music integration. PLoS ONE, 10(12),

e0144225.

Alexander, J., Wong, P. C. M., & Bradlow, A. R. (2005). Lexical tone perception in

musicians and nonmusicians. Paper presented in INTERSPEECH 2005 – Eurospeech,

9th European Conference on Speech Communication and Technology, Lisbon,

Portugal, September 4–8, 2005.

Bidelman, G. M., Hutka, S., & Moreno, S. (2013). Tone language speakers and

musicians shared enhanced perceptual and cognitive abilities for musical pitch: Evidence

for bidirectionality between the domains of language and music. PLoS ONE, 8(4),

e60676.

Choi, W. (2020). The selectivity of musical advantage: Musicians exhibit perceptual

advantage for some but not all Cantonese tones. Music Perception, 37(5), 423–434.

Choi, W. (2021a). Cantonese advantage on English stress perception: Constraints and neural

underpinnings. Neuropsychologia, 158, 107888.

Choi, W. (2021b). Musicianship influences language effect on musical pitch perception.

Frontiers in Psychology, 12, 712753.

MUSIC-TO-LANGUAGE TRANSFER 21

Choi, W., Tong, X., & Cain, K. (2016). Lexical prosody beyond first-language boundary:

Chinese lexical tone sensitivity predicts English reading comprehension. Journal of

Experimental Child Psychology, 148, 70–86.

Choi, W., Tong, X., & Deacon, H. (2017). Double dissociations in reading comprehension

difficulties among Chinese–English bilinguals and their association with tone awareness.

Journal of Research in Reading, 40(2), 184–198.

Choi, W., Tong, X., Gu, F., Tong, X., & Wong, L. (2017). On the early neural perceptual

integrality of tones and vowels. Journal of Neurolinguistics, 41, 11–23.

Choi, W., Tong, X., & Samuel, A. G. (2019). Better than native: Tone language

experience enhances English lexical stress discrimination in Cantonese–English

bilingual listeners. Cognition, 189, 188–192.

Choi, W., Tong, X., & Singh, L. (2017). From lexical tone to lexical stress: A cross-language

mediation model for Cantonese children learning English as a second language.

Frontiers in Psychology, 8, 492.

Chrabaszcz, A., Winn, M., Lin, C. Y., & Idsardi, W. J. (2014). Acoustic cues to

perception of word stress by English, Mandarin, and Russian speakers. Journal of

Speech, Language and Hearing Research, 57, 1468–1479.

Cooper, A., & Wang, Y. (2012). The influence of linguistic and musical experience on

Cantonese word learning. Journal of the Acoustical Society of America, 131(6), 4756–

4768.

Corrigall, K. A., Schellenberg, E. G., Misura, N. M. (2013). Music training, cognition, and

MUSIC-TO-LANGUAGE TRANSFER 22

personality. Frontiers in Psychology, 4, 222.

Cutler, A. (2014). Native Listening: Language Experience and the Recognition of Spoken

Words. Cambridge MA: MIT Press.

Dupoux, E., Pallier, C., Sebastian, N., & Mehler, J. (1997). A distressing “deafness” in French?

Journal of Memory and Language, 36, 406–421.

Dupoux, E., Peperkamp, S., & Sebastian-Galles, N. (2010). Limits on bilingualism

revisited: Stress ‘deafness’ in simultaneous French–Spanish bilinguals. Cognition, 114,

266–275.

Fitzroy, A. B., & Sanders, L. D. (2013). Musical expertise modulates early processing of

syntactic violations in language. Frontiers in Psychology, 3, e603.

Francois, C., & Schön, D. (2011). Musical expertise boosts implicit learning of both musical and

linguistic structures. Cerebral Cortex, 21(10), 2357–2365.

Fry, D. B. (1958). Experiments in the perception of stress. Language and Speech, 1,

205–213.

Gandour, J. (1981). Perceptual dimensions of tone: Evidence from Cantonese. Journal of

Chinese Linguistics, 9, 20–36.

Garde, O. (1968). L’accent. Paris: Presses Universitaires de France.

George, E. M., & Coch, D. (2011). Music training and working memory: An ERP study.

Neuropsychologia, 49(5), 1083–1094.

Henrich, K., Alter, K., Wiese, R., & Domahs, U. (2014). The relevance of rhythmical alternation

MUSIC-TO-LANGUAGE TRANSFER 23

in language processing: An ERP study on English compounds. Brain and Language, 136,

19–30.

Holliman, A. J., Wood, C., & Sheehy, K. (2010). The contribution of sensitivity to speech

rhythm and non-speech rhythm to early reading development. Educational Psychology,

30(3), 247–267.

Holliman, A. J., Wood, C., & Sheehy, K. (2012). A cross-sectional study of prosodic sensitivity

and reading difficulties. Journal of Research in Reading, 35(1), 32–48.

Hutka, S., Bidelman, G. M., & Moreno, S. (2015). Pitch expertise is not created equal: Cross-

domain effects of musicianship and tone language experience on neural and behavioural

discrimination of speech and music. Neuropsychologia, 71, 52–63.

Jusczyk, P. W., Cutler, A., & Redanz, N. J. (1993). Infants’ preference for the predominant stress

patterns of English words. Child Development, 64(3), 675–687.

Kolinsky, R., Cuvelier, H., Goetry, V., Peretz, I., & Morais, J. (2009). Music training facilitates

lexical stress processing. Music Perception, 26(3), 235–246.

Kraus, N., & Chandrasekaran, B. (2010). Music training for developmental auditory skills.

Nature Reviews Neuroscience, 11(8), 599–605.

Ladd, D. R. (2008). Intonational Phonology. Cambridge: Cambridge University Press.

Lerdahl, F. (2001). Tonal Pitch Space. Oxford and New York: Oxford University Press.

Marie, C., Delogu, F., Lampis, G., Belardinelli, M. O., & Besson, M. (2011). Influence of

musical expertise on segmental and tonal processing in Mandarin Chinese. Journal of

Cognitive Neuroscience, 23(10), 2701–2715.

MUSIC-TO-LANGUAGE TRANSFER 24

Marie, C., Magne, C., & Besson, M. (2010). Musicians and the metric structure of words.

Journal of Cognitive Neuroscience, 23(2), 294–305.

Moreno, S., Marques, C., Santos, A., Santos, M., Castro, S. L., & Besson, M. (2009). Musical

training influences linguistic abilities in 8-year-old children: More evidence for brain

plasticity. Cerebral Cortex, 19(3), 712–723.

Nan, Y., Liu, L., Geiser, E., Shu, H., Gong, C. C., Dong, Q., Gabrieli, J. D. E., & Desimone, R.

(2018). Piano training enhances the neural processing of pitch and improves speech

perception in Mandarin-speaking children. Proceedings of the National Academy of

Sciences, 115(28), 6630–6639.

Palmer, C., & Kelly, M. H. (1992). Linguistic prosody and musical meter in song. Journal of

Memory and Language, 31(4), 525–542.

Parbery-Clark, A., Tierney, A., Strait, D. L., & Kraus, N. (2012). Musicians have fine-tuned

neural distinction of speech syllables. Neuroscience, 219, 111–119.

Patel, A. D. (2003). Language, music, syntax and the brain. Nature Neuroscience, 6(7), 674–681.

Patel, A. D. (2011). Why would musical training benefit the neural encoding of

speech? The OPERA hypothesis. Frontiers in Psychology, 2, 142.

Patel, A. D. (2014). Can nonlinguistic musical training change the way the brain processes

speech? The expanded OPERA hypothesis. Hearing Research, 308, 98–108.

Patel, A. D., & Daniele, J. R. (2003a). An empirical comparison of rhythm in language and

music. Cognition, 87, B35–B45.

Patel, A. D., & Daniele, J. R. (2003b). Stress-timed vs. syllable-timed music? A comment on

MUSIC-TO-LANGUAGE TRANSFER 25

Huron and Ollen (2003). Music Perception, 21, 273–276.

Roden, I., Grube, D., Bongard, S., & Kreutz, G. (2014). Does music training enhance working

memory performance? Findings from a quasi-experimental longitudinal study.

Psychology of Music, 42(2), 284–298.

Schellenberg, E. G. (2015). Music training and speech perception: A gene-environment

interaction. Annals of the New York Academy of Science, 1337, 170–177.

Schulze, K., Dowling, W. J., & Tillmann, B. (2012). Working memory for tonal and atonal

sequences during a forward and a backward recognition task. Music Perception, 29(3),

255–267.

Shook, A., Marian, V., Bartolotti, J., & Schroeder, S. R. (2013). Musical experience influences

statistical learning of a novel language. American Journal of Psychology, 126(1), 95–104.

Skoe, E., & Kraus, N. (2013). Musical training heightens auditory brainstem function during

sensitive periods in development. Frontiers in Psychology, 4, e622.

Teschner, R. V., & Whitley, S. M. (2004). Pronouncing English: A Stress-Based

Approach with CD-ROM. Washington DC: Georgetown University Press.

Tong, X., Choi, W., & Man, Y. Y. (2018). Tone language experience modulates the

effect of long-term musical training on musical pitch perception. Journal of the

Acoustical Society of America, 144(2), 690–697.

Toussaint, G. T. (2005). The geometry of musical rhythm. In M. Kano & X. Tan (Eds).

Proceedings of the Japan Conference on Discrete and Computational Geometry, 3742,

198–212.

MUSIC-TO-LANGUAGE TRANSFER 26

Vidal, M. M., Lousada, M., & Vigário, M. (2020). Music effects on phonological awareness

development in 3-year-old children. Applied Psycholinguistics, 41(2), 299–318.

Wang, Q. (2008). Perception of English stress by Mandarin Chinese learners of

English: An acoustic study (Unpublished doctoral dissertation). British Columbia:

University of Victoria.

Yu, V. Y., & Andruski, J. E. (2010). A cross-language study of perception of lexical

stress in English. Journal of Psycholinguistic Research, 39, 323–344.

Zheng, Y., & Samuel, A. G. (2018). The effects of ethnicity, musicianship, and tone

language experience on pitch perception. Quarterly Journal of Experimental

Psychology, 71(12), 2627–2642.

MUSIC-TO-LANGUAGE TRANSFER 27

Table 1. Musical experience of the musicians

Participant

Onset age

(years)

Amount of music

training (years)

First

instrument

Second

instrument

Third

instrument

Piano

Oboe

Piano

Guitar

Keyboard

Piano

Guitar

Bass

Drums

Guitar

Piano

Bass

Piano

Flute

M10

Flute

M11

Piano

Ukulele

M12

Piano

Violin

M13

Clarinet

M14

Piano

M15

Piano

M16

Clarinet

M17

Piano

Guitar

Ukulele

M18

Violin

M19

Piano

MUSIC-TO-LANGUAGE TRANSFER 28

Table 2. Comparison of age, non-verbal intelligence, and short-term memory between English

musicians and non-musicians.

Variable

Musicians

Non-musicians

Group difference (p value)

Chronological age in years (SD)

26.63 (5.89)

32.67 (11.60)

.052

Non-verbal intelligence (SD)

9.68 (2.21)

8.83 (3.29)

.360

Short-term memory (SD)

7.84 (2.39)

7.83 (3.76)

.993

Note. The maximum possible value of non-verbal intelligence is 14. There are no maximum

possible values for age and short-term memory.

MUSIC-TO-LANGUAGE TRANSFER 29

Table 3. Fundamental frequency, duration, and intensity values of the stimuli.

Stimuli

First syllable

Second syllable

(Hz)

Duration

(ms)

Intensity

(dB)

(Hz)

Duration

(ms)

Intensity

(dB)

Male

ˈpɚmɪt

168

155

188

445

pɚˈmɪt

124

151

127

449

ˈsəspekt

218

305

483

295

səsˈpekt

127

300

157

300

ˈɪnsɚt

168

124

160

476

ɪnˈsɚt

117

139

158

461

ˈimpɔrt

163

177

103

423

imˈpɔrt

110

224

123

376

Female

ˈpɚmɪt

119

237

209

363

pɚˈmɪt

228

209

261

391

ˈsəspekt

130

332

269

268

səsˈpekt

219

315

245

285

ˈɪnsɚt

240

239

209

361

ɪnˈsɚt

218

204

311

396

ˈimpɔrt

262

290

205

310

imˈpɔrt

222

250

239

350

Note. All values are rounded to the nearest integer.

MUSIC-TO-LANGUAGE TRANSFER 30

Table 4. Fundamental frequency, durational, and intensity ratios of the first to second syllables

of the stimuli.

Stimuli

First-to-second syllable

(Male)

First-to-second syllable

(Female)

F0 ratio

Duration

ratio

Intensity

ratio

F0 ratio

Duration

ratio

Intensity

ratio

ˈpɚmɪt

0.89

0.35

1.21

0.57

0.65

1.19

pɚˈmɪt

0.97

0.34

1.09

0.87

0.53

1.08

ˈsəspekt

0.45

1.03

1.17

0.48

1.24

1.10

səsˈpekt

0.81

1.00

1.03

0.89

1.11

0.89

ˈɪnsɚt

1.05

0.26

1.12

1.15

0.66

1.17

ɪnˈsɚt

0.74

0.30

1.01

0.70

0.52

0.97

ˈimpɔrt

1.58

0.42

0.90

1.28

0.94

1.09

imˈpɔrt

0.89

0.60

0.86

0.93

0.71

0.93

Note. All values are rounded to the nearest two decimal places.

MUSIC-TO-LANGUAGE TRANSFER 31

Table 5. Correlations between the F0, duration, and intensity ratios and the trial-specific mean

accuracies of musicians and non-musicians.

Musicians’ accuracy

Non-musicians’ accuracy

Iambic stress stimuli

F0 ratio

.24*

.46**

Duration ratio

-.38**

-.42**

Intensity ratio

.29*

.28*

Trochaic stress stimuli

F0 ratio

Duration ratio

-.27*

-.29*

Intensity ratio

.23†

.23‡

Note. All values are rounded to the nearest two decimal places. ** p < .01; * p < .05; † p = .062; ‡ p

= .055.

MUSIC-TO-LANGUAGE TRANSFER 32

Figure 1. The hit and false alarm rates of musicians and non-musicians in the English stress

discrimination task. Errors bars represent 95% confidence intervals.

MUSIC-TO-LANGUAGE TRANSFER 33

Figure 2. The mean sensitivity index of musicians and non-musicians in the English stress

discrimination task. Errors bars represent 95% confidence intervals.

MUSIC-TO-LANGUAGE TRANSFER 34

Figure 3. The mean hit rate and response time of musicians and non-musicians for trochaic and

iambic stress. Errors bars represent 95% confidence intervals.

MUSIC-TO-LANGUAGE TRANSFER 35

Appendix I

Analysis of Age and Cognitive Profiles

Correlational analyses showed significant correlations between age and non-verbal

intelligence, r = -.39, p < .05, age and short-term memory, r = -.43, p < .01, and non-verbal

intelligence and short-term memory, r = .46, p < .01. Thus, multivariate analysis of variance

(MANOVA) was conducted to examine the potential group differences. MANOVA showed non-

significant main effect of group, p = .181, implying that the groups matched on these variables.

To be empirically stringent, independent sample t-tests were conducted as MANOVA has

a weak power for detecting differences. Consistent with MANOVA, both groups did not differ

significantly in non-verbal intelligence, t(35) = .93, p = .360, and short-term memory, t(35)

= .01, p = .993. However, the nonmusicians were marginally older than the musicians, t(35) = -

2.01, p = .052, d = .66. As the perceptual differences between the musicians and nonmusicians

were only meaningful if they remained evident when age and general cognitive abilities were

held constant, these three variables were controlled in the main analysis.

MUSIC-TO-LANGUAGE TRANSFER 36

Appendix II

Analysis of Response Time Across All Trials

Figure S1 shows the mean response time (collapsed across 96 trials) of musicians and

nonmusicians. To examine whether musicians and nonmusicians differed in the mean response

time across all 96 trials, one-way ANCOVA was conducted on mean response time with group

(musicians and nonmusicians) being the independent variable. Age, non-verbal intelligence, and

short-term memory were controlled. ANCOVA showed no significant group difference in mean

response time, p = .837.

Figure S1. The mean response time (collapsed across 96 trials) of musicians and nonmusicians in

the English stress discrimination task. Error bars represent 95% confidence intervals.

MUSIC-TO-LANGUAGE TRANSFER 37

Appendix III

Acoustic Analysis of Gender Differences

As f0, duration, and intensity did not correlate with each other, ps > .05, three sets of

three-way ANOVAs were conducted on each acoustic parameter with stress pattern (iambic and

trochaic), syllable status (stressed or unstressed), and gender (male and female) as the

independent variables.

For f0, three-way ANOVA revealed a significant main effect of gender, F(1, 24) = 6.98,

p < .05, η2 = .23, but not syllable status, p = .922, and stress pattern, p = .411. The interaction

between stress pattern and gender was significant, F(1, 24) = 4.27, p = .05, η2 = .15, but not the

interactions between stress pattern and syllable status, p = .163, and between gender and syllable

status, p = .411. The three-way interaction was also non-significant, p = .687. For the interaction

between stress pattern and gender, pairwise comparisons showed that f0 varied marginally

significantly across trochaics and iambics only for male, p = .051, but not for female, p = .393.

For duration, three-way ANOVA showed non-significant main effects of gender, p =

1.00, stress pattern, p = 1.00, and syllable status, p = .714. The interaction between gender and

syllable status was also non-significant, p = .430. However, the interaction between stress pattern

and syllable status was significant, F(1, 24) = 43.03, p < .001, η2 = .64, so was the three-way

interaction between gender, stress pattern, and syllable status, F(1, 24) = 7.15, p < .05, η2 = .23.

Simple main effect analysis was conducted to unpack the three-way interaction. For male-

produced iambics and trochaics, stressed and unstressed syllables differed significantly in

duration, ps < .01. For female-produced iambics, stressed and unstressed syllables also differed

MUSIC-TO-LANGUAGE TRANSFER 38

significantly in duration, p < .01. However, for female-produced trochaics, stress and unstressed

syllables did not differ significantly in duration, p = .089.

For intensity, three-way ANOVA revealed a significant main effect of syllable status,

F(1, 24) = 4.44, p < .05, η2 = .16, in which stressed syllables had higher intensity than unstressed

syllables. All other main effects and interactions were not significant, ps > .05.

Does musicianship influence the perceptual integrality of tones and segmental information?

Article

Full-text available

Aug 2023
J ACOUST SOC AM

This study investigated the effect of musicianship on the perceptual integrality of tones and segmental information in non-native speech perception. We tested 112 Cantonese musicians, Cantonese non-musicians, English musicians, and English non-musicians with a modified Thai tone AX discrimination task. In the tone discrimination task, the control block only contained tonal variations, whereas the orthogonal block contained both tonal and task-irrelevant segmental variations. Relative to their own performance in the control block, the Cantonese listeners showed decreased sensitivity index (d') and increased response time in the orthogonal block, reflecting integral perception of tones and segmental information. By contrast, the English listeners performed similarly across the two blocks, indicating independent perception. Bayesian analysis revealed that the Cantonese musicians and the Cantonese non-musicians perceived Thai tones and segmental information equally integrally. Moreover, the English musicians and the English non-musicians showed similar degrees of independent perception. Based on the above results, musicianship does not seem to influence tone-segmental perceptual integrality. While musicianship apparently enhances tone sensitivity, not all musical advantages are transferrable to the language domain.

What Is "Music" in Music-to-Language Transfer? Musical Ability But Not Musicianship Supports Cantonese Listeners' English Stress Perception

Article

Full-text available

Sep 2022
J SPEECH LANG HEAR R

William Choi

Purpose: This study investigates how Cantonese language experience influences the potential effects of (i) musicianship and (ii) musical ability on English stress perception. Method: The sample contained 124 participants, evenly split into Cantonese musician, Cantonese non-musician, English musician, and English non-musician groups. They completed the English stress discrimination task, English stress sequence recall task, Musical Ear Test, and non-verbal intelligence task. Following the musicianship-based analysis, 44 Cantonese and English listeners were re-assigned to four groups based on their musical ability—Cantonese high musical ability, Cantonese low musical ability, English high musical ability, and English low musical ability groups. Results: Musicianship-based analysis on English stress perception revealed a significant interaction between musicianship and language. Specifically, musicians outperformed non-musicians only among the English but not the Cantonese listeners. By contrast, ability-based analysis showed significant main effects of musical ability and language. For both Cantonese and English listeners, those with a high musical ability outperformed those with a low musical ability. Regardless of musical ability, Cantonese listeners outperformed English listeners. Correlational analyses yielded consistent findings. Conclusions: This study has found cross-sectional evidence that musical ability, but not musicianship, facilitates Cantonese English as a second language (ESL) listeners’ English stress perception. From a theoretical perspective, the current findings motivate two potential additions to the OPERA hypothesis for music-to-language transfer—unsaturation and utilization. Practically, the findings cast doubt on the application of non-perceptual based instrumental music training to enhance Cantonese ESL learners’ perceptual learning of English stress.

Musicianship Influences Language Effect on Musical Pitch Perception

Article

Full-text available

Oct 2021

William Choi

Given its practical implications, the effect of musicianship on language learning has been vastly researched. Interestingly, growing evidence also suggests that language experience can facilitate music perception. However, the precise nature of this facilitation is not fully understood. To address this research gap, I investigated the interactive effect of language and musicianship on musical pitch and rhythmic perception. Cantonese and English listeners, each divided into musician and non-musician groups, completed the Musical Ear Test and the Raven’s 2 Progressive Matrices. Essentially, an interactive effect of language and musicianship was found on musical pitch but not rhythmic perception. Consistent with previous studies, Cantonese language experience appeared to facilitate musical pitch perception. However, this facilitatory effect was only present among the non-musicians. Among the musicians, Cantonese language experience did not offer any perceptual advantage. The above findings reflect that musicianship influences the effect of language on musical pitch perception. Together with the previous findings, the new findings offer two theoretical implications for the OPERA hypothesis—bi-directionality and mechanisms through which language experience and musicianship interact in different domains.

Why Aren't All Cantonese Tones Equally Confusing to English Listeners?

Article

Full-text available

Dec 2022

English listeners often struggle to perceive tones, but some are easier than others. This study examined these phenomena grounded in the Feature Weighing Perspective (FWP) and the Perceptual Assimilation Model for Suprasegmentals (PAM-S). Forty-seven English and Cantonese listeners completed 4,212 trials of Cantonese tone discrimination and sequence recall tasks. The English listeners showed asymmetrical perceptual patterns of discrimination but not sequence recall. Specifically, these English listeners discriminated T1-T5, T3-T5, and T4-T5 more accurately than T1-T4, T3-T4, and T1-T3. However, they recalled the contour tone and level tone sequences with similar accuracies. Results of the discrimination task aligned with the predictions of PAM-S but not FWP. However, results of the sequence recall task did not support PAM-S. Together, these results suggest that PAM-S only applies to simple discrimination, not abstract phonological processing with a high memory load.

Musical Advantage in Lexical Tone Perception Hinges on Musical Instrument

Article

Full-text available

Jun 2024
MUSIC PERCEPT

Different musical instruments have different pitch processing demands. However, correlational studies have seldom considered the role of musical instruments in music-to-language transfer. Addressing this research gap could contribute to a nuanced understanding of music-to-language transfer. To this end, we investigated whether pitched musicians had a unique musical advantage in lexical tone perception relative to unpitched musicians and nonmusicians. Specifically, we compared Cantonese pitched musicians, unpitched musicians, and nonmusicians on Thai tone discrimination and sequence recall. In the Thai tone discrimination task, the pitched musicians outperformed the unpitched musicians and the nonmusicians. Moreover, the unpitched musicians and the nonmusicians performed similarly. In the Thai tone sequence recall task, both pitched and unpitched musicians recalled level tone sequences more accurately than the nonmusicians, but the pitched musicians showed the largest musical advantage. However, the three groups recalled contour tone sequences with similar accuracy. Collectively, the pitched musicians had a unique musical advantage in lexical tone discrimination and the largest musical advantage in level tone sequence recall. From a theoretical perspective, this study offers correlational evidence for the Precision element of the OPERA hypothesis. The choice of musical instrumental may matter for music-to-language transfer in lexical tone discrimination and level tone sequence recall.

The choice of musical instrument matters: Effect of pitched but not unpitched musicianship on tone identification and word learning

Article

Full-text available

Aug 2023
APPL PSYCHOLINGUIST

The present study investigated the differential effects of pitched and unpitched musicianship on tone identification and word learning. We recruited 44 Cantonese-pitched musicians, unpitched musicians, and non-musicians. They completed a Thai tone identification task and seven sessions of Thai tone word learning. In the tone identification task, the pitched musicians outperformed the non-musicians but the unpitched musicians did not. In session 1 of the tone word learning task, the three groups showed similar accuracies. In session 7, the pitched musicians outperformed the non-musicians but the unpitched musicians did not. The results indicate that the musical advantage in tone identification and word learning hinges on pitched musicianship. From a theoretical perspective, these findings support the precision element of the OPERA hypothesis. Broadly, they reflect the need to consider the heterogeneity of musicianship when studying music-to-language transfer. Practically, the findings highlight the potential of pitched music training in enhancing tone word learning proficiency. Furthermore, the choice of musical instrument may matter to music-to-language transfer.

Theorizing positive transfer in cross-linguistic speech perception: The Acoustic-Attentional-Contextual hypothesis

Article

Full-text available

Feb 2022
J PHONETICS

William Choi

Can non-natives outperform natives on speech discrimination? Surprisingly, Cantonese listeners discriminated English stress more accurately than did English listeners. To ascertain its generalizability, I further ask whether this Cantonese advantage in English stress discrimination is equally potent across pitch accent and vowel reduction contexts. Sixty Cantonese and English listeners completed four blocks of English stress discrimination task with varying pitch accent and vowel reduction contexts. In the absence of rising pitch accent pattern and vowel reduction , the Cantonese listeners outperformed the English listeners on English stress discrimination. However, the Cantonese advantage disappeared when either rising pitch accent pattern or vowel reduction was present. When both rising pitch accent pattern and vowel reduction were present, the Cantonese listeners even performed poorer than the English listeners. The findings underscore two constraints of the Cantonese advantage in English stress discrimination--rising pitch accent pattern and vowel reduction. Based on collective research on non-native advantage in speech perception, the Acoustic-Attentional-Contextual hypothesis is proposed.

Cantonese Advantage on English Stress Perception: Constraints and Neural Underpinnings

Article

Full-text available

Jul 2021
NEUROPSYCHOLOGIA

William Choi

A prevailing conception of cross-linguistic transfer is that first language experience poses perceptual interference, or at best null effect, on second language speech perception. Surprisingly, a recent study found that Cantonese listeners outperformed English listeners on English stress perception. The present study further evaluated whether segmental variations would constrain the Cantonese advantage on English stress perception. Cantonese and English listeners were tested with both active and passive oddball paradigms in which ERP responses to English stress deviations were elicited. Behaviorally, the Cantonese listeners exhibited a perceptual advantage relative to the English listeners, but the advantage disappeared upon the introduction of segmental variations. Neurophysiologically, segmental variations diminished the P3b amplitudes of the Cantonese but not the English listeners. Collectively, results suggest that segmental variations constrain the Cantonese advantage on English stress perception.

Enduring musician advantage among former musicians in prosodic pitch perception

Article

Full-text available

Feb 2023

Musical training has been associated with various cognitive benefits, one of which is enhanced speech perception. However, most findings have been based on musicians taking part in ongoing music lessons and practice. This study thus sought to determine whether the musician advantage in pitch perception in the language domain extends to individuals who have ceased musical training and practice. To this end, adult active musicians (n = 22), former musicians (n = 27), and non-musicians (n = 47) were presented with sentences spoken in a native language, English, and a foreign language, French. The final words of the sentences were either prosodically congruous (spoken at normal pitch height), weakly incongruous (pitch was increased by 25%), or strongly incongruous (pitch was increased by 110%). Results of the pitch discrimination task revealed that although active musicians outperformed former musicians, former musicians outperformed non-musicians in the weakly incongruous condition. The findings suggest that the musician advantage in pitch perception in speech is retained to some extent even after musical training and practice is discontinued.

Musicianship Influences Language Effect on Musical Pitch Perception

Article

Full-text available

Oct 2021

William Choi

Cantonese Advantage on English Stress Perception: Constraints and Neural Underpinnings

Article

Full-text available

Jul 2021
NEUROPSYCHOLOGIA

William Choi

The selectivity of musical advantage: Musicians exhibit perceptual advantage for some but not all Cantonese tones

Article

Full-text available

Jun 2020
MUSIC PERCEPT

William Choi

The OPERA Hypothesis theorizes how musical experience heightens perceptual acuity to lexical tones. One missing element in the hypothesis is whether musical advantage is general to all or specific to some lexical tones. To further extend the hypothesis, this study investigated whether English musicians consistently outperformed English nonmusicians in perceiving a variety of Cantonese tones. In an AXB discrimination task, the musicians exhibited superior discriminatory performance over the nonmusicians only in the high level, high rising, and mid-level tone contexts. Similarly, in a Cantonese tone sequence recall task, the musicians significantly outperformed the nonmusicians only in the contour tone context but not in the level tone context. Collectively, the results reflect the selectivity of musical advantage--musical experience is only advantageous to the perception of some but not all Cantonese tones, and elements of selectivity can be introduced to the OPERA hypothesis. Methodologically, the findings highlight the need to include a wide variety of lexical tone contrasts when studying music-to-language transfer.

Music effects on phonological awareness development in 3-year-old children

Article

Full-text available

Mar 2020

Music and language engage similar processing mechanisms, including auditory processing and higher cognitive functions, recruiting partially overlapping brain structures. It has been argued that both are related in child development and that linguistic functions can be positively influenced by music training above 4-years-old. In this randomized control study, with a test-training-retest methodology, 44 children (3-4 years old) were assessed with a phonological awareness test, prior and after an intervention period of a school year with weekly music classes (experimental group, n = 23) or visual arts classes (control group, n = 21) in kindergarten. In the preassessment there were no significant differences between groups. When comparing pre-and postassessment, results showed significant differences in both groups, but music classes' students outperformed the control group, showing larger differences between the beginning and the end of the intervention. Improvement in both groups is expected due to general developmental reasons. However, the fact that children receiving music classes show greater improvement indicates that music lessons have influenced phonological awareness. Our results support the hypothesis that music training may promote language abilities, specifically phonological awareness, prior to the ages previously studied. In the last years, the relation between music and language development has attracted great attention in various research fields, and in particular in linguistics. According to Gerry, Unrau, and Trainor (2012) active music participation seems to influence child development at as soon as 6 months of age, as participating in classes with appropriate pedagogical methodologies improves social and communicative development between infants and their parents. Later in development, music training and/or music abilities have been found to predict literacy outcomes (phonological awareness [word, syllabic] and phonemic awareness. |Open Access|

Better than native: Tone language experience enhances English lexical stress discrimination in Cantonese-English bilingual listeners

Article

Full-text available

Apr 2019
COGNITION

While many second language (L2) listeners are known to struggle when discriminating non-native features absent in their first language (L1), no study has reported that L2 listeners perform better than native listeners in this regard. The present study tested whether Cantonese-English bilinguals were better in discriminating English lexical stress in individual words or pseudowords than native English listeners, even though lexical stress is absent in Cantonese. In experiments manipulating acoustic, phonotactic, and lexical cues, Cantonese-English bilingual adults exhibited superior performance in discriminating English lexical stress than native English listeners across all phonotactic/lexical conditions when the fundamental frequency (f0) cue to lexical stress was present. The findings underscore the facilitative effect of Cantonese tone language experience on English lexical stress discrimination.

Piano training enhances the neural processing of pitch and improves speech perception in Mandarin-speaking children

Article

Full-text available

Jun 2018

Significance Musical training is beneficial to speech processing, but this transfer’s underlying brain mechanisms are unclear. Using pseudorandomized group assignments with 74 4- to 5-year-old Mandarin-speaking children, we showed that, relative to an active control group which underwent reading training and a no-contact control group, piano training uniquely enhanced cortical responses to pitch changes in music and speech (as lexical tones). These neural enhancements further generalized to early literacy skills: Compared with the controls, the piano-training group also improved behaviorally in auditory word discrimination, which was correlated with their enhanced neural sensitivities to musical pitch changes. Piano training thus improves children’s common sound processing, facilitating certain aspects of language development as much as, if not more than, reading instruction.

Author accepted manuscript: The Effects of Ethnicity, Musicianship, and Tone Language Experience on Pitch Perception

Article

Full-text available

Jan 2018

Language and music are intertwined: music training can facilitate language abilities, and language experiences can also help with some music tasks. Possible language-music transfer effects are explored in two experiments in this study. In Experiment 1, we tested native Mandarin, Korean, and English speakers on a pitch discrimination task with two types of sounds: speech sounds and fundamental frequency (F0) patterns derived from speech sounds. To control for factors that might influence participants' performance, we included cognitive ability tasks testing memory and intelligence. In addition, two music skill tasks were used to examine general transfer effects from language to music. Prior studies showing that tone language speakers have an advantage on pitch tasks have been taken as support for three alternative hypotheses: specific transfer effects, general transfer effects, and an ethnicity effect. In Experiment 1, musicians outperformed non-musicians on both speech and F0 sounds, suggesting a music-to-language transfer effect. Korean and Mandarin speakers performed similarly, and they both outperformed English speakers, providing some evidence for an ethnicity effect. Alternatively, this could be due to population selection bias. In Experiment 2, we recruited Chinese Americans approximating the native English speakers' language background to further test the ethnicity effect. Chinese Americans, regardless of their tone language experiences, performed similarly to their non-Asian American counterparts in all tasks. Therefore, although this study provides additional evidence of transfer effects across music and language, it casts doubt on the contribution of ethnicity to differences observed in pitch perception and general music abilities.

From Lexical Tone to Lexical Stress: A Cross-Language Mediation Model for Cantonese Children Learning English as a Second Language

Article

Full-text available

Mar 2017

This study investigated how Cantonese lexical tone sensitivity contributed to English lexical stress sensitivity among Cantonese children who learned English as a second language (ESL). Five-hundred-and-sixteen second-to-third grade Cantonese ESL children were tested on their Cantonese lexical tone sensitivity, English lexical stress sensitivity, general auditory sensitivity, and working memory. Structural equation modeling revealed that Cantonese lexical tone sensitivity contributed to English lexical stress sensitivity both directly, and indirectly through the mediation of general auditory sensitivity, in which the direct pathway had a larger relative contribution to English lexical stress sensitivity than the indirect pathway. These results suggest that the tone-stress association might be accounted for by joint phonological and acoustic processes that underlie lexical tone and lexical stress perception.

Tone language experience modulates the effect of long-term musical training on musical pitch perception

Article

Aug 2018

Long-term musical training is widely reported to enhance music pitch perception. However, it remains unclear whether tone language experience influences the effect of long-term musical training on musical pitch perception. The present study addressed this question by testing 30 Cantonese and 30 non-tonal language speakers, each divided equally into musician and non-musician groups, on pitch height and pitch interval discrimination. Musicians outperformed non-musicians among non-tonal language speakers, but not among Cantonese speakers on the pitch height discrimination task. However, musicians outperformed non-musicians among Cantonese speakers, but not among non-tonal language speakers on the pitch interval discrimination task. These results suggest that the effect of long-term musical training on musical pitch perception is shaped by tone language experience and varies across different pitch perception tasks.

On the early neural perceptual integrality of tones and vowels

Article

Feb 2017
J NEUROLINGUIST

The current study adopted the MMN additivity approach to examine the pre-attentive perceptual integration of vowels and tones. Twenty Cantonese listeners participated in the ERP experiment. Using the passive oddball paradigm, we elicited tone-MMN, vowel-MMN and double-MMN in the speech condition; and fundamental frequency-MMN, formant frequency-MMN and double-MMN in the non-speech condition. In both conditions , the double-MMNs were significantly smaller in amplitude than the sum of single feature MMNs. Morphological comparisons showed no significant difference in the latency and topographic patterns between vowel-MMN and tone-MMN, and marginal significant differences between formant frequency-MMN and fundamental frequency-MMN. Collectively , results reflect the perceptual integration of tones and vowels at the phonological level, and partial integration of fundamental frequency and formant frequency at the auditory level.

Towards a Native OPERA Hypothesis: Musicianship and English Stress Perception

Abstract and Figures

Recommended publications

Segmental and supra-segmental contributions to cross-language speech intelligibility

What Is "Music" in Music-to-Language Transfer? Musical Ability But Not Musicianship Supports Cantone...

Better than native: Tone language experience enhances English lexical stress discrimination in Canto...

The selectivity of musical advantage: Musicians exhibit perceptual advantage for some but not all Ca...

Theorizing positive transfer in cross-linguistic speech perception: The Acoustic-Attentional-Context...