The McGurk Effect: Audio-Visual Speech Perception Illusion

On This Page:

Key Takeaways

The McGurk effect occurs when a person perceives that another’s lip movements do not correspond to what that individual is saying.
The McGurk effect occurs when an individual perceives a mismatch between the auditory speech sounds they hear and the visual movements they see while someone is speaking. This can lead to the perception that the person’s lip movements do not align with the spoken words, resulting in a perceptual illusion.
Cognitive psychologists Harry McGurk, and John MacDonald, introduced the concept of the McGurk Effect in 1976 after accidentally discovering the phenomenon during an experiment.
External factors such as visual distraction, tactile diversion, familiarity, and syllable structure can impact the McGurk effect.
Brain damage, Alzheimer’s disease, Specific Language Impairment, and aphasia are some internal factors influencing the phenomenon.
The magnitude of the McGurk effect may vary across languages; however, such variation vanishes when the audio stimuli become unintelligible.

a woman's mouth blowing hand drawn icons and symbols close up

The McGurk effect is a perceptual phenomenon which happens when a person perceives that the movement of another individual’s lips do not match up with what that individual is actually saying. (Boersma, 2011; Nath & Beauchamp, 2012)

In other words, an illusion occurs in the interaction between vision and hearing in the perception of speech.

Herein, one sound’s optical component is coupled with another sound’s audial component. This pairing subsequently induces the perception of a third sound.

While a combination of poor auditory stimuli and high-quality ocular information may constitute a common source of the McGurk effect, a multiplicity of factors influence the magnitude of this phenomenon.

Origin and History

The McGurk effect was first introduced in 1976 by the cognitive psychologists Harry McGurk and John MacDonald in a paper titled Hearing Lips and Seeing Voices (McGurk & MacDonald, 1976).

This effect was accidentally discovered while McGurk and his assistant, MacDonald, were conducting a study on the perception of language by infants at various developmental stages.

During the study, they would, in one location, play the video of a mother speaking and, in another, the sound of her voice.

In the experiment, they directed a technician to dub, with the auditory syllable “ba,” a videotape with a visual of “ga.” When the dubbed tape was played, MacDonald and McGurk perceived “da”—a phoneme distinct from the dubbed audio and the visual.

Both McGurk and MacDonald were confused. Eventually, however, they realized that the phenomenon stemmed not from a mistake by the technician but from an idiosyncrasy in human perception.

External Factors Impacting the McGurk Effect

Visual Distraction

A study of the role of optic attention in audiovisual speech perception observed the McGurk effect in two different situations (Tiippana, Andersen & Sams, 2004).

In the first instance, the listener’s attention was focused on the face of the speaker, whereas in the second instance, the listener ignored the face by paying attention to a leaf that moved across the talker’s face.

The results demonstrated that the McGurk effect was weaker in the latter scenario. This outcome was attributed to visual attention’s modulation of audiovisual speech perception.

The modulation herein could have transpired either at a premature, unisensory state of processing or on account of alterations at the integration stage of visual and auditory information.

Tactile Diversion

Research has recently challenged the popular assumption that audiovisual pairing transpires in an attention-free mode (Alsius, Navarra & Soto-Faraco, 2007).

A study investigated whether audiovisual speech integration is hindered by the depletion of optic and audial attentional resources, as attentional demands are imposed on the tactile domain, which is not directly associated with speech perception.

The McGurk effect was measured in a dual-task model involving an exacting tactile assignment. The results indicated that the proportion of visually swayed responses to audiovisual information abated as attention was deflected to the tactile task.

This outcome was attributed to the modulatory impact on audiovisual binding of speech, which is mediated by the limitations of supramodal attention.

These findings provide a glimpse of the dynamism and the extensiveness of the interactions between crossmodal binding mechanisms and the attentional system.

Familiarity

An experiment examined the claims for the independence of facial speech processing and facial identity (Walker, Bruce & O’Malley, 1995).

In the study, the faces used to create the McGurk effect stimuli were manipulated so that they were alien to some participants but familiar to others. Moreover, the voices and the faces utilized were either congruent (belonged to the same individual) or incongruent (belonged to different individuals).

The distinct participant groups were compared to each other to gauge their susceptibility to the McGurk illusion. The results indicated that when the voices and the faces were incongruent, the participants who were acquainted with the faces were less susceptible than those who were unacquainted with the faces.

This outcome implies that facial speech and facial identity are far from independent and that those familiar with the speakers’ faces are less likely than those unfamiliar to be swayed by the McGurk effect.

Syllable Structure

One research study conducted four experiments to decide whether optic information influences the judgments of acoustically-specified nonspeech events and speech events (Brancazio, Best & Fowler, 2006).

The study employed click sounds perceived as nonspeech by many English listeners but function as consonants in certain African languages.

The results demonstrated a significant McGurk effect for isolated clicks. This effect, however, was notably smaller than that for the stop-consonant-vowel syllables.

Moreover, strong McGurk effects were discovered for click-vowel syllables, which were similar to those for English syllables.

On the other hand, weak McGurk effects were found for excised release bursts of stop consonants in isolation; these were similar to the effects for isolated clicks.

This outcome shows that the McGurk effect may occur even in non-speech settings.

Moreover, while phonological significance is not a prerequisite for the McGurk effect, it does seem to intensify it.

Internal Factors Impacting the McGurk Effect

Brain Damage

The hemispheres of the brain cooperate to integrate speech information received via the optic and aural senses (Baynes, Fummell & Fowler, 1994).

Right-handed individuals, for whom words have privileged access to the left hemisphere and the face to the right hemisphere, are more likely to experience a McGurk effect.

Research also shows that the McGurk effect, though present, is significantly slower for those who have undergone callosotomy.

Moreover, visual stimuli strongly impact speech perception in individuals with lesions to the left hemisphere, demonstrating a larger McGurk effect than the average person (Schmid, Thielmann & Ziegler, 2009).

However, they would be less likely to experience the McGurk effect if the damage to the left hemisphere had compromised their visual segment speech perception (Nicholson, Baum, Cuddy & Munhall, 2002).

On the other hand, individuals whose right brain hemisphere has been damaged demonstrate impairment in visual-only as well as audio-visual integration functions.

Moreover, though these individuals’ integration of the information can produce a McGurk effect, such integration appears only if optic stimuli are utilized to enhance performance when the aural signal is poor.

Thus, although persons whose right hemisphere is damaged may exhibit a McGurk effect, it is not as strong as in a normal group.

Alzheimer’s Disease

A study that investigated brain connectivity associated with Alzheimer’s disease (AD) via the evaluation of a crossmodal effect examined the McGurk effect in those afflicted with the disease and matched the control participants (Delbeuck, Collette & Van der Linden, 2007).

The results unveiled the impairment of crossmodal integration in speech perception for AD. However, the phenomenon was not associated with interruptions in the distinct visual and auditory speech stimuli processing.

This outcome implies that the specific aural-optic integration deficit in AD patients might result from a connectivity breakdown.

Specific Language Impairment

A study that examined 28 preschoolers with Specific Language Impairment (SLI) and 28 preschoolers without SLI sought to analyze their capacity for auditory-visual integration (Norrix, Plante, Vance & Boliek, 2007).

While both the groups performed equivalently in congruent audio-visual modalities, in the incongruent audio-visual condition, the children with SLI demonstrated a McGurk effect that was weaker than that observed for those without SLI.

The results indicate that those with SLI experience a significantly lower McGurk effect than those without SLI.

While they may pay less attention to articulatory gestures and employ less visual information in perceiving speech, they encounter no significant challenges in perceiving exclusively aural cues.

Aphasia

An investigation of the ability of a person afflicted with mild aphasia to recognize tokens offered in visual-only, auditory-only, and audio-visual conditions yielded results meriting attention (Youse, Cienkowski & Coelho, 2004).

The hypothesis was that in the bimodal condition, performance would be ideal and that the McGurk effect would exhibit integration of speech information.

The results, however, did not support the hypotheses but suggested that a perseverative response pattern is limiting the successful integration of audio-visual speech information, and the utilization of bisensory speech stimuli may be compromised in adults afflicted with aphasia.

The McGurk Effect in Different Languages

Regardless of the language being used, listeners generally depend, to some degree, on visual information in speech perception. However, the intensity of the McGurk effect varies across languages.

For instance, Spanish, Italian, Turkish, English, Dutch, and German listeners experience a stronger McGurk effect than Chinese and Japanese listeners (Sekiyama, 1997; Bavo, Ciorba, Prosser & Martini, 2009; Erdener, 2015).

The cultural practice of avoiding eye contact and tonic and syllabic linguistic structures might account for this diminished effect among the Japanese as well as the Chinese.

Research also shows that, unlike English children, Japanese children do not demonstrate a developmental advancement in visual influence following age six (Sekiyama & Burnham, 2008; Hisanaga, Sekiyama, Igasaki & Murayama, 2009).

However, Japanese listeners recognize the incompatibility between aural and optic stimuli better than English listeners—perhaps because of the Japanese’s want for consonant clusters (Sekiyama & Tohkura, 1991).

Notwithstanding the manifest differences, listeners of all languages are compelled to rely on optic stimuli when audio stimuli are unintelligible. When this occurs, variation across languages disappears, and the McGurk effect is applied equally.

Is The McGurk effect an illusion?

Yes, the McGurk effect is considered an illusion. It occurs when the perception of speech is influenced by both auditory and visual information, leading to a perceptual experience where the visual cues impact the interpretation of the auditory speech sounds.

References

Alsius, A., Navarra, J., & Soto-Faraco, S. (2007). Attention to touch weakens audiovisual speech integration. Experimental Brain Research, 183 (3), 399-404.

Bovo, R., Ciorba, A., Prosser, S., & Martini, A. (2009). The McGurk phenomenon in Italian listeners. Acta Otorhinolaryngologica Italica, 29 (4), 203.

Baynes, K., Funnell, M. G., & Fowler, C. A. (1994). Hemispheric contributions to the integration of visual and auditory information in speech perception. Perception & Psychophysics, 55 (6), 633-641.

Boersma, P. (2012). A constraint-based explanation of the McGurk effect. Phonological Architecture: Empirical, Theoretical and Conceptual Issues, 299-312.

Brancazio, L., Best, C. T., & Fowler, C. A. (2006). Visual influences on perception of speech and nonspeech vocal-tract events. Language and speech, 49 (1), 21-53.

Delbeuck, X., Collette, F., & Van der Linden, M. (2007). Is Alzheimer’s disease a disconnection syndrome?: Evidence from a crossmodal audio-visual illusory experiment. Neuropsychologia, 45 (14), 3315-3323.

Erdener, D. (2015). The McGurk illusion in Turkish. Turkish Journal of Psychology. 30 (76): 19–31.

Hisanaga, S., Sekiyama, K., Igasaki, T., & Murayama, N. (2009). Audiovisual speech perception in Japanese and English: inter-language differences examined by event-related potentials. In AVSP (pp. 38-42).

Massaro, D. W., & Stork, D. G. (1998). Speech recognition and sensory integration: a 240-year-old theorem helps explain how people and machines can integrate auditory and visual information to understand speech. American Scientist, 86 (3), 236-244.

McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264 (5588), 746-748.

Nath, A. R., & Beauchamp, M. S. (2012). A neural basis for interindividual differences in the McGurk effect, a multisensory speech illusion. Neuroimage, 59 (1), 781-787.

Nicholson, K. G., Baum, S., Cuddy, L. L., & Munhall, K. G. (2002). A case of impaired auditory and visual speech prosody perception after right hemisphere damage. Neurocase, 8 (4), 314-322.

Norrix, L. W., Plante, E., Vance, R., & Boliek, C. A. (2007). Auditory-visual integration for speech by children with and without specific language impairment. Journal of Speech, Language, and Hearing Research, 50 (6), 1639–1651.

Schmid, G., Thielmann, A., & Ziegler, W. (2009). The influence of visual and auditory information on the perception of speech and non‐speech oral movements in patients with left hemisphere lesions. Clinical linguistics & phonetics, 23 (3), 208-221.

Sekiyama, K. (1997). Cultural and linguistic factors in audiovisual speech processing: The McGurk effect in Chinese participants. Perception & psychophysics, 59 (1), 73-80.

Sekiyama, K., & Burnham, D. (2008). Impact of language on development of auditory‐visual speech perception. Developmental science, 11 (2), 306-320.

Sekiyama, K., & Tohkura, Y. I. (1991). McGurk effect in non‐English listeners: Few visual effects for Japanese participants hearing Japanese syllables of high auditory intelligibility. The Journal of the Acoustical Society of America, 90 (4), 1797-1805.

Tiippana, K., Andersen, T. S., & Sams, M. (2004). Visual attention modulates audiovisual speech perception. European Journal of Cognitive Psychology, 16 (3), 457-472.

Walker, S., Bruce, V., & O’Malley, C. (1995). Facial identity and facial speech processing: Familiar faces and voices in the McGurk effect. Perception & Psychophysics, 57 (8), 1124-1133.

Youse, K. M., Cienkowski, K. M., & Coelho, C. A. (2004). Auditory-visual speech perception in an adult with aphasia. Brain injury, 18 (8), 825-834.

What is the McGurk Effect?

Key Takeaways

Origin and History

External Factors Impacting the McGurk Effect

Visual Distraction

Tactile Diversion

Familiarity

Syllable Structure

Internal Factors Impacting the McGurk Effect

Brain Damage

Alzheimer’s Disease

Specific Language Impairment

Aphasia

The McGurk Effect in Different Languages

Is The McGurk effect an illusion?

References

Inside Out: How Emotions Move Through the Body

Anxiety Spirals And How To Recognize And Manage Them

Emotionally Immature Parents: Signs, Effects, and What You Can Do

Virtual Reality Therapy Helps Men Heal Their Inner Child

When Depression and Obesity Collide: Healing Mind and Metabolism

How Autistic People Describe The Pain Behind Rejection

Social Disorganization Theory

Dark Figure of Crime

Neuroscientists Reveal What Happens When You’re Distracted

Contact

our staff

topics