By Ayesh Perera, published June 10, 2021
The McGurk effect is a perceptual phenomenon which happens when a person perceives that the movement of another individual’s lips do not match up with what that individual is actually saying.
In other words, it is an illusion which occurs in the interaction between vision and hearing in the perception of speech.
Herein, the optical component of one sound is coupled with the audial component of another sound. This pairing subsequently induces the perception of a third sound.
While a combination of poor auditory stimuli and high-quality ocular information may constitute a common source of the McGurk effect, a multiplicity of factors influence the magnitude this phenomenon.
The McGurk effect was first introduced in 1976 by the cognitive psychologists, Harry McGurk and John MacDonald in a paper titled, Hearing Lips and Seeing Voices (McGurk & MacDonald, 1976).
This effect was accidentally discovered while McGurk and his assistant, MacDonald, were conducting a study on the perception of language by infants at various developmental stages.
During the study, they would, in one location, play the video of a mother speaking, and in another location, play the sound of her voice.
In the experiment, they directed a technician to dub, with the auditory syllable “ba,” a videotape with a visual of “ga.” When the dubbed tape was played, MacDonald and McGurk perceived “da”—a phoneme distinct from both the dubbed audio and the visual.
Both McGurk and MacDonald were confused. Eventually, however, they realized that the phenomenon stemmed not from a mistake by the technician but from an idiosyncrasy in human perception.
A study of the role of optic attention in audiovisual speech perception observed the McGurk effect in two different situations (Tiippana, Andersen & Sams, 2004).
In the first instance, the listener’s attention was focused on the face of the speaker whereas in the second instance, the listener ignored the face by paying attention to a leaf which moved across the talker’s face.
The results demonstrated that the McGurk effect was weaker in the latter scenario. This outcome was attributed to the modulation of audiovisual speech perception by visual attention.
The modulation herein, could have transpired either at a premature, unisensory state of processing, or on account of alterations at the integration stage of visual and auditory information.
Research has recently challenged the popular assumption that audiovisual pairing transpires in an attention-free mode (Alsius, Navarra & Soto-Faraco, 2007).
A study investigated whether audiovisual speech integration is hindered by the depletion of optic and audial attentional resources, as attentional demands are imposed on the tactile domain which is not directly associated with speech perception.
The McGurk effect was measured in a dual task model which involved an exacting tactile assignment. The results indicated that the proportion of visually swayed responses to audiovisual information abated as attention was deflected to the tactile task.
This outcome was attributed to the modulatory impact on audiovisual binding of speech which is mediated by the limitations of supramodal attention.
These findings seem to simultaneously provide a glimpse of the dynamism and the extensiveness of the interactions between crossmodal binding mechanisms and the attentional system.
An experiment was conducted to examine the claims for the independence of facial speech processing and facial identity (Walker, Bruce & O'Malley, 1995).
In the study, the faces used to create the McGurk-effect stimuli were manipulated so that were alien to some participants but familiar to others. Moreover, the voices and the faces utilized were either congruent (belonged to the same individual) or incongruent (belonged to different individuals).
The distinct participant groups were compared to each other to gauge their susceptibility to the McGurk illusion. The results indicated that when the voices and the faces were incongruent, the participants who were acquainted with the faces were less susceptible than those who were unacquainted with the faces.
This outcome implies that facial speech and facial identity are far from independent, and that those who are familiar with the faces of the speakers are less likely, than those who are unfamiliar, to be swayed by the McGurk effect.
One research study conducted four experiments to decide whether optic information influences the judgments of acoustically-specified nonspeech events and speech events (Brancazio, Best & Fowler, 2006).
The study employed click sounds which are perceived as nonspeech by many English listeners, but function as consonants in certain African languages.
The results demonstrated a significant McGurk effect for isolated clicks. This effect however, was notably smaller than that for the stop-consonant-vowel syllables.
Moreover, for click-vowel syllables, strong McGurk effects were discovered which were similar to those for English syllables. On the other hand, for excised release bursts of stop consonants in isolation, weak McGurk effects were found; these were similar to the effects for isolated clicks.
This outcome shows that the McGurk effect may occur even in non-speech settings.
Moreover, while phonological significance is not a prerequisite for the McGurk effect, it does seem to intensify it.
The hemispheres of the brain cooperate to integrate speech information received via the optic and aural senses (Baynes, Fummell & Fowler, 1994).
Right-handed individuals, for whom words have privileged access to the left hemisphere and the face to the right hemisphere, are more likely to experience a McGurk effect.
Research also shows that the McGurk effect, though present, is significantly slower for those who have undergone callosotomy. Moreover, visual stimuli strongly impact speech perception in individuals with lesions to the left hemisphere, and they demonstrate a larger McGurk effect than the average person (Schmid, Thielmann & Ziegler, 2009).
However, they would be less likely to experience the McGurk effect if the damage to the left hemisphere had compromised their visual segment speech perception (Nicholson, Baum, Cuddy & Munhall, 2002).
On the other hand, individuals whose right brain hemisphere has been damaged, demonstrate impairment on visual-only as well as audio-visual integration functions.
Moreover, though these individuals’ integration of the information can produce a McGurk effect, such integration appears only if optic stimuli are utilized to enhance performance when the aural signal is poor.
Thus, although persons whose right hemisphere is damaged may exhibit a McGurk effect, it is not strong as it would be in a normal group.
A study which investigated brain connectivity associated with Alzheimer’s disease (AD) via the evaluation of a crossmodal effect, examined the McGurk effect in those afflicted with the disease and matched the control participants (Delbeuck, Collette & Van der Linden, 2007).
The results unveiled the impairment of crossmodal integration in speech perception for AD. However, the phenomenon was not associated with interruptions in the distinct processing of visual and auditory speech stimuli.
This outcome implies that the specific aural-optic integration deficit in AD patients might be the result of a connectivity breakdown.
A study which examined 28 preschoolers with Specific Language Impairment (SLI) and 28 preschoolers without SLI sought to analyze their capacity for auditory-visual integration (Norrix, Plante, Vance & Boliek, 2007).
While both the groups performed equivalently in congruent audio-visual modalities, in the incongruent audio-visual condition, the children with SLI demonstrated a McGurk effect which was weaker than that observed for those without SLI.
The results indicate that those with SLI experience a significantly lower McGurk effect than those without SLI.
While they may pay less attention to articulatory gestures and employ less visual information in perceiving speech, they encounter no significant challenges in perceiving exclusively aural cues.
An investigation of the ability of a person afflicted with mild aphasia to recognize tokens offered in visual only, auditory only and audio-visual conditions yielded results meriting attention (Youse, Cienkowski & Coelho, 2004).
The hypothesis was that in the bimodal condition, performance would be ideal, and that the McGurk effect would exhibit integration of speech information.
The results, however, did not support the hypotheses, but suggested that a perseverative response pattern is limiting the successful integration of audio-visual speech information, and the utilization of bisensory speech stimuli may be compromised in adults afflicted with aphasia.
Regardless of the language being used, listeners generally depend, to some degree, on visual information in speech perception. However, the intensity of the McGurk effect varies across languages.
For instance, Spanish, Italian, Turkish, English, Dutch and German listeners experience a stronger McGurk effect than Chinese and Japanese listeners (Sekiyama, 1997; Bavo, Ciorba, Prosser & Martini, 2009; Erdener, 2015).
The cultural practice of avoiding eye contact, and tonic and syllabic linguistic structures might account for this diminished effect among the Japanese as well as the Chinese.
Research also shows that, unlike English children, Japanese children do not demonstrate a developmental advancement in visual influence following age six (Sekiyama & Burnham, 2008; Hisanaga, Sekiyama, Igasaki & Murayama, 2009).
However, Japanese listeners recognize the incompatibility between aural and optic stimuli better than English listeners—perhaps on account of Japanese’s want of consonant clusters (Sekiyama & Tohkura, 1991).
Notwithstanding the aforementioned manifest differences, listeners of all languages are compelled to rely on optic stimuli when audio stimuli are unintelligible. When this occurs, variation across languages disappears and the McGurk effect is applied equally.
Prera, A (2021, June 10). The mcGurk fffect. Simply Psychology. www.simplypsychology.org/mcGurk effect.html
Alsius, A., Navarra, J., & Soto-Faraco, S. (2007). Attention to touch weakens audiovisual speech integration. Experimental Brain Research, 183(3), 399-404.
Bovo, R., Ciorba, A., Prosser, S., & Martini, A. (2009). The McGurk phenomenon in Italian listeners. Acta Otorhinolaryngologica Italica, 29(4), 203.
Baynes, K., Funnell, M. G., & Fowler, C. A. (1994). Hemispheric contributions to the integration of visual and auditory information in speech perception. Perception & Psychophysics, 55(6), 633-641.
Boersma, P. (2012). A constraint-based explanation of the McGurk effect. Phonological Architecture: Empirical, Theoretical and Conceptual Issues, 299-312.
Brancazio, L., Best, C. T., & Fowler, C. A. (2006). Visual influences on perception of speech and nonspeech vocal-tract events. Language and speech, 49(1), 21-53.
Delbeuck, X., Collette, F., & Van der Linden, M. (2007). Is Alzheimer's disease a disconnection syndrome?: Evidence from a crossmodal audio-visual illusory experiment. Neuropsychologia, 45(14), 3315-3323.
Erdener, D. (2015). The McGurk illusion in Turkish. Turkish Journal of Psychology. 30 (76): 19–31.
Hisanaga, S., Sekiyama, K., Igasaki, T., & Murayama, N. (2009). Audiovisual speech perception in Japanese and English: inter-language differences examined by event-related potentials. In AVSP (pp. 38-42).
Massaro, D. W., & Stork, D. G. (1998). Speech recognition and sensory integration: a 240-year-old theorem helps explain how people and machines can integrate auditory and visual information to understand speech. American Scientist, 86(3), 236-244.
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264(5588), 746-748.
Nath, A. R., & Beauchamp, M. S. (2012). A neural basis for interindividual differences in the McGurk effect, a multisensory speech illusion. Neuroimage, 59(1), 781-787.
Nicholson, K. G., Baum, S., Cuddy, L. L., & Munhall, K. G. (2002). A case of impaired auditory and visual speech prosody perception after right hemisphere damage. Neurocase, 8(4), 314-322.
Norrix, L. W., Plante, E., Vance, R., & Boliek, C. A. (2007). Auditory-visual integration for speech by children with and without specific language impairment. Journal of Speech, Language, and Hearing Research, 50(6), 1639–1651.
Schmid, G., Thielmann, A., & Ziegler, W. (2009). The influence of visual and auditory information on the perception of speech and non‐speech oral movements in patients with left hemisphere lesions. Clinical linguistics & phonetics, 23(3), 208-221.
Sekiyama, K. (1997). Cultural and linguistic factors in audiovisual speech processing: The McGurk effect in Chinese participants. Perception & psychophysics, 59(1), 73-80.
Sekiyama, K., & Burnham, D. (2008). Impact of language on development of auditory‐visual speech perception. Developmental science, 11(2), 306-320.
Sekiyama, K., & Tohkura, Y. I. (1991). McGurk effect in non‐English listeners: Few visual effects for Japanese participants hearing Japanese syllables of high auditory intelligibility. The Journal of the Acoustical Society of America, 90(4), 1797-1805.
Tiippana, K., Andersen, T. S., & Sams, M. (2004). Visual attention modulates audiovisual speech perception. European Journal of Cognitive Psychology, 16(3), 457-472.
Walker, S., Bruce, V., & O’Malley, C. (1995). Facial identity and facial speech processing: Familiar faces and voices in the McGurk effect. Perception & Psychophysics, 57(8), 1124-1133.
Youse, K. M., Cienkowski, K. M., & Coelho, C. A. (2004). Auditory-visual speech perception in an adult with aphasia. Brain injury, 18(8), 825-834.
Prera, A (2021, June 10). The mcGurk fffect. Simply Psychology. www.simplypsychology.org/mcGurk effect.html
Home | About Us | Privacy Policy | Advertise | Contact Us
Simply Psychology's content is for informational and educational purposes only. Our website is not intended to be a substitute for professional medical advice, diagnosis, or treatment.
© Simply Scholar Ltd - All rights reserved