The latency differentiates BOLD responses elicited by congruent and incongruent McGurk audio-visual stimulus pairs
Ren-Horng Wang1, Shu-Yu Huang1, Hsin-Ju Lee2,3, Wen-Jui Kuo3, and Fa-Hsuan Lin2,4,5

1Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan, 2National Taiwan University, Taipei, Taiwan, 3National Yang-Ming University, Taipei, Taiwan, 4Department of Mediacl Biophysics, University of Toronto, Toronto, ON, Canada, 5Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland


We used fast fMRI sampled at the 10-Hz rate to study fMRI timing in audiovisual integration. Using the McGurk protocol, we found that the superior temporal gyrus (STG) had significant BOLD time-to-half (TTH) difference between the McGurk and congruent audiovisual stimuli pairs. The significance of TTH difference between congruent and McGurk was progressively more significant from posterior to anterior regions (p-values at visual cortex, lateral occipital lob, occipital parietal junction, STG, and auditory cortex were 0.13, 0.12, 0.05, 0.02, and 0.01, respectively), suggesting that incongruent audio-visual stimuli cause more delayed the brain response at regions closer to primary auditory processing.


Combining information generated from auditory modality and visual modalities leads to better speech comprehension1,2. Yet incongruent combination of auditory and visual information can cause illusion. Specifically, the McGurk effect describes the auditory percept of the syllable /ta/ or /ha/ when viewing a video clip with the mouth movement of /pa/ and the sound of /ka/3,4. Previous studies show that the left superior temporal sulcus (STS) is a critically important brain area integrating auditory and visual information5-10. People perceiving McGurk illusion have increased BOLD signal in STS11,12. For McGurk perceivers, how the BOLD signal differs between congruent, incongruent, and McGurk stimuli were less explored.

Most fMRI studies use the amplitude of the BOLD signal to correlate with behaviors or to contrast between experimental conditions. Yet recently we found that BOLD signal latency can be more sensitivity to differentiate between attentional states then BOLD signal amplitude13. Accordingly, we hypothesize that the BOLD signals differ in latencies between congruent and McGurk stimuli.


Eleven participants with participated the study and provided written informed consent. The experiment was approved by the Institutional Review Board of National Taiwan University Hospital. Stimuli were short video clips with sound. The visual component of the video clips was a whole face with a man pronouncing syllables /pa/ and /ka/; The auditory component of the video clips was the same person uttering syllables /pa/ and /ka/. Four different conditions of stimuli were generated by pairing visual and auditory components: congruent/pa/ (visual /pa/ + auditory /pa/), congruent /ka/ (visual /ka/ + auditory /ka/), incongruent non-McGurk (visual /pa/ + auditory /ka/), and incongruent McGurk (visual /ka/ + auditory /pa/). Each trial consisted of two utterances and lasted 1.8 s. Participants were asked to push the left button upon hearing /ka/ or /pa/ and the right button upon hearing /ha/ or /ta/ or other percepts. Every run lasted 7 min, and each participant completed from three to five runs. Functional MRI was measured by SMS-InI14 to sample the whole-brain hemodynamic response in the 10-Hz rate with 5-mm isotropic resolution. Structural MRI data were obtained by using a T1-weighted 3D sequence. Data pre-processing included slice-timing correction, motion correction, co-registration between functional and anatomical data, spatial normalization to the MNI space, and spatial smoothing. All the pre-processing steps were done by SPM8 (SPM8, Wellcome Department, University College London, UK). We used General Linear Model to reveal the BOLD response by finite impulse response (FIR) basis functions. The timing of a BOLD waveform was quantified by its onset, time-to-half (TTH), time-to-peak (TTP), and time-to-half-off (TTHoff). The significance of a timing index was evaluated by the permutation test, where labels to waveforms were randomly shuffled 1,000 times to create an empirical null distribution.


Figure 1 shows the regions-of-interest (ROI) at the visual cortex, lateral occipital cortex, superior temporal gyrus (STG), and auditory cortex in this study. These ROI’s were defined from the group analysis of the BOLD signal. Figure 2 shows the BOLD waveforms for congruent and incongruent as well as McGurk conditions at STG. The incongruent condition waveform had visually larger amplitude than congruent as well as McGurk waveforms. No clear difference was found between congruent condition and McGurk waveform. Normalizing all time courses to their maximal values suggested that the incongruent condition waveform has later offset than waveforms of congruent condition as well as McGurk conditions. Permutation test on BOLD timing was shown in Figure 3. Only auditory cortex (Aud), STG, and occipital parietal junction (OPJ) shows significant time-to-half difference between McGurk and Congruent conditions. No TTH difference between these two conditions at lateral occipital cortex (LOC) and visual cortex (Vis). The significance of TTH difference between congruent and McGurk was progressively more significant from posterior to anterior ROIs. Specifically, the p-values at Vis, LOC, OPJ, STG, and Aud were 0.13, 0.12, 0.05, 0.02, and 0.01, respectively.


This study revealed that the BOLD timing difference is sensitive to disclose the difference between congruent and McGurk condition. This difference was not found in BOLD signal amplitude comparison. The fine difference in BOLD timing but not amplitude timing was also reported in our prevous fast fMRI study to differetiate between attentional states13. Interestingly, we found that the timing difference became progressively more signficant as regions are further away from the visual cortex toward the auditory cortex, suggesting incongruent audio-visual stimuli cause more delayed the brain response at regions closer to primary auditory processing. Further electrophysioglical data are required to find the neuronal basis of region-dependent BOLD latencies between congruent, incongruent, and McGurk conditions.


This work was partially supported by Ministry of Science and Technology, Taiwan (103-2628-B-002-002-MY3, 105-2221-E-002- 104), the National Health Research Institutes, Taiwan (NHRI-EX107-10727EI), and the Academy of Finland (No. 298131).


1 Poeppel D., Idsardi W. J. & van Wassenhove V. Philos Trans R Soc Lond B Biol Sci.2008; 363:1071-1086.

2 Sheppard J. P., Wang J. P. & Wong P. C. PLoS One.2011; 6:e16510.

3 McGurk H. & MacDonald J. Nature.1976; 264:746-748.

4 Tiippana K. Front Psychol.2014; 5:725.

5 Beauchamp M. S., Lee K. E., Argall B. D. et al. Neuron.2004; 41:809-823.

6 Callan D. E., Jones J. A., Munhall K. et al. J Cogn Neurosci.2004; 16:805-816.

7 Calvert G. A., Campbell R. & Brammer M. J. Curr Biol.2000; 10:649-657.

8 Dahl C. D., Logothetis N. K. & Kayser C. J Neurosci.2009; 29:11924-11932.

9 Miller L. M. & D'Esposito M. J Neurosci.2005; 25:5884-5893.

10 Noesselt T., Rieger J. W., Schoenfeld M. A. et al. J Neurosci.2007; 27:11431-11441.

11 Benoit M. M., Raij T., Lin F. H. et al. Hum Brain Mapp.2010; 31:526-538.

12 Nath A. R. & Beauchamp M. S. Neuroimage.2012; 59:781-787.

13 Chu Y.-H., Lin J.-F., Wu P.-Y. et al. Proc Intl Soc Magn Reson Med.2017; 5251.

14 Hsu Y. C., Chu Y. H., Tsai S. Y. et al. Sci Rep.2017; 7:17019.


Figure 1. Regions-of-interest in this study include visual cortex (vis), lateral occipital cortex (LOC), occipital pariental junction (OPJ), superior temporal gyrus (STG), and auditory cortex (Aud).

Figure 2. The BOLD waveform at STG before (top) and after (bottom) normalizing the maximum to 1. Incongruent condition had visually larger amplitude than congruent and McGurk conditions. After amplitude normalization, the incongruent condition visually had later off response than congruent and McGurk conditions.

Figure 3. The BOLD timings at different ROIs. No significant difference in BOLD amplitude between congruent (Con) and McGurk (McGurk) conditions. However, BOLD time-to-half (TTH) differs significantly between Con and McGurk conditions at occipital parietal junction (OPJ), superior temporal gyrus (STG), and auditory cortex (Aud) but not at lateral occipital cortex (LOC) or visual cortex (Vis).

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)