Congruence-based contextual plausibility modulates cortical activity during vibrotactile perception in virtual multisensory environments | Communications Biology – Nature.com

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Advertisement
Communications Biology volume 5, Article number: 1360 (2022)
Metrics details
How congruence cues and congruence-based expectations may together shape perception in virtual reality (VR) still need to be unravelled. We linked the concept of plausibility used in VR research with congruence-based modulation by assessing brain responses while participants experienced vehicle riding experiences in VR scenarios. Perceptual plausibility was manipulated by sensory congruence, with multisensory stimulations confirming with common expectations of road scenes being plausible. We hypothesized that plausible scenarios would elicit greater cortical responses. The results showed that: (i) vibrotactile stimulations at expected intensities, given embedded audio-visual information, engaged greater cortical activities in frontal and sensorimotor regions; (ii) weaker plausible stimulations resulted in greater responses in the sensorimotor cortex than stronger but implausible stimulations; (iii) frontal activities under plausible scenarios negatively correlated with plausibility violation costs in the sensorimotor cortex. These results potentially indicate frontal regulation of sensory processing and extend previous evidence of contextual modulation to the tactile sense.
Human behaviours are embodied through the senses and embedded in a system of contexts1,2. Moreover, humans also actively construct and select contexts for their behaviours and adapt to them3,4. Thus, other than the physical and social environments, breakthroughs in computer science as well as in communication and other digital technologies in the past decades have created new digital environments for human behaviours and the research about them. Specifically, virtual reality (VR) technologies offer a broad spectrum of experiential contexts that are applicable in many domains, ranging from industry to education and medicine5,6. Within the communities of psychological and cognitive neuroscience research, VR technologies have been increasingly used in lab-based experiments to enable more naturalistic studies of human behaviours7,8,9,10. There are, however, still considerable concerns about the discrepancies between experiences in the real and virtual worlds11,12,13,14. Such gaps arise, in part, from the still rather limited use of multisensory inputs in VR, which usually do not go beyond vision and hearing.
Albeit its widespread usages, most VR applications rely only on visual and auditory information15. The inclusion of other sensory modalities, such as touch, into VR technologies are currently being developed and evaluated15,16. Towards this goal, recent developments in digital, telecommunication, as well as sensor and actuator technologies have joined force to establish a type of digital communication infrastructure, known as the Tactile Internet17,18,19, for humans to remotely access, perceive, and manipulate real or virtual objects. These technologies could provide multisensory avenues for humans to experience virtual (or remote) environments through digitalized tactile and kinesthetic information, besides visual and auditory signals. The tactile sense operates with a more precise temporal resolution20 and develops earlier in life than hearing and vision21. Therefore, the inclusion of tactile information could further enhance multisensory perception in virtual environments. En route to developing digital infrastructures for humans to behave in virtual or remote settings, it is crucial to understand neurocognitive mechanisms underlying multisensory perception22,23. Thus, we investigated cortical processes for the combined influence of bottom-up sensory congruence cues and top-down experience-based expectations on tactile perception in virtual multisensory environments.
In VR research, the plausibility of external events in the virtual environment is considered as one of the key factors for constructing virtual experiences that could be perceived as sufficiently realistic24,25. Plausibility in this regard refers to correspondences between sensory events in VR. Furthermore, the credibility of plausible external events can be maintained if they confirm with what normally would be expected in the rendered circumstances (cf. 24). This notion of perceived perceptual plausibility in VR being affirmed through a conformation with what would be expected in normal circumstances is in line with Helmholtz’s classical view (1857)26 and the more recent Bayesian approaches of perception2,27,28,29. According to these theories, perception is guided, in part, by expectations that are based on the individual’s past experiences or prior knowledge. Such top-down influences act together with bottom-up congruency cues that are driven by low-level stimulus properties (e.g., spatial or cross-modal congruency) for humans to form coherent and robust multisensory perception27,28,29,30,31,32. Regarding brain substrates for such processes, the inferior frontal lobe plays a role in combining top-down congruency expectations with bottom-up congruency cues to adjust information integration and segregation during multisensory perception29. A general aim of this study is to experimentally relate the concept of plausibility used in developing VR technologies with neurocognitive studies of multisensory perception to facilitate interdisciplinary research.
Thus far, studies on virtual realism have primarily evaluated the plausibility of external events in VR by subjective ratings33,34 or behavioural measures35. However, subjective ratings are known for their methodological limitations on validity34. Besides, behavioural observations alone could not elucidate neurocognitive processes involved in the interplay between sensory congruence and contextual expectation that together may affect the perceived plausibility of multisensory experiences in VR. Knowledge about these processes can shed light on how congruence-based contextual plausibility may modulate multisensory perception in more naturalistic settings and inform engineering solutions for plausible virtual multisensory experiences.
Empirical findings from primate and human studies have identified several relevant brain areas for tactile and multisensory perception. For instance, neuronal activity recorded in awake monkeys trained to detect vibrotactile stimuli showed that neuronal representations associated with vibrotactile stimulations unfold in time across several cortical regions, ranging from the earlier somatosensory cortex over the motor cortex to the later dorsal and ventral premotor cortex36,37. Moreover, in situations with substantial sensory uncertainty, such as when processing near-threshold vibrotactile stimuli, the subjective experience of signal detection was found to be correlated with activities in the frontal regions (e.g., premotor cortex) known to support top-down attentional control and action planning36. Besides, the parietal cortex also plays important roles in multisensory perception of trimodal stimuli which encompasses vision, hearing, and touch38. Furthermore, results from human studies of visual-auditory illusions39 showed that brain activities in the inferior frontal lobe contribute to incongruency effects when bottom-up sensory congruency cues conflicted with top-down contextual expectations29,40.
Along with the aforementioned general goal, our specific aim is to investigate how perceived perceptual plausibility that is jointly influenced by sensory congruence cues and congruency expectations may modulate cortical activities in frontal and sensorimotor regions, as well as their relations during vibrotactile perception in virtual multisensory environments. Furthermore, we were also interested in finding out whether congruence-based expectation would directly impact cortical responses in the sensorimotor cortex, beyond the effects of vibrotactile stimulation intensity. Previous studies showed that the effects of congruence-based expectation on perception can either be facilitative or inhibitory, depending on task-specific demands or response strategies41,42. Whereas task-specific response strategies may reflect perceptual decision biases, multisensory tasks without explicit response requirements are more suitable in capturing automatic perceptual processes that occur in many real-life situations41,43.
Regarding the first question, previous findings from studies on prior expectations based on semantic contexts showed that, when no explicit perceptual decision or response are involved, perceptual processing favours sensory inputs that are congruent (confirmed) over those that are incongruent (disconfirmed) with what would be expected based on prior experiences or knowledge31,44, and engages more neural activities in the frontal45 and sensory41 cortices. We thus hypothesized greater hemodynamic responses during the high in comparison to the low plausibility scenarios. As for the second question, past studies also indicated that congruence-based expectations could induce substantial changes in perceptual representations beyond just biasing perceptual selections. Such effects have previously been shown in audio-visual27, taste46,47, and pain48. We, therefore, also expected that the responses of the sensorimotor cortex to vibrotactile stimulations of a given intensity would be modulated by multisensory contextual congruency in the virtual environment. Specifically, we hypothesized that experiencing a weaker vibrotactile stimulation may nonetheless elicit a larger response in the sensorimotor cortex than a stronger stimulation, if the vibrotactile signal of a lower intensity would be more in line with the participant’s congruency expectation given by the audio-visual information embedded in the virtual scenario. We also explored the relation between the effects of congruence-based plausibility on frontal cortical responses and subjective ratings of vibrotactile stimulation plausibility.
To test these hypotheses, we created several virtual scenarios of front-row vehicle riding to investigate the underlying neurocognitive mechanisms. The plausibility level of vibrotactile stimulations was experimentally operationalized by manipulating the congruence (match) or incongruence (mismatch) between stimulation intensity and the contextual audio-visual information of different virtual vehicle riding scenarios. The daily experiences of being a passenger riding in a car that moves through roads with different surface conditions are common for adults. Acquired prior knowledge that is based on such common experiences serves as a basis for expecting weaker vibrations when riding on smooth roads and stronger vibrations when confronted with rough road situations. In this way, the sensory (in)congruence between vibrotactile stimulation intensity and the audio-visual information embedded in the virtual scenarios of moving through roads of different surface types allowed us to vary the degrees of conformity between the experienced multisensory congruence cues in VR and the participant’s expectations about vibrations they would normally expect for a given road type.
Specifically, in each of the virtual scenarios, a vibrotactile stimulation of high (e.g., 36 dB above perceptual threshold) or low (e.g.,10 dB above perceptual threshold) intensity from a car seat was presented concurrently with the audio-visual information for a given road scene. The vibration was delivered through the seat that was securely mounted on a hydraulic platform with an electrodynamic shaker. The scenes displayed views from the front passenger row of a vehicle moving through rougher (e.g., cobblestone) or smoother (highway) road surfaces that are, respectively, accompanied by congruent louder or quieter audio sound effects recorded from the corresponding situations (see “Methods” for details). The road scenes were carefully selected to be representative of daily experiences. Thus, when participants were exposed to the audio-visual contexts of the various scenarios, they would have general expectations about the vibration strengths to be felt from the seat, which would normally be associated with the respective audio-visual information of the road scenes. The different intensity levels of vibrotactile stimulations were crossed with the audio-visual information for the different road scenes, which allowed us to vary street scene-based contextual congruency expectation29 about the vibrotactile experience for each of the scenarios. Accordingly, the experiment included virtual scenarios of high plausibility (i.e., vibrotactile intensity confirmed with the contextual expectation given by the corresponding audio-visual information displayed for a road scene) or low plausibility (vibrotactile intensity disconfirmed with the contextual expectation).
We measured brain hemodynamic responses using functional Near-Infrared Spectroscopy (fNIRS) while human participants were exposed to the virtual scenarios of vehicle riding. Given previous findings on brain regions relevant for vibrotactile and trimodal perception, we used a montage with the NIRS sources and detectors covering the dorsolateral prefrontal (dlPFC), premotor (including the supplementary motor area) and the sensorimotor regions. While being on the virtual rides, the participants were asked to simply experience the situations and reflect about the plausibility of the vibrations felt through the seat. No explicit responses were required during the main experiment. Besides the main experiment, subjective plausibility ratings of vibrotactile stimulations over a wider range of intensity levels were also obtained from the participants (see “Methods” for details).
To anticipate, results of the present study indicate that plausible vibrotactile stimulations which conformed to the participants’ congruency expectations that are probed by the contextual audio-visual information embedded in the virtual scenarios elicited greater brain hemodynamic responses than stimulations of low plausibility. Notably, other than engaging a higher level of frontal cortical activity, congruence-based expectations about the intensities of vibrotactile stimulation also impacted perceptual representations in the sensorimotor cortex. Moreover, we observed negative relations between individual differences in frontal activities observed during high plausibility scenarios and the congruence-based plausibility violation costs measured in the sensorimotor cortex.
We first examined the impact of congruence-based plausibility on cortical activities when participants passively experienced virtual scenarios with audio-visual scenes of roads (e.g., see Fig. 1a, b) of different surface roughness (i.e., cobblestone, fine cobblestone, tarmac and highway roads) that were combined with vibrotactile stimulations of different intensities (the videos, sound files, and digital files of high as well as low plausibility vibrotactile stimulations for the virtual scenarios are available through the link provided in the “Data availability” statement). This manipulation jointly varied bottom-up multisensory congruence cues and top-down congruence expectations, thus yielding scenarios of high or low plausibility, depending on congruency or incongruency between the audio-visual scenes and the intensities of vibrotactile stimulations.
a Vertical vibrotactile stimulation was displayed through a seat mounted on a hydraulic platform and an electrodynamic shaker. b Videos of moving road scenes were displayed on a wall-sized screen together with the corresponding audio recordings reproduced with a wavefield synthesis system with 464 loudspeakers in the room to provide contextual visual-auditory information for the vibrotactile stimulation. A participant wearing the NIRS cap sat on a seat in the virtual environment during the experiment. c The layout of the NIRS montage used in this study covered the dorsal lateral prefrontal cortex (dlPFC, dark yellow), premotor cortex with supplementary motor area (light yellow), primary motor, and primary somatosensory cortex (pink). d The sensitivity profile of the montage indicates high sensitivity to optical density changes measured by each of the source-detector pairs in brain regions of (values shown here are on the log10 scale of sensitivity values from 0.01 to 1.0).
The participants’ subjective plausibility ratings of vibrations across a larger range of intensity levels were obtained from a separate phase of the study. These ratings confirmed that the vibrotactile stimulations used in the high plausibility scenarios were rated as more plausible (mean ratings ranged from 59.93 to 79.12) than those used in the low plausibility scenarios (mean ratings ranged from 4.46 to 56.05), F(1, 290) = 280.35, p < 0.001, ηp2 = 0.49.
We used linear mixed-effects models to analyze data collected from young adult participants (N = 36), with experimental factors (road scene and plausibility) as fixed effects, and random intercepts for participants with NIRS channels nested in participants (the relatively large number of observations from each participant, i.e., on average 36 NIRS channels for each of the 8 scenarios resulted in large degrees of freedom that influenced the calculation of effect sizes49; see “Methods” for detailed descriptions of the participants, effective sample size, and further information about the linear mixed-effects models). Results from the overall analysis of concentration levels of oxygenated haemoglobin (HbO) measured from all NIRS channels (see “Methods” for procedures of data pre-processing and motion artifact correction) revealed significant main effects of road scene (F (3, 8617) = 12.79, P < 0.0001, ηp2 = 0.004, and plausibility (F (1, 8617) = 79.30, P < 0.0001, ηp2 = 0.009). This pattern of results indicates that the levels of HbO concentration differed between the virtual scenarios as a function of contextual expectancy (Fig. 2). Across scenes of all road types, HbO concentration was higher in virtual scenarios of high than low plausibility, suggesting contextual expectancy modulates cortical activity. Of note, these effects were specific for HbO concentration and were not observed for deoxygenated haemoglobin (HbR; P = 0.18, ηp2 = 0.0006 and P = 0.17, ηp2 = 0.0002 for the main effects of road scene and plausibility, respectively). Furthermore, previous research also showed that, changes in HbO levels are more associated with task-induced cortical responses than HbR concentration levels50 (see also further information in “Methods”). Thus, below we only focus on results based on HbO. Besides the main effects, the scene by plausibility interaction was also significant (F (3, 8617) = 7.93, P < 0.0001, ηp2 = 0.003), indicating that the effect of contextual expectancy on HbO concentration was larger in scenes with extreme roughness (cobblestone, t (8617) = 5.26, 95% CI [0.000002 0.000007], P < 0.0001, d = 0.21) or extreme smooth surface (highway, t (8617) = 8.09, 95% CI [0.000004 0.0000096], P < 0.0001, d = 0.33).
The different columns correspond to the four different road scenes. The top and middle rows show results, respectively, from scenarios with vibrotactile stimulations at intensity levels of higher and low plausibility. The third row shows contrast plots (high–low plausibility).
Following up the results across all NIRS channels and road scenes, we next conducted separate analyses for the scenes with extreme roughness (cobblestone) or smoothness (highway), which also had the largest intensity difference (26 dB) between high and low plausibility conditions. We compared HbO concentrations in the sensorimotor cortex with those in the dorsolateral prefrontal and premotor regions, which are known to be involved in processes of cognitive control, planning, and conceptual expectations45,51,52 as well as in combined effects of top-down congruency expectation and bottom-up congruence cues during multisensory perception29,40. The results (Fig. 3) again revealed the main effect of plausibility in each of the two scenarios (cobblestone: F (1, 1230) = 22.50, P < 0.0001, ηp2 = 0.02; highway: F (1, 1230) = 52.68, P < 0.0001, ηp2 = 0.04). The main effect of brain regions (sensorimotor vs. dlPFC & premotor) was also significant in each of the two analyses: the HbO concentration was higher in the sensorimotor cortex than in the frontal regions (cobblestone: F (1, 1195) = 4.92, P = 0.027, ηp2 = 0.004; highway: F (1, 1195) = 4.48, P = 0.03, ηp2 = 0.004). The brain region × plausibility interaction was significant only in the road scene with extreme roughness (cobblestone: F (1, 1230) = 3.95, P = 0.047, ηp2 = 0.003; highway: F (1, 1230) = 0.49, P = 0.48, ηp2 = 0.0004), due to the relative lower HbO concentration level in the frontal regions in the low plausibility scenario in the cobblestone scene.
a Shown here are effects of congruence-based plausibility and cortical regions for the scenes of cobblestone road and b Shown here are effects of congruence-based plausibility and cortical regions for the scenes of highway. Error bars represent ± 1 standard error of mean.
Plausibility in the current study was experimentally manipulated by crossing scenes of rough (cobblestone) or smooth (highway) roads with vibrotactile stimulations of high (36 dB above perceptual threshold) or low (10 dB above perceptual threshold) vibration intensities. This allowed us to conduct further follow-up analyses to examine the nature of the effects of contextual expectancy. First, we compared effects of vibrotactile stimulation of a given intensity in the sensorimotor cortex between scenes (highway vs. cobblestone). On the one hand, when contrasting HbO concentration in the sensorimotor cortex under the 10 dB stimulation in the cobblestone scene with that in the highway scene (cobblestone10dB − highway10dB), we observed a significantly lower HbO level of the same stimulation intensity (t (1131) = −3.48, 95% CI [−0.000011 −0.0000015], P = 0.0005, (Bonferroni-)Adjusted P = 0.003, d = 0.23) in the scene with lower multisensory plausibility. For the scene with the rough road surface (i.e., cobblestone), this difference reflected an effect of negative expectancy violation cost, since 10 dB fell short of the expectation implied by the audio-visual information in the virtual scenario of the cobblestone road. On the other hand, when contrasting HbO concentration in the sensorimotor cortex under the 36 dB stimulation in the cobblestone scene with that in the highway scene (cobblestone36dB − highway36dB), we observed a marginally higher HbO level of the same stimulation intensity (t (1131) = 1.744, 95% CI [−0.0000017 0.0000080], P = 0.08, Adjusted P = 0.488, d = 0.12) in the scene with higher multisensory plausibility. For the scene with smooth road, this reflected an effect of positive expectancy violation cost, since 36 dB surpassed the expectation implied by the audio-visual contextual information in the virtual scenario of the highway).
We then compared effects of vibrotactile stimulations of different intensities in the same scene as a function of expectation congruence or violation. In the scene of smooth road (highway), a lower intensity (10 dB) of vibrotactile stimulation that was congruent with the expectation given by the audio-visual information of the scene elicited a higher HbO level in the sensorimotor cortex than a stronger (36 dB) vibrotactile stimulation that violated the contextual expectation (t (1230) = 4.61, 95% CI [0.0000034 0.0000124], P < 0.0001, Adjusted P < 0.0001, d = 0.30). An analysis of HbO levels associated with the same scene (highway) in the frontal regions revealed a similar effect (t (1230) = 5.65, 95% CI [0.0000034 0.0000095], P < 0.0001, Adjusted P < 0.0001, d = 0.25). In the scene of rough road (cobblestone), similar patterns of results were only observed in the frontal regions (t (1230) = 5.05, 95% CI [0.0000027 0.0000088], P < 0.0001, Adjusted P < 0.0001, d = 0.22), but not in the sensorimotor cortex (t (1230) = 0.97, 95% CI [−0.0000029 0.0000062], P = 0.33, Adjusted P > 0.99, d = 0.06).
As a next step, we conducted correlational analyses to examine how congruence-based plausibility may modulate functional relations between cortical activities in the frontal and sensorimotor regions during multisensory perceptual experiences in VR scenarios. Given that individual differences in the signal-to-noise ratios of the NIRS signals may confound between-person correlations, baseline correction was performed for each participant (see “Methods” for details). Furthermore, we computed plausibility violation cost scores for each of the participants, based on the signal amplitude of an individual’s cortical activities in the low plausibility condition as his or her own control. Lastly, we also conducted a further control analysis by checking correlations based on ratio scores for each participant (i.e., normalizing plausibility violation costs through dividing them by the means of cortical activities of both conditions at the individual level). Since the ratio scores were not normally distributed, we conducted Spearman’s correlation for the ratio scores.
We first correlated the magnitudes of expectancy modulation of HbO concentrations in the frontal regions with those in the sensorimotor cortex separately for the modulation effects involving negative or positive expectancy violations. Independent of valence (negative or positive), greater magnitudes of expectancy modulation in the frontal regions are highly correlated with greater modulation effects in the sensorimotor region across individuals (Fig. 4a: negative plausibility violation cost: rcost (33) = 0.76, P < 0.0001, N = 35; Fig. 4b: positive plausibility violation cost: rcost (34) = 0.89, P < 0.0001, N = 36). Control analyses computed based on ratio scores that were normalized by individual means of HbO concentrations of both plausibility conditions also yielded significant, albeit attenuated, effects (negative: rhoratio (33) = 0.68, P < 0.0001, N = 35; positive: rhoratio (32) = 0.43, P = 0.01, N = 34). These results indicate that congruence-based plausibility systematically modulates cortical activities in the frontal and sensorimotor regions.
a Results regarding negative expectancy cost which was computed as the difference in HbO concentration between the scenarios of cobblestone10dB and highway10dB (n = 35 young adults). b Results regarding positive expectancy cost which was computed as the difference in HbO concentration between the scenarios of highway36dB and cobblestone36dB. Shaded area represents the 95% confidence interval (n = 36 young adults).
Next, we examined the relations between levels of HbO in the frontal regions under scenarios of high multisensory plausibility (i.e., the scenes with highway or cobblestone road that were paired with the 10 dB or 36 dB vibrotactile stimulations, respectively) and the magnitudes of expectancy violation costs in the sensorimotor cortex. Again, independent of the valence of expectancy violation, we observed negative correlations between levels of HbO concentration in the dlPFC and premotor regions under high plausibility scenarios and the magnitude of expectancy violation costs in the sensorimotor cortex (Fig. 5a: high plausibility virtual scenario with 10 dB stimulation, rcost (33) = −0.56, P < 0.001, N = 35; Fig. 5b: high plausibility virtual scenario with 36 dB stimulation, rcost (33) = −0.60, P < 0.001, N = 35). Validating these results by control analyses computed based on ratio scores also yielded comparable significant effects (10 dB stimulation: rhoratio (33) = −0.55, P < 0.001, N = 35; 36 dB stimulation: rhoratio (31) = −0.75, P < 0.001, N = 33). These correlations indicate that individual differences in engaging activities in the dlPFC and premotor cortex when processing contextually congruent multisensory information are negatively associated with individual differences in recruiting cortical activities in sensorimotor cortex when contextual expectations were violated.
a Results regarding the scenario of high plausibility with 10 dB vibrotactile stimulation (n = 35 young adults). b Results regarding the scenario of high plausibility with 36 dB vibrotactile stimulation. Shaded area represents the 95% confidence interval (n = 35 young adults).
Lastly, as an exploratory analysis we correlated individual subjective ratings of perceived plausibility of the vibrotactile stimulations that were collected in addition to the main experimental task separately with HbO concentrations in the frontal regions and in the sensorimotor cortex. Significant negative correlations between individual differences in the sensitivity of subjective ratings (i.e., the difference between ratings of high vs. low plausibility scenarios divided by the rating of low plausibility scenario) were observed in the frontal regions (rho (34) = −0.45, P = 0.006, N = 36; see Fig. 6a) and in the sensorimotor cortex (rho (34) = −0.34, P = 0.04, N = 36; see Fig. 6b) in the low plausibility scenario with the scene of smooth road surface (highway). No such correlations were observed for the high plausibility scenario with the highway scene or scenarios with the rougher surfaces.
a Results showing HbO concentration in the low plausibility scenario with smooth road surface (highway) in the frontal and premotor regions (n = 35 young adults). b Results showing HbO concentration in the low plausibility scenario with smooth road surface (highway) in the sensorimotor cortex (n = 35 young adults). Shaded area represents the 95% confidence interval.
Combining a naturalistic setup for vehicle riding experiences in virtual multisensory scenarios with assessing brain hemodynamic responses using fNIRS, we identified brain mechanisms associated with the plausibility principle of virtual realism24,25, which operated through multisensory congruence cues and congruency expectations26,27,28,29,53 that were contextually embedded in the virtual environments. Congruency between the experienced intensity of vibrotactile stimulations and the audio-visual information available in the virtual scenarios that also confirmed with the individual’s contextual expectation based on prior experiences with similar road types render more plausible perception in VR. Such confirmed congruence-based plausibility engaged greater activities in cortical regions (dlPFC and premotor cortex) which are important for top-down congruency expectation modulation and planning of cognition and behaviour54, as well as activities in the sensorimotor cortex implicating sensory and motor processing55. Of note, independent of the actual stimulation intensity, greater cortical activities were observed in the sensorimotor cortex for vibrotactile events with congruent audio-visual sensory cues and congruent road scene-based expectations. These results reveal that congruence-based contextual plausibility modulates vibrotactile perceptual representations in the sensorimotor region (see Figs. 2 and 3). Furthermore, negative correlations between individual differences in frontal activities under virtual scenarios supporting multisensory expectancy confirmation and the magnitude of plausibility violation costs in the sensorimotor cortex (see Fig. 5) indicates that those individuals who were more sensitive in engaging frontal activities during confirmed multisensory contextual expectations also yielded weaker perceptual representations in the sensorimotor cortex when the experienced vibrotactile stimulations were of low congruence-based plausibility. Results from the exploratory analyses revealed correlations between cortical activities and individual differences in subjective ratings of perceptual plausibility. Individuals who showed a greater sensitivity in subjective plausibility ratings also engaged weaker frontal and sensorimotor activities when the sensory events experienced in the VR environment were low in plausibility. This finding, however, was only observed for the road scene with smooth surface, indicating that while subjective ratings may be related with underlying neurocognitive processes, they are less discriminant for individual differences in ratings under conditions of high plausibility scenarios or rough road surface where the normally expected stimulation intensities may tend to be high.
Taken together, findings from this study empirically link the construct of plausibility in the research on the design and evaluation of VR technologies24,25 with established psychological and neurocognitive research on mechanisms and effects of expectations on multisensory perception27,28,29,53. However, whether current results can be generalized to more immersive settings that use head-mounted displays in VR space with visual projections on several walls (known as CAVE technologies) still need to be investigated in future research. Whereas many previous studies on brain mechanisms of expectancy modulation have focused mainly on auditory and visual modalities29,31,41,44,45, the findings of the present study extend these mechanisms to the tactile sense. This study also lends further support for the Bayesian inference framework of embodied and contextually embedded perception2,27,28,29, which formalizes contextual expectations as statistical priors about the properties of the multisensory environment during perception and action.
Evidence from previous studies that used auditory-visual illusions39 to investigate mechanisms of expectations on multisensory perception shows that blood-oxygen-level-dependent (BOLD) responses in the inferior frontal gyrus40 and inferior frontal sulcus29 are sensitive to sensory congruency. The current results of confirmed contextual expectations modulating HbO concentrations in the frontal and sensorimotor regions are only partly consistent with the earlier evidence, since instead of an incongruency-related upregulation of cortical activities, here we observed increased HbO responses during conditions of confirmed contextual expectations. It is important to note, however, that perceptual illusion tasks usually involve competing responses (illusory or non-illusory percept). The potential perceptual and response conflicts in the non-congruent conditions engaged more, instead of less, frontal activities relative to the congruent conditions in such settings. Thus, depending on task-specific demands and response strategies, sensory stimulations that are either congruent or incongruent with expectations could both trigger greater brain responses41,42. Our finding that vibrotactile stimulations congruent with the expectations given by the audio-visual information engaged greater cortical activities is in good agreement with previous results, further reiterating that passive perceptual processing favours congruent over incongruent information29,44 and engages more neural activities in the frontal45 and sensory41 cortices. Furthermore, previous research on visual scene contextual facilitation of object perception yielded findings that are in line with the direction of the effects observed here. Neural responses in the visual cortex during visual object processing were found to be enhanced by embedding objects in expected visual scenes56.
Other than the effects on frontal regions, we also observed that the levels of HbO concentrations in the sensorimotor cortex were higher when the intensities of vibrotactile stimulations felt through the seat were as expected based on the audio-visual information in the virtual environments. Moreover, the plausibility-related upregulation of activities in sensorimotor cortex was positively correlated with plausibility-enhancing effects in the frontal and premotor regions. This finding parallels with a previous result from the visual modality, which revealed correlated cortical effects of scene contextual facilitation in brain regions of visual object processing and in the areas of higher-level expectation-derived scene processing, including the  parahippocampal place area and the retrosplenial cortex56. The mere intensity of the stimulation did not directly affect perceptual representations in the sensorimotor cortex. A stronger stimulation level (36 dB that violated the contextual expectation associated with riding over a smooth road given the audio-visual information in the virtual scenario of a highway resulted in a 7-fold weaker response in the sensorimotor cortex compared to the expected weaker stimulation (10 dB). This result extends previous findings on the powerful effects of contextual congruency in changing perceptual representations that were found in auditory, visual, taste, and pain perception27,46,47,48.
The role of the prefrontal cortex in flexibly gating sensory processing for context-dependent or goal-directed behaviour has been well established in animal and human studies57. Here, the observed relationship between frontal activities during virtual scenarios with vibration intensities that confirmed with contextual expectations and the expectancy violation costs as reduced responses in the sensorimotor cortex are in line with previous research. There are anatomical and functional connectivity for frontal regulations of sensory processing. For instance, frontal regulations of sensorimotor processing could be channelled through the cortico-cortical pathway, starting from the dlPFC, premotor cortex, supplementary motor area to the sensorimotor cortex and parietal regions58. Although the PFC does not project directly to the modality-specific sensory regions in the thalamus, it may filter sensory signals indirectly through regulating the basal-ganglia-thalamus pathways59. Early human brain lesion research indicates that the prefrontal cortex may gate somatosensory inputs through frontal-parietal pathway, as demonstrated by a reduction in somatosensory evoked potentials observed in patients with PFC lesions60. Using non-invasive brain stimulation technique, a recent study also showed that high-frequency repetitive transcranial stimulation over the dlPFC modulates sensorimotor cortex’s adaption to pain perception61.
Thus far, the interdisciplinary exchanges between cognitive neuroscience and virtual reality (VR) technologies have been mainly focusing on using VR as a means to enhance ecological and dynamic features of experiments. This direction of exchanges has been beneficial for researchers to set up more naturalistic and well-controlled psychological and neuroscientific experimental studies in the lab7,10,13,14. However, the other direction of exchange, i.e., using understandings about psychological and neurocognitive mechanisms of embodied and embedded multisensory perception to guide the design and construction of VR techniques, is also an important, but still a very much neglected task. Results from this study along with previous findings from the research on human perception underscore the power of contextual expectations. We perceive what we expect to experience, be it in reality or in VR. Thus, besides focusing on improving engineering solutions for sensor/actuator technologies and software, the effects of contextual expectations on multisensory processing at the perceptual and brain level are other avenues that can be leveraged to optimize VR technologies.
Forty-three healthy young adults (24 males, mean age = 23.86 years, range: 18–30 years) participated in the study. Thirty-eight participants were right-handed as measured by the Edinburgh handedness Inventory62. All participants provided informed consent before the study and were compensated for their participation. The study was approved by the Ethics Committee of the Technische Universität Dresden (SR-EK-5012021).
The stimuli were chosen to represent naturalistic vibration exposure in daily car riding situations. Therefore, four scenes of a common vehicle moving through different road surfaces (cobblestone, fine cobblestone, tarmac, highway) were recorded at a speed of 50 kph from the perspective of front seat. The vertical seat vibrations were recorded with a seat pad accelerometer (B&K 4515B) and low frequency vibrations with a Kistler 8305B10 sensor. Videos were recorded with a Canon EOS 600D Camera with an optical image stabilizer lens. Sound effects at the ear of the driver/passenger were recorded with two B&K 2671 microphones attached to the head rest of the seat.
All virtual multisensory scenarios used in the experiment were based on these multimodal scene recordings. The audio-visual recordings originally recorded from the scenes were used, whereas synthesized vibrations (sinusoidal, amplitude modulated sinusoidal, bandlimited white gaussian noise) were utilized according to previous results33. The synthesized vibrations63 had been previously rated by human users to be of equal plausibility as the recorded vibrations. It is known that vibration levels affect perceived plausibility of audio-visual scenes64. Thus, for each road scene, the acceleration level of the synthesized vibrations was set to 10, 13.25, 16.50, 19.75, 23, 26.25, 29.50, 32.75, and 36 dB in sensation level (i.e., relative to the perceptual threshold of whole-body vibration65) for the subjective rating phase of the study. The maximum acceleration level was constrained by the reproduction system, while the minimum acceleration level was chosen to be clearly perceivable (i.e., above the perceptual threshold). Table 1 shows only the parameters of the synthetized vibrations at different sensation levels used for the virtual scenarios of the main experimental task.
We experimentally manipulated the plausibility of vibrotactile stimulations in virtual multisensory road scenes by crossing the vibration levels with the four road scenes of different surface smoothness (the audio-visual scenes of the four virtual scenarios are available in the link provided in the “Data availability” statement). Table 2 shows the levels of the vibrotactile stimulations in the high and low plausibility scenarios for the four road scenes in the main experimental task. The main experimental task thus had a 4 ×2 (Scene × Plausibility) within-subject design. During the main experiment, 15 repetitions of each of the eight plausibility by scene combinations were presented and the participants were instructed to passively perceive the multisensory information in the virtual scenario and reflect about the plausibility of the experienced vibration from the car seat. No explicit response was required.
Aside from the main experimental phase, there was a separate phase of subjective rating. During the rating phase, participants were asked to explicitly rate the plausibility of the vibrotactile stimulations across 9 intensity levels (10, 13.25, 16.50, 19.7, 23, 26.25, 29.50, 32.75, and 36 dB) that were paired only once with each of the four road scenes, resulting in 36 scenarios for ratings. The broad range of vibration levels was used for the rating phase to allow assessments of incremental changes in the subjective rating. The participants provided verbal ratings on a quasi-continuous Rohrmann scale with possible values from 0 to 10033,63,64,66 with equidistant verbal anchors at 0 (“not at all” plausible), 25 (“slightly” plausible), 50 (“moderately” plausible), 75 (“very” plausible) and 100 (“extremely” plausible). The subjective ratings validated the anticipated difference between the high and low plausibility condition.
The audio-visual-tactile stimuli described in the previous section were presented as multimodal virtual scenarios in the multimodal measurement laboratory described in a previous study67. Optical reproduction was achieved with a Full-HD Projector on a screen with a diagonal of 300 cm at the distance of 340 cm to the participants. The acoustic reproduction was realized with a wavefield synthesis system consisting of 464 individually controllable speakers. Such a system recreates the wavefield originally produced by a recorded sound source. Compared to headphone reproduction, it does not have shortcomings such as head localization of sounds which could potentially break immersion. Furthermore, it does not rely on the phantom source effect as utilized in stereo setups. Thus, the audio reproduction of road scene recordings is insensitive to unwanted changes in perceived direction of sounds due to head movements. Finally, vibrations of the seat were reproduced by two systems to cover all the perceivable frequency range of everyday life vibrations spanning from 1 to 500 Hz. Low-frequency vibrations below 15 Hz was presented with a hydraulic motion platform and high- frequency vibration with an electrodynamic shaker attached to the surface of the seat. Due to the current study’s focus on vibration, reproduction was calibrated for each participate to take into account individual differences in weight or body height. The transfer function for each participant was measured prior to the experiment. Subsequently, it was compensated with an FIR filter68 to ensure identical vibration reproduction for each participant. The multimodal reproduction system used as the virtual environment in our experiment is shown in the figure (Fig. 1a, b).
All participants had normal or corrected-to-normal vision and handedness was assessed using the Edinburgh Handedness Inventory62. Before the study, two commonly used psychometric tests, i.e., the Identical Pictures Test69 and the Spot-the-Word Test70 were used to assess participants’ basic cognitive speed and verbal ability. Afterwards, all participants underwent the main experimental phase and the subjective rating phase. The order of these two parts of the study were counterbalanced across participants. In the main experiment, an event-related study design was used to measure brain hemodynamic responses using fNIRS. Participants passively experienced the multisensory virtual scenarios of car riding on four difference road surfaces with concurrent vibrotactile stimulations (vibrations of the seat) of different intensities (see Table 2). Each of the eight virtual scenarios was displayed for a duration of 4 s, while the inter-trial interval (ITI) was jittered according to the following formula:
In this equation TLoad refers to the time it takes for a given scene to load, TScene refers to the duration of the scene, TTransition refers to the time taken to progress to the next stimulus and TRandom(Geometric) refers to a random number generated according to a geometric distribution. The mean inter-trial interval (ITI) was 14.60 s (ranging from 13.65 to 17.08 s). Each of the 8 virtual scenarios (Table 2) was presented for 15 times, amounting to a total of 120 trials. The presentation order of virtual scenarios was randomized for each participant.
The concentrations of oxygenated and deoxygenated haemoglobin (HbO and HbR, respectively) were collected using the continuous-wave, battery-operated fNIRS system NIRSport 2 (NIRx Medical Technologies, LLC, USA). This system employs two distinct wavelengths (i.e., 760 and 850 nm) with a sampling rate of 4.98 Hz. The NIRS probe was designed using fNIRS Optodes Location Decider (fOLD v2.271), which is a MATLAB-based toolbox that computes optimal optode placement in the 10-10 system in relation to specific brain areas. The Brodmann anatomical atlas was used to localize the optodes to brain regions corresponding to the dorsolateral prefrontal cortex (BA 9), premotor and supplementary motor cortex (BA 6), primary somatosensory cortex (BA 1, BA 2, BA 3), and primary motor cortex (BA 4). For coverage of these areas, the source positions were located at positions AF3, AF4, F3, F2, F4, FC5, FC1, FC2, FC6, C3, Cz, C4, CP1, CP2 while the detector positions were located at AFz, F1, F2, FT7, FC3, FCz, FC4, FT8, C5, C1, C2, C6, CP3, CPz, and CP4 (see Fig. 1c for the probe arrangement). The sensitivity profile of the measurement channels (Fig. 1d) provides visual information as to whether our probe design (montage) is sensitive enough in optical density changes measured by each of the source-detector pairs from changes in the absorption coefficients in the cortex in the regions of interest. As shown in Fig. 1d, our montage resulted in a high sensitivity profile that covers the brain regions of interest.
The fNIRS montage we used consisted of 36 long source-detector separation channels (with inter-optode distance approximately 30 mm with the help of linked optode holders). Of the 36 different channels, 25 channels recorded brain hemodynamic responses from the prefrontal cortex and premotor regions, while the remaining 11 channels recorded activity over the motor and somatosensory cortex. No short-separation channels were used in this study. Since participants would experience whole body vibration, this might lead to artifactual short-distance measurements which could potentially transfer some noise into the signal of interest72,73. However, we acknowledge that there may be unwanted physiological confounds, thus, these unwanted physiological confounds potentially due to movement were removed using principal component analysis which has been shown to yield results comparable to short-channel separation techniques74 (see below for details on pre-processing).
Out of the 43 participants who participated in the study, fNIRS data from 4 participants were not obtained due to technical difficulties. Pre-processing was conducted using the Homer3 Toolbox75 (BUNPC), while the hemodynamic data was reconstructed using AtlasViewer Toolbox76. During pre-processing, bad channels were first identified and excluded using the hmrR_PruneChannels function with the signal-to-noise threshold set to the common criterion of 6.67 (equivalent to coefficient of variation (CV) = 15%; SNR = 1/CV*100). Datasets which contained more than 25% bad channels (i.e., less then CV of 15%77,78 were removed from further analysis (N = 2). The raw signals of light intensity were converted into changes in optical density using the hmrR_Intensity2OD function. Afterwards, using the hmrR_MotionArtifactbyChannel function, motion artefacts were identified as signal changes in optical density units greater than an amplitude of 0.3 over half a second and marked for one second. Outliers due to motion artefacts that were identified as wavelet coefficients exceeding 1.5 times of the interquartile range were then corrected using the hmrR_MotionCorrectWavelet function79. Wavelet filtering is the most effective approach for correcting motion artefacts and it typically reduces motion artefacts up to 93% of cases80,81. Trials which still contained motion artefacts that were unsuccessfully corrected were rejected using the hmrR_StimRejection function in the time range covering 2 seconds prior to 4 s post-stimulus. Datasets with more than 20% of rejected trials were removed from the analyses (N = 1). Thus, altogether, the effective dataset for analyses of fNIRs data contained data from 36 participants.
Subsequently, a low-pass filter with a cutoff of 0.5 Hz was used to remove high-frequency components (e.g., instrument noise) yet retain brain response of interest82. This frequency range, however, may still include physiological artifacts (e.g., those induced by respiration, which is around 0.3 Hz and spontaneous oscillations in arterial blood pressure that is around 0.1 Hz known as the Mayer waves). Since our setup only consisted of long-separation channels, principal component analysis (hmrR_PCAFilter, nSV=1.0) was used to mitigate physiological confounds and interferences from superficial layers of the scalp, skin, and skull83,84. In setups without short-separation channels, principal component analysis could yield comparable results as short source-separation channels74. In the next step, the pro-processed and filtered optical density was converted into hemodynamic concentration values using hmrR_OD2Conc function based on the modified Beer-Lambert Law. Finally, the hemodynamic response function (HRF) is estimated with a general linear model (GLM) that uses ordinary least squares function85. The HRF was modelled with consecutive sequence of Gaussian functions with a standard deviation of 0.5 s and mean separated by 0.5 s (glmSolveMethod= 1, idxBasis=1, paramsBasis = [0.5, 0.5]). The regression time was between −2 to 12 seconds, using the pre-stimulus time of 2 s for baseline correction to account for intra- and inter-individual differences in time-dependent changes in cerebral oxygenation. To account for baseline drift, it was modelled using a third-order polynomial fit86,87. For the purpose of further statistical analysis, baseline-corrected mean concentrations of HbO and HbR between 1 to 5 s post-trial onset were extracted. This time window was selected because our stimuli (i.e., the multisensory virtual scenarios) lasted for 4 seconds and to ensure that we include the peak HRF which typically occurs around 4 to 5 s. Since the concentration of HbO has been consistently shown to be more reliable than HbR in reflecting task-induced, event-related cortical responses50,88,89,90, below we only focused on reporting results of HbO concentrations in the main section of this paper.
In order to obtain cortical topography of brain activity, an atlas head model was registered to the participant’s head via digitized points of sources and detectors which allow for a more accurate estimation of the location of brain activation. Image reconstruction of the changes in absorption coefficient in the cortex is possible given experimentally measured changes in optical density and sensitivity profile (i.e., forward matrix)76. Briefly, the inverse problem can be solved by inverting the forward matrix. Thus, image reconstruction can be accomplished using the following equation:
In this equation, x refers to the spatial distribution of HbO or HbR absorption perturbation, A refers to the forward matrix/sensitivity profile of the registered atlas which is obtained using Monte Carlo photon migration simulation, λ refers to the scalar regularization parameter (in which we used the default value of α = 0.01), I refers to the identity matrix while y refers to the vector of measurements which is provided as optical density changes.
Statistical analysis of the fNIRS data was conducted using Matlab R2018b and RStudio (R 4.1.1). HbO concentration changes was analyzed with linear mixed-effects models using the lme function from the nlme package in R. To investigate if there were any changes in HbO concentration with regards to the different road scenes and plausibility levels, linear mixed-effects models were calculated with maximum likelihood estimation, with Scene (cobblestone, fine cobblestone, tarmac, highway) and Plausibility (low, high) as within-subjects fixed-effects; participants were included as random intercepts with the NIRS channels nested in participants to account for between-subject variability in hemodynamic concentration changes across the channels78,91,92,93.
To investigate contextual expectancy modulation of cortical activities during extreme rough or smooth road scenes of high (36 dB) or low (10 dB) levels of vibrotactile stimulation, we then focused on the scenes with the largest difference in sensation level between the most plausible and least plausible scenarios, namely, the Highway and Cobblestone scenes. For each of the two scenes, linear mixed-effects models were analyzed with two regions of interests, i.e., the frontal attentional control and action planning region (dlPFC & premotor) and the sensorimotor region (somatosensory & motor) and Plausibility (High, Low) as fixed-effects and subjects (with channels nested in subjects) as random-effects. To compare the effects of contextual expectancy in modulating perceptual representations of vibrotactile stimulation across scenes, linear mixed-effects models were analyzed using Scene (Highway, Cobblestone) and Intensity (10 dB and 36 dB) as fixed-effects and subjects (with channels nested into subjects) as random effect.
For all the linear mixed-effects models analyses above, the normality of residuals was inspected using Kolmogorov–Smirnov test. Where the residuals were not normally distributed, robust permutation tests were carried out in the following way. Firstly, linear mixed-effects model was analyzed using the lmer function from the lme4 package94. If there were convergence issues, the random effects structure was simplified. Permutation tests were conducted on these models using PERMANOVA (number of permutations = 5000) from the ‘predictmeans’ package in R95. Since the initial models revealed similar results to the findings from models with the permutation test, we report findings from the initial standard linear mixed-effects models, with factors ‘Scene’ and ‘Plausibility’ as fixed effects, participants as random intercept with NIRS channels nested into participants. For the main effects and interactions, we report partial eta-squared (ηp2) for effect size96,97 with the following interpretation: ηp2 = 0.01 (small), ηp2 = 0.06 (medium), ηp2 = 0.14 (large). Main effects and interaction effects were followed up by post hoc multiple comparisons with Bonferroni correction using the emmeans package in R98. For these post hoc analyses, Adjusted P-values are also reported and Cohen’s d are reported as effect size with the following interpretation: d = 0.2 (small), d = 0.5 (medium), d = 0.8 (large).
To account for potential order effects (main experiment phase before rating phase, or the other around), we also included order as a covariate in our analyses; however, using order as a covariate did not change the results. Therefore, we report findings from the original analyses without any covariate.
Negative expectancy violation cost was calculated by subtracting HbO concentration in the Highway High Plausibility condition from the HbO concentration in the Cobblestone Low Plausibility condition (i.e., cobblestoneLow Plausibility (10dB) − highwayHigh Plausibility (10dB)), whereas positive expectancy violation cost was calculated by subtracting HbO concentration in the Cobblestone High Plausibility condition from the HbO concentration in the Highway Low Plausibility condition (highwayLow Plausibility (36dB) − CobblestoneHigh Plausibility (36dB)). For the correlational analyses, HbO concentration changes were averaged across all channels in frontal attentional control and actional planning regions (dlPFC & premotor) and in the sensorimotor regions (somatosensory & motor). Outliers that were 3 standard deviations away from the mean were removed to avoid spurious correlations (N = 35 for the correlation between negative plausibility violation cost in the frontal and negative plausibility violation cost in the sensorimotor regions (Fig. 4a); N = 35 for the correlation between negative plausibility violation cost in the sensorimotor cortex and average HbO concentration in the frontal regions of the high plausibility highway scenario (Fig. 5a); N = 35 for the correlation between positive plausibility violation cost in the sensorimotor cortex and average HbO concentration in the frontal regions of the high plausibility cobblestone scenario (Fig. 5b). Correlational analyses were conducted using Pearson’s correlation. Where the data was not normally distributed as indicated by Shapiro–Wilk test of normality, Spearman’s correlation was used.
The participants’ subjective ratings were collected across 9 vibration levels (10, 13.25, 16.50, 19.75, 23, 26.25, 29.50, 32.75, and 36 dB) that were each paired once with each of the four road scenes. The ratings showed that indeed across the four scenes participants felt the vibrotactile stimulations to be more plausible in the scenarios experimentally defined to be of high plausibility (mean ratings ranged from 59.93 to 79.12) than in the low-plausibility scenarios (mean ratings ranged from 4.46 to 56.05). The plausibility ratio for each scene was calculated by dividing the difference between the high and low plausibility ratings for that scene by the low plausibility ratings. Because the ratings ranged from 0 to 100, a value of 1 was added to the denominator. Thus, the plausibility ratings ratio was calculated as follows: (Ratinghigh − Ratinglow)/(Ratinglow + 1). Outliers that were 3 standard deviations away from the mean were removed. To validate as to whether our highly plausible scenes were indeed perceived to be more plausible than the least plausible scenes, we used linear mixed-effects models to analyze participants’ plausibility ratings data with Scene (Cobblestone, Fine Cobblestone, Tarmac, Highway) and Plausibility (Low, High) as within-subjects fixed-effects and subjects as random-effects. Similar to above, where residuals were not normally distributed, ratings data were analyzed using robust permutation tests.
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
The anonymised data analyzed in this paper and stimuli are available at the following Open Science Framework repository link: https://osf.io/wpn6e/?view_only=cbcd4489f4b847a6a0fd642ef999f3e2.
The code used for analyzing the data is available at the Open Science Framework repository link: https://osf.io/wpn6e/?view_only=cbcd4489f4b847a6a0fd642ef999f3e2.
Clark, A. The Cambridge Handbook of Cognitive Sciences (eds Frank, K. & Ramsey, W.) 275–291 (Cambridge University Press, 2012).
Linson, A., Clark, A., Ramamoorthy, S. & Friston, K. The active inference approach to ecological perception: general information dynamics for natural and artificial embodied cognition. Front. Robot. AI 5, 21 (2018).
Article  Google Scholar 
Quartz, S. R. The constructivist brain. Trends Cogn. Sci. 3, 48–57 (1999).
Article  Google Scholar 
Li, S.-C. Biocultural orchestration of developmental plasticity across levels: the interplay of biology and culture in shaping the mind and behavior across the life span. Psychol. Bull. 129, 171–194 (2003).
Article  Google Scholar 
Eden, J. et al. Principles of human movement augmentation and the challenges in making it a reality. Nat. Commun. 13, 1–13 (2022).
Article  Google Scholar 
O’Connor, M. et al. Sampling molecular conformations and dynamics in a multiuser virtual reality framework. Sci. Adv. 4, eaat2731 (2018).
Article  Google Scholar 
Bellmund, J. L. et al. Deforming the metric of cognitive maps distorts memory. Nat. Hum. Behav. 4, 177–188 (2020).
Article  Google Scholar 
Brookes, J., Warburton, M., Alghadier, M., Mon-Williams, M. & Mushtaq, F. Studying human behavior with virtual reality: the Unity Experiment Framework. Behav. Res. Methods 52, 455–463 (2020).
Article  Google Scholar 
Draschkow, D., Nobre, A. C. & van Ede, F. Multiple spatial frames for immersive working memory. Nat. Hum. Behav. 6, 536–544 (2022).
Article  Google Scholar 
Hofmann, S. M. et al. Decoding subjective emotional arousal from EEG during an immersive virtual reality experience. Elife 10, e64812 (2021).
Article  Google Scholar 
Bohbot, V. D., Copara, M. S., Gotman, J. & Ekstrom, A. D. Low-frequency theta oscillations in the human hippocampus during real-world and virtual navigation. Nat. Commun. 8, 1–7 (2017).
Article  Google Scholar 
Donato, F. & Moser, E. I. A world away from reality. Nature 533, 325–326 (2016).
Google Scholar 
Matusz, P. J., Dikker, S., Huth, A. G. & Perrodin, C. Are we ready for real-world neuroscience? J. Cogn. Neurosci. 31, 327–338 (2019).
Article  Google Scholar 
Shamay-Tsoory, S. G. & Mendelsohn, A. Real-life neuroscience: an ecological approach to brain and behavior research. Perspect. Psychol. Sci. 14, 841–859 (2019).
Article  Google Scholar 
Obrist, M., Ranasinghe, N. & Spence, C. Multisensory human–computer interaction. Int. J. Hum. Comput. Stud. 107, 1–4 (2017).
Article  Google Scholar 
Melo, M. et al. Immersive multisensory virtual reality technologies for virtual tourism. Multimed. Syst. 28, 1027–1037 (2022).
Article  Google Scholar 
Aijaz, A., Simsek, M., Dohler, M. & Fettweis, G. 5G Mobile Communications 677–691 (Springer, 2017).
Fitzek, F. H. et al. Tactile internet: With Human-in-the-Loop (Academic Press, 2021).
Muschter, E. et al. Perceptual quality assessment of compressed vibrotactile signals through comparative judgment. IEEE Trans. Haptics 14, 291–296 (2021).
Article  Google Scholar 
Yang, Y. & Zador, A. M. Differences in sensitivity to neural timing among cortical areas. J. Neurosci. 32, 15142–15147 (2012).
Article  Google Scholar 
Stein, B. E., Stanford, T. R. & Rowland, B. A. Development of multisensory integration from the perspective of the individual neuron. Nat. Rev. Neurosci. 15, 520–535 (2014).
Article  Google Scholar 
Li, S.-C, Muschter, E., Limanowski, J. & Hazipanayioti, A. Human perception and neurocognitive development across the lifespan. In Tactile Internet: with Human in the Loop (eds. Fitzek, F. H. et al.) 199–221 (Academic Press, 2021)
Schirner, G., Erdogmus, D., Chowdhury, K. & Padir, T. The future of human-in-the-loop cyber-physical systems. Computer 46, 36–45 (2013).
Article  Google Scholar 
Slater, M. Place illusion and plausibility can lead to realistic behaviour in immersive virtual environments. Philos. Trans. R. Soc. B Biol. Sci. 364, 3549–3557 (2009).
Article  Google Scholar 
Slater, M. & Sanchez-Vives, M. V. Transcending the self in immersive virtual reality. Computer 47, 24–30 (2014).
Article  Google Scholar 
Helmholtz, H. (eds Warren, R. M. & Warren, R. P.) Helmholtz on Perception: Its Physiology and Development 49 (John Wiley & Sons, 1968).
Ernst, M. O. & Bülthoff, H. H. Merging the senses into a robust percept. Trends Cogn. Sci. 8, 162–169 (2004).
Article  Google Scholar 
de Lange, F. P., Heilbron, M. & Kok, P. How do expectations shape perception? Trends Cogn. Sci. 22, 764–779 (2018).
Article  Google Scholar 
Gau, R. & Noppeney, U. How prior expectations shape multisensory perception. Neuroimage 124, 876–886 (2016).
Article  Google Scholar 
Chen, Y. C. & Spence, C. When hearing the bark helps to identify the dog: Semantically-congruent sounds modulate the identification of masked pictures. Cognition 114, 389–404 (2010).
Article  Google Scholar 
Doehrmann, O. & Naumer, M. J. Semantics and the multisensory brain: how meaning modulates processes of audio-visual integration. Brain Res. 1242, 136–150 (2008).
Article  Google Scholar 
Spence, C. Multisensory flavor perception. Cell 161, 24–35 (2015).
Article  Google Scholar 
Rosenkranz, R. & Altinsoy, M. E. Tactile design: Translating user expectations into vibration for plausible virtual environments. In IEEE World Haptics Conference (WHC) 307–312 (2019).
Yannakakis, G. N. & Martínez, H. P. Ratings are overrated! Front. ICT 2, 13 (2015).
Article  Google Scholar 
Skarbez, R., Neyret, S., Brooks, F. P., Slater, M. & Whitton, M. C. A psychophysical experiment regarding components of the plausibility illusion. IEEE Trans. Vis. Comput. Graph. 23, 1369–1378 (2017).
Article  Google Scholar 
de Lafuente, V. & Romo, R. Neural correlate of subjective sensory experience gradually builds up across cortical areas. Proc. Natl Acad. Sci. USA 103, 14266–14271 (2006).
Article  Google Scholar 
Romo, R. & Rossi-Pool, R. Turning touch into perception. Neuron 105, 16–33 (2020).
Article  Google Scholar 
Driver, J. & Noesselt, T. Multisensory interplay reveals crossmodal influences on ‘sensory-specific’brain regions, neural responses, and judgments. Neuron 57, 11–23 (2008).
Article  Google Scholar 
McGurk, H. & MacDonald, J. Hearing lips and seeing voices. Nature 264, 746–748 (1976).
Article  Google Scholar 
Nath, A. R. & Beauchamp, M. S. A neural basis for interindividual differences in the McGurk effect, a multisensory speech illusion. Neuroimage 59, 781–787 (2012).
Article  Google Scholar 
van Atteveldt, N. M., Formisano, E., Goebel, R. & Blomert, L. Top–down task effects overrule automatic multisensory responses to letter–sound pairs in auditory association cortex. Neuroimage 36, 1345–1360 (2007).
Article  Google Scholar 
Diaconescu, A. O., Alain, C. & McIntosh, A. R. The co-occurrence of multisensory facilitation and cross-modal conflict in the human brain. J. Neurophysiol. 106, 2896–2909 (2011).
Article  Google Scholar 
de Gelder, B. & Bertelson, P. Multisensory integration, perception, and ecological validity. Trends Cogn. Sci. 7, 460–467 (2003).
Article  Google Scholar 
Laurienti, P. J., Kraft, R. A., Maldjian, J. A., Burdette, J. H. & Wallace, M. T. Semantic congruence is a critical factor in multisensory behavioral performance. Exp. Brain Res. 158, 405–414 (2004).
Article  Google Scholar 
Laurienti, P. J. et al. Cross-modal sensory processing in the anterior cingulate and medial prefrontal cortices. Hum. Brain Mapp. 19, 213–223 (2003).
Article  Google Scholar 
Spence, C., Levitan, C. A., Shankar, M. U. & Zampini, M. Does food color influence taste and flavor perception in humans? Chemosens. Percept. 3, 68–84 (2010).
Article  Google Scholar 
Woods, A. T. et al. Expected taste intensity affects response to sweet drinks in primary taste cortex. Neuroreport 22, 365–369 (2011).
Article  Google Scholar 
Wager, T. D. et al. Placebo-induced changes in FMRI in the anticipation and experience of pain. Science 303, 1162–1167 (2004).
Article  Google Scholar 
Lakens, D. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4, 863 (2013).
Plichta, M. M. et al. Event-related functional near-infrared spectroscopy (fNIRS): are the measurements reliable? Neuroimage 31, 116–124 (2006).
Article  Google Scholar 
Koechlin, E., Corrado, G., Pietrini, P. & Grafman, J. Dissociating the role of the medial and lateral anterior prefrontal cortex in human planning. Proc. Natl Acad. Sci. USA 97, 7651–7656 (2000).
Article  Google Scholar 
Noppeney, U., Josephs, O., Hocking, J., Price, C. J. & Friston, K. J. The effect of prior visual information on recognition of speech and sounds. Cereb. Cortex 18, 598–609 (2008).
Article  Google Scholar 
Deroy, O., Spence, C. & Noppeney, U. Metacognition in multisensory perception. Trends Cogn. Sci. 20, 736–747 (2016).
Article  Google Scholar 
Badre, D. & Nee, D. E. Frontal cortex and the hierarchical control of behavior. Trends Cogn. Sci. 22, 170–188 (2018).
Article  Google Scholar 
Melnik, A., Hairston, W. D., Ferris, D. P. & König, P. EEG correlates of sensorimotor processing: independent components involved in sensory and motor processing. Sci. Rep. 7, 1–15 (2017).
Article  Google Scholar 
Brandman, T. & Peelen, M. V. Interaction between scene and object processing revealed by human fMRI and MEG decoding. J. Neurosci. 37, 7700–7710 (2017).
Article  Google Scholar 
Miller, E. K. & Cohen, J. D. An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202 (2001).
Article  Google Scholar 
Genon, S. et al. The right dorsal premotor mosaic: organization, functions, and connectivity. Cereb. Cortex 27, 2095–2110 (2017).
Google Scholar 
Nakajima, M., Schmitt, L. I. & Halassa, M. M. Prefrontal cortex regulates sensory filtering through a basal ganglia-to-thalamus pathway. Neuron 103, 445–458 (2019).
Article  Google Scholar 
Yamaguchi, S. & Knight, R. Gating of somatosensory input by human prefrontal cortex. Brain Res. 521, 281–288 (1990).
Article  Google Scholar 
de Martino, E., Seminowicz, D. A., Schabrun, S. M., Petrini, L. & Graven-Nielsen, T. High frequency repetitive transcranial magnetic stimulation to the left dorsolateral prefrontal cortex modulates sensorimotor cortex function in the transition to sustained muscle pain. NeuroImage 186, 93–102 (2019).
Article  Google Scholar 
Oldfield, R. C. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9, 97–113 (1971).
Article  Google Scholar 
Rosenkranz, R. & Altinsoy, M. E. Mapping the sensory-perceptual space of vibration for user-centered intuitive tactile design. IEEE Trans. Haptics 14, 95–108 (2020).
Article  Google Scholar 
Sakamoto, S., Ohtani, T., Suzuki, Y. & Gyoba, J. Effects of vibration information on the senses of presence and verisimilitude of audio-visual scenes. In INTER-NOISE and NOISE-CON Congress and Conference Proceedings Vol. 253 4890–4895 (Institute of Noise Control Engineering, 2016).
Morioka, M. & Griffin, M. J. Absolute thresholds for the perception of fore-and-aft, lateral, and vertical vibration at the hand, the seat, and the foot. J. Sound Vib. 314, 357–370 (2008).
Article  Google Scholar 
Rohrmann, B. Verbal Qualifiers for Rating Scales: Sociolinguistic Considerations and Psychometric Data. Project Report, University of Melbourne, Australia. www.rohrmannresearch.net/pdfs/rohrmann-vqs-report.pdf (2007).
Altinsoy, M. E., Jekosch, U., Landgraf, J. & Merchel, S. Progress in Auditory Perception Research Laboratories—Multimodal Measurement Laboratory of Dresden University of Technology. In Audio Engineering Society Convention 129 (Audio Engineering Society, 2010).
Altinsoy, M. E. & Merchel, S. BRTF (body-related transfer function) and whole-body vibration reproduction systems. in Audio Engineering Society Convention 130 (Audio Engineering Society, 2011).
Lindenberger, U. & Baltes, P. B. Intellectual functioning in old and very old age: cross-sectional results from the Berlin Aging Study. Psychol. Aging 12, 410 (1997).
Article  Google Scholar 
Baddeley, A., Emslie, H. & Nimmo-Smith, I. The spot-the-word test: a robust estimate of verbal intelligence based on lexical decision. Br. J. Clin. Psychol. 32, 55–65 (1993).
Article  Google Scholar 
Zimeo Morais, G. A., Balardin, J. B. & Sato, J. R. fNIRS Optodes’ Location Decider (fOLD): a toolbox for probe arrangement guided by brain regions-of-interest. Sci. Rep. 8, 1–11 (2018).
Article  Google Scholar 
Santosa, H., Aarabi, A., Perlman, S. B. & Huppert, T. Characterization and correction of the false-discovery rates in resting state connectivity using functional near-infrared spectroscopy. J. Biomed. Opt. 22, 055002 (2017).
Article  Google Scholar 
Zhou, X., Sobczak, G., McKay, C. M. & Litovsky, R. Y. Comparing fNIRS signal qualities between approaches with and without short channels. PLoS One 15, e0244186 (2020).
Article  Google Scholar 
Noah, J. A. et al. Comparison of short-channel separation and spatial domain filtering for removal of non-neural components in functional near-infrared spectroscopy signals. Neurophotonics 8, 015004 (2021).
Article  Google Scholar 
Huppert, T. J., Diamond, S. G., Franceschini, M. A. & Boas, D. A. HomER: a review of time-series analysis methods for near-infrared spectroscopy of the brain. Appl. Opt. 48, D280–D298 (2009).
Article  Google Scholar 
Aasted, C. M. et al. Anatomical guidance for functional near-infrared spectroscopy: AtlasViewer tutorial. Neurophotonics 2, 020801 (2015).
Article  Google Scholar 
Piper, S. K. et al. A wearable multi-channel fNIRS system for brain imaging in freely moving subjects. Neuroimage 85, 64–71 (2014).
Article  Google Scholar 
Schommartz, I., Dix, A., Passow, S. & Li, S.-C. Functional effects of bilateral dorsolateral prefrontal cortex modulation during sequential decision-making: a functional near-infrared spectroscopy study with offline transcranial direct current stimulation. Front. Hum. Neurosci. 14, 619 (2021).
Molavi, B. & Dumont, G. A. Wavelet-based motion artifact removal for functional near-infrared spectroscopy. Physiol. Meas. 33, 259 (2012).
Article  Google Scholar 
Cooper, R. et al. A systematic comparison of motion artifact correction techniques for functional near-infrared spectroscopy. Front. Neurosci. 6, 147 (2012).
Article  Google Scholar 
Brigadoi, S. et al. Motion artifacts in functional near-infrared spectroscopy: a comparison of motion correction techniques applied to real cognitive data. Neuroimage 85, 181–191 (2014).
Article  Google Scholar 
Yücel, M. A. et al. Best practices for fNIRS publications. Neurophotonics 8, 012101 (2021).
Google Scholar 
Virtanen, J., Noponen, T. E. & Meriläinen, P. Comparison of principal and independent component analysis in removing extracerebral interference from near-infrared spectroscopy signals. J. Biomed. Opt. 14, 054032 (2009).
Article  Google Scholar 
Zhang, Y., Brooks, D. H., Franceschini, M. A. & Boas, D. A. Eigenvector-based spatial filtering for reduction of physiological interference in diffuse optical imaging. J. Biomed. Opt. 10, 011014 (2005).
Article  Google Scholar 
Ye, J. C., Tak, S., Jang, K. E., Jung, J. & Jang, J. NIRS-SPM: statistical parametric mapping for near-infrared spectroscopy. Neuroimage 44, 428–447 (2009).
Article  Google Scholar 
von Lühmann, A., Li, X., Müller, K.-R., Boas, D. A. & Yücel, M. A. Improved physiological noise regression in fNIRS: a multimodal extension of the general linear model using temporally embedded canonical correlation analysis. NeuroImage 208, 116472 (2020).
Article  Google Scholar 
Jahani, S., Setarehdan, S. K., Boas, D. A. & Yücel, M. A. Motion artifact detection and correction in functional near-infrared spectroscopy: a new hybrid method based on spline interpolation method and Savitzky–Golay filtering. Neurophotonics 5, 015003 (2018).
Article  Google Scholar 
Huppert, T. J., Hoge, R. D., Diamond, S. G., Franceschini, M. A. & Boas, D. A. A temporal comparison of BOLD, ASL, and NIRS hemodynamic responses to motor stimuli in adult humans. Neuroimage 29, 368–382 (2006).
Article  Google Scholar 
Hoge, R. D. et al. Simultaneous recording of task-induced changes in blood oxygenation, volume, and flow using diffuse optical imaging and arterial spin-labeling MRI. Neuroimage 25, 701–707 (2005).
Article  Google Scholar 
Mihara, M. & Miyai, I. Review of functional near-infrared spectroscopy in neurorehabilitation. Neurophotonics 3, 031414 (2016).
Article  Google Scholar 
Jasinska, K. K. & Petitto, L.-A. How age of bilingual exposure can change the neural systems for language in the developing brain: A functional near infrared spectroscopy investigation of syntactic processing in monolingual and bilingual children. Dev. Cogn. Neurosci. 6, 87–101 (2013).
Article  Google Scholar 
Vassena, E., Gerrits, R., Demanet, J., Verguts, T. & Siugzdaite, R. Anticipation of a mentally effortful task recruits Dorsolateral Prefrontal Cortex: An fNIRS validation study. Neuropsychologia 123, 106–115 (2019).
Article  Google Scholar 
Wyser, D. G. et al. Characterizing reproducibility of cerebral hemodynamic responses when applying short-channel regression in functional near-infrared spectroscopy. Neurophotonics 9, 015004 (2022).
Article  Google Scholar 
Bates, D. et al. Package ‘lme4’. Linear mixed-effects models using S4 classes. http://cran.r-project.org/web/packages/lme4 (2011).
Luo, D., Ganesh, S. & Koolaard, J. predictmeans: Calculate predicted means for linear models. http://cran.r-project.org/package=predictmeans (2014).
Fern, E. F. & Monroe, K. B. Effect-size estimates: issues and problems in interpretation. J. Consum. Res. 23, 89–105 (1996).
Article  Google Scholar 
Cohen, J. Eta-squared and partial eta-squared in fixed factor ANOVA designs. Educ. Psychol. Meas. 33, 107–112 (1973).
Article  Google Scholar 
Lenth, R., Singmann, H., Love, J., Buerkner, P. & Herve, M. Emmeans: Estimated marginal means, aka least-squares means (R package Version 1.3.0) [Computersoftware]. https://cran.r-project.org/web/packages/emmeans/index.html (2018).
Download references
This work was funded by the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) as part of Germany’s Excellence Strategy – EXC 2050/1 – Project ID 390696704 – Cluster of Excellence “Centre for Tactile Internet with Human-in-the-Loop” (CeTI) of Technische Universität Dresden. Open access funding provided by Technische Universität Dresden.
Open Access funding enabled and organized by Projekt DEAL.
These authors contributed equally: Kathleen Kang, Robert Rosenkranz.
Chair of Lifespan Developmental Neuroscience, Technische Universität Dresden, Dresden, Germany
Kathleen Kang, Kaan Karan & Shu-Chen Li
Centre for Tactile Internet with Human-in-the-Loop (CeTI), Technische Universität Dresden, Dresden, Germany
Robert Rosenkranz, Kaan Karan, Ercan Altinsoy & Shu-Chen Li
Chair of Acoustics and Haptics, Technische Universität Dresden, Dresden, Germany
Robert Rosenkranz & Ercan Altinsoy
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
S.C.L., K. Kang, and R.R. designed the research. R.R. and E.A. designed and provided the virtual reality environment. K. Kang and K. Karan collected the data. K. Kang and S.C.L. planned data analyses. K. Kang conducted the analyses. S.C.L., K. Kang, and R.R. wrote the initial draft of manuscript. All authors contributed to further iterations of manuscript editing.
Correspondence to Kathleen Kang or Shu-Chen Li.
The authors declare no competing interests.
Communications Biology thanks Marta Gomez and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Daniel Bendor and Luke R. Grinham.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and Permissions
Kang, K., Rosenkranz, R., Karan, K. et al. Congruence-based contextual plausibility modulates cortical activity during vibrotactile perception in virtual multisensory environments. Commun Biol 5, 1360 (2022). https://doi.org/10.1038/s42003-022-04318-4
Download citation
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-022-04318-4
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Advertisement
Communications Biology (Commun Biol) ISSN 2399-3642 (online)
© 2022 Springer Nature Limited
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

source

Leave a Comment