The rubber hand “illusion” (RHI), in which participants report experiences of ownership over a fake hand, appears to demonstrate that subjective ownership over one’s body can be easily disrupted. It was recently shown that existing methods of controlling for suggestion effects in RHI responding are invalid. It was also shown that propensity to agree with RHI ownership statements is correlated with trait phenomenological control (response to imaginative suggestion). There is currently disagreement regarding the extent to which this relationship may cofound interpretation of RHI measures. Here we present the results of simulated experiments to demonstrate that a relationship between trait phenomenological control and RHI responding of the size reported would fundamentally change the way existing RHI results must be interpreted. Using real participant data, each simulated experiment used a sample biased in selection for trait phenomenological control. We find that using experiment samples comprised only of participants higher in trait phenomenological control almost guarantees that an experiment provides evidence consistent with RHI. By contrast, samples comprised of only participants lower in trait phenomenological control find evidence for RHI only around half the time – and of greater concern, evidence specifically for “ownership” experience just 4% of the time. These findings clearly contradict claims that the magnitude of relationship between phenomenological control and RHI responding is a minor concern, demonstrating that the presence of participants higher in trait phenomenological control in a given RHI experiment sample is critical for finding evidence consistent with RHI. Further study and theorising regarding RHI (and related effects) must take into account the role that trait phenomenological control plays in participant experience and responses during RHI experiments.
Since its report over 20 years ago, the rubber hand “illusion” (RHI) has been used extensively to investigate human (and animal) sense of body ownership (Botvinick & Cohen, 1998; Braun et al., 2018; Riemer et al., 2019). The apparent success of the original paradigm has led to analogous versions being deployed in many contexts such as ownership for other limbs (Lenggenhager et al., 2015), whole body (Lenggenhager et al., 2007), or even other’s faces (Sforza et al., 2010) or bodies (Petkova & Ehrsson, 2008), or objects (Armel & Ramachandran, 2003). While there are ‘implicit’ measures of ownership – implicit because they do not require explicit report of body ownership – the claim that RHI can be informative about body ownership rests on two aspects: 1) the strong qualitative experience that some people describe as having (and long-form reports of such) and 2) reported agreement with a series of statements describing anomalous experience during the experimental procedure. Because existing control methods for these questions are confounded by hypothesis awareness (Lush, 2020; Lush, Seth, et al., 2021; Reader, 2021) it is not currently known the degree to which any existing RHI report reflects hypothesis awareness effects, e.g., faking, intentional acts of imagination or phenomenological control (Corneille & Lush, 2021).
It has been proposed that imaginative suggestion effects may confound psychological experiments (Kirsch & Council, 1989; Michael et al., 2012). Recently (Lush et al., 2020), it was reported that subjective response in typical RHI paradigms is correlated with hypnotisability. Based on this result, it was suggested that RHI responding at least partially reflects top-down phenomenological control to meet expectancies arising from demand characteristics (Dienes, Lush, et al., 2020; Dienes, Palfi, et al., 2020; Lush et al., 2020). Phenomenological control is the trait ability to generate experience to meet expectancies in a range of contexts (including the hypnotic context; see Dienes, Lush, et al., 2020; Dienes, Palfi, et al., 2020 for reviews). There is disagreement regarding the degree to which this may confound existing interpretations of RHI studies. While Lush et al. (2020) interpret linear models which show substantial relationships, others focus on the standardized correlation coefficients which they interpret as weak and therefore of little consequence (see e.g. Ehrsson et al., 2021). The present study is intended to address such interpretations of a weak relationship by providing concrete demonstrations of the potential implications of the relationship for interpretation of RHI studies, framed in terms of whether a study may have reported evidence in favour of the RHI. This study is therefore intended to support researchers in forming intuitions regarding the strength of the reported relationships as regards the RHI and hypnotisability but may also be of benefit for other effects for which relationships with trait phenomenological control have been reported (e.g. mirror touch synaesthesia, vicarious pain, and visually evoked auditory response; Lush et al., 2020; Lush, Dienes, Seth, et al., 2021).
Using data from the largest existing single dataset on the RHI (Lush et al., 2020), we conducted a series of simulated experiments to mimic the output of a whole field of RHI studies. In these experiments, we explicitly included or excluded participants based on whether they were higher in trait phenomenological control. If the relationship between RHI responding and trait phenomenological control is weak, this manipulation should have little to no influence on the whether any part of this simulated field of studies would conclude in favour of their participants demonstrating the RHI. However, if trait phenomenological control and RHI responding are strongly related, as reported by Lush et al. (2020), then the outcomes of the simulated field of RHI studies will be very different depending on which participants are included: the simulated experiments wherein only participants lower in trait phenomenological control are included will be unlikely to conclude in favour of evidence for the RHI, while experiments including only those higher in trait phenomenological control will be very likely to find evidence for the RHI.
Before considering a potential relationship between trait phenomenological control and RHI responding, we must first establish the methods by which evidence for “ownership” over a rubber hand is established in the existing RHI literature. This is a difficult task given that such a wide variety of methods have been used (Reader et al., 2021a; Riemer et al., 2019). Here we attempt to reduce this issue to something that is rational, justifiable, and can be worked with in practice.
Quantifying change in subjective experience in RHI
RHI experiments typically attempt to bring the sense of ownership into the domain of quantitative science by asking participants to rate the degree to which they agree or disagree (on a scale from -3 to 3) with a series of statements that quantify their subjective experiences (Table 1), introduced by Botvinick & Cohen (1998). Since their original use, these questions, or adaptations thereof, have been repeatedly interpreted as a method to quantify participants’ agreement with the sentiment that the rubber hand begins to feel as though it belongs to them. As can be seen from the statements, only S3 can reasonably be considered to specifically reflect on ownership, while S1 and S2 reflect instead the degree to which the RHI experience is causing confusion about the relationship between objects, the rubber hand, and the person’s own body. Note that S1 is phrased in a way that makes this judgement ambiguous. It’s not clear whether the “…location where I saw the rubber hand touched” refers to the location on the rubber hand, or the analogous location on the participant’s own hand, making the degree to which the question is fit for purpose unclear, because it’s likely to produce a mixture of responses following each of these distinct interpretations (Wu, forthcoming). Anecdotally in our experiments and elsewhere (see Wu, forthcoming), some participants report interpreting this statement as a check that the experimenter was doing a good job of matching brush strokes across real and fake hands. Indeed, it is not uncommon for participants to report agreement for S1, but disagreement with S2, even though both are supposed to reflect an experience of referred touch (~12% of participants in the sample from Lush et al., 2020; see Supplementary Results). It is likely that this contributes to the relatively high agreement for S1 (1.94) compared to S2 (1.15; see Supplementary Results).
Statements . |
---|
S1. It seemed as if I were feeling the touch of the paintbrush in the location where I saw the rubber hand touched |
S2. It seemed as though the touch I felt was caused by the paintbrush touching the rubber hand |
S3. I felt as if the rubber hand were my hand |
Statements . |
---|
S1. It seemed as if I were feeling the touch of the paintbrush in the location where I saw the rubber hand touched |
S2. It seemed as though the touch I felt was caused by the paintbrush touching the rubber hand |
S3. I felt as if the rubber hand were my hand |
Statement S1 has also been found to not load on any factors identified for RHI reports (for example, ownership or location; Longo et al., 2008). Some studies report maximum agreement scores for all participants for this statement (so that there is no variance in participant report; Botvinick & Cohen, 1998; Rohde et al., 2011), even whilst other similar statements, supposedly measuring the same experience, generate wide variation in response. Strong agreement with the statement by all participants would not be expected if such agreement only reflects experience of the RHI, as there is consensus that a sizeable minority of people do not experience the RHI (Riemer et al., 2019).
It is common practice to combine S1-S3 (or other combinations, e.g. S1 and S3 only) into an overall measure (e.g., Kaplan et al., 2014; Peled et al., 2003; see Reader et al., 2021a). There are obvious issues with interpreting this combined measure as reflecting ownership experience because of the combination of two ambiguous referred touch questions (S1 and S2) with a single question that is clearly the only one about ownership (S3) (see also Kalckert et al., 2019; Reader et al., 2021a). There is also the question of whether S3 is phrased in such a way as to render agreement with it uninterpretable (Wu, forthcoming). Putting aside these legitimate concerns, if we take at face value that the questions are broadly indicating “ownership”, there is still a serious issue with interpreting results based on these reports as indicating important features of human body ownership.
Rather than using the ratings in only the synchronous condition, the difference in subjective rating between synchronous and asynchronous stroking conditions is sometimes claimed to be the true evidence for the RHI (e.g. Ehrsson et al., 2021). However, use of the asynchronous control condition, and therefore any difference from it, is invalid because the difference between synchronous and asynchronous conditions is confounded by participant expectations (Lush, 2020),1. The results of the study by Lush (2020) revealed that when participants with no prior experience of the RHI were merely provided with information about the RHI procedure, they reported expectancies which matched the magnitude and direction of reports in illusion and control conditions in RHI studies. That is, participants in RHI studies are, on average, hypothesis aware regarding both the difference between synchronous and asynchronous stroking conditions and the difference between control and illusion statements (Lush, 2020; Lush, Seth, et al., 2021; Reader, 2021). We therefore do not interpret difference measures between synchronous and asynchronous conditions here, and further, we caution against their continued use (e.g. Abdulkarim et al., 2021; Reader et al., 2021b) given the now well-established issues (see also Kalckert et al., 2019 for a further argument against interpreting a difference between synchronous and asynchronous conditions as evidence of illusion experience).
Strength of RHI subjective reports is related to trait phenomenological control
For the purposes of the present study, the key finding reported previously (Lush et al., 2020) indicated that participants’ subjective SWASH ratings (their trait phenomenological control measured by the Sussex Waterloo Scale of Hypnotisability; Lush et al., 2018) were significantly correlated with a propensity to agree with the average of the three subjective statements (rs = 0.26, 95% CI [0.16, 0.37]; see Figure 3a in Lush et al., 2020). Figure 1 depicts a scatter plot of (351) participants’ Subjective SWASH rating versus their average (dis)agreement scores following the synchronous stroking condition of the RHI experiment (this is equivalent to Figure 3a in Lush et al., 2020; see also Supplementary Figure 1 in the Supplementary Results for the distribution of SWASH scores). A linear regression suggests that for every one-unit change in SWASH rating, participants’ average agreement score (S1-S3) increased by 0.578 units. Propensity to agree with statements S1-S3 is highly related to the ability to respond to suggestion across the cases measured by SWASH. Inferential statistics and further analysis of this relationship are covered in Lush et al. (2020) and the meaning of this relationship in a broader sense has been previously discussed at length (Dienes, Lush, et al., 2020; Lush, Dienes, & Seth, 2021). The purpose of this study is not to duplicate these discussions, but to build on them by demonstrating precisely the effect that this relationship has on a typically conducted RHI experiment.
The results of a previous study (Walsh et al., 2015) might be taken to contradict the claim that there is a relationship between hypnotisability and subjective RHI responding. As discussed in Lush et al. (2020), Walsh et al. (2015) concluded that there was no evidence for a relationship on the basis of a non-significant correlation in 23 participants. This study would only be powered to detect very large relationships between these measures. Walsh et al. (2015) report Pearson’s r = 0.20, p = 0.371 (n = 23). 95% C.I. = [-0.2313, 0.5656]. Given that it is impossible to conclude in favour of the null hypothesis (no relationship) with the methods employed by Walsh et al. (2015), the results of that study do not provide counterevidence to the findings in Lush et al. (2020).
Outside of a potential relationship with hypnotisability (or phenomenological control), issues with task demands in RHI paradigms, and interpretation of RHI reports in general, have been noted previously. Consistent with the proposal that reports in RHI studies are confounded by suggestion effects, the phrasing of experimental instructions (referring to feelings or beliefs) has been shown to bias responses (Tamè et al., 2018). Regarding the putative basis of the illusion itself, Alsmith (2015) argued that response to the RHI may be “spontaneous imagining”. Dieguez (2018) goes so far as to suggest that “bodily-self researchers might have been operating under an”illusion illusion,” that is the illusion that they were studying an illusion all along”, a sentiment with which we broadly agree. Leaving aside any specific appeal to an interpretation based on phenomenological control, clearly the idea that there is something problematic in the RHI paradigm and that requires additional explanation beyond a “multisensory illusion” interpretation has been noted several times.
Practical consequences of relationship between RHI responding and phenomenological control: a case study
Here, we interpret agreement in the synchronous condition, rather than the difference between synchronous and asynchronous condition reports, because this difference is confounded by demand characteristics and therefore uninterpretable (Lush, 2020). As mentioned, propensity to agree with the three statements is taken as evidence that participants have a changed subjective impression induced by the RHI setup (Botvinick & Cohen, 1998). This is typically not explicitly tested in RHI studies, though some researchers interpret agreement scores above 1 as evidence of such change when distinguishing responders from non-responders (Ehrsson et al., 2007); note more stringent cut-offs have been used, e.g. 2 or greater, (Lloyd, 2007). A simple way to quantify agreement is to compare whether mean agreement across the three statements is significantly greater than 0 (demarcating agreement and disagreement). Scores less than 0 indicate disagreement with the idea that the rubber hand appears to belong to the participant. Scores greater than 0 would indicate that the rubber hand appears to belong to the participant. To first confirm that the sample from Lush et al. (2020) was consistent with evidence for changes in subjective experience when using this criterion (0), we conducted a one sample t-test, comparing the mean subjective rating for S1, S2 and S3 in the synchronous stroking condition across the entire sample against a rating of 0. This revealed evidence in favour of a difference from 0, with a mean rating of 1.27 (SEM=0.08), t(350)=15.42, p < 0.001). This can be taken as evidence for significantly positive ratings of the three statements following synchronous stroking, over the entire sample of 351 participants. If an experimenter following typical practice and, after testing for a difference between synchronous and asynchronous conditions, proceeds to interpret the synchronous condition, taking at face value that the average of ratings for the three subjective statements does indeed reflect subjective ownership and that a criterion of 0 separates agreement from disagreement, that researcher might conclude that a change in experience of body ownership attributable to multisensory integration mechanisms had occurred.
Simulated experiments with and without participants higher in trait phenomenological control
Analysis methods
To specifically examine how the presence of participants higher or lower in trait phenomenological control impacts inferences that could be taken from a RHI experiment, we simulated several experiments with different combinations of participants and examined the pattern of statistical inferences that was produced. To simulate each experiment, we randomly sampled a number of participants from the overall population of 351 participants taken from Lush et al. (2020). Two participants were excluded from the analyses presented there because they contained a missing value in at least one of the variables of interest. In the main analyses presented here, we used 20 participants for each experiment. This number was chosen because it is the mean number of participants that were used in the top 30 cited publications on the RHI (median = 15; see Table S1 in Supplementary Materials), therefore representing the standard practice in the field. The same experiments were repeated with samples of 10 or 50 participants (see Supplementary Results 10[50] participants). Each ‘experiment’ was simulated 10 000 times, with each iteration comprised of a random sample of 20 participants drawn from the pool of 351. In each repeat of the experiment, there were no duplicates of participant data – each of the 20 participants were unique in that sample and their reports were the real reports from the study reported in Lush et al. (2020). No attempt was made to prevent similarity of participant samples across the 10 000 simulations (constrained random selection rather than n-choose-k). The smallest cell sampled from contained 175 participants – the number of unique combinations (n-choose-k) of 175 samples is 9.65x1025, meaning that the 10 000 repeats did not saturate the combinations available in the dataset. For each experiment, we determined whether the mean agreement rating across participants was greater or less than 0 (agreement or disagreement) and whether a one-sample t-test (two-tailed) indicated that the mean rating was significantly different from 0. Consequently, we have two measures of interest – how frequently the mean agreement rating was in agreement, disagreement, or neutral (defined as 0) and whether that mean (positive or negative) was significantly different from 0 (tests only for agreement or disagreement, not neutral), or did not provide evidence in favour of a difference (not statistically significant). These simulations act as proxy for a field of researchers conducting 10 000 RHI experiments with participants like those from Lush et al. (2020), under different possible biases in participant sampling. The full analysis scripts and data for these experiments is available at OSF https://osf.io/d5hjp/).
Validating the sample of participants from Lush et al. (2020) with criterion of 0
To confirm that the sample from Lush et al. (2020) produced results consistent with those reported by the field, we first checked what happened if we simulated 10 000 experiments containing 20 participants each, without any bias in sampling (all 351 participants could be sampled from). Depicted in Figure 2A, we found that in almost all of the 10 000 simulations (9996), the mean rating of statements S1-S3 was positive (agreement) and that on 8315 of the 10 000 simulations this mean rating of agreement would have been statistically significant at p < 0.05. Therefore, we might expect that around 83% of experiments conducted under conditions and with participants like those from Lush et al. (2020) would produce evidence consistent with agreement in the averaged subjective statements. Standing in as representative of a field of studies, these simulated experiments show that the majority of studies using these subjective measures would report in favour of participants agreeing that they experienced the RHI. The participants in the sample from Lush et al. (2020) are therefore broadly consistent with participants from previous studies across the field of RHI research.
Experiments with and without participants higher in trait phenomenological control produce vastly different outcomes
Average of statements S1-S3
To provide evidence that the magnitude of relationship between RHI responding and trait phenomenological control is indeed substantial and of considerable concern when interpreting RHI studies, in the next experiments the participant sample (20) was constrained to include only participants below median phenomenological control – participants for these 10 000 experiments could only be sampled from the bottom half of participants in Subjective SWASH. In these experiments using only participants below the median in phenomenological control (Figure 2B), we again find that the vast majority of all (9932) experiments result in a mean report of agreement with S1-S3 (distribution of agreement across experiments in Figure 3). However, now only 51% of the experiments result in an inference that this mean agreement is greater than 0 (49% not statistically significant).
In the opposite experiment, we sampled (20) participants from only the top half of participants on Subjective SWASH. By extreme contrast with the results using only participants lower in phenomenological control (Figure 2B), here all 10 000 experiments produced mean reports of agreement (Figure 2C; Figure 3), with over 98% of experiments concluding in favour of this positive agreement being statistically greater than 0.
Considering statements S1-S3 separately
Given the issues identified with ambiguity of the different statements, and their combination (see Quantifying change in subjective experience in RHI above; Reader et al., 2021a), we next looked at ratings for the individual statements (S1-S3) separately. Here, the pattern becomes even more problematic for interpreting these statements as related to body ownership. Splitting the data by presence of participants (20 in each case) higher in phenomenological control, only ratings regarding S1 (related to referred touch) are not strongly affected – mean ratings for each of the 10 000 experiments was in agreement for both samples, with ~95% (Below Median) and > 99% (Above Median) of experiments concluding in favour of this agreement being statistically greater than 0 (Figure 4A-B). As already mentioned, S1 is a semantically ambiguous statement, so it is possible that the distribution of answers being made in reference to the rubber hand or the person’s own hand may protect it from being driven by phenomenological control.
Ratings of statement S2, which also describes experience of referred touch, have a very similar pattern to that seen for the average of the three statements overall (Figure 4C-D). Mean report was frequently in agreement for both samples (9420, only lows; 10000, only highs) but experiments that included only (20) participants below the median in phenomenological control would find evidence in favour of agreement ratings being greater than 0 only ~24% of the time, while experiments including only (20) highs concluded in favour of greater than 0 agreement in ~95% of experiments.
Statement S3 is the only one of the three subjective statements which could be considered to describe an experience of ownership of the rubber hand (though see Wu for discussion of whether even this is reasonable). For S3 (Figure 3, Figure 4E-F), experiments including only (20) participants lower in trait phenomenological control produce a much lower number of cases where the mean rating was agreement (7151), and unlike the other statements, include a substantial number (2485) of experiments wherein the mean rating was in disagreement with the statement. The proportion of experiments that would conclude in favour of mean agreement with the statement “I felt as if the rubber hand were my hand” being greater than 0 was only ~4% and, as depicted in Figure 5, the probability of obtaining inferential support (p < 0.05) for agreement with statement S3 is roughly the same as the probability of obtaining any other extent of 0.05 (e.g. >=0.1 & <0.15; i.e. uniformly distributed). Contrast these results with Figure 2D wherein, even for the experiments that included only participants lower in SWASH, there was still a tendency to find inferential support for RHI. The difference between these results (Figure 2D and Figure 5) highlights the role of the ambiguous S1 in producing support for RHI.
Discussion
Quantitative study of subjective body ownership in the RHI is accomplished using participant ratings of (dis)agreement with a series of subjective statements about referred touch (S1 and S2) and (putatively) ownership (S3). Propensity to agree with these statements has previously been shown to be related to trait phenomenological control – the domain general ability to meet expectancies arising from direct or implicit imaginative suggestion (including demand characteristics; Dienes, Lush, et al., 2020; Lush et al., 2020). It has been claimed that this reported relationship is too small to be of concern (Ehrsson et al., 2021; Fan et al., 2021). To concretely demonstrate the potential influence of this relationship, here we simulated a series of experiments with different degrees of sampling bias, selecting a disproportionate number of participants with higher trait phenomenological control. Given the relationship between subjective agreement and phenomenological control, this induced sampling bias may be analogous to the deliberate exclusion of participants (e.g. Chancel & Ehrsson, 2020; Ehrsson et al., 2005) who don’t report a strong experience of the RHI (up to ~30% of people; Riemer et al., 2019). We show that the frequency with which a RHI experiment will provide evidence consistent with subjective agreement for a RHI is directly related to the proportion of participants in the experiment sample that are higher in phenomenological control. Put more concretely - only experiments run with a large proportion of participants higher in phenomenological control will provide evidence for the RHI. This finding directly contradicts the statement that the reported relationship between RHI responding and trait phenomenological control is too small to be of concern.
Why does selecting participants based on trait phenomenological control affect whether a RHI study will conclude in favour of evidence for RHI or not? It can only be because they are substantially related – reiterating the result reported by Lush et al. (2020). Why is it a problem if the same people who are likely to report agreement with RHI questions also happen to be higher in phenomenological control? To interpret the clearly substantial relationship, we must first consider the nature of the constructs these measures are supposed to represent – body ownership for RHI responding; and propensity to respond to suggestion, implicit or explicit, with imagination for phenomenological control. Phenomenological control is a domain general ability (Dienes, Lush, et al., 2020; Lush et al., 2020), relating to sensory and decision processes over and above those specifically related to experiences of body ownership or multisensory experiences of bodily sensation. This is evident both in the measure of phenomenological control used in Lush et al. (2020; SWASH), which interrogates a variety of possible experiences, not just those related to bodily experience or ownership (e.g. abilities related to amnesia or musical and visual hallucination, in response to suggestion) and in several recent findings: Relationships between trait phenomenological control and anomalous experiences when demand characteristics (and therefore participant hypothesis awareness) have not been adequately controlled have been reported (Lush et al., 2020) for other body-related effects including vicarious pain (reports of pain in response to seeing people in apparently pain-inducing situations) and mirror-touch synaesthesia (reports of felt touch in response to seeing people touched), and also for non-body related experiences like the visually-evoked auditory response (Lush, Dienes, Seth, et al., 2021; reports of sound experienced when watching silent videos). The ability to control experience in response to suggestion is evident across perceptual and decision domains and there is much evidence that imaginative suggestion effects can be experienced as subjectively ‘real’ (see Dienes, Palfi, et al., 2020; Lynn et al., 2020; McConkey, 2008), and that hypnotisability is distinct from social compliance (e.g. see Moore, 1964; Tasso et al., 2020; but see Polczyk & Pasek, 2006). In an explanatory framework, a domain general process subsumes domain specific demonstrations of its operation. Therefore, we are left with limited options for interpretation of the result from Lush et al. (2020):
participants who can control their experience across sensory and decision domains through phenomenological control are also controlling their experience in RHI experiments (as suggested by Lush et al., 2020)
participants who can control their experience in other domains through phenomenological control happen to be the same participants who can experience the RHI, but for distinct reasons (e.g. pure multisensory integration account)
the same participants who can control their experience in other domains don’t control their experience in RHI but respond as though they do for other unspecified reasons – perhaps a response bias induced by social context or pure confabulation just for RHI case
The first interpretation is perhaps the most parsimonious in that it requires only that we reconsider the RHI as an act of phenomenological control in response to suggestion and that it allows for group level report of RHI ownership experience to reflect genuine experience. It is important to note that this position does not rule out a multisensory contribution to the RHI, but rather that another cognitive process is in play. This is an important distinction. If we were to say that there is no problem treating an act of phenomenological control (which is necessarily creative and interpretative) as evidence for a sense of ownership, we might also study visual perception through phenomenological control in response to imaginative suggestion. If people report the experience of “seeing” a unicorn in response to a direct or indirect imaginative suggestion, we would be obliged to take this as evidence for the existence of unicorns on the same basis that RHI researchers claim that reports of the RHI are evidence for a sense of body ownership. Certainly, visual imagery is constrained by the properties of visual processing and previous perceptual experience, as is the RHI likely constrained by the properties of (multi)sensory processing and previous experience. The existence of such constraints is not in contention. But the interpretation of imaginatively suggested experience is not the same as studying the properties of multisensory perception, as is the common interpretation of RHI studies (e.g. Ehrsson et al., 2005). Saying that you are “studying the rubber hand instantiation of the influence of phenomenological control on perception and decision making” is perfectly reasonable. But this statement is not consistent with the broader claims made in most papers regarding RHI where it is stated that the RHI represents a fundamental case study in understanding how subjective body ownership is determined in humans.
The second interpretation, while less parsimonious than the first may still be possible and is not explicitly ruled out by the results presented here or previously. However, given that hypothesis awareness is not controlled for in the existing RHI literature (Lush, 2020; Lush, Seth, et al., 2021; Reader, 2021), evidence for this interpretation cannot depend on that existing literature until the possible influences of phenomenological control are ruled out. There are methods by which phenomenological control effects may be controlled; control conditions which are not confounded by demand characteristics could be developed by a two-step procedure which takes into account both expectancies and differences in the relative difficulty of suggestion effects (see Lush, 2020 for details). This requires a difficult process of matching expectancies and direct imaginative suggestion response across candidate illusion and control measures. Note that while this makes the study of embodiment illusions more challenging than has been the tradition, the combined evidence from many small sample studies in which demand characteristics were uncontrolled has only generated illusory evidence which is of no value to theories of embodiment (except perhaps as a cautionary tale). Similar issues have been dealt with successfully in other fields. For example, the suggested approach has much in common with the use of placebo controls in clinical trials, which, although it adds considerable cost to experiments, has been standard practice for decades (Beecher, 1955).
The third interpretation should be considered in light of evidence that phenomenological control may confound interpretation of other effects. The RHI is one of four distinct effects for which such relationships have now been shown (Lush, Dienes, Seth, et al., 2021; Lush et al., 2020) which involve a range of modalities. Correlation is not causation, of course, and relationships between the RHI and phenomenological control may be attributable to some unknown cause (perhaps some difference in multi-sensory integration mechanisms common to RHI response and phenomenological control - even for cases of phenomenological control that require no multisensory percept). However, given the simplicity of the proposal that demand characteristics can act as implicit imaginative suggestions in measures of experience, and the growing evidence that this occurs in a range of effects, it is a more parsimonious explanation that all of these effects are confounded by phenomenological control than that some other explanation is in play for each individual case.
Note that, whenever expectancies are consistent with experimental aims (e.g. participants are hypothesis aware), any given result may be attributable to hypothesis awareness effects other than phenomenological control (e.g., faking or imagination; see Corneille & Lush, 2021). However, for both the RHI and response to imaginative suggestion, there is extensive evidence that at least some reports reflect genuine experience (see Dienes, Palfi, et al., 2020 for a discussion of this evidence as it relates to phenomenological control). Of course, if phenomenological control scales were to reflect trait differences in faking (for example), this would be an important consideration for interpretation of the RHI (and other reported experiences which correlate with trait phenomenological control).
Researchers familiar with the RHI will be likely to note that these results (and those of Lush et al., 2020) only relate to the subjective statements, while there is extensive evidence from ‘implicit’ measures such as proprioceptive drift, and convergent evidence from neuroimaging and animal studies. As mentioned previously, there are only two aspects linking RHI to phenomenological change – extended report and subjective ratings. All other findings, such as proprioceptive drift or changes in measured blood flow (fMRI) or scalp potential (EEG), are linked to changes in subjective ownership only through broad correspondence or co-occurrence with these subjective reports. If human participants’ ratings on the subjective statements are predominantly related to their trait phenomenological control, then proprioceptive drift (or neuroimaging) results are correlating with phenomenological control, not specifically rubber hand subjective ownership. Again, this is because phenomenological control is a domain general ability (Dienes, Lush, et al., 2020; Lush et al., 2020), relating to sensory and decision processes over and above those specifically related to experiences of body ownership or multisensory experiences of bodily sensation. Note that the asynchronous control measure is also employed for “implicit” RHI measures (e.g., proprioceptive drift), and participants show hypothesis awareness for these effects (Lush, Seth, et al., 2021).
Regarding animal studies (e.g. Fang et al., 2019; Wada et al., 2016), while it may be completely reasonable to posit that, for instance, a rat may have a sense of body ownership, the reverse inference on which such claims rely when referencing RHI is problematic. Because it is not possible to simply ask an animal about their subjective experience, to conclude that an animal has a change in subjective experience of body ownership during, for example, a “rubber tail illusion” paradigm (Wada et al., 2016) we start with a potential relationship in human participants between a behavioural observation and subjective reports (e.g. correlate proprioceptive drift measure in hand with magnitude or time course of change in subjective ownership reports). We then attempt to obtain observations of a similar behaviour in rodents (e.g. something like proprioceptive drift of tail). We must then make the conceptual leap that the observed behavioural phenomena (proprioceptive drift in humans and something like it in rodents) are equivalent. Through this combination of directly observing one relationship related to subjective experience of ownership (subjective ownership responding and proprioceptive drift are related in humans), then observing this single phenomenon in rodents (behaviour with similar properties to proprioceptive drift), that we have assumed to be equivalent to a similar observation in humans (hand proprioceptive drift), we can end up with the reverse inference that the rodent also has change in subjective experience of ownership. Importantly, there is, and can be, no direct evidence to support this claim. It is important to also note that the same issue of reverse inference is present in any study using implicit measures that doesn’t provide evidence for concurrent changes in subjective experience (e.g. Tsakiris & Haggard, 2005). If the foundation statement about human subjective body ownership on which this set of inferences is based is instead reflecting some other ability (e.g. to control experience in response to suggestion – phenomenological control), then it is clear that all subsequent inferences are questionable. This is not to say that nothing can be discovered about multisensory perception of the body using neuroimaging or animal models. Our claim in no way invalidates behaviourist approaches that seek to map body perception to neural function through observed behaviour. However, any instances wherein these findings are related to subjective ownership through their connection with RHI-like paradigms are clearly problematic if the original basis for claims of changes in subjective body ownership during rubber hand exposure are not being correctly interpreted.
It is sometimes claimed (e.g. Ehrsson et al., 2021; Tsakiris, 2010) that RHI effects reflect a combination of top-down and bottom-up processes. However, these top-down effects are believed to be distinct from demand characteristic effects and RHI response is considered to be “resistant to suggestions, thoughts, and high level conceptual knowledge” (Ehrsson et al., 2021). Only a few researchers have stated theoretical positions consistent with the theory that demand characteristics play an important role in these effects (Alsmith, 2015; Dieguez, 2018). An influential theory proposes that top-down effects arise when internal representations of body image are matched against incoming sensory signals (Tsakiris, 2010; though see Chancel & Ehrsson, 2020 for a counter-argument that apparently top-down effects are attributable to multisensory congruency). However, given that demand characteristics have not been controlled in existing RHI studies, it is premature to consider theoretical proposals of top-down effects which are not related to demand characteristics.
In sum, agreement with three different subjective statements is the standard basis for linking all aspects of RHI phenomena with body ownership. Propensity to agree with these reports is strongly related to the domain general ability for phenomenological control. Experiments wishing to link RHI experience specifically with subjective body ownership must take this relationship into account or their results cannot be taken to indicate general properties of human body ownership.
Acknowledgments
We would like to thank Anil Seth and Maxine Sherman for comments on early versions of the paper. We also thank Alex Holcombe, Sebastian Dieguez, and Martin Riemer for their time and effort to improve our submission as editor and reviewers. Finally, the authors are grateful to the Dr. Mortimer and Theresa Sackler Foundation.
Ethics
The data used in the present study were from Lush et al. (2020) and retrieved from https://osf.io/huwxd/.
Data Accessibility Statement
All data and analysis scripts are available at https://osf.io/d5hjp/.
Competing Interests
The authors declare no conflicts of interest.
Footnotes
Previous versions of this paper referred to a table in supplement that described the top 20 or 30 cited papers in the RHI field, according to certain stated selection criteria. A reviewer initially rejected the table (of 20) as invalid because it only included studies that used subjective ratings, as was the stated selection criteria. An updated version (of 30) included top cited studies not limited to those that used subjective ratings. This new table was also rejected by the reviewer as being an inaccurate review of the literature because it didn’t include recent review papers, again ignoring the stated criteria on which the papers were selected. We no longer refer to this table here as it was a distraction from the simpler point that the asynchronous stroking condition is invalid as a control condition.