Social media is a routine part of every-day life for millions of people worldwide. How does engaging with social media shape enduring memories for that experience? This question is important given the popularity of certain types of content on social media platforms, such as content widely known as “fitspiration”. Two experiments involving 510 US adults (mean age = 36.82) examined memory for food and fitness-related social media images that individuals write comments about, as well as memory for other images in the context. We demonstrate that commenting on social media images boosts memory for them and weakly affects memory for conceptually related images in the same context. Exploratory analyses revealed correlations between self-reported disordered eating symptomology and effects of commenting on memory. These findings demonstrate that how people engage with social media has implications for the enduring memories of that content and may relate to behaviors and attitudes in offline lives, such as eating and body image.
Over the past decade, social media platforms have evolved rapidly. Instagram was newly launched ten years ago and now has over one billion users (Instagram, 2020). This rapid growth may relate to social pressure to be active on social media (Baumer et al., 2013; Brody, 2018; Pennington, 2020). While many current social media platforms are popular and quickly growing, each platform is different in regards to levels of engagement and user demographics. For example, the social media platform Facebook has a greater number of users (3 billion users) than Instagram (1 billion users) (Facebook, 2020; Instagram, 2020). Likewise, Facebook and Instagram are shown to have differing levels of engagement, motivations for use, and user demographics (Mittal et al., 2017; Voorveld et al., 2018). Here we focus on popular social media content that features dieting and fitness inspiration, widely known as “fitspiration” or “fitspo”. Instagram in particular is thought to be preferred among young people (Shane-Simpson et al., 2018), and the “fitspiration” hashtag is more prevalent on Instagram than Facebook; as of this writing, 93,000 posts are revealed after a Facebook search of “#fitspiration,” compared to 19.6 million posts on Instagram.
Many aspects of popular Western and now increasingly non-Western cultures have idealized thin body types through media (Becker et al., 2011; Nasser, 2009; Pavlova et al., 2010). Recently, a movement has emerged to encourage media consumers to switch the emphasis from being thin to being “fit,” “toned,” “strong,” and “healthy” in order to motivate people to adopt a healthier lifestyle (Abena, 2019; Tiggemann & Zaccardo, 2018). This movement has been called “fitspiration,” or “fitspo” for short, as it encompasses content that is intended to inspire people to become fit through engaging in behaviors such as exercise and healthy eating rather than trying to attain low body weight. This “fitspiration” content typically features adults with toned and muscular bodies engaging in exercise and meal-prepping for workout regimens. Fitspiration is also very popular - a search on Instagram using the “fitspo” hashtag reveals over 74.8 million associated posts. Related hashtags, including #fitnessgoals (tagged on over 13.6 million posts), #fitnessmotivation (tagged on over 121.7 million posts), and #fitnessmodel (tagged on over 58.9 million posts), are similarly popular.
While “fitspiration” is often promoted as positive motivation to live a healthy lifestyle (Easton et al., 2018; Ghaznavi & Taylor, 2015; Tiggemann & Zaccardo, 2018; see also Abena, 2019), it has been shown to contain harmful messages and content surrounding thinness, weight, food restriction, and fat-stigmatization (Boepple & Thompson, 2016). Prior studies reveal links between engaging with fitspiration content and disordered eating, compulsive exercise, negative affect, and body dissatisfaction (Fardouly et al., 2018; Griffiths & Stefanovski, 2019; Holland & Tiggemann, 2017; Prichard et al., 2018, 2020; Robinson et al., 2017; Tiggemann & Zaccardo, 2015).
Generally, social media platforms include features that encourage users to engage with the content, for example through commenting, sharing, or liking images. In the next section we provide background on the relationship between talking about, or commenting on, the things we see and do in everyday life, including social media content, and subsequent memory for those things. This sets the stage for the present research examining the impact of how we engage with food and fitness related social media content on later memory for that content.
Memory for referents and contexts
Language is shaped by the things in the world around us. In particular, the way speakers refer to different entities in the world is shaped both by the properties of the intended referent as well as other candidate referents in the local context (Craig G Chambers et al., 2002; Landragin, 2006; Olson, 1970; Osgood, 1971; Pechmann, 1989). Determinations about what information is and is not relevant are influenced by a variety of factors including relevance to the task at hand, whether the information is known to both speaker and addressee, and spatial proximity to the intended referent (Brown-Schmidt & Tanenhaus, 2008; C.G. Chambers et al., 2004; Heller et al., 2016; Nadig & Sedivy, 2002).
It is well known from work in the memory literature that the act of generating and producing information improves memory over passively receiving information. For example, saying words aloud produces better memory for those words than simply reading them (Jacoby, 1978; Macleod et al., 2010; Slamecka & Graf, 1978). Additionally, studies of conversation show that conversational partners are more likely to recall, and also repeat, what they said themselves in conversation compared to what was said to them (Knutsen & Le Bigot, 2014; Ross & Sicoly, 1979). Likewise, measures of memory for what was talked about, such as for images that speakers refer to in a communicative task, show that speakers are more likely to correctly recognize those pictures after a delay compared to listeners (McKinley et al., 2017; Yoon et al., 2016; Zormpa et al., 2019). Other evidence suggests that both speakers and listeners are more likely to correctly recognize information from the context when it is related to what was discussed. Yoon, Benjamin, and Brown-Schmidt (2021) asked pairs of participants to play a game where they took turns asking each other to click on images, e.g. “Click on the striped sock”, and later tested memory for both the referent (striped sock) as well as unmentioned items in the context that were related to the target (dotted sock) or not related (wooden chair). They found that both speakers and listeners had better memory for the related over unrelated items in the context, indicating that referring to an item boosts memory not only for that item but also for other related items that were not talked about. This memory boost for speakers may reflect the fact that designing an appropriate referring expression requires attending both to the to-be-referenced item and to the other items in the context (Brown-Schmidt & Tanenhaus, 2006; Pechmann, 1989). Conversely, the boost for listeners may reflect the fact that the process of interpreting a referring expression as it unfolds in time requires considering multiple candidate referents in the context that are temporarily consistent with the unfolding referring expression (Eberhard et al., 1995).
Memory for social media and the present work
Extending the insights from this prior work in laboratory-based referential communication tasks to the domain of social media raises the question of how typical behaviors within social media apps, such as viewing and commenting, might affect subsequent memory. Using real Instagram posts as a test case, Zimmerman and Brown-Schmidt (2020) examined how the act of commenting on a social media image affected subsequent memory for that image. Participants were asked to view a series of posts, one-by-one, over a series of trials. For half of the posts, participants were prompted to write a comment about the post, and for the other half, they were asked to simply view the posts as they would with their own feed. A subsequent recognition memory test revealed significantly better memory for the posts that the participants had generated comments about. This commenting-related memory boost held for all of the image types that were tested, including posts about ostensibly “healthy” food (e.g., salad and fish), “unhealthy” food (e.g., burgers and cake), cats, dogs, and nature. Moreover, correct recognition increased with the length of the generated comment. The motivation for testing memory for food-related images originated in an interest to understand the relationship between social media posts about food and self-reported eating behavior. While an analysis of individual differences revealed no systematic relationship between memory for Instagram posts and eating behavior, the findings were clear that engaging with this form of social media impacts enduring memory for that content – whatever that may be.
In the present study, we sought to build upon these findings in two ways. First, we investigated how the act of commenting on one social media image affects memory for other images in the immediate context. Consider that certain features of Instagram, such as the “explore” tab and user profiles, allow the viewer to see multiple images at once, arranged in a grid. Here, we presented images in a similar layout and asked participants to generate a comment about one of them. After a brief delay, a subsequent memory test probed memory for both the commented-upon images and the other images in the same array. Second, we created a stimulus set that included two highly popular types of imagery seen on social media platforms such as Instagram: food-related posts, and fitness-related, or “fitspiration”, posts. Because food and fitness are often inextricably linked in modern culture and social media, and because prior work has reported biased cognitive processing of both food and body shape-related material among individuals with eating disorders (Aspen et al., 2013; Leehr et al., 2016; Morris et al., 2001; Parasecoli, 2005; Paslakis et al., 2017; Placanica et al., 2002; Shafran et al., 2007), we explored links between memory for food and fitness related posts.
Experiment 1
Methods
This experiment was preregistered on the Open Science Framework (https://osf.io/5s8kv).
Participants
Participants were recruited and compensated through an online recruiting platform (Qualtrics Panels). Criteria for inclusion in the study were that the participant was a self-reported native English speaker or reported that they were fully competent in speaking, reading, and writing English, that the participant had familiarity and/or experience with Instagram, and that they were between the ages of 18 and 65. We intentionally used a broad age range in order to have a sample that would generalize to a wide variety of internet users. Participants were compensated by the panel provider directly. To achieve the final planned sample size of 200 quality responses, Qualtrics Panels removed any responses that were nonsensical.
Due to oversampling, the final sample included 210 participants. They reported an average age of 38 years (range: 18-65) and that they were either native English speakers (n=171) or fully competent but not native (n=39). One-hundred and nine participants reported their gender as female, 95 as male, and 6 did not report their gender. When asked about their familiarity with social media platforms, all participants reported familiarity with either Instagram or Facebook, and 167 participants (80%) reported having an Instagram account and frequently using it.
Materials and Procedure
The materials were images assembled by the first author, for this purpose, from a large number of posts on existing Instagram accounts.
In the first phase of the task (study phase), the images were presented to participants in 3 x 3 arrays, mimicking the layout seen on Instagram in the “explore” tab or in an individual’s profile (Figure 1). Over a series of 30 trials, participants saw a 3 x 3 array with 9 images on each trial. On each of the 30 trials, one of the images in the array (the target) was indicated by a red box. The location of the 9 images within each grid was randomized such that the target and other images could appear in any of the 9 spots in the grid. Thus, there was no systematic relationship between the location of the target and the location of the other images. A response box appeared below the 3x3 array, and participants were asked to type a comment about the indicated target picture. Responses were forced so as to ensure participants adhered to the instructions, though there was no minimum comment length. Participants were instructed to treat the experience as they would when browsing pictures on their own accounts to elicit realistic commenting behavior. Once the participant typed the comment for the target, they clicked a button to proceed to the next trial, in which they saw a new grid of images. Note that unlike a real social media feed where one chooses what posts to comment on, we indicated to participants which image to generate a comment about. This design choice was important because it provides the necessary experimental control to determine which pictures were commented on, thus controlling the number of trials in each condition of interest.
The images were from one of five categories of Instagram posts (dogs, cats, nature, food, fitness). The posts featuring dogs, cats, and nature1 served as control images. Posts featuring food and posts featuring women and men engaging in physical fitness activities, including exercising or spending time at the gym, served as critical stimuli. We focused on healthy food and physical fitness activities to emulate “fitspiration” content as closely as possible, with the typical posts featuring images of meals (e.g., fish and salad bowls) and adults doing yoga or lifting weights. In order to mirror the typical “fitspiration” content that one is likely to see on social media, we selected images that depicted men and women with fit body types. Hashtags searched to gather such material included #fitspiration, #fitnessgoals, #fitnessmotivation, #cleaneating, and other related hashtags. Note that, as in Zimmerman and Brown-Schmidt (2020), determination of what was considered to be an image of “healthy” food and physical fitness activities was based on group discussion and popular cultural beliefs rather than quantitative analysis of the nutritional content (for food images) or of the fitness activity (for fitness images). Please see Appendix for a description of the critical images.
In sum, during the study phase of the task each participant viewed a series of 30 trials, each of which featured a 3x3 array of photographic images. Trials were presented in a random order, and the participants generated a comment about one of the images on each of the 30 trials (12 food, 12 fitness, 6 control). Across the 30 study trials, participants saw a total of 150 control, 60 food and 60 fitness images. We intentionally used a 5:4 ratio of control to food and fitness images to ensure variety in the types of images, similar to a typical experience browsing social media.
Following the first phase of the task, participants completed 17 math problems that involved multiplication, division, and addition as a distractor task. The aim of the distractor task was to bring memory performance off-ceiling as memory for images tends to be quite good (Shepard, 1967). The final phase of the task was a recognition memory test for the food and fitness images that had appeared in the first phase of the task; we did not test memory for the control images because the focus of the work was on how “fitspo” related images are remembered. In the recognition memory test, participants were presented with a series of 240 images, one at a time, and were asked to indicate whether each image was “old” (having appeared in the first phase of the task) or “new” (having not appeared in the first phase of the task). Half of the items seen at test were old, and half were new. Critically, the 120 new images were matched pairwise to each of the 120 old items such that the new and old items were similar (e.g., two different images of a bowl of almonds). The 240 images were presented in a random order.
In total, the materials used in this study consisted of 390 Instagram images, 120 of which were food images, 120 were fitness images, and 150 were control images. The critical images were counterbalanced across ten lists, rotating which food and fitness images were shown to participants in the exposure phase (“old” images at test) and which were not shown to participants during exposure (“new” images at test). For old images, we also counterbalanced across lists whether the participant commented on them or not.
Additional measures
Following the recognition memory task, participants completed the Eating Disorder Examination Questionnaire, EDE-Q (Fairburn et al., 2014). The EDE-Q is a self-report, shortened version of the Eating Disorder Questionnaire, which is used to measure both the frequency and severity of the behavioral and psychopathological features of eating disorders. The EDE-Q provides a global score of severity along with four subscale scores that target certain aspects of eating disorder psychopathology, such as restraint, eating concern, shape concern, and weight concern. The range of ratings for each item is 0 to 6, with 6 being the most frequent or severe. Subscale scores are calculated by averaging the ratings for the corresponding items, and a global score is calculated by dividing the sum of the four subscale scores by 4 (Fairburn et al., 2014). The EDE-Q is shown to be a valid measure of eating disorder symptomology (Aardoom et al., 2012; Mond et al., 2004) and has high internal consistency and test-retest reliability (Luce & Crowther, 1999; Rizvi et al., 2000), with higher scores reflecting higher symptomology.
Lastly, participants were also asked to report their age, gender, ability to read and write in the English language, education level, and familiarity with social media platforms such as Instagram.
Predictions
As described in our pre-registration (https://osf.io/5s8kv), the primary analyses focused on the food and fitness related images, based on our interested in fitspiration-related content. We also included the number of words used to describe the target object as an exploratory predictor variable.
Recall that prior findings demonstrated that commenting on social media posts boosts memory for those posts and that the correct recognition rate increases with the number of words in the comment (Zimmerman & Brown-Schmidt, 2020). If these findings extend to the format used in the present study, where images are presented in groups, as in the “explore” feature on Instagram, we would predict that participants would be more likely to accurately identify those images that they had commented upon and that accuracy would increase with comment length. Our second set of predictions concerned memory for the context. Based on prior findings that describing a focal target image in a conversational setting promotes memory for unmentioned, but related, images in the context (Yoon et al., 2016, 2021), we predicted better memory for non-referenced items in the context if they were from the same category as the target or from a related category. Thus, we expected that memory for non-referenced food (or fitness) images would be better if the participant had generated a comment for a different food (or fitness) image in that same scene. Similarly, if social media users consider food and fitness related images to be conceptually similar, we predicted that participants would be more likely to correctly recognize food items if the referenced item was a fitness-related image, compared to a control image (and vice versa). Lastly, by using complex scenes with multiple images, this study allows us to examine the relationship between comment length and memory for the context. Based on prior work, we expect longer comments to boost memory for the target; if this comment-length memory boost relates to processes of comparing that target to other entities in the context in order to characterize it with respect to the context (Brown-Schmidt & Tanenhaus, 2006; Pechmann, 1989), we would expect longer comments to boost context memory as well.
Lastly, exploratory analyses address the relationship between memory for food and fitness related social media imagery, and EDE-Q. Previous work indicates that individuals with eating disorders display biased behaviors such as approach (e.g., over-attention) and/or avoidance when exposed to disorder-salient stimuli (Aspen et al., 2013; Leehr et al., 2016; Morris et al., 2001; Parasecoli, 2005; Paslakis et al., 2017; Placanica et al., 2002; Shafran et al., 2007). These equivocal findings make predictions for the present study unclear. However, the present work is positioned to determine if stable individual differences in memory for food and fitness social media images exist, and if so, if they relate to scores on the EDE-Q.
Analysis and Results
Descriptions of the target images during the study phase of the task were coded for length in terms of the number of words. On average, descriptions of the critical food and fitness items were 2.71 words long (min = 1, max = 19). Recall that each participant completed 240 old-new recognition memory trials where on each trial they saw one image and were asked to respond whether it was OLD (seen before in the first phase of the task) or NEW (not seen in the first phase of the task). Our analyses focus on recognition memory for the images (Figure 2). We use a signal-detection theoretic analysis of the response data (Wright et al., 2009), modeling the log odds of an “old” response in the memory task with a logit-link mixed effects model using the glmer function in the lme4 package in R (Bates et al., 2018). Participants (N=210) and items (240 unique images) were treated as random factors. We used the buildmer function in R (Voeten, 2020) to identify a parsimonious random effects structure for the model. The selected models included each of the fixed effects based on the experimental design, participants and items as random intercepts, and random slopes based on the model results using buildmer; these models are reported in the text.
The first model (Table 1) used weighted orthogonal Helmert contrasts (Cohen et al., 2002) given unequal numbers of old target, old non-target, and new images.2 The first contrast in the model tested for an effect of item type (old vs. new) on the likelihood of an “old” response. The second contrast specifically tested whether old images that the participant had commented upon were more likely to be successfully recognized than old images that were not commented on. The results of that model revealed a significant effect of item type, indicating that participants were significantly more likely to respond “old” if the image was in fact old, rather than new (b = 1.07, p<.0001). Additionally, participants were significantly more likely to correctly recognize images that they had commented upon, compared to ones that were viewed in the same array (b = 1.91, p<.0001). Lastly, the intercept term was negative (b= -1.17, p<.0001), indicating an overall response bias to respond new in the memory test, regardless of whether the image was old or new.
Fixed Effects | Estimate | SE | z-value | p-value |
(Intercept) | -1.173 | 0.100 | -11.670 | <.0001 |
Type (new = -.5, non-tgt = .5, target = .5) | 1.069 | 0.072 | 14.800 | <.0001 |
Comment (new = .15, non-tgt = -.35, target = .65) | 1.907 | 0.136 | 13.990 | <.0001 |
Random Effects | Variance | Std.Dev. | Corr | |
Item (intercept) | 0.11 | 0.34 | ||
Comment | 0.15 | 0.39 | -0.52 | |
Participant (intercept) | 1.97 | 1.40 | ||
Type | 0.91 | 0.95 | -0.60 | |
Comment | 3.38 | 1.84 | -0.62 | 0.98 |
Fixed Effects | Estimate | SE | z-value | p-value |
(Intercept) | -1.173 | 0.100 | -11.670 | <.0001 |
Type (new = -.5, non-tgt = .5, target = .5) | 1.069 | 0.072 | 14.800 | <.0001 |
Comment (new = .15, non-tgt = -.35, target = .65) | 1.907 | 0.136 | 13.990 | <.0001 |
Random Effects | Variance | Std.Dev. | Corr | |
Item (intercept) | 0.11 | 0.34 | ||
Comment | 0.15 | 0.39 | -0.52 | |
Participant (intercept) | 1.97 | 1.40 | ||
Type | 0.91 | 0.95 | -0.60 | |
Comment | 3.38 | 1.84 | -0.62 | 0.98 |
Note. Experiment 1: Results of logistic mixed-effects model for old and new items by item type (new vs. old target and non-target images), and whether the participant generated a comment about that image. 210 participants; 240 items; 50,400 observations.
The second model examined the impact of generating a comment on memory for both the focus of the comment (the target) as well as non-focused elements in the context. This model examined old items only because the comment length variable is undefined for new items (Table 2). On 6 study trials, one participant did not type a coherent comment, thus the corresponding 24 test trials (1 target and 3 non-target test trials were associated with each study trial) were dropped from this analysis.3 The number of words used to describe the target during the first phase of the task was entered as a mean-centered exploratory predictor variable. For target images, the word count measure corresponded to the number of words that the participant used to describe that image in the first phase of the task. For non-target images, the word count measure corresponded to the number of words they had used to describe the target that was on the same screen as the non-referenced (non-target) image. In addition, the model contrasted four types of images as a function of their relationship to the image in the scene that the participant had commented upon. A dummy coding system was used, treating context items that were unrelated to the target as the reference level (i.e., memory for non-target food and fitness items when the target was a control image – dogs, cats or nature pictures). The Target fixed effect compares target memory to this baseline reference level. The Category match and Schema match fixed effects compare memory for non-target items from the same category as the target (Category-match), or from a schematically related category as the target (Schema-match) to the baseline.
Fixed Effects | Estimate | SE | z-value | p-value |
(Intercept) | -1.334 | 0.114 | -11.703 | <.0001 |
# of words | -0.012 | 0.021 | -0.582 | 0.560 |
Targets | 1.939 | 0.135 | 14.340 | <.0001 |
Category match | 0.138 | 0.050 | 2.748 | <.01 |
Schema match | 0.009 | 0.044 | 0.205 | 0.837 |
Target*words | 0.103 | 0.032 | 3.237 | <.01 |
Category*words | 0.027 | 0.024 | 1.124 | 0.261 |
Schema*words | 0.012 | 0.022 | 0.546 | 0.585 |
Random Effects | Variance | Std.Dev. | Corr | |
Item | 0.09805 | 0.3131 | ||
Participant | 2.30867 | 1.5194 | ||
Targets | 3.22811 | 1.7967 | -0.69 |
Fixed Effects | Estimate | SE | z-value | p-value |
(Intercept) | -1.334 | 0.114 | -11.703 | <.0001 |
# of words | -0.012 | 0.021 | -0.582 | 0.560 |
Targets | 1.939 | 0.135 | 14.340 | <.0001 |
Category match | 0.138 | 0.050 | 2.748 | <.01 |
Schema match | 0.009 | 0.044 | 0.205 | 0.837 |
Target*words | 0.103 | 0.032 | 3.237 | <.01 |
Category*words | 0.027 | 0.024 | 1.124 | 0.261 |
Schema*words | 0.012 | 0.022 | 0.546 | 0.585 |
Random Effects | Variance | Std.Dev. | Corr | |
Item | 0.09805 | 0.3131 | ||
Participant | 2.30867 | 1.5194 | ||
Targets | 3.22811 | 1.7967 | -0.69 |
Note. Experiment 1: Results of logistic mixed-effects model for old items by the number of words used to describe the target image in the first phase of the task, and image type. Unrelated items coded as baseline; fixed effects of Targets, Category match and Schema match contrast correct recognition rate to baseline. 210 participants, 240 items, 25,176 observations.
Participants were significantly more likely to correctly recognize targets than baseline (b = 1.94, p<.0001); the odds of correctly recognizing target images was estimated to be 6.95 times higher compared to the baseline (i.e., unrelated images in the context). Participants were also more likely to correctly recognize non-targets from the same category as the named target (i.e., memory for a food (or fitness) item when the participant had generated a comment about a food (or fitness) item) compared to baseline (b = .14, p<.01), though this effect was much smaller, only increasing the odds of correct recognition by 1.15 compared to baseline items. Non-targets from schematically related categories (i.e., memory for a food [or fitness] item when the participant had generated a comment about a fitness [or food] item) were not remembered significantly better compared to baseline (b = .01, p=.84, OR=1.01). As unrelated images were entered as the reference level, the non-significant effect of the number of words (b = -0.1, p=.56) indicates that longer comments did not improve memory for these unrelated items in the context. Similarly, the lack of a significant interaction between the number of words and the Category and Schema effects (ps>.2) indicate that longer comments did not significantly improve memory for these unrelated items in the context. Memory for target images, however was significantly related to comment length (b=.10, p<.01), evidenced by a larger target memory boost with longer comments.4
Individual differences
The second model included a random intercept by participants, capturing individual variability in correct recognition of non-target food and fitness images when the target had been a control image (Table 2). The model also included a random slope by participants for Target memory, reflecting individual variability in the boost to memory for food and fitness targets over baseline. Internal reliability of these measures was assessed in two ways. First, we calculated the model-based reliability rho, which can be interpreted as the ratio of estimated over observed variance in theta, with values closer to 1.0 indicating better reliability (Cho et al., 2019). Rho for these two measures was good, .945 for the by-participant intercept reflecting Context memory, and .874 for the by-participant slope reflecting the boost to memory for Targets over non-targets. Second, we calculated split-half reliability by running the model in Table 2 on only the odd trials, and separately on only the even trials, and correlating the random by-participant effects. Split-half reliability was also high, with reven-odd = .932 for Context memory, and reven-odd = .841 for Target memory. Exploratory bivariate correlations among these random by-participant effects, and covariates of participant age and their scores in the EDE-Q are show in Table 3 (for a graphical depiction of each of these relationships see Appendix Figure A1).
Restraint | Eating | Shape | Weight | Global | Context | Target | Age | |
Restraint | 1 | |||||||
Eating | 0.63 | 1 | ||||||
Shape | 0.48 | 0.59 | 1 | |||||
Weight | 0.61 | 0.75 | 0.71 | 1 | ||||
Global | 0.82 | 0.87 | 0.8 | 0.91 | 1 | |||
Context | 0.25 | 0.31 | 0.18 | 0.25 | 0.29 | 1 | ||
Target | -0.28 | -0.44 | -0.24 | -0.35 | -0.38 | -0.71 | 1 | |
Age | 0.04 | -0.18 | -0.09 | -0.11 | -0.09 | -0.09 | 0.21 | 1 |
Restraint | Eating | Shape | Weight | Global | Context | Target | Age | |
Restraint | 1 | |||||||
Eating | 0.63 | 1 | ||||||
Shape | 0.48 | 0.59 | 1 | |||||
Weight | 0.61 | 0.75 | 0.71 | 1 | ||||
Global | 0.82 | 0.87 | 0.8 | 0.91 | 1 | |||
Context | 0.25 | 0.31 | 0.18 | 0.25 | 0.29 | 1 | ||
Target | -0.28 | -0.44 | -0.24 | -0.35 | -0.38 | -0.71 | 1 | |
Age | 0.04 | -0.18 | -0.09 | -0.11 | -0.09 | -0.09 | 0.21 | 1 |
Note. Experiment 1 bivariate correlations (N=210); bolded values indicate significant correlations at a corrected alpha level of .001. The dark box indicates subscales of the EDE-Q and the EDE-Q global score (see text for description). Context = correct recognition of unrelated items in context. Target = recognition boost for target over unrelated items.
In addition to the expected correlations among the EDE-Q subscales and the EDE-Q global score, we note that context memory was positively associated with each subscale and the global score. By contrast the relative increase in target over context memory for target images was negatively related to each subscale and the global score. We also note that target and context memory were negatively associated (r = -0.71), indicating that participants who were more likely to correctly recognize the context images were less likely to have a boost in memory for the target over the context. This negative relationship may be influenced by the fact that the context memory measure is based on the intercept term in the model and influenced by the participant’s overall response bias; we will return to this point later.
Based on these exploratory analyses, we included the Global score on the EDE-Q as a participant covariate in a version of the model presented in Table 2 that included the three fixed effects for item type, the interaction of target memory and comment length, and interactions with the EDE-Q Global score. The results of this model were consistent with the bivariate correlations above: A significant effect of Global score indicated that context memory was positively related to the Global score on the EDE-Q (b = .39, z = 4.33, p<.0001). In addition, a significant interaction between Target memory and Global score (b = -.62, z = -5.82, p<.0001) indicated that higher EDE-Q Global scores were associated with a smaller memory boost for target over baseline images.
Discussion
The results of Experiment 1 reveal that when viewing social media images in arrays similar to the “explore” page and profile formats in Instagram, and that participants form memories of the images they view. We found that participants were more likely to correctly recognize the images that they had generated a comment on compared to images in the context that were viewed, but not commented on. In addition, memory was positively related to comment length. These findings replicate and extend our prior findings of a relationship between commenting in a social media setting (Zimmerman & Brown-Schmidt, 2020) to a common viewing experience on social media where multiple images are seen at once. We also observed initial evidence that commenting on an image shapes memory not only for that image, but also for others in the context from the same category. This suggests that when a person does choose to comment on a social media image, that they are not only changing the way that image is encoded in memory, but this action also affects the way other conceptually-related images in the scene are processed. Lastly, exploratory analyses of individual differences indicated that the severity of reported disordered eating behaviors was positively associated with memory for food and fitness related items in the context, but negatively related to the boost in memory that was observed for commenting on those images. This finding may indicate that participants who reported more frequent or severe disordered eating behaviors were more likely to distribute attention across the images within the array, rather than focusing more specifically on the image that they commented on.
Experiment 2
The purpose of Experiment 2 was to conceptually replicate Experiment 1 with a few methodological changes, detailed below. The experiment was pre-registered on the Open Science Framework (https://osf.io/9zmnq).
Methods
Experiment 2 was identical to Experiment 1, with three exceptions. In the first phase of the task the stimuli were presented as dynamic gif files (rather than static images as in Experiment 1), allowing presentation of the array of images first, followed by highlighting around the target image. This delayed indication of the target was intended to increase the likelihood that participants would notice the non-target images, similar to how a person browsing a social media feed may view multiple images before commenting upon one. Secondly, rather than an old/new recognition paradigm (as in Experiment 1), the test phase of the experiment was a two-alternative forced-choice task (2AFC). The use of the 2AFC task allows us to test the generalizability of the findings of Experiment 1 in a new task and had the added benefit of shortening the test phase of the study by half. Third, we used Mechanical Turk for participant recruitment, as this was the same recruitment platform as used in our prior work (Zimmerman & Brown-Schmidt, 2020).
Participants
To determine the sample size, an a-priori simulation-based power analysis using the simr package in R (Peter et al., 2019) was conducted based on the estimated effect of a category match on memory in Experiment 1 (based on 500 simulations, and alpha = .05). That analysis revealed that a sample size of N=300 would achieve 90% power. Participants were recruited on Mechanical Turk and were paid $4.50 for their participation.
Inclusion criteria were identical to Experiment 1. The final sample of 300 participants reported an average age of 36 years (max=65, min=18), and participants reported they were either native English speakers (n=288) or fully competent but not native (n=12). One-hundred reported their gender as female, 186 as male, 2 as non-binary, and 12 did not report their gender. When asked about their familiarity with social media platforms, all participants reported familiarity with either Instagram or Facebook, and 259 (86%) reported having an Instagram account and frequently using it.
Materials
The materials were identical to Experiment 1, except for the following changes:
During the first phase of the task, the 9-image arrays were formatted as gif files instead of static images. All 9 images first appeared together without cuing of the target; a green box appeared around the target after approximately 4 seconds. Participants were instructed to write a comment about the image in the green box using a comment box at the bottom of the screen.
In the second phase of the test, participants completed a series of 120, 2AFC test trials. On each trial, participants were presented with two pictures (one was “old” that they had seen during the first phase of the task, the other was “new”). The pictures were selected to be visually similar, such as two different fruit bowls, or two different women leaping through the air (Figure 3). Participants were asked to click on the picture that they had seen before in the first phase of the task. After responding, the participant advanced to a new page and the next pair of pictures was presented. Note that the materials were identical to those in Experiment 1, except that whereas at test in Experiment 1 participants saw the images one at a time (half old, half new), in Experiment 2 we paired similar images together on one screen where one was always old and one was always new. As in Experiment 1, we only tested memory for the critical food and fitness-related items.
Predictions
If the findings of Experiment 1 generalize across the methodological changes made here, we would expect to replicate the finding from Experiment 1 and our prior work that commented-upon posts would be better remembered than posts that were passively viewed. Additionally, we would expect to replicate the finding that participants would be more likely to recognize context images if they were from the same category as the target. Lastly, exploratory analyses examine the relationship between memory and EDE-Q scores. Given the findings of Experiment 1, we hypothesize that higher symptomology in the EDE-Q will be positively related to context memory, but negatively related to the target memory boost.
Analysis and Results
Descriptions of the target images during the study phase of the task were coded for length in terms of the number of words (M = 4.66 words, range = 1:36), as this metric was positively related to memory for those images in our prior work.5 Recall that each participant completed 120 2AFC forced-choice memory trials where on each trial they saw two images (one old, one new), and were asked to select the old image (the one seen before in the first phase of the task).
Our analyses focus on accuracy in the memory test. We modeled the log odds of a correct response in the memory task using a logit-link mixed effects model. Participants (N=300) and items (240 unique images) were treated as random factors. As in Experiment 1, we used the buildmer function in R (Voeten, 2020) to identify a parsimonious random effects structure. The final models included each of the fixed effects based on the experimental design, participants and items as random intercepts, and random slopes based on the model results using buildmer; these models are reported in the text.
Accuracy for target images was 82% (SD of the by-participant means=16%), higher than accuracy for identifying non-targets that were from the same category as the target (55.5%, SD=12%), that were from a schematically related category (55.3%, SD=9%), and non-targets unrelated to the target image (54%, SD=11%). Analysis of the accuracy data included the number of words that the participant used to describe that image in the first phase of the task as a centered fixed effect. For non-target images, the word count measure corresponded to the number of words they had used to describe the target that was on the same screen as the non-target image. In addition, the model used three contrasts to capture the relationship of each image type to the image in the scene that the participant had commented upon. As in Experiment 1, context items that were unrelated to the target were coded as the reference level (i.e., memory for non-target food and fitness images when the target was a control image – dogs, cats or nature). The item type contrasts test whether memory for targets, non-targets from the same category as the target, and non-targets from a schematically related category differed from baseline.
The intercept term in the model measures memory for non-target images that were unrelated to the target image that the participant had commented on, as these unrelated items served as the baseline reference level in the contrast coding scheme. A significant intercept term in the model (Table 4) indicated that participants were more accurate than not when attempting to distinguish these unrelated non-target images from similar new images (b = .18, p<.0001). Participants were significantly more likely to correctly recognize targets than this baseline (b = 1.88, p<.0001); the odds of correctly recognizing target images was estimated to be 6.52 times higher compared to unrelated images in the context. Unlike Experiment 1, accuracy for non-targets from the same category as the named target (i.e., memory for a food [or fitness] item when the participant had generated a comment about a food [or fitness] item) was not significantly higher compared to baseline (b = .075, p=.07). Likewise, non-targets from schematically related categories (i.e., memory for a food [or fitness] item when the participant had generated a comment about a fitness [or food] item) were not remembered significantly better compared to baseline (b = .052, p=.13). As unrelated images were entered as the reference level, the non-significant effect of the number of words (b = 0.003, p=.69) indicates that longer comments did not impact memory for unrelated items. A lack of significant interaction between the number of words and the category and schema effects (ps>.3) indicate that longer comments did not significantly improve memory for these related items in the context. The target memory boost, however, significantly interacted with comment length (b = .11, p<.0001), due to a larger target memory boost with longer comments.
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | 0.177 | 0.049 | 3.589 | 0.000 | |
# of words | 0.003 | 0.009 | 0.394 | 0.694 | |
Targets | 1.875 | 0.085 | 21.970 | <.0001 | |
Category match | 0.075 | 0.041 | 1.819 | 0.069 | |
Schema match | 0.052 | 0.034 | 1.517 | 0.129 | |
Target*words | 0.110 | 0.019 | 5.644 | <.0001 | |
Category*words | -0.011 | 0.011 | -1.025 | 0.305 | |
Schema*words | -0.005 | 0.010 | -0.496 | 0.620 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Participant | 0.07 | 0.27 | |||
Targets | 0.88 | 0.94 | 0.44 | ||
Item | 0.38 | 0.61 | |||
Targets | 0.38 | 0.61 | -0.42 | ||
Category match | 0.11 | 0.33 | -0.17 | -0.12 | |
Schema match | 0.06 | 0.24 | -0.48 | 0.09 | 0.60 |
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | 0.177 | 0.049 | 3.589 | 0.000 | |
# of words | 0.003 | 0.009 | 0.394 | 0.694 | |
Targets | 1.875 | 0.085 | 21.970 | <.0001 | |
Category match | 0.075 | 0.041 | 1.819 | 0.069 | |
Schema match | 0.052 | 0.034 | 1.517 | 0.129 | |
Target*words | 0.110 | 0.019 | 5.644 | <.0001 | |
Category*words | -0.011 | 0.011 | -1.025 | 0.305 | |
Schema*words | -0.005 | 0.010 | -0.496 | 0.620 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Participant | 0.07 | 0.27 | |||
Targets | 0.88 | 0.94 | 0.44 | ||
Item | 0.38 | 0.61 | |||
Targets | 0.38 | 0.61 | -0.42 | ||
Category match | 0.11 | 0.33 | -0.17 | -0.12 | |
Schema match | 0.06 | 0.24 | -0.48 | 0.09 | 0.60 |
Note. Accuracy on 2AFC memory test for 300 participants, 240 items, 36,000 data points. Bolded values indicate significant effects.
Individual differences
Our primary analysis included a random intercept by participants, capturing individual variability in correct recognition of old, non-target images when the target had been a control image. The model also included a random slope by participants for target memory, reflecting individual variability in the boost to memory for target over baseline images. As in Experiment 1, we calculated the reliability of these measures in two ways, first with model-based reliability rho (Cho et al., 2019), and additionally using split-half reliability, calculated by running the model on the odd and even trials separately, and correlating the random by-participant effects. For the by-participant intercept reflecting context memory, reliability was mediocre for both rho (.471), and split-half (reven-odd = .559). Similarly, for the by-participant slope for target over non-target memory, reliability was mediocre for both rho (.502), and split-half (reven-odd = .567). Planned, exploratory bivariate correlations among these random by-participant effects, and covariates of participant age and their scores on the EDE-Q are show in Table 5 (for a graphical illustration of the bivariate correlations, see Appendix Figure A2). Due to the low reliability of the memory measures, these analyses should be interpreted with some caution.
Restraint | Eating | Shape | Weight | Global | Context | Target | Age | |
Restraint | 1 | |||||||
Eating | 0.68 | 1 | ||||||
Shape | 0.71 | 0.75 | 1 | |||||
Weight | 0.72 | 0.8 | 0.93 | 1 | ||||
Global | 0.86 | 0.88 | 0.94 | 0.95 | 1 | |||
Context | -0.04 | -0.17 | -0.05 | -0.1 | -0.1 | 1 | ||
Target | -0.18 | -0.39 | -0.19 | -0.26 | -0.27 | 0.65 | 1 | |
Age | -0.04 | -0.16 | -0.03 | -0.07 | -0.08 | 0.11 | 0.15 | 1 |
Restraint | Eating | Shape | Weight | Global | Context | Target | Age | |
Restraint | 1 | |||||||
Eating | 0.68 | 1 | ||||||
Shape | 0.71 | 0.75 | 1 | |||||
Weight | 0.72 | 0.8 | 0.93 | 1 | ||||
Global | 0.86 | 0.88 | 0.94 | 0.95 | 1 | |||
Context | -0.04 | -0.17 | -0.05 | -0.1 | -0.1 | 1 | ||
Target | -0.18 | -0.39 | -0.19 | -0.26 | -0.27 | 0.65 | 1 | |
Age | -0.04 | -0.16 | -0.03 | -0.07 | -0.08 | 0.11 | 0.15 | 1 |
Note. Experiment 2 bivariate correlations (N=300); bolded values indicate significant correlations at a corrected alpha level of .001. The box indicates subscales of the EDE-Q and the EDE-Q global score (see text for description). Context = memory accuracy for unrelated items in context. Target = memory boost for target over unrelated items.
The expected correlations among the EDE-Q subscales and the EDE-Q global score are shown in the black box. Unlike Experiment 1, the bivariate correlations between context memory and the EDE-Q scores were not significant and trending negative (in Experiment 1 these relationships were positive). Replicating the pattern of data in Experiment 1, however, the relative increase in target over context memory was negatively related to the EDE-Q subscales and the global score (r = -.27). Note that unlike Experiment 1, target and context memory were positively associated (r = .65). The positive correlation may reflect the fact that the 2AFC task measured the participants’ accuracy in discriminating old from new images, with discrimination abilities for target and context images tracking together. By contrast, because Experiment 1 used an old/new recognition task, the intercept term reflected a combination of overall response bias, plus the recognition of context items.
Based on these exploratory analyses, we included global score on the EDE-Q as a participant covariate in a version of the model presented in Table 3 that included the three fixed effects for item type and the interaction of target memory and comment length. The results of this model were consistent with the bivariate correlations above. The global score on the EDE-Q did not significantly predict baseline context memory (b = -0.004, z = -0.27, p=.79). However, the interaction between target memory and global score was significant (b = -0.28, z = -5.47, p<.0001): Participants who had a higher global score on the EDE-Q experienced a smaller memory boost for target over baseline images.
Discussion
The results of Experiment 2 partially replicate Experiment 1. Using dynamic displays that indicated the target after a brief delay, and a two-alternative forced choice memory task that required participants to directly compare previously-viewed images with similar non-viewed images, we again found that participants were more likely to correctly recognize the images that they had generated a comment about. Moreover, memory for the commented-upon image was positively related to comment length. Unlike Experiment 1, memory was not significantly boosted for unmentioned, but related items in the context. We note that the observed related context-memory boost in Experiment 1 was quite small. While the sample size for Experiment 2 was designed to achieve 90% power to replicate that effect, Experiment 1 may have overestimated the magnitude of that effect size if it in fact exists. Another possibility is that the methodological changes in Experiment 2 meaningfully altered the magnitude of the effect.
Regarding individual differences, while the severity of self-reported disordered eating behaviors was not significantly associated with memory for context images, we replicated the finding that the EDE-Q Global score was significantly, negatively related to the commenting-related memory boost.
General Discussion
A variety of laboratory findings show that people are more likely to remember things that they “generated” in conversation, compared to information that was heard (McKinley et al., 2017; Yoon et al., 2016; Zormpa et al., 2019), and that conversational participants also remember the context of every-day language use (Yoon et al., 2016, 2021). This memory boost for generated content, as well as for contextual information, may reflect the fact that uniquely specifying a referring expression requires attending both to the referent and the contrasting item in the context (Brown-Schmidt & Tanenhaus, 2006). Here, we conceptually replicate this larger body of work, but pivot away from laboratory-based communication tasks to the ubiquitous phenomenon of social media communication. Examining enduring memory for social media content is of high social relevance due to the fact that billions of people make use of platforms like Instagram and Facebook on a regular basis (Facebook, 2020; Instagram, 2020). We focus on “fitspiration” content due to its popularity, and its alleged promotion of healthy and positive lifestyles (Easton et al., 2018; Ghaznavi & Taylor, 2015; Tiggemann & Zaccardo, 2018; see also Abena, 2019), despite containing potentially problematic messaging surrounding themes of thinness and fat-stigmatization (Boepple & Thompson, 2016). Indeed, prior findings link fitspiration content with disordered eating, compulsive exercise, negative affect, and body dissatisfaction (Fardouly et al., 2018; Griffiths & Stefanovski, 2019; Holland & Tiggemann, 2017; Prichard et al., 2018, 2020; Robinson et al., 2017; Tiggemann & Zaccardo, 2015). Here we take a cognitive approach, examining memory for real food and fitness images posted to Instagram that we found using hashtags such as #fitspo, #fitspiration, #fitnessgoals, #fitnessmotivation, and #cleaneating.
The results of two experiments revealed that commenting on a social media post boosted subsequent memory for that post, with longer comments resulting in a larger memory boost. This finding replicates our prior work (Zimmerman & Brown-Schmidt, 2020) in displays that mimic the popular “explore” tab and user profiles on Instagram and related platforms. While Experiment 1 revealed a small but significant memory boost for context images when the participant had generated a comment about another item from the same category, this effect did not replicate in Experiment 2, questioning the magnitude and/or generalizability of this initial finding. We speculate that unlike task-based conversation (Yoon et al., 2016, 2021), social media arrays may demand less consideration of the context when generating comments. If so, we may expect that the degree to which items in the context are later remembered may be contingent on the degree to which they were considered to be relevant to social or communicative behaviors unfolding at the time they were viewed.
While related context images were not particularly well-remembered, in both studies we observed that memory for images in the context was significantly above chance, indicating that even though participants did not generate comments about these images, they were nonetheless encoded into memory. This finding suggests that when engaging with social media, users will remember not only the specific material that they engaged with but also other material that was not elaboratively encoded. We can expect, then, that if a user comments on one “fitspo”-related image, they may also encode in memory other images in their feed as well. Given that #fitspo content may be problematic for some social media users (Prichard et al., 2018, 2020; Robinson et al., 2017; Tiggemann & Zaccardo, 2015), it is important to know how viewing and engaging with this content shapes memory and, potentially, future behavior. Note that, in order to create the conditions of interest, it was necessary to have experimental control over which images participants commented on. Given prior findings that making a choice about a response option improves memory for that option (Coverdale & Nairne, 2019), we speculate that in natural browsing experiences where one chooses what to comment on, the commenting-related memory boost would be even larger.
Finally, exploratory analyses of individual differences revealed – in both experiments – a link between the commenting-related memory boost and self-reported eating behaviors. Specifically, the higher the severity of self-reported disordered eating behaviors, the less of a boost participants experienced in memory for target over control images. One potential explanation for these findings is that participants with higher EDE-Q Global Scores were less likely to focus attention on the target image, instead spreading attention to related food and fitness-related items in the immediate context. A non-mutually exclusive explanation for this phenomenon is that users with problematic eating behaviors and body concern are already more likely to focus attention on any food and body related material, regardless of level of engagement, suggesting that additional commenting has little to no encoding benefit. Though further work is needed to explore the underlying mechanisms, the current findings suggest that individuals who report disordered eating behavior may represent this type of content differently in memory than persons who do not.
Limitations and Future Directions
This study has several limitations. We recruited for a broad age range in order to capture a wide sample of those who use the internet and social media sites. However, those most at risk for developing eating disorders tend to be young females (Hudson et al., 2007; Udo & Grilo, 2018; also see Cheng et al., 2019). Future studies may therefore benefit from recruiting younger participants.
This study was conducted via online platforms, which enabled us to capture a large, diverse sample. However, an in-lab design would allow for complementary measures of visual attention using on-line measures of gaze to different regions of interest in the Instagram displays. Given that gazing at an image is associated with better memory for it (Loftus, 1972), fixation patterns on social media may relate to subsequent memory.
Lastly, we used the EDE-Q to measure disordered eating symptomology in the general population rather than enrolling those with clinically significant eating disorders. Future studies may explore memory for food and fitness imagery on social media among clinical populations. It is also important to note that we relied on self-report of symptoms, rather than objective measures of real-world health behaviors. An interesting future direction for this work is to relate memory for social media content with objective measures of health-related behaviors and outcomes (e.g., weight loss, diet).
Conclusion
Social media is becoming more and more entrenched in modern society, with over one billion worldwide users of Instagram’s platform alone (Instagram, 2020). The everyday use of this platform, along with the increasing popularity of “fitspiration” content and its potential links to disordered eating, necessitates further understanding of the effects of interacting with this type of material in social media contexts. Our findings show that engaging with Instagram fitspiration content leads to better recognition of not only the content that the user directly engages with, but also boosts memory for content that the user passively views. Exploratory analyses of individual differences in memory for “fitspiration” material reveal a relationship between memory for these images, and self-reported unhealthy eating behaviors and body concern. Together, these findings have important potential implications for how millions of users are cognitively affected by content that they see and interact with on a daily basis, especially those with unhealthy behaviors.
Competing Interests
The authors have no conflicts of interest to declare.
Ethics Approval and Consent to Participate
The research procedures were approved by the Vanderbilt University Human Research Protections Program and all participants consented prior to participation in this research.
Funding
Preparation of this manuscript was supported in part by National Science Foundation Grant BCS 19-21492 to Sarah Brown-Schmidt.
Contributions
The authors jointly conceived the experimental approach. JZ, ADR, KL and SBS designed and ran the experiments. SBS analyzed the data. All authors read and approved the final manuscript.
Acknowledgements
Thank you to Lisa Fazio for helpful conversations about this research project.
Data Accessibility Statement
The raw de-identified data associated with this manuscript are available at https://osf.io/25wt4/. Materials are available upon request. Experiment 1 (https://osf.io/5s8kv) and Experiment 2 (https://osf.io/9zmnq) were both pre-registered on the Open Science Framework.
Appendices
Appendix A
Table A1. Example descriptions of selected “healthy” food and fitness items in the study.
Healthy food
Salad with grilled chicken, avocados and tomatoes
Grilled shrimp with brussels sprouts
Grilled vegetables
Orange smoothie
Acai bowl with berries
Tuna poke bowl with carrots
Zucchini noodles with tomatoes
Grilled vegetable sandwich
Roasted asparagus
Tofu grain bowl with broccolini
Women Fitness
Woman in sports bra and leggings taking a mirror picture
Woman with spandex lifting weights
Woman with shirt pulled up, stretching on a track
Woman in a bathing suit doing yoga
Woman in sports bra flexing bicep in a mirror
Woman in sports bra and spandex lying down on a bench press
Woman in leggings and sports bra doing yoga
Woman running outside
Woman lifting dumbbells in the mirror
Woman in matching sports bra and leggings standing next to weights taking a picture
Men Fitness
Man without shirt stretching in athletic gear
Man lifting dumbbells in gym
Man without shirt using a weight machine
Man without shirt lifting weight in mirror
Man flexing biceps posing for picture in gym
Man doing a pull-up in a gym
Man without shirt running outside
Man running with a rugby ball
Man without shirt lunging
Man posing in mirror with dumbbells in gym
Appendix B
Exploratory Analyses of Experiment 1
Our pre-registered analysis plan (https://osf.io/5s8kv) included exploratory analyses of timing information and also object category. We summarize those findings in what follows.
Timing
As an exploratory analysis, we extracted automatic timing information in Qualtrics regarding the time spent on each study trial. Unfortunately, the validity of this timing information was questionable, with the amount of time reportedly spent on individual study trials ranging from .034 to 4945.130 seconds. Adding this timing measure to the model presented in Table 2 required log-transforming and then mean-centering the measure to achieve convergence (Table B1). The results of the model were similar to the findings presented in Table 2, with a significant memory boost for Targets (z=14.53, p<.0001), and a significant interaction between Target memory and comment length (z=2.74, p<.01). The Category effect was also significant (z=2.74, p<.01). There was also a significant interaction between Target memory and study time (z=2.11, p=.035), due to a larger Target memory boost the longer the study time. The fact that the effect of comment length on Target memory is still present even when study time (albeit measured imperfectly) is taken into account indicates that the commenting-related memory boost is not simply due to the amount of time spent viewing the target image. This finding is in line with evidence that the production-memory benefit is not simply a function of extra time on item (Macleod et al., 2010).
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | -1.334 | 0.114 | -11.674 | <.0001 | |
Study Time | 0.015 | 0.042 | 0.346 | 0.729 | |
Targets | 1.937 | 0.133 | 14.525 | <.0001 | |
Category match | 0.138 | 0.050 | 2.736 | 0.006 | |
Schema match | 0.008 | 0.044 | 0.177 | 0.860 | |
# of words | -0.013 | 0.022 | -0.623 | 0.533 | |
Time*Target | 0.132 | 0.063 | 2.113 | 0.035 | |
Time*Category | 0.023 | 0.053 | 0.428 | 0.669 | |
Time*Schema | -0.006 | 0.047 | -0.126 | 0.899 | |
Target*Words | 0.089 | 0.033 | 2.736 | 0.006 | |
Category*Words | 0.024 | 0.026 | 0.944 | 0.345 | |
Schema*Words | 0.013 | 0.023 | 0.561 | 0.575 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Item | 0.10 | 0.31 | |||
Participant | 2.32 | 1.52 | |||
Targets | 3.12 | 1.77 | -0.69 |
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | -1.334 | 0.114 | -11.674 | <.0001 | |
Study Time | 0.015 | 0.042 | 0.346 | 0.729 | |
Targets | 1.937 | 0.133 | 14.525 | <.0001 | |
Category match | 0.138 | 0.050 | 2.736 | 0.006 | |
Schema match | 0.008 | 0.044 | 0.177 | 0.860 | |
# of words | -0.013 | 0.022 | -0.623 | 0.533 | |
Time*Target | 0.132 | 0.063 | 2.113 | 0.035 | |
Time*Category | 0.023 | 0.053 | 0.428 | 0.669 | |
Time*Schema | -0.006 | 0.047 | -0.126 | 0.899 | |
Target*Words | 0.089 | 0.033 | 2.736 | 0.006 | |
Category*Words | 0.024 | 0.026 | 0.944 | 0.345 | |
Schema*Words | 0.013 | 0.023 | 0.561 | 0.575 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Item | 0.10 | 0.31 | |||
Participant | 2.32 | 1.52 | |||
Targets | 3.12 | 1.77 | -0.69 |
Note. Experiment 1 Timing effects: Old/new recognition data for 210 participants, 240 items, 25,176 data points. Fixed effects include the number of words used to describe the target image in the first phase of the task, the match between the named target and the current item, and the log-transformed amount of time (in seconds) on the corresponding study trial. Unrelated items coded as baseline; fixed effects of Targets, Category match and Schema match contrast correct recognition rate to baseline. Bolded values indicate significant effects.
Image Type
The hit rates for correctly identifying food and fitness images at test in Experiment 1 are presented in Figure B1 as a function of target type and image type. As an exploratory analysis, we added image type (food or fitness) as a factor into a version of the model presented in Table 2. To simplify this analysis, because the interactions between comment length and Category and Schema effects were not significant in the Table 2 model, we removed them from this model (Table B2).
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | -1.338 | 0.114 | -11.746 | <.0001 | |
Type (food = -.5, fitness = +.5) | 0.447 | 0.080 | 5.561 | <.0001 | |
# of words | 0.001 | 0.013 | 0.078 | 0.938 | |
Targets | 1.941 | 0.135 | 14.367 | <.0001 | |
Category | 0.136 | 0.050 | 2.710 | 0.007 | |
Schema | 0.011 | 0.044 | 0.253 | 0.800 | |
Type*Words | 0.038 | 0.017 | 2.190 | 0.029 | |
Target*Words | 0.093 | 0.027 | 3.405 | 0.001 | |
Type*Target | -0.504 | 0.098 | -5.137 | <.0001 | |
Type*Category | -0.098 | 0.100 | -0.976 | 0.329 | |
Type*Schema | -0.126 | 0.088 | -1.433 | 0.152 | |
Type*Target*Words | -0.049 | 0.038 | -1.266 | 0.205 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Item | 0.08 | 0.28 | |||
Participant | 2.32 | 1.52 | |||
Target | 3.22 | 1.80 | -0.70 |
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | -1.338 | 0.114 | -11.746 | <.0001 | |
Type (food = -.5, fitness = +.5) | 0.447 | 0.080 | 5.561 | <.0001 | |
# of words | 0.001 | 0.013 | 0.078 | 0.938 | |
Targets | 1.941 | 0.135 | 14.367 | <.0001 | |
Category | 0.136 | 0.050 | 2.710 | 0.007 | |
Schema | 0.011 | 0.044 | 0.253 | 0.800 | |
Type*Words | 0.038 | 0.017 | 2.190 | 0.029 | |
Target*Words | 0.093 | 0.027 | 3.405 | 0.001 | |
Type*Target | -0.504 | 0.098 | -5.137 | <.0001 | |
Type*Category | -0.098 | 0.100 | -0.976 | 0.329 | |
Type*Schema | -0.126 | 0.088 | -1.433 | 0.152 | |
Type*Target*Words | -0.049 | 0.038 | -1.266 | 0.205 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Item | 0.08 | 0.28 | |||
Participant | 2.32 | 1.52 | |||
Target | 3.22 | 1.80 | -0.70 |
Note. Experiment 1 Category effects: Old/new recognition data test for 210 participants, 240 items, 25,176 data points. Fixed effects include the number of words used to describe the target image in the first phase of the task, image type, and the match between the named target and the current item. Unrelated items coded as baseline; fixed effects of Targets, Category match and Schema match contrast correct recognition rate to baseline. Bolded values indicate significant effects.
The results of this model (Table B2) were similar to the results presented in the primary model (Table 2), with a significant memory boost for Targets over the non-target baseline (z=14.37, p<.0001), a significant memory boost for items from the same Category as the target over baseline (z=2.71, p<.01), and a significant interaction between Target memory and comment length (z=3.41, p<.01). There was also significant effect of image type (food vs. fitness) at baseline (z=5.56, p<.0001), meaning that when the participant was commenting on a control image (cat, dog, or nature), the hit rate for non-target fitness images was better than that for non-target food images. While this image type effect may be due to a general mnemonic benefit for images of bodies over food, another possibility is that idiosyncratic variability in the items is responsible. This image type effect interacted with comment length (z=2.19, p<.05), such that this image type effect was more pronounced with longer comments. Lastly, a significant interaction between Target memory and image type (z = -5.14, p<.0001), was due to a smaller target benefit for fitness than food items; this may relate to the fact that memory for the fitness images was superior at baseline.
Exploratory Analyses of Experiment 2
Our pre-registered analysis plan (https://osf.io/9zmnq) included exploratory analyses of timing information and also object category. We summarize those findings in what follows.
Timing
As an exploratory analysis, we extracted automatic timing information in Qualtrics regarding the time spent on each study trial. As in Experiment 1, the validity of this timing information was questionable, with the amount of time reportedly spent on individual study trials ranging from .45 to 4413.05 seconds. This timing measure was log transformed and mean-centered, and added to a version of the model presented in Table 4 (Table B3). The results of the model were similar to the findings presented in Table 4, with a significant memory boost for Targets (z=22.18, p<.0001), and a significant interaction between Target memory and comment length (z=5.19, p<.001). There was also a significant interaction between Target memory and study time (z=2.10, p=.036), due to a larger Target memory boost the longer the study time. The fact that the effect of the comment length on Target memory is still present even when study time is taken into account indicates that the commenting-related memory boost is not simply due to the amount of time spent viewing the target image.
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | 0.178 | 0.049 | 3.611 | 0.000 | |
Targets | 1.866 | 0.084 | 22.178 | <.0001 | |
# of words | 0.002 | 0.009 | 0.192 | 0.848 | |
Category match | 0.073 | 0.041 | 1.776 | 0.076 | |
Schema match | 0.051 | 0.034 | 1.487 | 0.137 | |
Study time | 0.021 | 0.033 | 0.631 | 0.528 | |
Target*Words | 0.102 | 0.020 | 5.186 | 0.000 | |
Category*Words | -0.012 | 0.012 | -0.999 | 0.318 | |
Schema*Words | -0.001 | 0.010 | -0.068 | 0.946 | |
Target*Time | 0.122 | 0.058 | 2.100 | 0.036 | |
Category*Time | 0.006 | 0.045 | 0.144 | 0.885 | |
Schema*Time | -0.047 | 0.039 | -1.227 | 0.220 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Participant | 0.07 | 0.27 | |||
Targets | 0.82 | 0.91 | 0.43 | ||
Item | 0.38 | 0.62 | |||
Targets | 0.37 | 0.61 | -0.42 | ||
Category match | 0.11 | 0.33 | -0.17 | -0.12 | |
Schema match | 0.06 | 0.24 | -0.48 | 0.08 | 0.6 |
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | 0.178 | 0.049 | 3.611 | 0.000 | |
Targets | 1.866 | 0.084 | 22.178 | <.0001 | |
# of words | 0.002 | 0.009 | 0.192 | 0.848 | |
Category match | 0.073 | 0.041 | 1.776 | 0.076 | |
Schema match | 0.051 | 0.034 | 1.487 | 0.137 | |
Study time | 0.021 | 0.033 | 0.631 | 0.528 | |
Target*Words | 0.102 | 0.020 | 5.186 | 0.000 | |
Category*Words | -0.012 | 0.012 | -0.999 | 0.318 | |
Schema*Words | -0.001 | 0.010 | -0.068 | 0.946 | |
Target*Time | 0.122 | 0.058 | 2.100 | 0.036 | |
Category*Time | 0.006 | 0.045 | 0.144 | 0.885 | |
Schema*Time | -0.047 | 0.039 | -1.227 | 0.220 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Participant | 0.07 | 0.27 | |||
Targets | 0.82 | 0.91 | 0.43 | ||
Item | 0.38 | 0.62 | |||
Targets | 0.37 | 0.61 | -0.42 | ||
Category match | 0.11 | 0.33 | -0.17 | -0.12 | |
Schema match | 0.06 | 0.24 | -0.48 | 0.08 | 0.6 |
Note. Experiment 2 supplemental timing analysis: Accuracy on 2AFC memory test for 300 participants, 240 items, 36,000 data points. The study time measure was log-transformed and mean centered. Unrelated items coded as baseline; fixed effects of Targets, Category match and Schema match contrast correct recognition rate to baseline. Bolded values indicate significant effects.
Image Type
Accuracy in identifying food and fitness images at test in Experiment 2 is presented in Figure B2 as a function of target type and image type. As an exploratory analysis, we added image type (food or fitness) as a factor into a version of the model presented in Table 4 that, for simplicity, did not contain the non-significant interactions between comment length and Category and Schema effects (Table B4).
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | 0.177 | 0.049 | 3.602 | 0.000 | |
Type (food = -.5, fitness = +.5) | 0.126 | 0.093 | 1.348 | 0.178 | |
# of words | -0.002 | 0.005 | -0.390 | 0.697 | |
Targets | 1.875 | 0.085 | 22.053 | <.0001 | |
Category | 0.075 | 0.041 | 1.826 | 0.068 | |
Schema | 0.052 | 0.034 | 1.521 | 0.128 | |
Target*Words | 0.116 | 0.018 | 6.359 | 0.000 | |
Words*Type | 0.011 | 0.008 | 1.401 | 0.161 | |
Target*Type | -0.240 | 0.118 | -2.037 | 0.042 | |
Category*Type | 0.100 | 0.082 | 1.221 | 0.222 | |
Schema*Type | -0.015 | 0.068 | -0.224 | 0.823 | |
Target*Words*Type | -0.021 | 0.027 | -0.790 | 0.430 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Participant | 0.07 | 0.27 | |||
Targets | 0.88 | 0.94 | 0.44 | ||
Item | 0.37 | 0.61 | |||
Targets | 0.36 | 0.60 | -0.41 | ||
Category | 0.10 | 0.32 | -0.18 | -0.09 | |
Schema | 0.06 | 0.24 | -0.48 | 0.08 | 0.61 |
Fixed Effects | Estimate | SE | z-value | p-value | |
(Intercept) | 0.177 | 0.049 | 3.602 | 0.000 | |
Type (food = -.5, fitness = +.5) | 0.126 | 0.093 | 1.348 | 0.178 | |
# of words | -0.002 | 0.005 | -0.390 | 0.697 | |
Targets | 1.875 | 0.085 | 22.053 | <.0001 | |
Category | 0.075 | 0.041 | 1.826 | 0.068 | |
Schema | 0.052 | 0.034 | 1.521 | 0.128 | |
Target*Words | 0.116 | 0.018 | 6.359 | 0.000 | |
Words*Type | 0.011 | 0.008 | 1.401 | 0.161 | |
Target*Type | -0.240 | 0.118 | -2.037 | 0.042 | |
Category*Type | 0.100 | 0.082 | 1.221 | 0.222 | |
Schema*Type | -0.015 | 0.068 | -0.224 | 0.823 | |
Target*Words*Type | -0.021 | 0.027 | -0.790 | 0.430 | |
Random Effects | Variance | Std.Dev. | Corr | ||
Participant | 0.07 | 0.27 | |||
Targets | 0.88 | 0.94 | 0.44 | ||
Item | 0.37 | 0.61 | |||
Targets | 0.36 | 0.60 | -0.41 | ||
Category | 0.10 | 0.32 | -0.18 | -0.09 | |
Schema | 0.06 | 0.24 | -0.48 | 0.08 | 0.61 |
Note. Experiment 2 Category effects: Accuracy on 2AFC memory test for 300 participants, 240 items, 36,000 data points. Bolded values indicate significant effects. Fixed effects include image type, and the match between the named target and the current item. Unrelated items coded as baseline; fixed effects of Targets, Category match and Schema match contrast correct recognition rate to baseline. Bolded values indicate significant effects.
The results of this model (Table B4) were similar to the results presented in the primary model (Table 4), with a significant memory boost for Targets over the non-target baseline (z=22.05, p<.0001), and a significant interaction between Target memory and comment length (z=6.36, p<.01). Lastly, a significant interaction between Target memory and image type (z = -2.04, p<.05), was due to a smaller target benefit for fitness than food items; this may relate to the fact that memory for the fitness images was numerically better at baseline.
Footnotes
Note that sometimes images in the “nature” category contained human artifacts such as buildings (as in Figure 1).
The same pattern of results was obtained with non-weighted Helmert contrasts.
In coding the comment data we discovered 24 trials for which the comment was completely garbled with no word-like units present, and an additional 845 trials for which a participant typed the identical comment for 4 or more of the 30 pictures that they were asked to comment on (e.g. repeatedly typing “nice”). While these considerations were not part of the pre-registered analysis plan, a post-hoc analysis that excluded the corresponding trials revealed the same pattern of findings as in the primary analysis presented in Table 2.
Additional exploratory analyses are presented in the Appendix.
A post-hoc analysis of the comment data revealed 646 target trials on which participants typed the same comment, e.g. “good” for 4 or more trials. A version of the model presented in Table 4 that excluded the associated trials revealed the same pattern of results. Additional exploratory analyses are presented in the Appendix.