tidystats tutorial: An example

A semi-real life example of how to use tidystats.

In this post I will provide a semi-real life example of how to use tidystats. We will analyze the data of a replication study performed by Wissink et al. as part of the Reproducibility Project: Psychology(Open Science Collaboration 2015). The target of the replication study was Study 6 of Cox et al. (2008)(Cox et al. 2008). The authors hypothesized that low levels of attachment anxiety and avoidance (attachment security) would predict an increased preference for close relationships with romantic partners. They also hypothesized that after a mortality salience induction, insecurely attached individuals should report increased preference for a parent. They predicted that this effect would be specific to those with high levels of attachment anxiety and low levels of attachment avoidance.

To measure the relationship preference, the authors created a cell phone task wherein participants were asked to allocate minutes on a cell phone calling plan to each of four different relationship persons (i.e., parent, sibling, romantic partner, close friend). The participant gets 100 minutes and they have to distribute these minutes of the four attachment sources.

We will run the necessary analyses and save the results to a .csv file, using tidystats.

Before we begin running the analyses, we make some preparations.


# Load packages
library(tidyverse)
library(tidystats)

# Set a default ggplot theme
theme_set(theme_light())

# Create an empty list for tidystats
results <- list()

We load the tidyverse (because it’s amazing) and we load tidystats. After that, we create an empty list. This is the list that we are going to add the output to from the statistical analyses we run.

Next we will prepare the data for analysis. The data is part of the tidystats package, so we do not need to read it in. As for preparations, we make sure the variables are correctly dummy coded, we mean center the anxiety and avoidance scores, and we create two categorical variables using a median split to indicate which participants scored high or low on avoidance and anxiety. We will only use this latter variable for visualization and not for any of the statistical analyses.


cox <- cox %>%
  mutate(
    condition = fct_relevel(condition, "dental pain"),
    avoidance_c = scale(avoidance, center = TRUE, scale = FALSE),
    anxiety_c = scale(anxiety, center = TRUE, scale = FALSE),
    avoidance_group = if_else(avoidance >= median(avoidance), "high", "low"),
    anxiety_group = if_else(anxiety >= median(anxiety), "high", "low")
  )

Before we analyze the results, let’s first visualize the data and re-create Figure 7 and Figure 8 (but better).

Figure 7:

Figure 8:

Based on these figures, we can actually already say that the results do not replicate. However, not running an analysis would make this a terrible example for tidystats, so let’s nevertheless run the analyses.

The analyses consist of four regression analyses with condition, avoidance, and anxiety as predictors (including their interactions) on minutes allocated to each of four attachment sources.


lm_call_friend <- lm(call_friend ~ condition * avoidance_c * anxiety_c, 
  data = cox)
lm_call_siblings <- lm(call_siblings ~ condition * avoidance_c * anxiety_c, 
  data = cox)
lm_call_partner <- lm(call_partner ~ condition * avoidance_c * anxiety_c, 
  data = cox)
lm_call_parent <- lm(call_parent ~ condition * avoidance_c * anxiety_c, 
  data = cox)

As in the original, there was no significant three-way interaction for time allocated to a close friend, b = 0.087, SE = 5.57, t(192) = 0.016, p = .99, or to a sibling, b = -2.94, SE = 3.61, t(192) = -0.81, p = .42.

However, there was also no significant three-way interaction for time allocated to a romantic partner, b = -2.95, SE = 7.01, t(192) = -0.42, p = .67, or to one’s parents, b = 5.80, SE = 5.61, t(192) = 1.03, p = .30.

The original authors summarized the results of the first two regression analyses (close friend and siblings) as ps >= .21. We will use tidystats to report all results, rather than summarizing the two regression analyses to a single number.

Below you can see how to add the four regression analyses to the results list.


results <- results %>%
  add_stats(lm_call_friend) %>%
  add_stats(lm_call_siblings) %>%
  add_stats(lm_call_partner) %>%
  add_stats(lm_call_parent)

And we are actually already done with the analyses. So let’s convert the results to a file that we can share with others, using write_stats().


write_stats(results, "results.csv")

The result is a simple .csv file that can be found here. You can add this file to the supplemental materials of your manuscript or simply make it part of your data package.

That’s it already! My plan is to have more examples of how to use tidystats in the future and I will aim to have examples that differ in how to analyze the results. For example, the next post will cover an example with different kinds of analyses.

I’m also interested in how you experience the use of tidystats, so if you have any feedback, please don’t hesitate to let me know! You can find my contact information on this website or you can go to the Github page of tidystats.

Cox, Cathy R, Jamie Arndt, Tom Pyszczynski, Jeff Greenberg, Abdolhossein Abdollahi, and Sheldon Solomon. 2008. “Terror Management and Adults’ Attachment to Their Parents: The Safe Haven Remains.” Journal of Personality and Social Psychology 94 (4): 696. https://doi.org/10.1037/0022-3514.94.4.696.

Open Science Collaboration. 2015. “Estimating the reproducibility of psychological science.” Science 349 (6251): aac4716–aac4716. https://doi.org/10.1126/science.aac4716.