Parseltongue
Parseltongue

Reputation: 11707

Convert from wide to long format by matching column suffixes in dplyr

Let's say I have the following data:

dat <- read.table(text="pairing feelings_pre feelings_post ingroup_pre ingroup_post
0 22.0 22.6 66.3 67.5
1 22.0 28.5 63.2 64.6", header=T)

I am trying to transform this data from wide format to long so that I can plot pre and post scores as a line chart in ggplot. So I need a column that is "pre", that is set to 1 if the column of interest has the "_pre" suffix, and set to 0 if the column has a "_post" suffix.

The partial example of the resulting dataframe would look like:

dat <- read.table(text="pairing variable value pre
0 feelings_pre 22.0 1
0 feelings_post 22.6 0
0 ingroup_pre 66.3 1
0 ingrop_post 67.5 0", header=T)

I have been trying to use spread and separate with a regex matcher, but have not been able to get it to work. Any ideas?

Upvotes: 1

Views: 399

Answers (2)

tmfmnk
tmfmnk

Reputation: 40171

A bit different possibility could be:

dat %>%
 gather(variable, value, -pairing) %>%
 mutate(pre = ifelse(sub(".*_", "", variable) == "pre", 1, 0)) 

  pairing      variable value pre
1       0  feelings_pre  22.0   1
2       1  feelings_pre  22.0   1
3       0 feelings_post  22.6   0
4       1 feelings_post  28.5   0
5       0   ingroup_pre  66.3   1
6       1   ingroup_pre  63.2   1
7       0  ingroup_post  67.5   0
8       1  ingroup_post  64.6   0

In this case it transforms the data from wide to long and then checks whether the part after _ in key (i.e. "variable") is "pre". If so, it assigns 1, otherwise 0.

Or using str_detect() from stringr:

dat %>%
 gather(variable, value, -pairing) %>%
 mutate(pre = str_detect(variable, "_pre") * 1)

Upvotes: 1

arg0naut91
arg0naut91

Reputation: 14774

Try:

library(dplyr)

dat %>% filter(pairing == 0) %>%
  gather(variable, value, -pairing) %>%
  mutate(pre = +(grepl("_pre", variable)))

Output:

  pairing      variable value pre
1       0  feelings_pre  22.0   1
2       0 feelings_post  22.6   0
3       0   ingroup_pre  66.3   1
4       0  ingroup_post  67.5   0

Note that this is if you'd like to filter out 0 pairing (as you don't have it in your example).

However, since you said this is partial, you'd just leave the filter part out and get also the results for pairing where it is equal to 1.

Upvotes: 3

Related Questions