Reputation: 125
I have a dataframe containing observations for two sets of data (A,B), with dataset and observation type given by the column names :
mydf <- data.frame(meta1=paste0("a",1:2), meta2=paste0("b",1:2),
A_var1 = c(11:12), A_var2 = c("p","r"),
B_var1 = c(21:22), B_var2 = c("x","z"))
I would like to reshape this dataframe so that each row contains observations on one set only. In this long format, set and column names should by given by splitting the original column names at the '_':
mydf2 <- data.frame(meta1=rep(paste0("a",1:2),2),
meta2=rep(paste0("b",1:2),2),
set=c("A","B","A","B"),
var1 = c(11:12),
var2 = c("a","b","c","d"))
I have tried using 'gather' in combination with 'str_split','sub', but unfortunately without success. Could this be done using tideverse functions?
Upvotes: 2
Views: 304
Reputation: 3722
Yes you can do this with tidyverse
!
You were close, you need to gather
, then separate
, then spread
.
new_df <- mydf %>%
gather(set, vars, 3:6) %>%
separate(set, into = c('set', 'var'), sep = "_") %>%
spread(var, vars)
hope this helps!
Upvotes: 1