Reputation: 411
Good evening,
I have a dataframe that looks like this at the moment:
ID Day1Var1 Day1Var2 Day2Var1 Day2Var2
1 1 2 3 4
2 5 6 7 8
and I would like to bring it to a somewhat long format that looks like this:
ID Day Var1 Var2
1 1 1 2
1 2 3 4
2 1 5 6
2 2 7 8
Is there an easy way to also use the prefix terms of variables ("Day1", "Day2"), to create a new variable called "Day" and fill the values with 1 and 2 automatically? This would be really useful, as I have three levels of prefixes (observation, day, week).
Thanks a lot!
Upvotes: 1
Views: 239
Reputation: 47340
We can split horizontally, change the names and bind with "Day" as and id:
library(tidyverse)
list(df1[1:3],df1[c(1,4:5)]) %>%
map(rename_all,gsub,pattern="Day.",replacement="") %>%
bind_rows(.id = "Day")
# Day ID Var1 Var2
# 1 1 1 1 2
# 2 1 2 5 6
# 3 2 1 3 4
# 4 2 2 7 8
Upvotes: 2
Reputation: 887691
Here is an option with melt
which takes multiple measure
columns
library(data.table)
melt(setDT(df1), measure = patterns("Var1", "Var2"),
variable.name = 'Day', value.name = c('Var1', 'Var2'))[order(ID)]
# ID Day Var1 Var2
#1: 1 1 1 2
#2: 1 2 3 4
#3: 2 1 5 6
#4: 2 2 7 8
df1 <- structure(list(ID = 1:2, Day1Var1 = c(1L, 5L), Day1Var2 = c(2L,
6L), Day2Var1 = c(3L, 7L), Day2Var2 = c(4L, 8L)), .Names = c("ID",
"Day1Var1", "Day1Var2", "Day2Var1", "Day2Var2"), class = "data.frame",
row.names = c(NA, -2L))
Upvotes: 4
Reputation: 23608
You could use the following code:
First we gather the data in a long format, split the column key so day and var get split. I remove day from day1 and spread the data into the correct form. Not sure if the separate and substring can not be done in 1 step.
library(tidyr)
library(dplyr)
df1 %>%
gather(key, value, -ID) %>%
separate(key, into = c("Day", "Var"), sep = 4) %>%
mutate(Day = substring(Day, 4, 4)) %>%
spread(Var, value)
ID Day Var1 Var2
1 1 1 1 2
2 1 2 3 4
3 2 1 5 6
4 2 2 7 8
data:
df1 <- structure(list(ID = 1:2, Day1Var1 = c(1L, 5L), Day1Var2 = c(2L,
6L), Day2Var1 = c(3L, 7L), Day2Var2 = c(4L, 8L)), class = "data.frame", row.names = c(NA,
-2L))
Upvotes: 6