LyxUser12345
LyxUser12345

Reputation: 411

Partial wide to long format for data.frame in R

Good evening,

I have a dataframe that looks like this at the moment:

ID Day1Var1 Day1Var2 Day2Var1 Day2Var2 
1     1        2        3        4
2     5        6        7        8

and I would like to bring it to a somewhat long format that looks like this:

ID    Day   Var1   Var2 
1      1      1      2  
1      2      3      4
2      1      5      6  
2      2      7      8

Is there an easy way to also use the prefix terms of variables ("Day1", "Day2"), to create a new variable called "Day" and fill the values with 1 and 2 automatically? This would be really useful, as I have three levels of prefixes (observation, day, week).

Thanks a lot!

Upvotes: 1

Views: 239

Answers (3)

moodymudskipper
moodymudskipper

Reputation: 47340

We can split horizontally, change the names and bind with "Day" as and id:

library(tidyverse)
list(df1[1:3],df1[c(1,4:5)]) %>%
  map(rename_all,gsub,pattern="Day.",replacement="") %>%
  bind_rows(.id = "Day")
#   Day ID Var1 Var2
# 1   1  1    1    2
# 2   1  2    5    6
# 3   2  1    3    4
# 4   2  2    7    8

Upvotes: 2

akrun
akrun

Reputation: 887691

Here is an option with melt which takes multiple measure columns

library(data.table)
melt(setDT(df1), measure = patterns("Var1", "Var2"), 
      variable.name = 'Day', value.name = c('Var1', 'Var2'))[order(ID)]
#   ID Day Var1 Var2
#1:  1   1    1    2
#2:  1   2    3    4
#3:  2   1    5    6
#4:  2   2    7    8

data

df1 <- structure(list(ID = 1:2, Day1Var1 = c(1L, 5L), Day1Var2 = c(2L, 
6L), Day2Var1 = c(3L, 7L), Day2Var2 = c(4L, 8L)), .Names = c("ID", 
"Day1Var1", "Day1Var2", "Day2Var1", "Day2Var2"), class = "data.frame", 
 row.names = c(NA, -2L))

Upvotes: 4

phiver
phiver

Reputation: 23608

You could use the following code:

First we gather the data in a long format, split the column key so day and var get split. I remove day from day1 and spread the data into the correct form. Not sure if the separate and substring can not be done in 1 step.

library(tidyr)
library(dplyr)

df1 %>% 
  gather(key, value, -ID) %>% 
  separate(key, into = c("Day", "Var"), sep = 4) %>% 
  mutate(Day = substring(Day, 4, 4)) %>% 
  spread(Var, value)
  ID Day Var1 Var2
1  1   1    1    2
2  1   2    3    4
3  2   1    5    6
4  2   2    7    8

data:

df1 <- structure(list(ID = 1:2, Day1Var1 = c(1L, 5L), Day1Var2 = c(2L, 
6L), Day2Var1 = c(3L, 7L), Day2Var2 = c(4L, 8L)), class = "data.frame", row.names = c(NA, 
-2L))

Upvotes: 6

Related Questions