Economist_Ayahuasca
Economist_Ayahuasca

Reputation: 1642

Reshaping complicating data-set in R

I have a strange dataset format where a simple reshape function won't work. Assume I have three time periods (1-3); 2 id Names (A-B); and three variables (X,Y and Z) in the following format. Where the id names and variables name are seperated by -:

Time A-X A-Y A-Z B-X B-Y B-Z
1    2   4   5   6   1   2
2    2   3   2   3   2   3
3    4   4   4   4   4   4

Ideally, I would like to produce the dataset in the following format:

ID Time X Y Z
A  1    2 4 5  
A  2    2 3 2
A  3    4 4 4
B  1    6 1 2
B  2    3 2 3
B  3    4 4 4

Which functions to use?

Upvotes: 1

Views: 46

Answers (2)

Prem
Prem

Reputation: 11955

library(dplyr)
library(tidyr)
library(splitstackshape)

df %>%
  gather(key, value, -Time) %>%
  cSplit("key", sep="_") %>%
  spread(key_2, value) %>%
  rename(ID = key_1) %>%
  arrange(ID, Time)

Output is:

  Time ID X Y Z
1    1  A 2 4 5
2    2  A 2 3 2
3    3  A 4 4 4
4    1  B 6 1 2
5    2  B 3 2 3
6    3  B 4 4 4

Sample data:

df <- structure(list(Time = 1:3, A_X = c(2L, 2L, 4L), A_Y = c(4L, 3L, 
4L), A_Z = c(5L, 2L, 4L), B_X = c(6L, 3L, 4L), B_Y = c(1L, 2L, 
4L), B_Z = 2:4), .Names = c("Time", "A_X", "A_Y", "A_Z", "B_X", 
"B_Y", "B_Z"), class = "data.frame", row.names = c(NA, -3L))

Upvotes: 1

hpesoj626
hpesoj626

Reputation: 3619

Here is another dplyr and tidyr solution.

df %>%
  gather(ID, value, -Time) %>%
  separate(ID, into = c("ID", "var")) %>%
  spread(var, value) %>%
  arrange(ID) %>%
  select(ID, Time, X, Y, Z)

#   ID Time X Y Z
# 1  A    1 2 4 5
# 2  A    2 2 3 2
# 3  A    3 4 4 4
# 4  B    1 6 1 2
# 5  B    2 3 2 3
# 6  B    3 4 4 4

Upvotes: 1

Related Questions