Juanchi
Juanchi

Reputation: 1166

gather two groups of columns with tidyr

I have this example dataset, containing 3 treatments, two response variables (D, var) for 3 classes (o, l, h):

#        fungic   D_o Var_o   D_l Var_l   D_h Var_h
#    1    AZ_BF 343.6 25756 223.8 20127 442.5 22843
#    2 AZ_CIPRO 186.6 25756  58.2 20127 311.7 22843
#    3  AZ_SOLA 382.6 25756 205.6 20127 520.5 22843


  d =  structure(list(fungic = c("AZ_BF", "AZ_CIPRO", "AZ_SOLA"), D_o = c(343.6, 186.6, 382.6), Var_o = c(25756L, 25756L, 25756L), D_l = c(223.8, 
58.2, 205.6), Var_l = c(20127L, 20127L, 20127L), D_h = c(442.5, 
311.7, 520.5), Var_h = c(22843L, 22843L, 22843L)), .Names = c("fungic", 
"D_o", "Var_o", "D_l", "Var_l", "D_h", "Var_h"), row.names = c(NA,3L), class = "data.frame")

I want to gather the columns that have the same second part of the col name, i.e, all D together, and all Var together, keeping the class level, Like this:

     fungic C     D   Var
1     AZ_BF o 343.6 25756
4     AZ_BF o 223.8 25756
7     AZ_BF o 442.5 25756
10    AZ_BF l 343.6 20127
13    AZ_BF l 223.8 20127
16    AZ_BF l 442.5 20127
19    AZ_BF h 343.6 22843
22    AZ_BF h 223.8 22843
25    AZ_BF h 442.5 22843
2  AZ_CIPRO o 186.6 25756
5  AZ_CIPRO o  58.2 25756
8  AZ_CIPRO o 311.7 25756
11 AZ_CIPRO l 186.6 20127
14 AZ_CIPRO l  58.2 20127
17 AZ_CIPRO l 311.7 20127
20 AZ_CIPRO h 186.6 22843
23 AZ_CIPRO h  58.2 22843
26 AZ_CIPRO h 311.7 22843
3   AZ_SOLA o 382.6 25756
6   AZ_SOLA o 205.6 25756
9   AZ_SOLA o 520.5 25756
12  AZ_SOLA l 382.6 20127
15  AZ_SOLA l 205.6 20127
18  AZ_SOLA l 520.5 20127
21  AZ_SOLA h 382.6 22843
24  AZ_SOLA h 205.6 22843
27  AZ_SOLA h 520.5 22843

I did it with this horrible code...

d1 = d %>% gather(var_Class, D, starts_with("D_")) %>% 
  gather(C, Var, starts_with("Var_")) 
d2 = d1[1:27,]  
d3 = d2 %>% mutate(C=sapply(strsplit(C, split='_', fixed=TRUE),function(x) (x[2])))
d3 = d3[order(d3$fungic),]

can someone help to improve it? I don't really want to fall in excel please..

You are totally right @akrun, but I was totally wrong...

Sorry, I actually need this:

fungic  class       D     var
AZ_BF       o   343.6   25756
AZ_CIPRO    o   186.6   25756
AZ_SOLA     o   382.6   25756
AZ_BF       l   223.8   20127
AZ_CIPRO    l   58.2    20127
AZ_SOLA     l   205.6   20127
AZ_BF       h   442.5   22843
AZ_CIPRO    h   311.7   22843
AZ_SOLA     h   520.5   22843

Upvotes: 3

Views: 1054

Answers (2)

pe-perry
pe-perry

Reputation: 2621

We can first use gather to create a data.frame of 3 columns, one for the id ("fungic"), one for the names of the rest columns and one for the values. Then, we create two variables for the prefix and suffix of the column names. Finally, we use spread to convert the data to wide format.

d %>% 
    gather(vrb, val, -fungic) %>% 
    mutate(D_Var = gsub("(.*)\\_(.*)", "\\1", vrb), 
           class = gsub("(.*)\\_(.*)", "\\2", vrb)) %>% 
    select(-vrb) %>% 
    spread(D_Var, val) %>% 
    arrange(desc(class))

Output:

#     fungic class     D   Var
# 1    AZ_BF     o 343.6 25756
# 2 AZ_CIPRO     o 186.6 25756
# 3  AZ_SOLA     o 382.6 25756
# 4    AZ_BF     l 223.8 20127
# 5 AZ_CIPRO     l  58.2 20127
# 6  AZ_SOLA     l 205.6 20127
# 7    AZ_BF     h 442.5 22843
# 8 AZ_CIPRO     h 311.7 22843
# 9  AZ_SOLA     h 520.5 22843

Upvotes: 2

akrun
akrun

Reputation: 887118

After the first gather, separate the 'var_Class' into two columns, remove the column that are not needed, apply the second gather, remove the 'Var' and arrange by 'fungic'

gather(d, var_Class, D, starts_with("D")) %>% 
       separate(var_Class, into = c("Dvar", "YR")) %>% 
       select(-Dvar) %>% 
       gather(Var, Val, starts_with("Var")) %>% 
       select(-Var) %>%
       arrange(fungic)
#    fungic YR     D   Val
#1     AZ_BF  o 343.6 25756
#2     AZ_BF  l 223.8 25756
#3     AZ_BF  h 442.5 25756
#4     AZ_BF  o 343.6 20127
#5     AZ_BF  l 223.8 20127
#6     AZ_BF  h 442.5 20127
#7     AZ_BF  o 343.6 22843
#8     AZ_BF  l 223.8 22843
#9     AZ_BF  h 442.5 22843
#10 AZ_CIPRO  o 186.6 25756
#11 AZ_CIPRO  l  58.2 25756
#12 AZ_CIPRO  h 311.7 25756
#13 AZ_CIPRO  o 186.6 20127
#14 AZ_CIPRO  l  58.2 20127
#15 AZ_CIPRO  h 311.7 20127
#16 AZ_CIPRO  o 186.6 22843
#17 AZ_CIPRO  l  58.2 22843
#18 AZ_CIPRO  h 311.7 22843
#19  AZ_SOLA  o 382.6 25756
#20  AZ_SOLA  l 205.6 25756
#21  AZ_SOLA  h 520.5 25756
#22  AZ_SOLA  o 382.6 20127
#23  AZ_SOLA  l 205.6 20127
#24  AZ_SOLA  h 520.5 20127
#25  AZ_SOLA  o 382.6 22843
#26  AZ_SOLA  l 205.6 22843
#27  AZ_SOLA  h 520.5 22843

Upvotes: 1

Related Questions