Johanna Ramirez
Johanna Ramirez

Reputation: 171

How join multiple dataframes with complementary information

I have the following dataframes: df1:

Animal Farm Year    y  
1      A      1    23
2      A      2    24    
3      B      1    13   
4      B      3    28 
5      C      1    25

df2

Farm Year Month  C   Mm  Pp
 A      1    1    23  15  4
 A      1    2    25  18  5
 A      1    3    13  35  7
 A      1    4    18  38  4

df3

Farm Year Month  C   Mm  Pp
 B      1    1    12  13  3
 B      1    2    22  11  2
 B      1    3    14  17  5
 B      1    4    18  38  6

df4

 Farm Year Month  C   Mm  Pp
     C      3    1    12  13  3
     C      3    2    22  11  2
     C      3    3    14  17  5
     C      3    4    18  38  6

I want the following for all animals and farms present in df1: ex:

df5

 Animal Farm  Year  month  C Mm  Pp y
    1   A      1    1    23  15  4 23
    1   A      1    2    25  18  5 23
    1   A      1    3    13  35  7 23
    1   A      1    4    18  38  4 23
    3   B      1    1    12  13  3 13
    3   B      1    2    22  11  2 13
    3   B      1    3    14  17  5 13
    3   B      1    4    18  38  6 13

I tried

df1 %>% full_join(df2, by= c("farm", "year")) %>%
full_join(df3, by = c("farm, "year", "month")) %>%
full_join(df4, by = c("farm, "year", "month")) --> df5

but it did't work correctly. So, I also tried:

library(gtools)

df5 <- smartbind(df1,df2, df3, df4)

Upvotes: 0

Views: 89

Answers (2)

Marco_CH
Marco_CH

Reputation: 3294

Here a way to go with data.table:

library(data.table)

df5 = merge(df1, rbindlist(list(df2,df3,df4)), by = c("Farm", "Year"))

Output:

> df5
   Farm Year Animal  y Month  C Mm Pp
1:    A    1      1 23     1 23 15  4
2:    A    1      1 23     2 25 18  5
3:    A    1      1 23     3 13 35  7
4:    A    1      1 23     4 18 38  4
5:    B    1      3 13     1 12 13  3
6:    B    1      3 13     2 22 11  2
7:    B    1      3 13     3 14 17  5
8:    B    1      3 13     4 18 38  6

Update, if you want to see all rows, whether it's matched or not:

> merge(df1, rbindlist(list(df2,df3,df4)), by = c("Farm", "Year"), all=TRUE)
    Farm Year Animal  y Month  C Mm Pp
 1:        NA     NA NA    NA NA NA NA
 2:    A    1      1 23     1 23 15  4
 3:    A    1      1 23     2 25 18  5
 4:    A    1      1 23     3 13 35  7
 5:    A    1      1 23     4 18 38  4
 6:    A    2      2 24    NA NA NA NA
 7:    B    1      3 13     1 12 13  3
 8:    B    1      3 13     2 22 11  2
 9:    B    1      3 13     3 14 17  5
10:    B    1      3 13     4 18 38  6
11:    B    3      4 28    NA NA NA NA
12:    C    1      5 25    NA NA NA NA
13:    C    3     NA NA     1 12 13  3
14:    C    3     NA NA     2 22 11  2
15:    C    3     NA NA     3 14 17  5
16:    C    3     NA NA     4 18 38  6

Upvotes: 1

Antex
Antex

Reputation: 1454

If you can share your dataframes so they can be reproduced, it would be helpful.

Here is an example:

mtcars[1:3,1:3] %>% dput

structure(list(mpg = c(21, 21, 22.8), cyl = c(6, 6, 4), disp = c(160, 
160, 108)), row.names = c("Mazda RX4", "Mazda RX4 Wag", "Datsun 710"
), class = "data.frame")

Upvotes: 1

Related Questions