Rafael Nakamura
Rafael Nakamura

Reputation: 59

Copy and repeat values ​from one dataframe to another dataframe with specific conditions in R

I have two dataframes with distinct columns between them, as follows:

DF1
    month      mbs   rec_n     vol_n      rec_indice  rec_mm
    2021-05-01 hsp   164.84724 102.44143  1.5         0.6
    2021-05-01 tvc   106.2428  81.89729   16.4        2.9
    2021-07-01 tvc   111.51844 90.37325   -3.1       -3.4
    2022-01-01 outro 74.33283  70.90493   0.8         1.2

DF2
    month      mbs  prod_n     prod_n_mm
    2021-05-01 inds 164.84724  102.44143
    2021-05-01 indf 106.2428   81.89729
    2021-07-01 inds 111.51844  90.37325
    2022-01-01 indf 74.33283   70.90493

I want to copy the columns and values ​​of the respective columns of DF1 and 2, comparing the column month and mbs if they are equal. The month column should only be compared to the month, not the day. If the month of the "Month" column is the same and the value of the mbs column is also the same, copy the value. If these conditions are not met, set the value to zero, as shown in DF3

DF3
    month      mbs   x     y     rec_n     vol_n     rec_indice  rec_mm prod_n    pro_n_mm
    2021-05-06 hsp   value value 164.84724 102.44143 1.5         0.6    0         0
    2021-05-30 hsp   value value 164.84724 102.44143 1.5         0.6    0         0
    2021-05-18 tvc   value value 106.2428  81.89729  16.4        2.9    0         0
    2022-01-21 outro value value 74.33283  70.90493  0.8         1.2    0         0
    2022-01-30 outro value value 74.33283  70.90493  0.8         1.2    0         0
    2021-05-06 inds  value value 0         0         0           0      164.84724 102.44143
    2021-07-30 idnf  value value 0         0         0           0      0         0
    2021-07-22 inds  value value 0         0         0           0      111.51844 90.37325
    2022-01-21 indf  value value 0         0         0           0      74.33283 70.90493

Remembering that the column x and y are columns that already exist in DF3. So I just want to complete DF3 with the columns of DF1 and DF2.

Upvotes: 0

Views: 473

Answers (2)

Rafael Nakamura
Rafael Nakamura

Reputation: 59

left_join did the trick. I know there is a more elegant solution, but I created a new column for the two dataframes with just month and year. Then I used left_join(df3, df2, by = c("month_year", "mbs"))

Upvotes: 0

Allan Cameron
Allan Cameron

Reputation: 173803

It sounds like you are looking for a full join:

library(tidyverse)

full_join(DF1, DF2, by = c('month', 'mbs'))
#>        month   mbs     rec_n     vol_n rec_indice rec_mm    prod_n prod_n_mm
#> 1 2021-05-01   hsp 164.84724 102.44143        1.5    0.6        NA        NA
#> 2 2021-05-01   tvc 106.24280  81.89729       16.4    2.9        NA        NA
#> 3 2021-07-01   tvc 111.51844  90.37325       -3.1   -3.4        NA        NA
#> 4 2022-01-01 outro  74.33283  70.90493        0.8    1.2        NA        NA
#> 5 2021-05-01  inds        NA        NA         NA     NA 164.84724 102.44143
#> 6 2021-05-01  indf        NA        NA         NA     NA 106.24280  81.89729
#> 7 2021-07-01  inds        NA        NA         NA     NA 111.51844  90.37325
#> 8 2022-01-01  indf        NA        NA         NA     NA  74.33283  70.90493

If you really want zeros instead of NA (which might lead to problems later) then you can do:

full_join(DF1, DF2, by = c('month', 'mbs')) %>%
  replace_na(list(rec_n = 0, vol_n = 0, rec_indice = 0, rec_mm = 0, 
                  prod_n = 0, prod_n_mm = 0))
#>        month   mbs     rec_n     vol_n rec_indice rec_mm    prod_n prod_n_mm
#> 1 2021-05-01   hsp 164.84724 102.44143        1.5    0.6   0.00000   0.00000
#> 2 2021-05-01   tvc 106.24280  81.89729       16.4    2.9   0.00000   0.00000
#> 3 2021-07-01   tvc 111.51844  90.37325       -3.1   -3.4   0.00000   0.00000
#> 4 2022-01-01 outro  74.33283  70.90493        0.8    1.2   0.00000   0.00000
#> 5 2021-05-01  inds   0.00000   0.00000        0.0    0.0 164.84724 102.44143
#> 6 2021-05-01  indf   0.00000   0.00000        0.0    0.0 106.24280  81.89729
#> 7 2021-07-01  inds   0.00000   0.00000        0.0    0.0 111.51844  90.37325
#> 8 2022-01-01  indf   0.00000   0.00000        0.0    0.0  74.33283  70.90493

Created on 2022-09-27 with reprex v2.0.2

Upvotes: 2

Related Questions