Tom
Tom

Reputation: 2341

Replacing NA's with values from another row, conditional on a specific country and years

I have a data.table with currencies (let's say in USD) as follows, which has NA's after the country has switched to EUR. In addition, I have the year in which the currency switches to EUR:

library(data.table)
DT <- fread(
    "Val iso   year   
     2   NLD   1998
     NA  NLD   1999   
     NA  NLD   2000
     NA  NLD   2001   
     NA  NLD   2002  
     1   GBR   1998
     1   GBR   1999   
     2   GBR   2000
     NA  GBR   2001   
     1   GBR   2002  
     NA  EUR   1998
     1   EUR   1999   
     1   EUR   2000
     2   EUR   2001   
     1   EUR   2002  
     3   GRC   1998
     NA  GRC   1999
     4   GRC   2000
     NA  GRC   2001
     NA  GRC   2002",
   header = TRUE
)

EUR <- fread(
    " e_iso e_year   
      NLD   1999   
      GRC   2001",
   header = TRUE
)

I would like to use the EUR currency to replace the NA's ONLY after the respective country has switched to EUR. So, for example Val for GRE in 1999 should remain NA.

I was thinking something along the lines of:

DT[EUR$e_iso %in% DT_iso & EUR$e_year %in% DT$year, Val := ?]

But I am not sure how to proceed. Desired Output:

library(data.table)
DT <- fread(
    "Val iso   year   
     2   NLD   1998
     1   NLD   1999   
     1   NLD   2000
     1   NLD   2001   
     1   NLD   2002  
     1   GBR   1998
     1   GBR   1999   
     2   GBR   2000
     NA  GBR   2001   
     1   GBR   2002  
     NA  EUR   1998
     1   EUR   1999   
     1   EUR   2000
     2   EUR   2001   
     1   EUR   2002  
     3   GRC   1998
     NA  GRC   1999
     4   GRC   2000
     2   GRC   2001
     1   GRC   2002",
   header = TRUE
)

EDIT:

I made a small mistake in the desired output (which is now corrected). The value for greece in 2001 is now 2.

With for example DT[is.na(Val), Val := EUR[.SD, as.integer(NA ^ is.na(x.e_iso)), on = .(e_iso = iso, e_year <= year)]] line 19 is Val = 1 but should be Val = 2 since the value of the EUR in 2001 (line 14) is 2.

Upvotes: 1

Views: 98

Answers (2)

s_baldur
s_baldur

Reputation: 33488

DT[EUR, 
   on = .(iso = e_iso, year >= e_year), 
   Val := DT[iso == "EUR" & year == .BY, Val], 
   by = year]

    Val iso year
 1:   2 NLD 1998
 2:   1 NLD 1999
 3:   1 NLD 2000
 4:   2 NLD 2001
 5:   1 NLD 2002
 6:   1 GBR 1998
 7:   1 GBR 1999
 8:   2 GBR 2000
 9:  NA GBR 2001
10:   1 GBR 2002
11:  NA EUR 1998
12:   1 EUR 1999
13:   1 EUR 2000
14:   2 EUR 2001
15:   1 EUR 2002
16:   3 GRC 1998
17:  NA GRC 1999
18:   4 GRC 2000
19:   2 GRC 2001
20:   1 GRC 2002

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389047

We can left_join DT and EUR by iso and replace the values where year >= e_year with equivalent "EUR" currency of the same year.

library(dplyr)

left_join(DT, EUR, by = c('iso' = 'e_iso')) %>%
   mutate(Val = replace(Val, year >= e_year, 
                Val[iso == "EUR" & year == first(e_year)]))


#     Val iso    year
#   <dbl> <chr> <int>
# 1     2 NLD    1998
# 2     1 NLD    1999
# 3     1 NLD    2000
# 4     1 NLD    2001
# 5     1 NLD    2002
# 6     1 GBR    1998
# 7     1 GBR    1999
# 8     2 GBR    2000
# 9    NA GBR    2001
#10     1 GBR    2002
#11    NA EUR    1998
#12     1 EUR    1999
#13     1 EUR    2000
#14     2 EUR    2001
#15     1 EUR    2002
#16     3 GRC    1998
#17    NA GRC    1999
#18     4 GRC    2000
#19     1 GRC    2001
#20     1 GRC    2002

Upvotes: 1

Related Questions