Bart
Bart

Reputation: 317

Set observation value to NA after any lag has an observation

If any of the lags of my two month return is smaller than -0.7 I want to set every next closeret observation to NA.

I tried:

x <- x %>%
  group_by(seriesid) %>%
  mutate(first = { twomonthreturn <= (-0.7) } %>% { . * !duplicated(.) } ) %>%
  mutate(first = ifelse(first==1,datem,NA)) #%>%
  mutate(closeret = ifelse(datem<=sum(first,na.rm = TRUE),closeret,NA))

where datem = year(date)*12+month(date)

date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500-0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000 
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 -0.40000000 -0.62500000
2019-01-26 50005 0.360 0.20000000 -0.28000000
2019-02-23 50005 0.300 -0.16666667 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451

I am looking to find the solution for every group, my solution works for group one but of course if there is no first because it is NA, this solution does not work.

date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500-0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000 
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 NA -0.62500000
2019-01-26 50005 0.360 NA -0.28000000
2019-02-23 50005 0.300 NA 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451

Upvotes: 1

Views: 68

Answers (1)

utubun
utubun

Reputation: 4520

Hope this one will work:

Solution

Feed replace with cumsum(lag(replace_na(twomonthreturn, Inf), , Inf) <= -.7) > 0 which serve as an indices for all values in a group after - 0.7 in twomonthreturn, so it replace all such values in closered with NA.

library(tidyverse)

dat %>%
  arrange(seriesid, date) %>%
  group_by(seriesid) %>%
  mutate(
    closeret = replace(
      closeret,
      cumsum(lag(replace_na(twomonthreturn, Inf), , Inf) <= -.7) > 0,
      NA_real_
      )
    ) %>%
  ungroup()

Output

# A tibble: 13 x 5
   date       seriesid price closeret twomonthreturn
   <date>        <dbl> <dbl>    <dbl>          <dbl>
 1 2018-07-25    50005  3.1   NA             NA     
 2 2018-08-14    50005  2.5   -0.194         NA     
 3 2018-09-28    50005  2.35  -0.06          -0.242 
 4 2018-10-27    50005  0.8   -0.660         -0.68  
 5 2018-11-27    50005  0.5   -0.375         -0.787 
 6 2018-12-31    50005  0.3   NA             -0.625 
 7 2019-01-26    50005  0.36  NA             -0.28  
 8 2019-02-23    50005  0.3   NA              0     
 9 2017-01-21    50006  7     NA             NA     
10 2017-03-28    50006  9.75  NA             NA     
11 2017-04-14    50006  8.88  -0.0897        NA     
12 2017-05-20    50006  9      0.0141        -0.0769
13 2017-06-22    50006  9      0              0.0141

Data

dat <- read.table(
  text = "date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500 -0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000 
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 -0.40000000 -0.62500000
2019-01-26 50005 0.360 0.20000000 -0.28000000
2019-02-23 50005 0.300 -0.16666667 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451",
  colClasses = c("Date", "numeric", "numeric", "numeric", "numeric"),
  header     = TRUE
)

Upvotes: 1

Related Questions