JontroPothon
JontroPothon

Reputation: 534

ifelse with conditions involving dates and NAs

I have the following data,

library(lubridate)
day <- c(NA, NA, NA, "01", NA, "30", NA, "20", NA, NA)
my <- ifelse(is.na(day), NA, "042020")
value <- c(19, 23, 25, 26, 22, 12, 12, 34, 54, 23)
closing <- ifelse(is.na(day), NA, paste(day, my, sep = ""))
endofmonth <- ifelse(is.na(dmy), NA, rep("30042020"))
reopening <- c("01062020", "25042020", NA, "20042020", NA, "01082020", NA, "01082020", NA, NA)
adjvalue <- c(0, 4, 25, 9, 22, 12, 12, 23, 54, 23)
data <- data.frame(closing, endofmonth, reopening, value, adjvalue)
data$closing <- lubridate::dmy(closing)
data$endofmonth <- lubridate::dmy(endofmonth)
data$reopening <- lubridate::dmy(reopening)
data
> data
      closing endofmonth  reopening value adjvalue
1        <NA> 2020-04-30 2020-06-01    19        0
2        <NA> 2020-04-30 2020-04-25    23        4
3        <NA> 2020-04-30       <NA>    25       25
4  2020-04-01 2020-04-30 2020-04-20    26        9
5        <NA> 2020-04-30       <NA>    22       22
6  2020-04-30 2020-04-30 2020-08-01    12       12
7        <NA> 2020-04-30       <NA>    12       12
8  2020-04-20 2020-04-30 2020-08-01    34       23
9        <NA> 2020-04-30       <NA>    54       54
10       <NA> 2020-04-30       <NA>    23       23

Assume I want to adjust the forecasted sales of some companies. In April 2020 many companies closed. So my target is to adjust the forecasted sales values in the data frame.

I want to create the adjvalue column from four cases. I am using simple ifelse. Individually, the cases are as follows

Case 1

The first case (row 1) says, company is closed as there is a reopening date. So the forecasted values for April is zero. So I set the condition -- if the closing = NA, and the reopening date is after April, then multiply the forecasted values with zero.

# Row 1 
# Closing = NA, reopening > endofmonth -> value * 0
# Expected result: 19 * 0 = 0
data$adjValue <- with(data, ifelse(is.na(closing) & reopening > endofmonth,  
                                   value * 0, 
                                   value * 1))

I have got the correct output for this case, except it is creating NAs in column.

> data$adjValue
 [1]  0 23 NA 26 NA 12 NA 34 NA NA

Case 2

The second case (row 2) says, the company was closed before April and it was reopened on 25th April. So they have operated only 5 days. So I have adjusted the value for 5 days.

# Row 2
# Expected result: 23 * ((30 - 25)/30) = 4
data$adjValue <- with(data, ifelse(is.na(closing) & reopening < endofmonth,  
                                   value * (endofmonth - reopening)/30, 
                                   value * 1))

It gives me the expected value for row 2, but again some NA values in the output.

Case 3

The company was closed from 1st April to 20th April. So the company operated only for 10 days. I have adjusted the forecast accordingly.

# Row 4
# Expected result: 26 * ((30 - 20)/30) = 9
data$adjValue <- with(data, ifelse(endofmonth > reopening & !is.na(closing), 
                                   value * (30 - (reopening - closing))/30, 
                                   value * 1))

This is not working, the adjValue should be calculated using the following conditions,

It is not working. What did I do wrong?

Case 4

In this case, the company operated for only 20 days. I have adjusted that in the forecasted sales values.

# Row 8
# Expected result: 34 * ((30-20)/30) = 23
data$adjValue <- with(data, ifelse(reopening > endofmonth & closing < endofmonth,  
                                   round(value * (30 - (endofmonth - closing))/30), 
                                   value * 1))
> data$adjValue
 [1] NA 23 NA 26 NA 12 NA 23 NA NA

This works but again the NA problem.

I have the following questions,

  1. How to get rid of those NAs?
  2. What did I do wrong in case 3?
  3. How can I get the column like adjvalue using all the conditions together? Is there a tidy solution for this?

Upvotes: 0

Views: 131

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388862

I think the basic issue is handling NA values. You need to include those conditions separately or you could use dplyr::if_else where you can use missing argument to specify what value to return in case of missing value.

library(dplyr)

data$adjValue <- with(data, if_else(reopening > endofmonth & closing < endofmonth,
                            round(value * (30 - (endofmonth - closing))/30), 
                            value, missing = 0))
data$adjValue
#Time differences in days
# [1]  0 23  0 26  0 12  0 23  0  0

Upvotes: 1

Related Questions