Reputation: 534
I have the following data,
library(lubridate)
day <- c(NA, NA, NA, "01", NA, "30", NA, "20", NA, NA)
my <- ifelse(is.na(day), NA, "042020")
value <- c(19, 23, 25, 26, 22, 12, 12, 34, 54, 23)
closing <- ifelse(is.na(day), NA, paste(day, my, sep = ""))
endofmonth <- ifelse(is.na(dmy), NA, rep("30042020"))
reopening <- c("01062020", "25042020", NA, "20042020", NA, "01082020", NA, "01082020", NA, NA)
adjvalue <- c(0, 4, 25, 9, 22, 12, 12, 23, 54, 23)
data <- data.frame(closing, endofmonth, reopening, value, adjvalue)
data$closing <- lubridate::dmy(closing)
data$endofmonth <- lubridate::dmy(endofmonth)
data$reopening <- lubridate::dmy(reopening)
data
> data
closing endofmonth reopening value adjvalue
1 <NA> 2020-04-30 2020-06-01 19 0
2 <NA> 2020-04-30 2020-04-25 23 4
3 <NA> 2020-04-30 <NA> 25 25
4 2020-04-01 2020-04-30 2020-04-20 26 9
5 <NA> 2020-04-30 <NA> 22 22
6 2020-04-30 2020-04-30 2020-08-01 12 12
7 <NA> 2020-04-30 <NA> 12 12
8 2020-04-20 2020-04-30 2020-08-01 34 23
9 <NA> 2020-04-30 <NA> 54 54
10 <NA> 2020-04-30 <NA> 23 23
Assume I want to adjust the forecasted sales of some companies. In April 2020 many companies closed. So my target is to adjust the forecasted sales values
in the data frame.
I want to create the adjvalue
column from four cases. I am using simple ifelse.
Individually, the cases are as follows
Case 1
The first case (row 1) says, company is closed as there is a reopening date. So the forecasted values
for April is zero. So I set the condition -- if the closing = NA
, and the reopening date is after April, then multiply the forecasted values with zero.
# Row 1
# Closing = NA, reopening > endofmonth -> value * 0
# Expected result: 19 * 0 = 0
data$adjValue <- with(data, ifelse(is.na(closing) & reopening > endofmonth,
value * 0,
value * 1))
I have got the correct output for this case, except it is creating NA
s in column.
> data$adjValue
[1] 0 23 NA 26 NA 12 NA 34 NA NA
Case 2
The second case (row 2) says, the company was closed before April and it was reopened on 25th April. So they have operated only 5 days. So I have adjusted the value for 5 days.
# Row 2
# Expected result: 23 * ((30 - 25)/30) = 4
data$adjValue <- with(data, ifelse(is.na(closing) & reopening < endofmonth,
value * (endofmonth - reopening)/30,
value * 1))
It gives me the expected value for row 2, but again some NA
values in the output.
Case 3
The company was closed from 1st April to 20th April. So the company operated only for 10 days. I have adjusted the forecast accordingly.
# Row 4
# Expected result: 26 * ((30 - 20)/30) = 9
data$adjValue <- with(data, ifelse(endofmonth > reopening & !is.na(closing),
value * (30 - (reopening - closing))/30,
value * 1))
This is not working, the adjValue
should be calculated using the following conditions,
NA
endofmonth > reopening
It is not working. What did I do wrong?
Case 4
In this case, the company operated for only 20 days. I have adjusted that in the forecasted sales values.
# Row 8
# Expected result: 34 * ((30-20)/30) = 23
data$adjValue <- with(data, ifelse(reopening > endofmonth & closing < endofmonth,
round(value * (30 - (endofmonth - closing))/30),
value * 1))
> data$adjValue
[1] NA 23 NA 26 NA 12 NA 23 NA NA
This works but again the NA
problem.
I have the following questions,
NA
s?adjvalue
using all the conditions together? Is there a tidy solution for this?Upvotes: 0
Views: 131
Reputation: 388862
I think the basic issue is handling NA
values. You need to include those conditions separately or you could use dplyr::if_else
where you can use missing
argument to specify what value to return in case of missing value.
library(dplyr)
data$adjValue <- with(data, if_else(reopening > endofmonth & closing < endofmonth,
round(value * (30 - (endofmonth - closing))/30),
value, missing = 0))
data$adjValue
#Time differences in days
# [1] 0 23 0 26 0 12 0 23 0 0
Upvotes: 1