Reputation: 503
I need to create a new column by adding value from the stock column and subtracting the value from the sales column. When the result is negative, the next stock should be rewritten to the results as in the example. With the help of ThomasIsCoding the created code works well for sample data. But when I test the program for many products, there are erroneous calculations. Which shows below. Can you check why this is happening?
Code:
TD<-data.frame(product = rep("A",9), data = seq(as.Date("2020-01-01"), as.Date("2020-01-09"), by = "day"),
sale = c(0, 5, 0, 0, 15, 1, 2, 1, 0), stock = c(0, 0, 50, 0, 10, 0, 10, 0, 100))
TD <- within(TD, result <- ave(stock-sale, product,
ave(stock-sale,
cumsum(stock>sale),FUN = cumsum)>0,FUN = cumsum))
> TD#works well
product data sale stock result
1 A 2020-01-01 0 0 0
2 A 2020-01-02 5 0 -5
3 A 2020-01-03 0 50 50
4 A 2020-01-04 0 0 50
5 A 2020-01-05 15 10 45
6 A 2020-01-06 1 0 44
7 A 2020-01-07 2 10 52
8 A 2020-01-08 1 0 51
9 A 2020-01-09 0 100 151
Examples when it doesn't work: I don't understand why the program sometimes works well and sometimes it doesn't.
> TD[1330:1340,]#doesn't work
# A tibble: 4 x 5
# Groups: data [4]
product data stock sale result expected
<chr> <fct> <dbl> <dbl> <dbl>
1 B 2020-02-25 0 0 -21 -21
2 B 2020-02-26 0 0 -21 -21
3 B 2020-02-27 60 4 35 56
4 B 2020-02-28 0 2 33 54
> TD[7293:7297,]#works ok
# A tibble: 5 x 5
# Groups: data [5]
product data stock sale result
<chr> <fct> <dbl> <dbl> <dbl>
1 D 2020-01-13 0 2 -22
2 D 2020-01-14 0 2 -24
3 D 2020-01-15 0 0 -24
4 D 2020-01-16 100 2 98
5 D 2020-01-17 0 4 94
Upvotes: 0
Views: 97
Reputation: 16178
OK, maybe this is a solution, definitely not the most elegant or readable but at least it passes all your "test" dataframes:
library(dplyr)
df %>% group_by(product) %>% mutate(New = stock - sale) %>%
mutate(New2 = ifelse(lag((stock-sale) <0), stock, stock-sale)) %>%
mutate(New3 = ifelse(is.na(New2),New, New2)) %>%
mutate(New4 = ifelse(lag(New3 >= 0), lag(New3)+stock-sale,New3)) %>%
mutate(New4 = ifelse(is.na(New4),stock-sale,New4)) %>%
select(product, sale, stock, Result = New4)
So, now testing a dataframe with multiple groups:
TD2 = data.frame(product = c("AA","AA","AA","AA","AZ","AZ","AZ","AF","AF","AF","BA","BA","BA"),
data = seq(as.Date("2020-01-01"), as.Date("2020-01-13"), by = "day"),
sale = c(0, 5, 0, 0, 15, 1, 2, 1, 0,10,0,1,10),
stock = c(0, 0, 50, 0, 10, 0, 10, 0, 100,0,0,100,0))
TD2 %>% group_by(product) %>% mutate(New = stock - sale) %>%
mutate(New2 = ifelse(lag((stock-sale) <0), stock, stock-sale)) %>%
mutate(New3 = ifelse(is.na(New2),New, New2)) %>%
mutate(New4 = ifelse(lag(New3 >= 0), lag(New3)+stock-sale,New3)) %>%
mutate(New4 = ifelse(is.na(New4),stock-sale,New4)) %>%
select(product, sale, stock, Result = New4)
# A tibble: 13 x 4
# Groups: product [4]
product sale stock Result
<fct> <dbl> <dbl> <dbl>
1 AA 0 0 0
2 AA 5 0 -5
3 AA 0 50 50
4 AA 0 0 50
5 AZ 15 10 -5
6 AZ 1 0 0
7 AZ 2 10 8
8 AF 1 0 -1
9 AF 0 100 100
10 AF 10 0 90
11 BA 0 0 0
12 BA 1 100 99
13 BA 10 0 89
Now testing the second example of your question (I assumed that -21
in the expected results is an error and was in fact inherited from previous calculations and so I decided to ignore this results and focused only on the last part because getting -21 at this step does not make any sense based on your data):
Test2 = data.frame(product = c("AA","AA","AA","AA"),
sale = c(0, 0,4,2),
stock = c(0, 0, 60, 0))
Test2 %>% group_by(product) %>% mutate(New = stock - sale) %>%
mutate(New2 = ifelse(lag((stock-sale) <0), stock, stock-sale)) %>%
mutate(New3 = ifelse(is.na(New2),New, New2)) %>%
mutate(New4 = ifelse(lag(New3 >= 0), lag(New3)+stock-sale,New3)) %>%
mutate(New4 = ifelse(is.na(New4),stock-sale,New4)) %>%
select(product, sale, stock, Result = New4)
# A tibble: 4 x 4
# Groups: product [1]
product sale stock Result
<fct> <dbl> <dbl> <dbl>
1 AA 0 0 0
2 AA 0 0 0
3 AA 4 60 56
4 AA 2 0 54
And finally the last example of your question:
Test3 = data.frame(product = c("AA","AA","AA","AA","AA"),
sale = c(2,2,0,2,4),
stock = c(0, 0, 0,100, 0))
Test3 %>% group_by(product) %>% mutate(New = stock - sale) %>%
mutate(New2 = ifelse(lag((stock-sale) <0), stock, stock-sale)) %>%
mutate(New3 = ifelse(is.na(New2),New, New2)) %>%
mutate(New4 = ifelse(lag(New3 >= 0), lag(New3)+stock-sale,New3)) %>%
mutate(New4 = ifelse(is.na(New4),stock-sale,New4)) %>%
select(product, sale, stock, Result = New4)
# A tibble: 5 x 4
# Groups: product [1]
product sale stock Result
<fct> <dbl> <dbl> <dbl>
1 AA 2 0 -2
2 AA 2 0 0
3 AA 0 0 0
4 AA 2 100 98
5 AA 4 0 94
Hope that it will be satisfying for you. Otherwise, as mentioned in comments by @RonakShah, I invited you to re-consider your approach to clearly defined what you need, how to calculate it and what is the rationale of your output.
Upvotes: 1