Reputation: 79
I'm having trouble using the row number as index. For example I want a new column that will give me the sales taking into account the next 4 days. I want to create column name:sale_next 4
The issue with my code is that I don't know how to make the index of the row_number()
variable, since what I'm doing is fetching the actual value of the column.
#heres to create the data
df <- read.table(text = "day price price_change sales High_sales_ind
1 5 0 12 1
2 5 0 6 0
3 5 0 5 0
4 5 0 4 0
5 5 0 10 1
6 5 0 10 1
7 5 0 10 1
8 5 0 12 1
9 5 0 14 1
10 7 2 3 0
11 7 0 2 0", header = TRUE)
#my code
df<- df %>% mutate(sales_next4 = sales[row_number():sales_rownumber()+4)
What I need:
day | price | price_change | sales | High_sales_ind | sales_next4 |
---|---|---|---|---|---|
1 | 5 | 0 | 12 | 1 | 27 |
2 | 5 | 0 | 6 | 0 | 25 |
3 | 5 | 0 | 5 | 0 | 29 |
4 | 5 | 0 | 4 | 0 | 34 |
5 | 5 | 0 | 10 | 1 | 42 |
6 | 5 | 0 | 10 | 1 | 46 |
7 | 5 | 0 | 10 | 1 | 39 |
8 | 5 | 0 | 12 | 1 | 31 |
9 | 5 | 0 | 14 | 1 | 19 |
10 | 7 | 2 | 3 | 0 | 5 |
11 | 7 | 0 | 2 | 0 | 2 |
Any help would be appreciated.
Upvotes: 2
Views: 426
Reputation: 24722
You can use Reduce()
and data.table::shift()
library(data.table)
setDT(df)[, n4:=Reduce(`+`,shift(c(sales,0,0,0),-3:0))[1:.N]]
Output:
day price price_change sales High_sales_ind sales_next4
1 1 5 0 12 1 27
2 2 5 0 6 0 25
3 3 5 0 5 0 29
4 4 5 0 4 0 34
5 5 5 0 10 1 42
6 6 5 0 10 1 46
7 7 5 0 10 1 39
8 8 5 0 12 1 31
9 9 5 0 14 1 19
10 10 7 2 3 0 5
11 11 7 0 2 0 2
or, could this as part of dplyr/mutate pipeline
mutate(df, sales_next4 = Reduce(`+`, data.table::shift(c(sales,0,0,0),0:-3))[1:nrow(df)])
Upvotes: 0
Reputation: 887058
Using slider
library(dplyr)
library(slider)
df %>%
mutate(sales_next4 = slide_dbl(day, ~ sum(sales[.x]), .after = 3))
day price price_change sales High_sales_ind sales_next4
1 1 5 0 12 1 27
2 2 5 0 6 0 25
3 3 5 0 5 0 29
4 4 5 0 4 0 34
5 5 5 0 10 1 42
6 6 5 0 10 1 46
7 7 5 0 10 1 39
8 8 5 0 12 1 31
9 9 5 0 14 1 19
10 10 7 2 3 0 5
11 11 7 0 2 0 2
Upvotes: 1
Reputation: 35554
You can use map()
from purrr
to do rolling sum depending on the day
column.
library(dplyr)
library(purrr)
df %>%
mutate(sales_next4 = map_dbl(day, ~ sum(sales[between(day, .x, .x+3)])))
# day price price_change sales High_sales_ind sales_next4
# 1 1 5 0 12 1 27
# 2 2 5 0 6 0 25
# 3 3 5 0 5 0 29
# 4 4 5 0 4 0 34
# 5 5 5 0 10 1 42
# 6 6 5 0 10 1 46
# 7 7 5 0 10 1 39
# 8 8 5 0 12 1 31
# 9 9 5 0 14 1 19
# 10 10 7 2 3 0 5
# 11 11 7 0 2 0 2
Upvotes: 1
Reputation: 33782
You can use rollapply
from the zoo
package for cases like this, assuming that the days are consecutive as in the example data provided.
You'll need to use the partial =
and align =
arguments to fill the column correctly, see ?rollapply
for the details.
library(dplyr)
library(zoo)
df <- df %>%
mutate(sales_next4 = rollapply(sales, 4, sum, partial = TRUE, align = "left"))
Result:
day price price_change sales High_sales_ind sales_next4
1 1 5 0 12 1 27
2 2 5 0 6 0 25
3 3 5 0 5 0 29
4 4 5 0 4 0 34
5 5 5 0 10 1 42
6 6 5 0 10 1 46
7 7 5 0 10 1 39
8 8 5 0 12 1 31
9 9 5 0 14 1 19
10 10 7 2 3 0 5
11 11 7 0 2 0 2
Upvotes: 1