Reputation: 79

Using row index number to calculate values

I'm having trouble using the row number as index. For example I want a new column that will give me the sales taking into account the next 4 days. I want to create column name:sale_next 4

The issue with my code is that I don't know how to make the index of the row_number() variable, since what I'm doing is fetching the actual value of the column.

#heres to create the data
df <- read.table(text = "day    price   price_change    sales   High_sales_ind
1 5 0   12  1
2   5   0   6   0
3   5   0   5   0
4   5   0   4   0
5   5   0   10  1
6   5   0   10  1
7   5   0   10  1
8   5   0   12  1
9   5   0   14  1
10  7   2   3   0
11  7   0   2   0", header = TRUE)


#my code

df<- df %>% mutate(sales_next4 = sales[row_number():sales_rownumber()+4)

What I need:

day	price	price_change	sales	High_sales_ind	sales_next4
1	5	0	12	1	27
2	5	0	6	0	25
3	5	0	5	0	29
4	5	0	4	0	34
5	5	0	10	1	42
6	5	0	10	1	46
7	5	0	10	1	39
8	5	0	12	1	31
9	5	0	14	1	19
10	7	2	3	0	5
11	7	0	2	0	2

Any help would be appreciated.

Upvotes: 2

Answers (4)

langtang

Reputation: 24722

You can use Reduce() and data.table::shift()

library(data.table)
setDT(df)[, n4:=Reduce(`+`,shift(c(sales,0,0,0),-3:0))[1:.N]]

Output:

   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

or, could this as part of dplyr/mutate pipeline


mutate(df, sales_next4 = Reduce(`+`, data.table::shift(c(sales,0,0,0),0:-3))[1:nrow(df)])

Upvotes: 0

akrun

Reputation: 887058

Using slider

library(dplyr)
library(slider)
df %>% 
  mutate(sales_next4 = slide_dbl(day, ~ sum(sales[.x]), .after = 3))
   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

Upvotes: 1

Darren Tsai

Reputation: 35554

You can use map() from purrr to do rolling sum depending on the day column.

library(dplyr)
library(purrr)

df %>%
  mutate(sales_next4 = map_dbl(day, ~ sum(sales[between(day, .x, .x+3)])))

#    day price price_change sales High_sales_ind sales_next4
# 1    1     5            0    12              1          27
# 2    2     5            0     6              0          25
# 3    3     5            0     5              0          29
# 4    4     5            0     4              0          34
# 5    5     5            0    10              1          42
# 6    6     5            0    10              1          46
# 7    7     5            0    10              1          39
# 8    8     5            0    12              1          31
# 9    9     5            0    14              1          19
# 10  10     7            2     3              0           5
# 11  11     7            0     2              0           2

Upvotes: 1

neilfws

Reputation: 33782

You can use rollapply from the zoo package for cases like this, assuming that the days are consecutive as in the example data provided.

You'll need to use the partial = and align = arguments to fill the column correctly, see ?rollapply for the details.

library(dplyr)
library(zoo)

df <- df %>% 
  mutate(sales_next4 = rollapply(sales, 4, sum, partial = TRUE, align = "left"))

Result:

   day price price_change sales High_sales_ind sales_next4
1    1     5            0    12              1          27
2    2     5            0     6              0          25
3    3     5            0     5              0          29
4    4     5            0     4              0          34
5    5     5            0    10              1          42
6    6     5            0    10              1          46
7    7     5            0    10              1          39
8    8     5            0    12              1          31
9    9     5            0    14              1          19
10  10     7            2     3              0           5
11  11     7            0     2              0           2

Upvotes: 1

day	price	price_change	sales	High_sales_ind	sales_next4
1	5	0	12	1	27
2	5	0	6	0	25
3	5	0	5	0	29
4	5	0	4	0	34
5	5	0	10	1	42
6	5	0	10	1	46
7	5	0	10	1	39
8	5	0	12	1	31
9	5	0	14	1	19
10	7	2	3	0	5
11	7	0	2	0	2

day	price	price_change	sales	High_sales_ind	sales_next4
1	5	0	12	1	27
2	5	0	6	0	25
3	5	0	5	0	29
4	5	0	4	0	34
5	5	0	10	1	42
6	5	0	10	1	46
7	5	0	10	1	39
8	5	0	12	1	31
9	5	0	14	1	19
10	7	2	3	0	5
11	7	0	2	0	2

Using row index number to calculate values

Answers (4)

Related Questions

day	price	price_change	sales	High_sales_ind	sales_next4
1	5	0	12	1	27
2	5	0	6	0	25
3	5	0	5	0	29
4	5	0	4	0	34
5	5	0	10	1	42
6	5	0	10	1	46
7	5	0	10	1	39
8	5	0	12	1	31
9	5	0	14	1	19
10	7	2	3	0	5
11	7	0	2	0	2