Reputation: 463
What is the most efficient way to create a new variable that increases by x units after every x rows? For example, I have a data frame:
d <- data.frame(group_var = c('a', 'b', 'c'),
y = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21))
And I'd like to create a new variable that starts with x and increases by x every x rows so I would get a data frame like this:
d <- data.frame(group_var = c('a', 'b', 'c'),
y = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21),
z = c(5,5,5,5,5,10,10,10,10,10,15,15,15,15,15,20,20,20,20,20,20,5,5,5,5,5,10,10,10,10,10,15,15,15,15,15,20,20,20,20,20,20,5,5,5,5,5,10,10,10,10,10,15,15,15,15,15,20,20,20,20,20,20))
Also, when I have remainders, as the previous dataframe does, I would like those to be sequenced with the previous group (so y=11 would be z=10). Note that my goal dataframe preserves the same number of rows as the original dataframe.
Upvotes: 1
Views: 273
Reputation: 886948
We can create a grouping variable based on the diff
on 'y', then create the 'z' with gl
and multiply by 5
library(dplyr)
library(tidyr)
d1 <- d %>%
group_by(grp = cumsum(c(TRUE, diff(y) < 0))) %>%
mutate(z = as.integer(gl(n(), 5, n())) * 5,
z = replace(z, ave(z, z, FUN = length) < 5, NA)) %>%
ungroup %>%
fill(z) %>%
select(-grp)
-output
as.data.frame(d1)
group_var y z
1 a 1 5
2 b 2 5
3 c 3 5
4 a 4 5
5 b 5 5
6 c 6 10
7 a 7 10
8 b 8 10
9 c 9 10
10 a 10 10
11 b 11 15
12 c 12 15
13 a 13 15
14 b 14 15
15 c 15 15
16 a 16 20
17 b 17 20
18 c 18 20
19 a 19 20
20 b 20 20
21 c 21 20
22 a 1 5
23 b 2 5
24 c 3 5
25 a 4 5
26 b 5 5
27 c 6 10
28 a 7 10
29 b 8 10
30 c 9 10
31 a 10 10
32 b 11 15
33 c 12 15
34 a 13 15
35 b 14 15
36 c 15 15
37 a 16 20
38 b 17 20
39 c 18 20
40 a 19 20
41 b 20 20
42 c 21 20
43 a 1 5
44 b 2 5
45 c 3 5
46 a 4 5
47 b 5 5
48 c 6 10
49 a 7 10
50 b 8 10
51 c 9 10
52 a 10 10
53 b 11 15
54 c 12 15
55 a 13 15
56 b 14 15
57 c 15 15
58 a 16 20
59 b 17 20
60 c 18 20
61 a 19 20
62 b 20 20
63 c 21 20
Upvotes: 1