Reputation: 478
I have the following data
> dput(DF)
structure(list(NAME = c("Gait", "Roc", "Bo", "Hernd",
"Bet", "Oln", "Gai", "Rock", "Mil", "Arli", "Re", "Fred", "Ro",
"Rock", "Wheat", "Germa", "Rock", "Nort", "Arli",
"Rockv"), AGE = c(33, 43, 37, 45, 44, 35, 22, 30,
38, 23, 45, 43, 67, 43, 28, 47, 16, 29, 22, 31)),
class = "data.frame", row.names = c(NA, -20L))
I want to group the data by specific intervals such that the first group is from AGE
0-19 and the remaining groups are by 10-year intervals so 20-29, 30-39, etc to the max AGE
.
Desired output is:
NAME AGE GROUP
1 Gait 33 3
2 Roc 43 4
3 Bo 37 3
4 Hernd 45 4
5 Bet 44 4
6 Oln 35 3
7 Gai 22 2
8 Rock 30 3
9 Mil 38 3
10 Arli 23 2
11 Re 45 4
12 Fred 43 4
13 Ro 67 6
14 Rock 43 4
15 Wheat 28 2
16 Germa 47 4
17 Rock 16 1
18 Nort 29 2
19 Arli 22 2
20 Rockv 31 3
Please keep in mind this is just a sample of the data and the actual data is larger. My goal is to have one odd interval for group 1, while the remaining groups are all by the same range of 10 years.
Upvotes: 0
Views: 511
Reputation: 388817
You may use cut
and create groups based on defined intervals.
transform(DF, GROUP = cut(AGE, c(0, seq(19, max(AGE) + 10, 10)), labels = FALSE))
# NAME AGE GROUP
#1 Gait 33 3
#2 Roc 43 4
#3 Bo 37 3
#4 Hernd 45 4
#5 Bet 44 4
#6 Oln 35 3
#7 Gai 22 2
#8 Rock 30 3
#9 Mil 38 3
#10 Arli 23 2
#11 Re 45 4
#12 Fred 43 4
#13 Ro 67 6
#14 Rock 43 4
#15 Wheat 28 2
#16 Germa 47 4
#17 Rock 16 1
#18 Nort 29 2
#19 Arli 22 2
#20 Rockv 31 3
The key part here is how we create intervals with c
and seq
which define the groups.
c(0, seq(19, max(DF$AGE) + 10, 10))
#[1] 0 19 29 39 49 59 69
Upvotes: 2