Reputation: 189
I've been trying to replicate a binary output of 1 and 2 within groups.
I'd like to make use of rep
and dplyr
, but I can't seem to understand how to apply rep
within groups. I've been able to do it by manually separating the groupings and specifying the correct range per group. I would like to know how rep
could be applied using dplyr
.
Here's a sample data.
df <- data.frame(date = c("2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02"),
loc =c("AB", "AB", "AB", "AB", "AB", "AB", "AB", "AB", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD"),
cat = c("a", "a", "a", "b", "b", "b", "b", "b", "c", "c", "c", "c", "c", "d", "d", "d", "d", "d"))
This is basically the code I run per grouping applied on the entire dataset.
df$type <- rep(1:2,nrow(df)/2)
As you can see, the output disregards the column cat
. cat b & d
should have started at 1.
date loc cat type
1 2017-01-01 AB a 1
2 2017-01-01 AB a 2
3 2017-01-01 AB a 1
4 2017-01-01 AB b 2
5 2017-01-01 AB b 1
6 2017-01-01 AB b 2
7 2017-01-01 AB b 1
8 2017-01-02 AB b 2
9 2017-01-02 CD c 1
10 2017-01-02 CD c 2
11 2017-01-02 CD c 1
12 2017-01-02 CD c 2
13 2017-01-02 CD c 1
14 2017-01-02 CD d 2
15 2017-01-02 CD d 1
16 2017-01-02 CD d 2
17 2017-01-02 CD d 1
UPDATE: Here's the desired output.
date loc cat type
1 2017-01-01 AB a 1
2 2017-01-01 AB a 2
3 2017-01-01 AB a 1
4 2017-01-01 AB b 1
5 2017-01-01 AB b 2
6 2017-01-01 AB b 1
7 2017-01-01 AB b 2
8 2017-01-02 AB b 1
9 2017-01-02 CD c 1
10 2017-01-02 CD c 2
11 2017-01-02 CD c 1
12 2017-01-02 CD c 2
13 2017-01-02 CD c 1
14 2017-01-02 CD d 1
15 2017-01-02 CD d 2
16 2017-01-02 CD d 1
17 2017-01-02 CD d 2
Upvotes: 1
Views: 2645
Reputation: 60130
Assuming that cat
is the only relevant grouping variable here (not date and loc), you can do:
library(dplyr)
df = df %>%
group_by(cat) %>%
mutate(type = rep(1:2, length.out = length(cat)))
# Output:
date loc cat type
<fctr> <fctr> <fctr> <int>
1 2017-01-01 AB a 1
2 2017-01-01 AB a 2
3 2017-01-01 AB a 1
4 2017-01-01 AB b 1
5 2017-01-01 AB b 2
6 2017-01-01 AB b 1
7 2017-01-01 AB b 2
8 2017-01-02 AB b 1
9 2017-01-02 CD c 1
10 2017-01-02 CD c 2
11 2017-01-02 CD c 1
12 2017-01-02 CD c 2
13 2017-01-02 CD c 1
14 2017-01-02 CD d 1
15 2017-01-02 CD d 2
16 2017-01-02 CD d 1
17 2017-01-02 CD d 2
18 2017-01-02 CD d 1
Upvotes: 3