JnrfL
JnrfL

Reputation: 189

Applying rep() within groups through dplyr

I've been trying to replicate a binary output of 1 and 2 within groups. I'd like to make use of rep and dplyr, but I can't seem to understand how to apply rep within groups. I've been able to do it by manually separating the groupings and specifying the correct range per group. I would like to know how repcould be applied using dplyr.

Here's a sample data.

df <- data.frame(date = c("2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02"),
                 loc =c("AB", "AB", "AB", "AB", "AB", "AB", "AB", "AB", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD"),
                 cat = c("a", "a", "a", "b", "b", "b", "b", "b", "c", "c", "c", "c", "c", "d", "d", "d", "d", "d"))

This is basically the code I run per grouping applied on the entire dataset.

df$type <- rep(1:2,nrow(df)/2)

As you can see, the output disregards the column cat. cat b & d should have started at 1.

         date loc cat type
1  2017-01-01  AB   a    1
2  2017-01-01  AB   a    2
3  2017-01-01  AB   a    1
4  2017-01-01  AB   b    2
5  2017-01-01  AB   b    1
6  2017-01-01  AB   b    2
7  2017-01-01  AB   b    1
8  2017-01-02  AB   b    2
9  2017-01-02  CD   c    1
10 2017-01-02  CD   c    2
11 2017-01-02  CD   c    1
12 2017-01-02  CD   c    2
13 2017-01-02  CD   c    1
14 2017-01-02  CD   d    2
15 2017-01-02  CD   d    1
16 2017-01-02  CD   d    2
17 2017-01-02  CD   d    1

UPDATE: Here's the desired output.

        date loc cat type
1  2017-01-01  AB   a    1
2  2017-01-01  AB   a    2
3  2017-01-01  AB   a    1
4  2017-01-01  AB   b    1
5  2017-01-01  AB   b    2
6  2017-01-01  AB   b    1
7  2017-01-01  AB   b    2
8  2017-01-02  AB   b    1
9  2017-01-02  CD   c    1
10 2017-01-02  CD   c    2
11 2017-01-02  CD   c    1
12 2017-01-02  CD   c    2
13 2017-01-02  CD   c    1
14 2017-01-02  CD   d    1
15 2017-01-02  CD   d    2
16 2017-01-02  CD   d    1
17 2017-01-02  CD   d    2

Upvotes: 1

Views: 2645

Answers (1)

Marius
Marius

Reputation: 60130

Assuming that cat is the only relevant grouping variable here (not date and loc), you can do:

library(dplyr)
df = df %>%
    group_by(cat) %>%
    mutate(type = rep(1:2, length.out = length(cat)))
# Output:
         date    loc    cat  type
       <fctr> <fctr> <fctr> <int>
1  2017-01-01     AB      a     1
2  2017-01-01     AB      a     2
3  2017-01-01     AB      a     1
4  2017-01-01     AB      b     1
5  2017-01-01     AB      b     2
6  2017-01-01     AB      b     1
7  2017-01-01     AB      b     2
8  2017-01-02     AB      b     1
9  2017-01-02     CD      c     1
10 2017-01-02     CD      c     2
11 2017-01-02     CD      c     1
12 2017-01-02     CD      c     2
13 2017-01-02     CD      c     1
14 2017-01-02     CD      d     1
15 2017-01-02     CD      d     2
16 2017-01-02     CD      d     1
17 2017-01-02     CD      d     2
18 2017-01-02     CD      d     1

Upvotes: 3

Related Questions