Village.Idyot
Village.Idyot

Reputation: 2043

How to count groupings of elements in base R or dplyr using multiple conditions?

I am trying to count the number of elements by groupings, subject to the condition that each grouping code ("Group") is > 0. Suppose we start with the below output DF generated via the code immediately beneath:

   Element Group reSeq
   <chr>   <dbl> <int>
 1 R           0     1
 2 R           0     1
 3 X           0     1
 4 X           1     2
 5 X           1     2
 6 X           0     1
 7 X           0     1
 8 X           0     1
 9 B           0     1
10 R           0     1
11 R           2     2
12 R           2     2
13 X           3     3
14 X           3     3
15 X           3     3

library(dplyr)

myDF <- data.frame(
  Element = c("R","R","X","X","X","X","X","X","B","R","R","R","X","X","X"),
  Group = c(0,0,0,1,1,0,0,0,0,0,2,2,3,3,3)
)

myDF %>% group_by(Element) %>% mutate(reSeq = match(Group, unique(Group)))

Instead, I would like the reSeq column to calculate and output as shown below with explanations to the right:

   Element Group reSeq reSeq explanation
   <chr>   <dbl> <int>
 1 R           0     1  1st instance of R (ungrouped)(Group = 0 means not grouped)
 2 R           0     2  2nd instance of R (ungrouped)(Group = 0 means not grouped)
 3 X           0     1  1st instance of X (ungrouped)(Group = 0 means not grouped)
 4 X           1     2  2nd instance of X (grouped by Group = 1)
 5 X           1     2  2nd instance of X (grouped by Group = 1)
 6 X           0     3  3rd instance of X (ungrouped)
 7 X           0     4  4th instance of X (ungrouped)
 8 X           0     5  5th instance of X (ungrouped)
 9 B           0     1  1st instance of B (ungrouped)
10 R           0     3  3rd instance of R (ungrouped)
11 R           2     4  4th instance of R (grouped by Group = 2)
12 R           2     4  4th instance of R (grouped by Group = 2)
13 X           3     6  6th instance of X (grouped by Group = 3)
14 X           3     6  6th instance of X (grouped by Group = 3)
15 X           3     6  6th instance of X (grouped by Group = 3)

Any recommendations for doing this? If possible, starting with the dplyr code I use above because I am fairly familiar with it.

Upvotes: 3

Views: 47

Answers (2)

akrun
akrun

Reputation: 887118

If we use rowid from data.table, can skip a couple of steps

library(dplyr)
library(data.table)
library(tidyr)
 myDF %>% 
  mutate(reSeq =  rowid(Element) * NA^!(Group == 0 |!duplicated(Group))) %>% 
  group_by(Element) %>% 
  fill(reSeq) %>%
  mutate(reSeq = match(reSeq, unique(reSeq))) %>%
  ungroup

-output

# A tibble: 15 × 3
   Element Group reSeq
   <chr>   <dbl> <int>
 1 R           0     1
 2 R           0     2
 3 X           0     1
 4 X           1     2
 5 X           1     2
 6 X           0     3
 7 X           0     4
 8 X           0     5
 9 B           0     1
10 R           0     3
11 R           2     4
12 R           2     4
13 X           3     6
14 X           3     6
15 X           3     6

Upvotes: 1

Village.Idyot
Village.Idyot

Reputation: 2043

Below is what I managed to cobble together. Maybe there's a cleaner solution? Here's the code:

library(dplyr)
library(tidyr)

myDF %>% 
  group_by(Element) %>%
    mutate(eleCnt = row_number()) %>%
  ungroup()%>%
  mutate(reSeq = ifelse(Group == 0 | Group != lag(Group), eleCnt,0)) %>%
  mutate(reSeq = na_if(reSeq, 0)) %>%
  group_by(Element) %>%
    fill(reSeq) %>%
    mutate(reSeq = match(reSeq, unique(reSeq))) %>%
  ungroup

And here's the output:

# A tibble: 15 x 4
   Element Group eleCnt reSeq
   <chr>   <dbl>  <int> <int>
 1 R           0      1     1
 2 R           0      2     2
 3 X           0      1     1
 4 X           1      2     2
 5 X           1      3     2
 6 X           0      4     3
 7 X           0      5     4
 8 X           0      6     5
 9 B           0      1     1
10 R           0      3     3
11 R           2      4     4
12 R           2      5     4
13 X           3      7     6
14 X           3      8     6
15 X           3      9     6

Upvotes: 1

Related Questions