Hanting Yong
Hanting Yong

Reputation: 43

Assign unique numbers to unique values within group dplyr?

I would like to generate the column well_rep based on value of "well" after grouping by "prop".

well prop well_rep
C03 0 1
C03 0 1
C03 0 1
C03 0 1
C03 0 1
C05 0 2
C05 0 2
C05 0 2
C05 0 2
C05 0 2
C05 0 2
C05 0 2
D02 50 1
D02 50 1
D02 50 1
D02 50 1
D02 50 1
D02 50 1
D02 50 1
D02 50 1
D02 50 1
E07 50 2
E07 50 2
E07 50 2
E07 50 2
E07 50 2
E07 50 2
E07 50 2
E07 50 2
E07 50 2
E07 50 2
E07 50 2
F02 50 3
F02 50 3
F02 50 3
F02 50 3
F02 50 3
F02 50 3
F02 50 3
F02 50 3

Something like cur_group_id but the numbers restarting from 1 in a different group?

Upvotes: 3

Views: 1386

Answers (4)

ThomasIsCoding
ThomasIsCoding

Reputation: 102890

Here is a base R option using ave + match

transform(
  df,
  well_rep = as.numeric(ave(well, prop, FUN = function(x) match(x, unique(x))))
)

Upvotes: 0

akrun
akrun

Reputation: 887951

If we want to use cur_group_id, do a nested grouping and then extract the column

library(dplyr)
df %>%
    group_by(prop) %>%
    mutate(well_rep2 = cur_data() %>%
                  group_by(well)%>%
                  transmute(out = cur_group_id()) %>%
                  pull(out))

-output

# A tibble: 40 x 4
# Groups:   prop [2]
   well   prop well_rep well_rep2
   <chr> <int>    <int>     <int>
 1 C03       0        1         1
 2 C03       0        1         1
 3 C03       0        1         1
 4 C03       0        1         1
 5 C03       0        1         1
 6 C05       0        2         2
 7 C05       0        2         2
 8 C05       0        2         2
 9 C05       0        2         2
10 C05       0        2         2
# … with 30 more rows

ddata

df <- structure(list(well = c("C03", "C03", "C03", "C03", "C03", "C05", 
"C05", "C05", "C05", "C05", "C05", "C05", "D02", "D02", "D02", 
"D02", "D02", "D02", "D02", "D02", "D02", "E07", "E07", "E07", 
"E07", "E07", "E07", "E07", "E07", "E07", "E07", "E07", "F02", 
"F02", "F02", "F02", "F02", "F02", "F02", "F02"), prop = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 50L, 50L, 50L, 50L, 
50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 
50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L), well_rep = c(1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L)), class = "data.frame", row.names = c(NA, 
-40L))

Upvotes: 1

Onyambu
Onyambu

Reputation: 79348

You could do:

df %>%
  group_by(prop) %>%
  mutate(well_rep = as.numeric(as.factor(well)))
# A tibble: 40 x 3
# Groups:   prop [2]
   well   prop well_rep
   <chr> <int>    <dbl>
 1 C03       0        1
 2 C03       0        1
 3 C03       0        1
 4 C03       0        1
 5 C03       0        1
 6 C05       0        2
 7 C05       0        2
 8 C05       0        2
 9 C05       0        2
10 C05       0        2

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389325

Here are couple of ways using match and duplicated -

library(dplyr)

df %>%
  group_by(prop) %>%
  mutate(well_rep1 = match(well, unique(well)), 
         well_rep2 = cumsum(!duplicated(well)))

#   well prop well_rep well_rep1 well_rep2
#1   C03    0        1         1         1
#2   C03    0        1         1         1
#3   C03    0        1         1         1
#4   C03    0        1         1         1
#5   C03    0        1         1         1
#6   C05    0        2         2         2
#7   C05    0        2         2         2
#8   C05    0        2         2         2
#9   C05    0        2         2         2
#10  C05    0        2         2         2
#11  C05    0        2         2         2
#12  C05    0        2         2         2
#13  D02   50        1         1         1
#14  D02   50        1         1         1
#15  D02   50        1         1         1
#16  D02   50        1         1         1
#17  D02   50        1         1         1
#18  D02   50        1         1         1
#19  D02   50        1         1         1
#20  D02   50        1         1         1
#21  D02   50        1         1         1
#22  E07   50        2         2         2
#23  E07   50        2         2         2
#24  E07   50        2         2         2
#25  E07   50        2         2         2
#26  E07   50        2         2         2
#27  E07   50        2         2         2
#28  E07   50        2         2         2
#29  E07   50        2         2         2
#30  E07   50        2         2         2
#31  E07   50        2         2         2
#32  E07   50        2         2         2
#33  F02   50        3         3         3
#34  F02   50        3         3         3
#35  F02   50        3         3         3
#36  F02   50        3         3         3
#37  F02   50        3         3         3
#38  F02   50        3         3         3
#39  F02   50        3         3         3
#40  F02   50        3         3         3

Upvotes: 4

Related Questions