Reputation: 3432
I have a df such as:
COL1 COL2
G1 SP1
G1 SP2
G1 SP1
G2 SP1
G2 SP6
G2 SP9
G3 SP9
G4 SP9
G4 SP2
and I would like to add a COL3 which is by group of COL1 the number of different COL2 values:
COL1 COL2 COL3
G1 SP1 2
G1 SP2 2
G1 SP1 2
G2 SP1 3
G2 SP6 3
G2 SP9 3
G3 SP9 1
G4 SP9 2
G4 SP2 2
Upvotes: 1
Views: 55
Reputation: 886938
Another option with data.table
library(data.table)
setDT(df)[, COL3 := uniqueN(COL2), COL1]
Upvotes: 0
Reputation: 5684
Another way could be
df %>%
group_by(COL1) %>%
mutate(COL3 = length(unique(c_across(COL2))))
# A tibble: 9 x 3
# Groups: COL1 [4]
COL1 COL2 COL3
<chr> <chr> <int>
1 G1 SP1 2
2 G1 SP2 2
3 G1 SP1 2
4 G2 SP1 3
5 G2 SP6 3
6 G2 SP9 3
7 G3 SP9 1
8 G4 SP9 2
9 G4 SP2 2
Upvotes: 1
Reputation: 173793
You can use n_distinct(COL2)
after grouping by COL1
:
library(dplyr)
df %>%
group_by(COL1) %>%
mutate(COL3 = n_distinct(COL2))
#> # A tibble: 9 x 3
#> # Groups: COL1 [4]
#> COL1 COL2 COL3
#> <fct> <fct> <int>
#> 1 G1 SP1 2
#> 2 G1 SP2 2
#> 3 G1 SP1 2
#> 4 G2 SP1 3
#> 5 G2 SP6 3
#> 6 G2 SP9 3
#> 7 G3 SP9 1
#> 8 G4 SP9 2
#> 9 G4 SP2 2
Upvotes: 2