Reputation: 467
I need to group by col1, and for each unique value in col2, give an integer starting at 1 and going up from there, but starting back at 1 for the next unique value in col1. It's easier to see in an example:
Here's a data.frame.
df <- data.frame(
col1=c(100, 100, 100, 100, 200, 200, 300, 300, 300),
col2=c("a", "b", "c", "c",
"b", "b",
"a", "c", "c")
)
df
# col1 col2
# 1 100 a
# 2 100 b
# 3 100 c
# 4 100 c
# 5 200 b
# 6 200 b
# 7 300 a
# 8 300 c
# 9 300 c
For each unique value (i.e. group) in col1, each unique value (e.g. group) in col2 should be numbered ordinally:
Desired output:
df %>%
cbind(col3 =c(1, 2, 3, 3,
1, 1,
1, 2, 2))
# col1 col2 col3
# 1 100 a 1
# 2 100 b 2
# 3 100 c 3
# 4 100 c 3
# 5 200 b 1
# 6 200 b 1
# 7 300 a 1
# 8 300 c 2
# 9 300 c 2
Note that for each group of col1, each new value in col2 is given its own integer.
Upvotes: 1
Views: 199
Reputation: 78947
We could use rleid
from data.table
library(data.table)
library(dplyr)
df %>%
group_by(col1) %>%
mutate(new_col = rleid(col2))
Output:
col1 col2 new_col
<dbl> <chr> <int>
1 100 a 1
2 100 b 2
3 100 c 3
4 100 c 3
5 200 b 1
6 200 b 1
7 300 a 1
8 300 c 2
9 300 c 2
Upvotes: 1
Reputation: 389055
Here are 3 ways to do it -
library(dplyr)
df %>%
group_by(col1) %>%
mutate(col3 = match(col2, unique(col2)),
col4 = as.integer(factor(col2)),
col5 = dense_rank(col2)) %>%
ungroup
# col1 col2 col3 col4 col5
# <dbl> <chr> <int> <int> <int>
#1 100 a 1 1 1
#2 100 b 2 2 2
#3 100 c 3 3 3
#4 100 c 3 3 3
#5 200 b 1 1 1
#6 200 b 1 1 1
#7 300 a 1 1 1
#8 300 c 2 2 2
#9 300 c 2 2 2
Upvotes: 3