starski
starski

Reputation: 359

How to create another column in a data frame based on repeated observations in another column?

So basically I have a data frame that looks like this:

BX BY
1 12
1 12
1 12
2 14
2 14
3 5

I want to create another colum ID, which will have the same number for the same values in BX and BY. So the table would look like this then:

BX BY ID
1 12 1
1 12 1
1 12 1
2 14 2
2 14 2
3 5 3

Upvotes: 1

Views: 120

Answers (2)

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a base R way.
Subset the data.frame by the grouping columns, find the duplicated rows and use a standard cumsum trick.

df1<-'BX    BY
1   12
1   12
1   12
2   14
2   14
3   5'
df1 <- read.table(textConnection(df1), header = TRUE)

cumsum(!duplicated(df1[c("BX", "BY")]))
#> [1] 1 1 1 2 2 3

df1$ID <- cumsum(!duplicated(df1[c("BX", "BY")]))
df1
#>   BX BY ID
#> 1  1 12  1
#> 2  1 12  1
#> 3  1 12  1
#> 4  2 14  2
#> 5  2 14  2
#> 6  3  5  3

Created on 2022-10-12 with reprex v2.0.2

Upvotes: 3

lroha
lroha

Reputation: 34406

You can do:

transform(dat, ID = as.numeric(interaction(dat, drop = TRUE, lex.order = TRUE)))

  BX BY ID
1  1 12  1
2  1 12  1
3  1 12  1
4  2 14  2
5  2 14  2
6  3  5  3

Or if you prefer dplyr:

library(dplyr)

dat %>% 
  group_by(across()) %>%
  mutate(ID = cur_group_id()) %>%
  ungroup()

# A tibble: 6 × 3
     BX    BY    ID
  <dbl> <dbl> <int>
1     1    12     1
2     1    12     1
3     1    12     1
4     2    14     2
5     2    14     2
6     3     5     3

Upvotes: 1

Related Questions