Primož Dolžan
Primož Dolžan

Reputation: 57

Creating new id by group of two columns with possibility of the same values

I would appreciate some help I am stuck. I would like to create a new id based on two columns where the last column can have the same value. I have to maintain the current order which is done by a third column ORDER.

    df <- data.frame(id = c(1, 1, 1, 1, 1, 1,1,2,2,2),order=c(1,2,3,4,5,6,7,1,2,3), 
value = c(999, 999, 999, 250, 250, 999,999,250,250,999))

   id order value
1   1     1   999
2   1     2   999
3   1     3   999
4   1     4   250
5   1     5   250
6   1     6   999
7   1     7   999
8   2     1   250
9   2     2   250
10  2     3   999

I would like to have

      id value counter
 1     1   999       1
 2     1   999       1
 3     1   999       1
 4     1   250       2
 5     1   250       2
 6     1   999       3
 7     1   999       3
 8     2   250       4
 9     2   250       4
10     2   999       5

Every bit of code I tried is grouping regardless of order.

Thanks

Upvotes: 1

Views: 115

Answers (4)

s_baldur
s_baldur

Reputation: 33498

CAVEAT: The rle() solution assumes the data is ordered.


Using rle():

df$counter <- with(rle(with(df, paste(id, value))), rep(seq_along(lengths), lengths))

Adding pipes for readability:

library(magrittr)
df$counter <- df %>% 
  {paste(.$id, .$value)} %>% 
  rle() %>% 
  with(rep(seq_along(lengths), lengths))

With data.table

library(data.table)
df$counter <- with(df, rleid(id, value))

Upvotes: 2

Duck
Duck

Reputation: 39585

Maybe try this with a loop:

#Loop
df$Var <- NA
df$Var[1] <- 1
for(i in 2:nrow(df))
{
  df$Var[i] <- ifelse(df$month[i]==df$month[i-1],df$Var[i-1],df$Var[i-1]+1)
}

Output:

df
   id order month Var
1   1     1   999   1
2   1     2   999   1
3   1     3   999   1
4   1     4   250   2
5   1     5   250   2
6   1     6   999   3
7   1     7   999   3
8   2     1   250   4
9   2     2   250   4
10  2     3   999   5

Upvotes: 1

ekoam
ekoam

Reputation: 8844

What you want is a run-length type id column. Try this

df$counter <- data.table::rleidv(df[, c("id", "value")])

Upvotes: 2

Allan Cameron
Allan Cameron

Reputation: 173793

You could do

df$counter <- cumsum(!c(FALSE, 
                apply(head(df[c(1, 3)], -1) == tail(df[c(1, 3)], -1), 1, all)))

df[-2]
#>    id month counter
#> 1   1   999       1
#> 2   1   999       1
#> 3   1   999       1
#> 4   1   250       2
#> 5   1   250       2
#> 6   1   999       3
#> 7   1   999       3
#> 8   2   250       4
#> 9   2   250       4
#> 10  2   999       5

Upvotes: 1

Related Questions