R: group_id by changing row values

Question

1) Firstly, I have this data frame:

df <- data.frame(value=c("a","a","a", "b", "b", "b", "a", "a", "a"), ,
                 desired_id=c(1,1,1,2,2,2,3,3,3))

How do I generate the desired_id column? My groups are assigned by row order. That is, everytime the value column changes, I want the group indices to assign the next higher group indices.

I tried df$desired_id_replicate <- df %>% group_by(value) %>% group_indices but that doesn't work as all value=="a" will be assigned the same group indices.

2)Secondly, I have this data frame:

df <- data.frame(value=c("a","a","a", "b", "b", "b", "a", "a", "a"), 
                 value2=c("a","a","c", "b", "b", "c", "a", "a", "d"),
                 desired_id=c(1,1,2,3,3,4,5,5,6))

How do I generate the desired_id from the value and value2 column. My groups are assigned row-wise again. That is, everytime a unique combination of value and value2 changes, the next higher desired_id should be assigned.

Similar to the above, I tried df$desired_id_replicate <- df %>% group_by(value, value2) %>% group_indices but that doesn't work as all value=="a"&value2=="a" will be assigned the same group indices.

Thank you!

akrun · Accepted Answer

We can use rleid (run-length-encoding id) from data.table which would basically increment 1 for each element that is not equal to the previous element

library(data.table)
library(dplyr)
df%>%
  mutate(newcol = rleid(value))

and for the second dataset, it would be

df %>%
     mutate(new = rleid(value, value2))
#  value value2 desired_id new
#1     a      a          1   1
#2     a      a          1   1
#3     a      c          2   2
#4     b      b          3   3
#5     b      b          3   3
#6     b      c          4   4
#7     a      a          5   5
#8     a      a          5   5
#9     a      d          6   6

Or with rle from base R

df$newcol <- with(rle(df$value), rep(seq_along(values), lengths))

R: group_id by changing row values

Answers (1)

Related Questions