user3672160
user3672160

Reputation: 85

Count how many times values has changed in column using R

Hi i want to count how many times value has changed in a column by the group and how many unique values was in a group, and i sort of getting what i want, but it has a NA observation which i do not want to be counted.

df <- data.frame(x=c("a",'a', "a", "b",'b', "b", "c",'c', "d")
                 ,y=c(1,2,NA,3,3,3,2,1,5))
library(data.table) #data.table_1.9.5
setDT(df)[, wanted := rleid(y), by=x][]
setDT(df)[, count := uniqueN(y),by=x][]



   x  y wanted count
1: a  1      1     3
2: a  2      2     3
3: a NA      3     3
4: b  3      1     1
5: b  3      1     1
6: b  3      1     1
7: c  2      1     2
8: c  1      2     2
9: d  5      1     1`

Desired results:

   x  y wanted count
1: a  1      1     2
2: a  2      2     2
3: a NA      2     2
4: b  3      1     1
5: b  3      1     1
6: b  3      1     1
7: c  2      1     2
8: c  1      2     2
9: d  5      1     1

I tried rleid(!is.na(y)) but seems not to work as i expected. Thank you.

Upvotes: 2

Views: 133

Answers (1)

akrun
akrun

Reputation: 886968

We can replace the NA elements with previous non-NA element (na.locf), take the rleid on that to get the 'wanted' and also get the length of unique elements that are not NA to get the 'count'

library(zoo)
setDT(df)[, c('wanted', 'count') := list(rleid(na.locf(y)),  uniqueN(y, na.rm = TRUE)), x]
df
#   x  y wanted count
#1: a  1      1     2
#2: a  2      2     2
#3: a NA      2     2
#4: b  3      1     1
#5: b  3      1     1
#6: b  3      1     1
#7: c  2      1     2
#8: c  1      2     2
#9: d  5      1     1

Upvotes: 1

Related Questions