Reputation: 85
Hi i want to count how many times value has changed in a column by the group and how many unique values was in a group, and i sort of getting what i want, but it has a NA observation which i do not want to be counted.
df <- data.frame(x=c("a",'a', "a", "b",'b', "b", "c",'c', "d")
,y=c(1,2,NA,3,3,3,2,1,5))
library(data.table) #data.table_1.9.5
setDT(df)[, wanted := rleid(y), by=x][]
setDT(df)[, count := uniqueN(y),by=x][]
x y wanted count
1: a 1 1 3
2: a 2 2 3
3: a NA 3 3
4: b 3 1 1
5: b 3 1 1
6: b 3 1 1
7: c 2 1 2
8: c 1 2 2
9: d 5 1 1`
Desired results:
x y wanted count
1: a 1 1 2
2: a 2 2 2
3: a NA 2 2
4: b 3 1 1
5: b 3 1 1
6: b 3 1 1
7: c 2 1 2
8: c 1 2 2
9: d 5 1 1
I tried rleid(!is.na(y))
but seems not to work as i expected. Thank you.
Upvotes: 2
Views: 133
Reputation: 886968
We can replace the NA elements with previous non-NA element (na.locf
), take the rleid
on that to get the 'wanted' and also get the length
of unique
elements that are not NA
to get the 'count'
library(zoo)
setDT(df)[, c('wanted', 'count') := list(rleid(na.locf(y)), uniqueN(y, na.rm = TRUE)), x]
df
# x y wanted count
#1: a 1 1 2
#2: a 2 2 2
#3: a NA 2 2
#4: b 3 1 1
#5: b 3 1 1
#6: b 3 1 1
#7: c 2 1 2
#8: c 1 2 2
#9: d 5 1 1
Upvotes: 1