Reputation: 3107
Hello I have a quite large dataframe (~20k) with something like
id | GroupID | action
1 | 1 | run
2 | 1 | walk
3 | 1 | rest
4 | 1 | run
5 | 1 | walk
6 | 2 | run
7 | 2 | walk
I want to count the number off consecutive occurrences of action values ( for example (run,walk) which would be 3 in the above example. Also, would be good to be grouped by the groupID ( so 2 for group 1 and 1 for group 2 )
Is there any more nice way to do it except iterating row by row ?
Upvotes: 1
Views: 349
Reputation: 92292
Here's a possible solution using stringi
package
Select your vars
vars <- toString(c("run", "walk"))
Then
library(stringi)
aggregate(action ~ GroupID, df, function(x) stri_count_fixed(toString(x), vars))
# GroupID action
# 1 1 2
# 2 2 1
Or similarly, combined with the data.table
package
library(data.table)
setDT(df)[, stri_count_fixed(toString(action), vars), by = GroupID]
# GroupID V1
# 1: 1 2
# 2: 2 1
Basically we are converting action
to a one long string separated by ,
and counting the occurrences of "run, walk"
within each group.
Upvotes: 5
Reputation: 1616
You could use the diff
function in base
. This is assuming you only care about the run-walk combination, which is why rest was made an NA value. If errors arise in your actual data.frame, leave a comment so I may either adjust my answer or remove it.
> df<-data.frame(id=seq(1,7,1),
group=c(1,1,1,1,1,2,2),
action=c('run','walk','rest','run','walk','run','walk'))
> df$value<-ifelse(df$action=='rest',NA,
ifelse(df$action=='run',1,2)
> df
id group action value
1 1 1 run 1
2 2 1 walk 2
3 3 1 rest NA
4 4 1 run 1
5 5 1 walk 2
6 6 2 run 1
7 7 2 walk 2
> setNames(aggregate(df$value,by=list(df$group),function(x) length(which(diff(x)==1))),c('group','run_walk'))
group run_walk
1 1 2
2 2 1
Upvotes: 1