gh0strider18
gh0strider18

Reputation: 1140

If row meets criteria, then TRUE else FALSE in R

I have nested data that looks like this:

ID  Date Behavior
1   1    FALSE
1   2    FALSE
1   3    TRUE
2   3    FALSE
2   5    FALSE
2   6    TRUE
2   7    FALSE
3   1    FALSE
3   2    TRUE

I'd like to create a column called counter in which for each unique ID the counter adds one to the next row until the Behavior = TRUE

I am expecting this result:

ID  Date Behavior counter
1   1    FALSE    1
1   2    FALSE    2
1   3    TRUE     3
2   3    FALSE    1
2   5    FALSE    2
2   6    TRUE     3
2   7    FALSE    
3   1    FALSE    1
3   2    TRUE     2

Ultimately, I would like to pull the minimum counter in which the observation occurs for each unique ID. However, I'm having trouble developing a solution for this current counter issue.

Any and all help is greatly appreciated!

Upvotes: 1

Views: 340

Answers (3)

Vlo
Vlo

Reputation: 3188

do.call(rbind, by(df, list(df$ID), function(x) {n = nrow(x); data.frame(x, Counter = c(1:(m<-which(x$Behavior)), rep(NA, n-m)))}))

     ID  Date Behavior Counter
1.1  1    1    FALSE       1
1.2  1    2    FALSE       2
1.3  1    3     TRUE       3
2.4  2    3    FALSE       1
2.5  2    5    FALSE       2
2.6  2    6     TRUE       3
2.7  2    7    FALSE      NA
3.8  3    1    FALSE       1
3.9  3    2     TRUE       2

df = read.table(text = "ID  Date Behavior
                1   1    FALSE
                1   2    FALSE
                1   3    TRUE
                2   3    FALSE
                2   5    FALSE
                2   6    TRUE
                2   7    FALSE
                3   1    FALSE
                3   2    TRUE", header = T)

Upvotes: 0

Nick DiQuattro
Nick DiQuattro

Reputation: 739

Here's a dplyr solution that finds the row number for each TRUE in each ID:

library(dplyr)
newdf <- yourdataframe %>%
           group_by(ID) %>%
           summarise(
             ftrue = which(Behavior)) 

Upvotes: 0

NPE
NPE

Reputation: 500327

I'd like to create a counter within each array of unique IDs and from there, ultimately pull the row level info - the question is how long on average does it take to reach a TRUE

I sense there might an XY problem going on here. You can answer your latter question directly, like so:

> library(plyr)
> mean(daply(d, .(ID), function(grp)min(which(grp$Behavior))))
[1] 2.666667

(where d is your data frame.)

Upvotes: 1

Related Questions