Reputation: 3983
Suppose I have a data frame in R that looks like this...
Time Event
1 0
1 1
1 0
2 0
2 0
3 0
3 1
3 0
3 1
3 0
On this data frame, I want to get another data frame with a couple summary values. I want the original time, the count of rows with a time equal to or greater than the time in question, and the number of events that occurred at that time.
Example output:
Time Eligible Event
1 10 1
2 7 0
3 5 2
I've tried using the match
, by
, and table
functions to accomplish this, but I can't make anything stick. I could do a double for
loop... but there's got to be a better way.
How can I do this? I'd like to do it in base R, not using plyr
or some other library...
Upvotes: 0
Views: 146
Reputation: 887941
Using only base R
, we can loop the unique "Time" using lapply
, get the summary statistics based on the conditions described.
res <- do.call(rbind,lapply(unique(df$Time), function(x)
data.frame(Time=x, Eligible=sum(x<=df$Time),
Event=sum(df$Event[df$Time %in%x]))))
res
# Time Eligible Event
#1 1 10 1
#2 2 7 0
#3 3 5 2
df <- structure(list(Time = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L
), Event = c(0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L)), .Names = c("Time",
"Event"), class = "data.frame", row.names = c(NA, -10L))
Upvotes: 2
Reputation: 513
Perhaps this is a little bit more interpretable:
countEligible <- function(x, Time) {
sum(x <= Time)
}
dat1 <- data.frame(Time = unique(dat$Time), Eligible = unique(sapply(dat$Time, function(x) countEligible(x, dat$Time))))
dat2 <- data.frame(Time = unique(dat$Time), Event = tapply(dat$Event, dat$Time, sum))
merge(dat1, dat2)
> merge(dat1, dat2)
Time Eligible Event
1 1 10 1
2 2 7 0
3 3 5 2
Upvotes: 0
Reputation: 10204
You could use tapply
to the same effect
newData <- data.frame(
Eligible = tapply(myData$Event,myData$Time,length),
Events = tapply(myData$Event,myData$Time,sum))
If you have multiple summaries, you can lapply
over the fields of your data.frame.
Upvotes: 0