Reputation: 9018
I have a data frame:
d<-data.frame(time = factor(c("00:00","00:15","00:30","00:45", "01:00","01:15","01:30","01:45","02:00","02:40" )), q=c(0,0,100,0,0,100,0,0,0,0),p=c(.25,.25,.25,.25,.25,.25,.25,.25,.25,.25))
d
time q p
1 00:00 0 0.25
2 00:15 0 0.25
3 00:30 100 0.25
4 00:45 0 0.25
5 01:00 0 0.25
6 01:15 100 0.25
7 01:30 0 0.25
8 01:45 0 0.25
9 02:00 0 0.25
10 02:40 0 0.25
I would like to eliminate rows of the data frame that are BEFORE the first non-zero index of column "q" AND AFTER the last non-zero index of column "q". In the case above the results should look like this:
00:30 100 0.25
00:45 0 0.25
01:00 0 0.25
01:15 100 0.25
What's the best way to do this?
Upvotes: 3
Views: 4697
Reputation: 887881
You can use which
indx <- which(d$q!=0)
d[indx[1L]:indx[length(indx)],]
# time q p
#3 00:30 100 0.25
#4 00:45 0 0.25
#5 01:00 0 0.25
#6 01:15 100 0.25
As @Frank mentioned in the comments, if all the values are '0', then we may need a condition. The below function will return the whole dataset in that case.
f1 <- function(dat, col){
if(sum(dat[,col])!=0){
indx <- which(dat[,col]!=0)
dat[indx[1L]:indx[length(indx)],]
}
else{
dat
}
}
f1(d, 'q')
# time q p
#3 00:30 100 0.25
#4 00:45 0 0.25
#5 01:00 0 0.25
#6 01:15 100 0.25
Upvotes: 6