Reputation: 1069
I would like to remove all rows that contain a zero, but only if (the row below it is zero AND the row above it is also zero) OR (it is a zero AND it is the first observation AND the number below it is also a zero).
For example:
RowNumb Column2
1 0
2 0
3 0
4 1
5 0
6 1
7 1
8 0
9 0
10 0
I would like to remove rows 1, 2, 9 and 10, as these are the only rows that are equal to zero, with a zero below them and either a zero or nothing (in the case of rowNumb
1) above them, such that I get the following:
RowNumb Column2
3 0
4 1
5 0
6 1
7 1
8 0
Does anyone know of a way of doing this without using a loop?
Upvotes: 2
Views: 153
Reputation: 269441
1) rollapply This uses rollapply
from the zoo package to check if any of three in a row (and because of partial=TRUE
two in a row at each end) are not zero:
library(zoo)
DF[ rollapply(DF$Column2 != 0, 3, any, partial = TRUE), ]
giving:
RowNumb Column2
3 3 0
4 4 1
5 5 0
6 6 1
7 7 1
8 8 0
1a) This variation also works:
DF[ rollapply(c(0, DF$Column2, 0) != 0, 3, any), ]
2) embed This solution does not use any packages. For this example embed
forms a 10 x 3 matrix whose rows contain successive triples (two in a row and a zero for rows at the ends) and computes a logical matrix from that and applies any
to each row:
DF[apply(embed(c(0, DF$Column2, 0) != 0, 3), 1, any), ]
Upvotes: 2
Reputation: 132606
You can use filter
to sum each absolute value with the preceding and following absolute value and compare that sum with 0:
DF <- read.table(text="RowNumb Column2
1 0
2 0
3 0
4 1
5 0
6 1
7 1
8 0
9 0
10 0", header=TRUE)
rem <- na.omit(filter(abs(c(0, DF$Column2, 0)), rep(1, 3)) != 0L)
DF[rem,]
# RowNumb Column2
#3 3 0
#4 4 1
#5 5 0
#6 6 1
#7 7 1
#8 8 0
This assumes there are no NA
values. If those can occur you'd need to modify this a bit:
x <- c(0, DF$Column2, 0)
rem <- na.omit(filter(x != 0L | is.na(x) , rep(1, 3)) != 0L)
Upvotes: 4
Reputation: 70256
Here's a dplyr
approach using lag
and lead
functions:
require(dplyr)
df %>% filter(!(Column2 == 0 & lag(Column2, default = 0) == 0 & lead(Column2,default = 0) == 0))
# RowNumb Column2
#1 3 0
#2 4 1
#3 5 0
#4 6 1
#5 7 1
#6 8 0
Upvotes: 3