Reputation: 61
I'm having a rough time isolating n-rows before and after a flag by group
I found an answer elsewhere that sort of worked, but was thrown off by groups with less than the scope of rows. For example if the scope was 6 rows but a group only had five observations the query would start including irrelevant observations from a prior group.
Here's some dummy data to reproduce.
x <- c("", "", "", "1", "", "","", "", "", "", "", "1","", "", "", "", "1", "")
y <- c("2", "6", "4", "4", "7", "9","1", "15", "7", "4", "5", "8","6", "1", "2", "4", "6", "16")
z <- c("a", "a", "a", "a", "a", "a","a", "b", "b", "b", "b", "b","b", "b", "c", "c", "c", "c")
a <- as.data.frame(cbind(x, y, z))
x y z
1 2 a
2 6 a
3 4 a
4 1 4 a
5 7 a
6 9 a
7 1 a
8 15 b
9 7 b
10 4 b
11 5 b
12 1 8 b
13 6 b
14 1 b
15 2 c
16 4 c
17 1 6 c
18 16 c
Ideally I'd like to have a
look something like this:
x y z
1 6 a
2 4 a
3 1 4 a
4 7 a
5 9 a
6 1 a
7 4 b
8 5 b
9 1 8 b
10 6 b
11 1 b
12 2 c
13 4 c
14 1 6 c
15 16 c
Upvotes: 0
Views: 159
Reputation: 160447
a[zoo::rollapply(a$x, 5, function(z) "1" %in% z, partial = TRUE),]
# x y z
# 2 6 a
# 3 4 a
# 4 1 4 a
# 5 7 a
# 6 9 a
# 10 4 b
# 11 5 b
# 12 1 8 b
# 13 6 b
# 14 1 b
# 15 2 c
# 16 4 c
# 17 1 6 c
# 18 16 c
zoo::rollapply
operates on "windows" of numbers at a time. Here, it's five, which means it looks at five values and returns a single value; then shifts one (four of the same, plus one more), and returns a single value; etc.
Because I specified partial=TRUE
(necessary when you need the output length to be the same as the input length), the length of values looked at might not be the same as the k
ernel width (5).
The point is that if I'm looking at five at a time, if one of them is a "1"
, then we're within 2 rows of a "1
", and should be retained.
An important property of the window is align
ment, where the default is center. It defines where in the window the results go.
In this case, the windows look like:
# [1] "" "" "" "1" "" "" "" "" "" "" "" "1" "" "" "" "" "1" ""
1: nn-------' (partial match)
2: ----yy--------' (partial)
3: `-------yy-------' there is a window in this set of five, so a true ("yy")
4: `-------yy-------'
5: `-------yy-------'
6: `-------yy-------'
7: `-------nn-------' no "1", so a false
... etc
# [1] "" "" "" "1" "" "" "" "" "" "" "" "1" "" "" "" "" "1" ""
You can see in the first seven windows that the first is discarded (there is not a "1"
close enough), we have five true ("yy"
in my nomenclature), and then we get a false ("nn")
since it does not see a "1"
.
Upvotes: 1