Reputation: 69
I have a dataframe with a vector in the follwing format I want to keep all rows, within the "START" and "STOP" values in the column val (including them), and remove rows that is not inside start-stop
id = str_c("x",1:20)
val = c(rep("NO1", 3), "START", rep("yes1", 3), "STOP",
"NO2", "START", rep("yes2", 3), "STOP", rep("NO3",6))
data = data.frame(id,val)
My expected output is
data = data.frame(id = c(str_c("x", 4:8), str_c("x", 10:14)),
val = c("START", rep("yes1", 3), "STOP",
"START", rep("yes2", 3), "STOP"))
Upvotes: 1
Views: 214
Reputation: 388797
You can get the index of 'START'
and 'STOP'
values and create a sequence between them with mapply
and select those rows from data
.
ind1 <- which(data$val == 'START')
ind2 <- which(data$val == 'STOP')
data[sort(unique(unlist(mapply(`:`, ind1, ind2)))), ]
# id val
#4 x4 START
#5 x5 yes1
#6 x6 yes1
#7 x7 yes1
#8 x8 STOP
#10 x10 START
#11 x11 yes2
#12 x12 yes2
#13 x13 yes2
#14 x14 STOP
Upvotes: 1
Reputation: 11584
Does this work:
> library(dplyr)
> data %>% mutate(ID = substr(val, nchar(val), nchar(val))) %>%
+ mutate(ID = case_when(ID == 'T' ~ lead(ID), ID == 'P' ~ lag(ID), TRUE ~ ID)) %>%
+ group_by(ID) %>% filter(all(c('START','STOP') %in% val)) %>%
+ filter(row_number() %in% (which(val == 'START'): which(val == 'STOP'))) %>% ungroup() %>% select(-3)
# A tibble: 10 x 2
id val
<chr> <chr>
1 x4 START
2 x5 yes1
3 x6 yes1
4 x7 yes1
5 x8 STOP
6 x10 START
7 x11 yes2
8 x12 yes2
9 x13 yes2
10 x14 STOP
Upvotes: 1
Reputation: 780
Depending on the size of your data you could just do a brute force solution using a true false switch. You can return the row id's instead of the dataframe subset in the return output if you want.
parse_dataframe <- function(df, colnm='val'){
idx = c()
start <- FALSE
for (i in 1:nrow(data)){
if (df[i, colnm] == 'START') start <- TRUE
if (start) idx <- c(idx, i)
if (df[i, colnm] == 'STOP') start <- FALSE
}
return(df[idx, ])
}
parse_dataframe(data)
# id val
#4 x4 START
#5 x5 yes1
#6 x6 yes1
#7 x7 yes1
#8 x8 STOP
#10 x10 START
#11 x11 yes2
#12 x12 yes2
#13 x13 yes2
#14 x14 STOP
Upvotes: 1