Reputation: 1106
I have this sample dataset
df=structure(list(V1 = c("", "", "", ""), V2 = c("Segunda", "VACUNA SinoVac",
"Primera", "PARTICULAR"), V3 = c("Dosis por aplicar", "UNIDAD DE SERVICIOS DE",
"Aplicada", ""), V4 = c(NA, NA, "16", "SALUD CALLE 153"), V5 = c(NA,
NA, "7", NA), V6 = c(NA, NA, "2021 202105061K No registra", NA
), V7 = c(NA, NA, "6", NA), V8 = c(NA, NA, "8", NA), V9 = c(NA,
NA, "2021", NA), V10 = c(NA, NA, "ADRIANA JAIME", NA), V11 = c(NA_character_,
NA_character_, NA_character_, NA_character_), V12 = c(NA_character_,
NA_character_, NA_character_, NA_character_)), row.names = 53:56, class = "data.frame")
I'm currently extracting the row (lets call it Row X) that contains the word "Aplicada"
df.out1 = df %>% filter_all(any_vars(. %in% c("Aplicada")))
But now I'm also requiring to extract the entire row before Row X so the desired result is:
structure(list(V1 = c("", ""), V2 = c("VACUNA SinoVac", "Primera"
), V3 = c("UNIDAD DE SERVICIOS DE", "Aplicada"), V4 = c(NA, "16"
), V5 = c(NA, "7"), V6 = c(NA, "2021 202105061K No registra"),
V7 = c(NA, "6"), V8 = c(NA, "8"), V9 = c(NA, "2021"), V10 = c(NA,
"ADRIANA JAIME"), V11 = c(NA_character_, NA_character_),
V12 = c(NA_character_, NA_character_)), row.names = 54:55, class = "data.frame")
Could you please advise?
Upvotes: 0
Views: 62
Reputation: 1972
A tidyverse option.
library(dplyr)
library(stringr)
keep <- df %>%
mutate(id = row_number()) %>%
filter(if_any(everything(), ~ str_detect(., 'Aplicada'))) %>%
pull(id)
df %>%
slice((keep-1):keep)
# V1 V2 V3 V4 V5 V6 V7 V8 V9
# 1 VACUNA SinoVac UNIDAD DE SERVICIOS DE <NA> <NA> <NA> <NA> <NA> <NA>
# 2 Primera Aplicada 16 7 2021 202105061K No registra 6 8 2021
# V10 V11 V12
# 1 <NA> <NA> <NA>
# 2 ADRIANA JAIME <NA> <NA>
Upvotes: 1
Reputation: 106
I wrote a code that should work as You want.
y <- nrow(df)
for(i in 1:nrow(df)) {
y[i] <- any(df[i, ] %in% c("Aplicada"))
if(i > 1 & y[i] == 1) {
y[i - 1] <- 1
}
}
df[as.logical(y), ]
I tried use apply function instead of a loop, but it didnt work correct.
Upvotes: 1
Reputation: 4520
Will fail if the match is found in first row:
dplyr::slice(
dat,
sapply(which(rowSums(dat == 'Aplicada', TRUE) == 1), \(x) { (x - 1):x })
)
# V1 V2 V3 V4 V5 <truncated>
# 1 VACUNA SinoVac UNIDAD DE SERVICIOS DE <NA> <NA> <truncated>
# 2 Primera Aplicada 16 7 202 <truncated>
Upvotes: 1