Natasha R.
Natasha R.

Reputation: 531

How to filter certain rows based on the preceding row's value using dplyr?

I have a dataframe like this:

df1 <- data.frame(
  Group = c('A', 'B', 'C', 'D', 'Total: CD', 'E', 'F', 'G', 'Total: FG', 'H'),
  Value1 = c(12, 88, 10, 90, 100, 8, 4, 11, 15, 77)
)

I would like to filter out any rows that come after a row containing the word Total with the result looking like this:

df_desired <- data.frame(
  Group = c('A', 'B', 'C', 'D', 'Total: CD', 'F', 'G', 'Total: FG'),
  Value1 = c(12, 88, 10, 90, 100, 4, 11, 15)
)

How would I achieve this in R, ideally using the dplyr package?

Upvotes: 1

Views: 62

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388817

Using lag in dplyr -

library(dplyr)

df1 %>% filter(!lag(grepl('Total', Group), default = FALSE))

#      Group Value1
#1         A     12
#2         B     88
#3         C     10
#4         D     90
#5 Total: CD    100
#6         F      4
#7         G     11
#8 Total: FG     15

grepl returns TRUE where 'Total' is found and lag would lag the result by 1 step so that the next row after 'Total' is removed.

Upvotes: 2

jpdugo17
jpdugo17

Reputation: 7106

Maybe something like this:

library(tidyverse)

df1[-c(which(str_detect(df1$Group, 'Total')) + 1), ]
#>       Group Value1
#> 1         A     12
#> 2         B     88
#> 3         C     10
#> 4         D     90
#> 5 Total: CD    100
#> 7         F      4
#> 8         G     11
#> 9 Total: FG     15

Created on 2021-11-30 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions