user213544
user213544

Reputation: 2126

Filter rows in dataframe in R after first instance of specific column value using dplyr

I have a data frame in R with a lot of columns. One column contains an increasing sequence of numbers (column x), while another column contains certain values (column y)

Example library(tidyverse)

df <- approx(seq(1,10,1), c(1,5,7,11,4,12,30, 20, 10, 9)) %>%
      as.data.frame()

plot(df)

enter image description here

Problem

I want to remove all rows in the dataframe using dplyr, starting when the first y-value > 10. I am able to do so using dplyrs' filter function if the function only crosses the line of y=10 once. If the function crosses the line several times, however, I am not able to use it anymore.

Attempt

I have tried to use dplyrs' slice function:

df %>% slice(which(df$y<10)[1] )

but is sadly gives an error...

Expected output

df[-c(16:50),]
plot(df[-c(16:50),])

enter image description here

Question

How would I remove rows from a dataframe after the first instance of a value in a specific column using the Tidyverse collection of packages in R?

Upvotes: 0

Views: 680

Answers (2)

AnilGoyal
AnilGoyal

Reputation: 26218

Something like this may work:

df %>% 
    filter(row_number() <= first( which(df$y >10) ) -1 )

Upvotes: 1

Steffen
Steffen

Reputation: 196

df_new <- df[1:min(which(df$y>10)-1),]
plot(df_new)

Upvotes: 1

Related Questions