Mr Frog
Mr Frog

Reputation: 446

Filtering the first row in R

I would like to keep the first observation using the filter() function from dplyr. I mean, I would obtain many rows satisfying the same criterion from filtering but I only want to keep the first one, without further recurring to group() and distinct(). Is it possible?

I need to extract from a dataframe the first date stamp and the first date stamp where it appears "Bad".

problem = data.frame(
  Status = c("Good",  "Good",  "Bad", "Bad", "Bad"),
  Date_entry = c(as.Date("2000-01-01"), as.Date("2000-01-02"), as.Date("2000-01-03"), as.Date("2000-01-04"),as.Date("2000-01-05")),
  Date_status = c(as.Date("1999-01-01"), as.Date("1999-01-01"), as.Date("1999-01-02"), as.Date("1999-01-02"), as.Date("1999-01-02")),
  Value = c(150,20,14,96,04))

I can filter(Date == min(Date)) but then I don't know how to exactly filter out the first "Bad" outcome. I tried filter(Date_entry== min(Date_entry) | (Date_status - Date_entry) == min(Date_status - Date_entry)) but still does not work

solution = 
  data.frame(Status = c("Good", "Bad"),
             Date_entry = c(as.Date("2000-01-01"), as.Date("2000-01-02")),
             Date_status = c(as.Date("1999-01-01"), as.Date("1999-01-02")),
             Value = c(150,20))
             

Upvotes: 0

Views: 2912

Answers (3)

akrun
akrun

Reputation: 887991

An option with slice

library(dplyr)
problem %>%
   slice(union(which.min(Date_entry), match('Bad', Status)))

-output

#  Status Date_entry Date_status Value
#1   Good 2000-01-01  1999-01-01   150
#2    Bad 2000-01-03  1999-01-02    14

Upvotes: 0

MrFlick
MrFlick

Reputation: 206616

I think what you are asking for could be solved with

problem %>% 
  filter(Date_entry==min(Date_entry) | cumsum(Status=="Bad")==1)

Here we choose the min date, or we choose the first value of Bad using a cumsum (cumulative sum) trick. This number will go up by one each time a "Bad" is observed so we just select the row where it equals 1 (if present).

Upvotes: 1

Marcos Pérez
Marcos Pérez

Reputation: 1250

Something like this?

library(dplyr)
df <- data.frame(A=c(1,1,1,1,1,2,2,2,2,2),
                 B=c(1,2,3,4,5,1,2,3,4,5))
head(df %>% filter(A==1),1)

Upvotes: 0

Related Questions