Cebs
Cebs

Reputation: 180

Subseting the lowest date by a factor

I have the following dataset:

id<-c("1a","1a","1a","1a","1a",
      "2a","2a","2a","2a","2a",
      "3a","3a","3a","3a","3a")
fch<-c("22/05/2020","12/01/2020","01/01/2019","10/11/2020","01/01/2019",
       "10/10/2015","01/01/2015","20/10/2015","08/04/2020","12/12/2019",
       "01/05/2020","01/01/2013","10/08/2019","12/01/2020","20/10/2019")
dat<-c(25,35,48,97,112,
       65,85,77,89,555,
       58,98,25,45,336)
data<-as.data.frame(cbind(id,fch,dat))

My intention is to extract the row corresponding to the earliest date by the factor "id".

So my resulting data frame would look like this:

id<-c("1a","2a","3a")
fch<-c("01/01/2019","01/01/2015","01/01/2013")
dat<-c(48,85,98)
data_result<-as.data.frame(cbind(id,fch,dat))

This was my unsuccessful attempt:

DF1 <- data %>%
  mutate(fch = as.Date(as.character(data$fch),format="%d/%m/%Y")) %>% 
  group_by(id) %>% 
  mutate(fch = min(fch)) %>%
  ungroup

Upvotes: 1

Views: 44

Answers (2)

B Williams
B Williams

Reputation: 2050

Slightly different method from @akrun. Note that one of the earliest dates in your data has two entries. Without a time there is no way to know which occurred first (or maybe you want both?).

library(tidyverse)
library(lubridate)

data.frame(id = c(rep("1a",5), rep("2a",5), rep("3a",5)),
           fch = c("22/05/2020","12/01/2020","01/01/2019","10/11/2020","01/01/2019",
                   "10/10/2015","01/01/2015","20/10/2015","08/04/2020","12/12/2019",
                   "01/05/2020","01/01/2013","10/08/2019","12/01/2020","20/10/2019"),
           dat = c(25,35,48,97,112,65,85,77,89,555,58,98,25,45,336)) %>% 
  group_by(id) %>% 
    mutate(fch = dmy(fch)) %>% 
    filter(fch == min(fch)) 
    ungroup()

# A tibble: 4 x 3
# Groups:   id [3]
  id    fch          dat
  <chr> <chr>      <dbl>
1 1a    01/01/2019    48
2 1a    01/01/2019   112
3 2a    01/01/2015    85
4 3a    01/01/2013    98

Upvotes: 1

akrun
akrun

Reputation: 887158

We arrange the data by 'id', and the Date converted 'fch', grouped by 'id', use slice_head to get the first row of each group

library(dplyr)
library(lubridate)
data %>%  
  arrange(id, dmy(fch)) %>% 
  group_by(id) %>%
  slice_head(n = 1) %>%
  ungroup

-output

# A tibble: 3 x 3
#  id    fch          dat
#  <chr> <chr>      <dbl>
#1 1a    01/01/2019    48
#2 2a    01/01/2015    85
#3 3a    01/01/2013    98

NOTE: cbind returns a matrix by default and matrix can have only a single type. Instead, we can directly create the data.frame

data

data <- data.frame(id, fch, dat)

Upvotes: 1

Related Questions