Cesar
Cesar

Reputation: 585

Getting the first and the Last value of each day in R

I have a dataframe that shows two columns with the online time of each day.

I first separated time from date, using this:

a1 <- dmy_hm(df$V2)
d1 <- data.frame(Date= format(a1, '%d/%m/%Y'), Time=format(a1, '%H:%M:%S'))

        Date     Time
31   04/06/2018 17:51:00
32   04/06/2018 17:50:00
33   04/06/2018 17:33:00
34   04/06/2018 17:33:00
35   04/06/2018 17:29:00
36   04/06/2018 17:29:00
37   04/06/2018 17:06:00
38   04/06/2018 17:06:00
39   04/06/2018 17:01:00
40   04/06/2018 17:01:00
41   04/06/2018 16:49:00
42   04/06/2018 16:49:00
43   04/06/2018 16:43:00
44   04/06/2018 16:43:00
45   04/06/2018 16:38:00
46   04/06/2018 16:38:00
47   04/06/2018 16:22:00
48   04/06/2018 16:22:00
49   04/06/2018 16:21:00
50   04/06/2018 16:21:00
51   04/06/2018 16:14:00
52   04/06/2018 16:14:00
53   04/06/2018 15:57:00
54   04/06/2018 15:57:00
89   04/06/2018 12:05:00
90   04/06/2018 12:05:00
91   04/06/2018 12:05:00
92   04/06/2018 12:05:00
93   04/06/2018 12:05:00
94   04/06/2018 12:05:00
100  04/06/2018 12:05:00
101  04/06/2018 12:05:00

How to get the first and the last time of each day ?

d1 %>% 
  group_by(Date) %>% 
  summarise(Min = min(Time), Max= max(Time))

But it appears this error message:

Error in summarise_impl(.data, dots) : 
  Evaluation error: <U+0091>min<U+0092> not meaningful for factors.

Upvotes: 1

Views: 759

Answers (2)

s_baldur
s_baldur

Reputation: 33498

Translating Mudskipper's solution to fast and concise data.table:

setDT(d1)
d1[order(Time), .(Min = Time[1], Max = Time[.N]), Date]
         Date      Min      Max
1: 04/06/2018 12:05:00 17:51:00

And why not compare to base-R at the same time:

aggregate(Time ~ Date, d1, function(x) c(Min = min(x), Max = max(x)))
        Date Time.Min Time.Max
1 04/06/2018 12:05:00 17:51:00

Upvotes: 1

moodymudskipper
moodymudskipper

Reputation: 47300

You can sort your data and use first and last instead of min and max :

library(dplyr)
d1 %>% 
  arrange(Time) %>%
  group_by(Date) %>% 
  summarise(Min = first(Time), Max= last(Time))

# # A tibble: 1 x 3
#           Date      Min      Max
#         <fctr>   <fctr>   <fctr>
#   1 04/06/2018 12:05:00 17:51:00

Alternatively, you can use stringsAsFactors = FALSE in your data.frame call, min and max do work with character, they just don't work with unordered factors:

d1 <- data.frame(Date= format(a1, '%d/%m/%Y'), Time=format(a1, '%H:%M:%S'),stringsAsFactors = FALSE)

library(dplyr)
d1 %>% 
  group_by(Date) %>% 
  summarise(Min = min(Time), Max= max(Time))

# # A tibble: 1 x 3
#           Date      Min      Max
#         <fctr>   <fctr>   <fctr>
#   1 04/06/2018 12:05:00 17:51:00

data

datetimes <- c(
'04/06/2018 17:51:00',
'04/06/2018 17:50:00',
'04/06/2018 17:33:00',
'04/06/2018 17:33:00',
'04/06/2018 17:29:00',
'04/06/2018 17:29:00',
'04/06/2018 17:06:00',
'04/06/2018 17:06:00',
'04/06/2018 17:01:00',
'04/06/2018 17:01:00',
'04/06/2018 16:49:00',
'04/06/2018 16:49:00',
'04/06/2018 16:43:00',
'04/06/2018 16:43:00',
'04/06/2018 16:38:00',
'04/06/2018 16:38:00',
'04/06/2018 16:22:00',
'04/06/2018 16:22:00',
'04/06/2018 16:21:00',
'04/06/2018 16:21:00',
'04/06/2018 16:14:00',
'04/06/2018 16:14:00',
'04/06/2018 15:57:00',
'04/06/2018 15:57:00',
'04/06/2018 12:05:00',
'04/06/2018 12:05:00',
'04/06/2018 12:05:00',
'04/06/2018 12:05:00',
'04/06/2018 12:05:00',
'04/06/2018 12:05:00',
'04/06/2018 12:05:00')

library(lubridate)
a1 <- dmy_hms(datetimes)
d1 <- data.frame(Date= format(a1, '%d/%m/%Y'), Time=format(a1, '%H:%M:%S'))

Upvotes: 3

Related Questions