xhr489
xhr489

Reputation: 2319

Creating labels with hour/min and seq to create bins

I have some data with hour/min. Number is just as.numeric on hour_min. hour_min is a hms object.

library(dplyr)
library(lubridate)

df <-  structure(list(hour_min = structure(c(NA, 69300, 46800, 35100, 
52200, 37800, 52200, NA, 45300, 42300, NA, 29700, 46800, 34200, 
32400, 43200, 36000, 41400, 29700, 36000), units = "secs", class = c("hms", 
"difftime")), number = c(NA, 69300, 46800, 35100, 52200, 37800, 
52200, NA, 45300, 42300, NA, 29700, 46800, 34200, 32400, 43200, 
36000, 41400, 29700, 36000)), class = "data.frame", row.names = c(NA, 
-20L), .Names = c("hour_min", "number"))

 hour_min number
1        NA     NA
2  19:15:00  69300
3  13:00:00  46800
4  09:45:00  35100
5  14:30:00  52200
6  10:30:00  37800
7  14:30:00  52200
8        NA     NA
9  12:35:00  45300
10 11:45:00  42300
11       NA     NA
12 08:15:00  29700
13 13:00:00  46800
14 09:30:00  34200
15 09:00:00  32400
16 12:00:00  43200
17 10:00:00  36000
18 11:30:00  41400
19 08:15:00  29700
20 10:00:00  36000

I want to create 30 min intervals so I am using the following: If I don't use the labels then it seems to work...How can I get nice labels.

df$interval <- cut(df$number,
                          breaks = seq(as.numeric(hms::as.hms("07:00:00")), 
                                       as.numeric(hms::as.hms("23:00:00")), 1800),
                          labels = as.character(seq(hms::as.hms("07:00:00"), 
                                       hms::as.hms("23:00:00"), 1800)))

So without the labels I can get this: I want to make a count but with 30 min. intervals.

df %>% 
  count(interval)

# A tibble: 11 x 2
   interval                n
   <fct>               <int>
 1 (2.88e+04,3.06e+04]     2
 2 (3.06e+04,3.24e+04]     1
 3 (3.24e+04,3.42e+04]     1
 4 (3.42e+04,3.6e+04]      3
 5 (3.6e+04,3.78e+04]      1
 6 (3.96e+04,4.14e+04]     1
 7 (4.14e+04,4.32e+04]     2
 8 (4.5e+04,4.68e+04]      3
 9 (5.04e+04,5.22e+04]     2
10 (6.84e+04,7.02e+04]     1
11 <NA>                    3

But I need the labels.. Solution?

Upvotes: 1

Views: 142

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269694

Convert number to the chron times class giving a times column using the fact that times objects are measured in fractions of a day. In that case we can use trunc.times and then count.

library(chron)
library(dplyr)
library(lubridate)
library(tidyr)

df %>% 
  mutate(times = (number / (24 * 60 * 60)) %>% times %>% trunc("00:30:00")) %>%
  drop_na %>%
  count(times)

giving:

# A tibble: 11 x 2
   times           n
   <S3: times> <int>
 1 08:00:00        2
 2 09:00:00        1
 3 09:30:00        2
 4 10:00:00        2
 5 10:30:00        1
 6 11:30:00        2
 7 12:00:00        1
 8 12:30:00        1
 9 13:00:00        2
10 14:30:00        2
11 19:00:00        1

chron only

Note that this can alternately be written as follows using only chron:

library(chron)

tt <- trunc(times(df$number / (24 * 60 * 60)), "00:30:00")
table(tt)

giving:

08:00:00 09:00:00 09:30:00 10:00:00 10:30:00 11:30:00 12:00:00 12:30:00 
       2        1        2        2        1        2        1        1 
13:00:00 14:30:00 19:00:00 
       2        2        1 

or using aggregate instead of table:

aggregate(list(n = tt), list(times = tt), length)

giving:

      times n
1  08:00:00 2
2  09:00:00 1
3  09:30:00 2
4  10:00:00 2
5  10:30:00 1
6  11:30:00 2
7  12:00:00 1
8  12:30:00 1
9  13:00:00 2
10 14:30:00 2
11 19:00:00 1

Upvotes: 1

xhr489
xhr489

Reputation: 2319

well this is my own solution: I needed to use hms:as.hms after labels:

df$interval <- cut(df$number,
                          breaks = seq(as.numeric(hms::as.hms("07:00:00")), 
                                       as.numeric(hms::as.hms("23:00:00")), 1800),
                          labels = hms::as.hms( seq(as.numeric(hms::as.hms("07:00:00")), 
                                       as.numeric(hms::as.hms("22:30:00")), 1800))                      )

Upvotes: 1

Related Questions