Miao  Cai
Miao Cai

Reputation: 1014

calculate hours in day and night based on start time and end time in r

I want to calculate the minutes in the day and in the night based on start time and end time in R. To simplify the question, I assume that the sunrise time is always 6 a.m. and the sunset time is always 6 p.m., regardless of time zone and location.

Here are some sample data:

dat = structure(list(
  start_time = structure(c(1431096404, 1431107312, 1431124632, 1431163956, 1431170210, 1431180438, 1431225936, 1431431610, 1431434550, 1431450416, 1431457208), 
class = c("POSIXct", "POSIXt"), tzone = "America/Chicago"), 
  end_time = structure(c(1431104384, 1431119732, 1431126312, 1431168936, 1431179030, 1431193878, 1431240696, 1431432150, 1431447870, 1431455096, 1431465728),
class = c("POSIXct", "POSIXt"), tzone = "America/Chicago")), 
  row.names = c(NA, -11L), 
  class = "data.frame")

It looks like the dataframe below:

            start_time            end_time
1  2015-05-08 09:46:44 2015-05-08 11:59:44
2  2015-05-08 12:48:32 2015-05-08 16:15:32
3  2015-05-08 17:37:12 2015-05-08 18:05:12
4  2015-05-09 04:32:36 2015-05-09 05:55:36
5  2015-05-09 06:16:50 2015-05-09 08:43:50
6  2015-05-09 09:07:18 2015-05-09 12:51:18
7  2015-05-09 21:45:36 2015-05-10 01:51:36
8  2015-05-12 06:53:30 2015-05-12 07:02:30
9  2015-05-12 07:42:30 2015-05-12 11:24:30
10 2015-05-12 12:06:56 2015-05-12 13:24:56
11 2015-05-12 14:00:08 2015-05-12 16:22:08

I want to add two columns minutes_day and minutes_night. minutes_day is how many minutes of this specific period was during the day (6 a.m. to 6 p.m.), while minutes_night was how many minutes of this specific period was during the night (6 p.m. to 6 a.m. in the next day). So my desired data.frame is like below:

            start_time            end_time minutes_day minutes_night
1  2015-05-08 09:46:44 2015-05-08 11:59:44         133             0
2  2015-05-08 12:48:32 2015-05-08 16:15:32         207             0
3  2015-05-08 17:37:12 2015-05-08 18:05:12          23             5
4  2015-05-09 04:32:36 2015-05-09 05:55:36           0            83
5  2015-05-09 06:16:50 2015-05-09 08:43:50         147             0
6  2015-05-09 09:07:18 2015-05-09 12:51:18         224             0
7  2015-05-09 21:45:36 2015-05-10 01:51:36           0           246
8  2015-05-12 06:53:30 2015-05-12 07:02:30           9             0
9  2015-05-12 07:42:30 2015-05-12 11:24:30         222             0
10 2015-05-12 12:06:56 2015-05-12 13:24:56          78             0
11 2015-05-12 14:00:08 2015-05-12 16:22:08         142             0

This question becomes quite hard to me since some dates are different between start_time and end_time.

Does anyone have clues on this problem? Thank you!

Upvotes: 3

Views: 2356

Answers (2)

amatsuo_net
amatsuo_net

Reputation: 2448

It requires quite a bit of coding but I think this will do the job. It actually gets actual sunset and sunrise time from a suncalc package for each day.

I will annotate it soon.

Calculation for a day

This is a function to calculate day and night in seconds, when the start day and end day is the same. To get the exact sunrise and sunset time, you need to supply latitude and longitude of the location.

library(lubridate)
library(tidyverse)
library(suncalc)

calc_in_oneday <- function(st, ed, lon = 0, lat = 0) {
    sunlight_times <- getSunlightTimes(as.Date(st), lat = lat, lon = lon)
    sunset <- sunlight_times$sunset
    sunrise <- sunlight_times$sunrise
    sec_night <- sec_day <- 0

    if(st > sunset | ed<=sunrise) { # when the period includes the night only
        sec_night  <- difftime(ed, st, units = "secs")
    } else if(st > sunrise & ed<=sunset) { # when the period includes the daytime only
        sec_day  <- difftime(ed, st, units = "secs")
    } else { # when things are bit more complicated
        if (st<=sunrise) { # if "start" is before sunrise time until sunrise will be added to night
            sec_night <- sec_night + difftime(sunrise, st, units = "secs")
        } else {  
            # if otherwise time until sunset will be added to daytime 
            # in this condition "end" will come after sunset (otherwise the second condition above will be satisfied)
            sec_day <- sec_day + difftime(sunset, st, units = "secs")
        } 
        if (ed<=sunset) { # The same logic
            sec_day <- sec_day + difftime(ed, sunrise, units = "secs")
        } else {
            sec_night <- sec_night + difftime(ed, sunset, units = "secs")
        } 
        if(st <= sunrise & ed > sunset) { # above will not add the entire daytime when "start" before sunrise and "end" after sunset
            sec_day <- sec_day + difftime(sunset, sunrise, units = "secs")
        }
    }
    sec_night <- unclass(sec_night)
    sec_day <- unclass(sec_day)
    attr(sec_day, "units") <- NULL
    attr(sec_night, "units") <- NULL
    return(list(sec_day = sec_day, sec_night = sec_night))
}

The nested conditions are complicated. I believe it is right but please check by yourself.

For multiple days

Using the function above, deal with the check for multiple days period. What this function does is check whether the start and end date, and if these are not the same, calculate the day/night time until the end of the first date, and then slide the start time to the beginning of the next day. (Edit: the tzone of the start/end time).


calc_day_night <- function(st, ed, lon = 0, lat = 0) {
    attr(st, "tzone") <- "UTC"
    attr(ed, "tzone") <- "UTC"

    sec_night <- sec_day <- 0
    while(as.Date(st) != as.Date(ed)) {
        tmp_ed <- as.Date(st) + days(1)
        day_night_oneday <- calc_in_oneday(st, tmp_ed, lon, lat)
        sec_night <- sec_night + day_night_oneday$sec_night
        sec_day <- sec_day + day_night_oneday$sec_day
        st <- tmp_ed
    }
    day_night_oneday <- calc_in_oneday(st, ed, lon, lat)
    sec_night <- sec_night + day_night_oneday$sec_night
    sec_day <- sec_day + day_night_oneday$sec_day
    return(list(sec_day = sec_day, sec_night = sec_night))
}

Test

Using the test data, the results look like this:

dat %>%  
    rowwise() %>%
    mutate(temp = list(calc_day_night(start_time, end_time, lat = 41, lon = -85))) %>%
    mutate(sec_day = temp$sec_day) %>%
    mutate(sec_night = temp$sec_night) %>%
    mutate(min_day = round(sec_day / 60)) %>%
    mutate(min_night = round(sec_night / 60)) %>%
    select(-matches("sec")) %>%
    select(-temp) 

## Source: local data frame [11 x 4]
## Groups: <by row>
## 
## # A tibble: 11 x 4
##    start_time          end_time            min_day min_night
##    <dttm>              <dttm>                <dbl>     <dbl>
##  1 2015-05-08 09:46:44 2015-05-08 11:59:44     133         0
##  2 2015-05-08 12:48:32 2015-05-08 16:15:32     207         0
##  3 2015-05-08 17:37:12 2015-05-08 18:05:12      28         0
##  4 2015-05-09 04:32:36 2015-05-09 05:55:36      26        57
##  5 2015-05-09 06:16:50 2015-05-09 08:43:50     147         0
##  6 2015-05-09 09:07:18 2015-05-09 12:51:18     224         0
##  7 2015-05-09 21:45:36 2015-05-10 01:51:36       0       246
##  8 2015-05-12 06:53:30 2015-05-12 07:02:30       9         0
##  9 2015-05-12 07:42:30 2015-05-12 11:24:30     222         0
## 10 2015-05-12 12:06:56 2015-05-12 13:24:56      78         0
## 11 2015-05-12 14:00:08 2015-05-12 16:22:08     142         0

I googled the lat and lon of Chicago, and used the values. As you see, for some records, the results shifted a bit (e.g. record #4 is not entirely night as the dawn of chicago is early in summer).

Upvotes: 1

clemens
clemens

Reputation: 6813

library(lubridate)
library(dplyr)

This function creates a sequence in minutes from start_time to end_time, extracts the hours of the resulting times, separates them to day and night, and gets the time difference between the earliest and latest time at day and at night.

get_minutes <- function(start_time, end_time) {
  mins_in_range <- seq(start_time, end_time, by = "mins")
  h_between <- hour(mins_in_range)

  hours_day <- mins_in_range[h_between >= 6 &
                     h_between < 18]
  hours_night <- mins_in_range[h_between < 6 |
                                 h_between >= 18]
  minutes_day <- tryCatch(as.numeric(difftime(max(hours_day),
                                              min(hours_day),
                                              units = "mins")),
                          warning = function(w) {
                            0
                          })

  minutes_night <- tryCatch(as.numeric(difftime(max(hours_night),
                                                min(hours_night),
                                                units = "mins")),
                            warning = function(w) {
                              0
                            })
  return(list(minutes_day = minutes_day, 
              minutes_night = minutes_night))
}

You can then go through the data per row, apply the function (which returns a list column) and separate the list to columns (with the help of data.table and rbindlist:

dat %>% 
  rowwise() %>% 
  mutate(temp = list(get_minutes(start_time, end_time))) %>% 
  cbind(data.table::rbindlist(.$temp)) %>% 
  select(-temp)

The end result looks like this:

             start_time            end_time minutes_day minutes_night
 1: 2015-05-08 09:46:44 2015-05-08 11:59:44         133             0
 2: 2015-05-08 12:48:32 2015-05-08 16:15:32         207             0
 3: 2015-05-08 17:37:12 2015-05-08 18:05:12          22             5
 4: 2015-05-09 04:32:36 2015-05-09 05:55:36           0            83
 5: 2015-05-09 06:16:50 2015-05-09 08:43:50         147             0
 6: 2015-05-09 09:07:18 2015-05-09 12:51:18         224             0
 7: 2015-05-09 21:45:36 2015-05-10 01:51:36           0           246
 8: 2015-05-12 06:53:30 2015-05-12 07:02:30           9             0
 9: 2015-05-12 07:42:30 2015-05-12 11:24:30         222             0
10: 2015-05-12 12:06:56 2015-05-12 13:24:56          78             0
11: 2015-05-12 14:00:08 2015-05-12 16:22:08         142             0

Upvotes: 1

Related Questions