Reputation: 1447
I have the following dataset
df <- data.frame(
courts = c("Court 1", "Court 1", "Court 2"),
time = c("1:00 hrs", "3:00 hrs", "2:00 hrs")
)
I now remove the space like this
df <- df %>%
mutate(
time_without_hrs = sub(" hrs", "", time)
)
And format the hrs as date.time
df$time_without_hrs <- as.POSIXct(df$time_without_hrs, format="%H:%M")
However, now when I want to sum the courts by hours ....
summary_df <- df %>%
group_by(courts) %>%
summarize(
sum_value1 = sum(time_without_hrs),
)
I get the following error. Any thoughts on what goes wrong?
Error in `summarize()`:
ℹ In argument: `sum_value1 = sum(time_without_hrs)`.
ℹ In group 1: `courts = "Court 1"`.
Caused by error in `Summary.POSIXct()`:
! 'sum' not defined for "POSIXt" objects
Run `rlang::last_trace()` to see where the error occurred.
Upvotes: 1
Views: 68
Reputation: 269644
Convert the time to "difftime"
class and then sum. Finally convert that to numeric.
Optionally omit %>% as.numeric
if a difftime object is ok as the output for the time
column.
Have incorporated suggestion from comments by @ThomasIsCoding
library(dplyr)
df %>%
summarize(time = time %>%
as.difftime(format = "%H:%M hrs") %>%
sum %>%
as.numeric, .by = courts)
## courts time
## 1 Court 1 4
## 2 Court 2 2
Input used
df <- data.frame(
courts = c("Court 1", "Court 1", "Court 2"),
time = c("1:00 hrs", "3:00 hrs", "2:00 hrs")
)
Upvotes: 1
Reputation: 101383
In your case, what you want to sum up is the time interval (difference), not the time instance that POSIXct
represents.
You can use as.difftime
like below
df %>%
summarise(sum_hrs = sum(as.difftime(time,
format = "%H:%M hrs",
units = "hours"
)), .by = courts)
gives
courts sum_hrs
1 Court 1 4 hours
2 Court 2 2 hours
Upvotes: 1
Reputation: 18683
The error message is clear: "! 'sum' not defined for "POSIXt" objects
".
Try lubridate's hm
and hour
functions to convert the original character representation into a numeric value representing the number of whole hours.
library(lubridate)
df %>%
mutate(time_in_hrs=hour(
hm(time)
)) %>%
group_by(courts) %>%
summarize(
sum_value1 = sum(time_in_hrs),
)
# A tibble: 2 x 2
courts sum_value1
<chr> <dbl>
1 Court 1 4
2 Court 2 2
Upvotes: 1