Reputation: 312
I'm trying to create a function that will take a few parameters and return the total average hourly return. My data set looks like this:
Location Time units
1 Columbus 3:35 12
2 Columbus 3:58 199
3 Chicago 6:10 -45
4 Chicago 6:19 87
5 Detroit 12:05 -200
6 Detroit 0:32 11
What I would like returned would be
Location Time units unitsph
Columbus 7:33 211 27.9
Chicago 12:29 42 3.4
Detroit 12:37 -189 -15.1
while also retaining the other items
basically total units produced and units per hour.
I tried out
thing <- time %>% group_by(Location) %>% summarize(sum(units))
which returned locations and total units but not units per hour. Then I moved to
thing <- time %>% group_by(Location) %>% summarize(sum(units)) %>% summarize(sum(Time))
which returned
Error in eval(expr, envir, enclos) : object 'Time' not found
I also tried mutate but to no effect:
fin <- mutate(time, as.numeric(sum(Time))/as.numeric(sum(units)))
Error in Summary.factor(c(118L, 131L, 174L, 178L, 57L), na.rm = FALSE) :
‘sum’ not meaningful for factors
Any help here much appreciated. I also have a few other columns that I'd like to retain (they're geocodes for the locations etc), but didn't list those here. If that's important I can add back in.
Upvotes: 0
Views: 131
Reputation: 312
I ended up taking part of what @CAFEBABE recommended and modifying it.
I used
mutated_time <- time %>%
group_by(Location) %>%
summarize(play
= sum(as.numeric(Time)/60),
unitsph = sum(units))
and that plus
selektor <- as.data.frame(select(distinct(mutated_time), Location,unitsph))
got me where I wanted to go. Thank you all for the many helpful comments.
Upvotes: 1
Reputation: 4101
Your time is a a string object. You can use
data <- data.frame(loc=c("C","C","D","D"),time=c("1:22","1:23","1:24","1:25"),u=c(1,2,3,4))
basetime <- strptime("00:00","%H:%M")
data$in.hours <- as.double(strptime(data$time,"%H:%M")-basetime)
thing <- data %>% group_by(loc) %>% summarize(sum(u),sum(in.hours))
The conversion into hours is not exactly beautiful. It first turns the time into a Posix.ct object to convert it in turn to a double. But guess ok. The converted data
loc time u in.hours
1 C 1:22 1 1.366667
2 C 1:23 2 1.383333
3 D 1:24 3 1.400000
4 D 1:25 4 1.416667
so 1.366
means 1h + 1/3h
.
The final result is then
loc sum(u) sum(in.hours)
(fctr) (dbl) (dbl)
1 C 3 2.750000
2 D 7 2.816667
hence for C
you have 2 hours and 0.75*60 minutes
Upvotes: 2