Reputation: 680
I have this data frame (It's not allowing me to show the picture here). I am trying to get the time difference between each sequential location grouped by component name. So basically, I want to see how long each component takes at a certain location and then want to get the average time for any component takes at a location and then by location and component type. I was originally trying to spread the data so that the location would be the key and the times the value and then get the difference between the columns but each component type has different locations so that did not work.
comps <- structure(list(component_name = c("COMPONENT000000001",
"COMPONENT000000001",
"COMPONENT000000001", "COMPONENT000000001", "COMPONENT000000001",
"COMPONENT000000001", "COMPONENT000000002", "COMPONENT000000002",
"COMPONENT000000002", "COMPONENT000000002", "COMPONENT000000002",
"COMPONENT000000002", "COMPONENT000000002"), component_type =
structure(c(4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("component_0",
"component_1", "component_2", "component_3"), class = "factor"),
location = structure(c(6L, 2L, 14L, 1L, 1L, 4L, 6L, 2L, 14L,
14L, 1L, 1L, 4L), .Label = c("29MSJ_03_01", "5YU1V_01_02",
"7EFLP_03_02", "assembly room", "B57X3_03_00", "GH9CV_00_03",
"HUX1L_02_02", "JX3UO_01_01", "MRX5B_01_00", "TG6IA_00_02",
"VUFVH_00_00", "YBSFJ_00_01", "ZAENM_02_01", "ZZU3X_02_00"
), class = "factor"), times = structure(c(1514764800, 1514771683,
1514784872, 1514794911, 1514806504, 1514820010, 1514764800,
1514776184, 1514789862, 1514794911, 1514806046, 1514831050,
1514843151), class = c("POSIXct", "POSIXt"), tzone = "America/New_York")), .Names = c("component_name",
"component_type", "location", "times"), row.names = c(NA, 13L
), class = "data.frame")
loc_diff <- comps %>%
group_by(., type, location) %>%
mutate(., diff = as.numeric(difftime(max(times), min(times))))
ld <- loc_diff %>%
group_by(., location) %>%
summarise(., avg = mean(diff))
This is was originally what I tried to do but then it gave me a dataframe with all the averages being about the same and I don't think that's right based off other exploration I have done. Any help is greatly appreciated. Thanks!
P.S. I'm not sure if I should be doing a tsa on this but I am new to that and I am still trying to get that to work.
Upvotes: 1
Views: 121
Reputation: 20085
One can first calculate the time spend on each location by a component by grouping on component_name
and then taking difference between times
of next row and current row. One can use difftime
to find difference between two times (in say seconds).
library(dplyr)
library(tidyr)
library(lubridate)
#First get the time spend by each component at a location
comps_timesSpendAtLocatoin <- comps %>%
group_by(component_name) %>%
mutate(timeSpendAtLocation = difftime(lead(times),times, units = "secs"))
#Group_by 'component_name' to find average time spend on a location by each component
comps_timesSpendAtLocatoin %>% group_by(component_name) %>%
summarise(avgTimeComponentAtLocation = mean(timeSpendAtLocation, na.rm = TRUE))
# component_name avgTimeComponentAtLocation
# <chr> <time>
# 1 COMPONENT000000001 11042
# 2 COMPONENT000000002 13058.5
#Average time spend on a location by component_type and location
comps_timesSpendAtLocatoin %>% group_by(component_type, location) %>%
summarise(avgTimeComponentTypeAtLocation = mean(timeSpendAtLocation, na.rm = TRUE))
# # A tibble: 5 x 3
# # Groups: component_type [?]
# component_type location avgTimeComponentTypeAtLocation
# <fctr> <fctr> <time>
# 1 component_3 29MSJ_03_01 15551
# 2 component_3 5YU1V_01_02 13433.5
# 3 component_3 assembly room NaN
# 4 component_3 GH9CV_00_03 9133.5
# 5 component_3 ZZU3X_02_00 8741
Upvotes: 1