Reputation: 223
I have two different time series from different data frames with different irregular formats, but the problem is the same. I want to extract only hours, minutes, seconds and milliseconds.
The times series look like this:
ts1
08:27:23,445
08:27:24,280
08:27:25,115
...
I tried
strptime("08:27:23,445", "%H:%M:%OS")
[1] "2013-05-23 08:27:23"
I lost the millisecond-information and get the useless (for me) date information.
ts2
Fri Apr 19 2013 08:39:41 GMT+0200
Fri Apr 19 2013 08:39:43 GMT+0200
Fri Apr 19 2013 08:39:45 GMT+0200
...
I tried
strptime("Fri Apr 19 2013 08:39:41 GMT+0200", "%a %b %d %Y %H:%M:%S %Z")
[1] NA
In the end, I want to transform ts1 and ts2 each into a new time series that have the same format (with milliseconds), for example:
ts1
08:27:23,445
ts2
08:39:41,000
The same format is important for me, because I want to operate with the two time series later on. E.g.: Matching the time series, calculate differences, etc...
Thank you for your help!
UPDATE: add dput
Both datasets are very very long , thats why i tried to cut them down.
ts1
structure(list(t = structure(1:9, .Label = c("08:27:23,445",
"08:27:24,280", "08:27:25,115", "08:27:25,960", "08:27:26,780",
"08:27:27,540", "08:27:28,295", "08:27:29,075", "08:27:29,910"), class = "factor")), .Names = "t", row.names = c(NA, -9L
), class = "data.frame")
ts2
structure(list(t = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 6L, 7L,
8L), .Label = c("Fri Apr 19 2013 08:39:41 GMT+0200", "Fri Apr 19 2013 08:39:43 GMT+0200",
"Fri Apr 19 2013 08:39:45 GMT+0200", "Fri Apr 19 2013 08:39:49 GMT+0200",
"Fri Apr 19 2013 08:39:51 GMT+0200", "Fri Apr 19 2013 08:39:53 GMT+0200",
"Fri Apr 19 2013 08:39:59 GMT+0200", "Fri Apr 19 2013 08:40:05 GMT+0200",
"Fri Apr 19 2013 08:40:06 GMT+0200"
), class = "factor")), .Names = "t", row.names = c(NA, -9L), class = "data.frame")
Upvotes: 5
Views: 148
Reputation: 6554
i think that the key to getting the %OS specifier to work might be converting from "," to "." as the decimal separator. the following works for me:
> ts1 = data.frame(t = c("08:27:23,445", "08:27:24,280", "08:27:25,115",
+ "08:27:25,960", "08:27:26,780", "08:27:27,540",
+ "08:27:28,295", "08:27:29,075", "08:27:29,910"),
+ stringsAsFactors = FALSE)
>
> ts1$t = sub(",", ".", ts1$t)
>
> T = apply(ts1, 1, function(n) {(strptime(n, format = "%H:%M:%OS"))})
>
> T[[1]] # ****
t
"2013-12-22 08:27:23"
> sprintf("%.3f", T[[1]])
[1] "1387693643.445"
here you can see that if you convert the time structure to a numeric then the fractional seconds are definitely there!
however, it is also possible that your locale actually expects the comma as the decimal separator, in which case you may well find that the fractional seconds are actually there already but are not showing up when you simply look at the time structure (see line above marked ** where the fractional part is not displayed).
Upvotes: 0
Reputation: 295
Below is a quick lapply function that might help, IF you have a set-zero point. For example, if you only want to compare activities from 0:00 (midnight) until 23:59:99,999 on the same day. If so, you can convert the time into another form (minutes in my example) and you can see how long, say, a single activity takes.
Using your example for t1:
Make a vector of times (as characters)
time <- c("08:27:23,445",
"08:27:24,280",
"08:27:25,115")
Change the comma to a colon, for ease of stripping
time.new <- gsub(",", ":", time)
Compute decimal minutes
time.mins <- sapply(strsplit(as.character(time.new), ":"),
function(x) {
x<-as.numeric(x)
(x[1]*60+x[2]+(x[3]/60)+(x[4]/60000))
})
The results looks like this, if you make a df of the columns:
> df <- cbind(time, time.mins)
> df
time time.mins
[1,] "08:27:23,445" "507.39075"
[2,] "08:27:24,280" "507.404666666667"
[3,] "08:27:25,115" "507.418583333333"
I imagine this might be a bit more helpful for something like click-through rates, or when you don't ever care about a total gap of more than 24 hours.
Upvotes: 1