Reputation: 1352
I am stuck with converting strings to times. I am aware that there are many topics on Stack regarding converting strings-to-times, however I couldn't fix this problem with the solutions.
Situation I have a file with times like this:
> dput(df$Time[1:50])
c("1744.3", "2327.54", "1718.51", "2312.3200000000002", "1414.16",
"2046.15", "1442.5", "1912.22", "2303.2199999999998", "2146.3200000000002",
"1459.02", "1930.15", "1856.23", "2319.15", "1451.05", "25.460000000000036",
"1453.25", "2309.02", "2342.48", "2322.5300000000002", "2101.5",
"2026.07", "1245.04", "1945.15", "5.4099999999998545", "1039.5",
"1731.37", "2058.41", "2030.36", "1814.31", "1338.18", "1858.33",
"1731.36", "2343.38", "1733.27", "2304.59", "1309.47", "1916.11",
"1958.3", "1929.54", "1756.4", "1744.23", "1731.26", "1844.47",
"1353.25", "1958.3", "1746.44", "1857.53", "2047.15", "2327.2199999999998", "1915"
)
In this example, the times should be like this:
"1744.3" = 17:44:30
"2327.54" = 23:27:54
"1718.51" = 17:18:51
"2312.3200000000002" = 23:12:32
...
"25.460000000000036" = 00:25:46 # as you can see, the first two 00 are missing.
"1915" = 19:15:00
However, I tried multiple things (and now I am even stuck with str_replace()). Hopefully some one knows how I can transform this.
What have I tried?
format(df$Time, "%H%M.%S") # Yes I know...
# So therefore I thought, lets replace the strings to get them in a proper format
# like HH:MM:SS. First step was to replace the "." for a ":"
str_replace("." , ":", df$Time) # this was leading to "." (don't know why)
And that was the point that I was so frustrated that I posted it on Stack. Hope that you guys can help me.
Many thanks in advance!
Upvotes: 1
Views: 191
Reputation: 12410
The main problem is the time "25.460000000000036"
. But I think I found a clear though somewhat verbose solution:
library(tidyverse)
df %>%
mutate(hours = formatC(as.numeric(Time), width = 4, format = "d", flag = "0"),
seconds = as.numeric(str_extract(Time, "[.].+")) * 100) %>%
mutate(Time_new = stringi::stri_datetime_parse(paste0(hours, seconds), format = "HHmm.ss"))
#> # A tibble: 51 x 4
#> Time hours seconds Time_new
#> <chr> <chr> <dbl> <dttm>
#> 1 25.460000000000036 0025 46. 2020-02-19 00:25:46 # I changed the order of the times so the weird format is on top
#> 2 1744.3 1744 30 2020-02-19 17:44:30
#> 3 2327.54 2327 54 2020-02-19 23:27:54
#> 4 1718.51 1718 51 2020-02-19 17:18:51
#> 5 2312.3200000000002 2312 32. 2020-02-19 23:12:32
#> 6 1414.16 1414 16 2020-02-19 14:14:16
#> 7 2046.15 2046 15 2020-02-19 20:46:15
#> 8 1442.5 1442 50 2020-02-19 14:42:50
#> 9 1912.22 1912 22 2020-02-19 19:12:22
#> 10 2303.2199999999998 2303 22.0 2020-02-19 23:03:21
#> # ... with 41 more rows
If you also have times without fractions (i.e., without the dot) you could use this approach:
normalize_time <- function(t) {
formatC(as.numeric(t) * 100, width = 6, format = "d", flag = "0")
}
df %>%
mutate(Time_new = as.POSIXct(normalize_time(Time), format = "%H%M%S"))
Upvotes: 1
Reputation: 27732
a data.table way
First, convert your strings in your vector to numeric, multiply by 100 (to get the relevant part of HMS before the decimal separator) and set to integer. Then use sprintf()
to add leading zero's to get a 6-digit string. Finally, convert to time.
data.table::as.ITime( sprintf( "%06d",
as.integer( as.numeric(time) * 100 ) ),
format = "%H%M%S" )
# [1] "17:44:30" "23:27:54" "17:18:51" "23:12:32" "14:14:16" "20:46:15" "14:42:50" "19:12:22" "23:03:21" "21:46:32" "14:59:02" "19:30:15"
# [13] "18:56:23" "23:19:15" "14:51:05" "00:25:46" "14:53:25" "23:09:02" "23:42:48" "23:22:53" "21:01:50" "20:26:07" "12:45:04" "19:45:15"
# [25] "00:05:40" "10:39:50" "17:31:37" "20:58:41" "20:30:36" "18:14:31" "13:38:18" "18:58:33" "17:31:36" "23:43:38" "17:33:27" "23:04:59"
# [37] "13:09:47" "19:16:11" "19:58:30" "19:29:54" "17:56:40" "17:44:23" "17:31:26" "18:44:47" "13:53:25" "19:58:30" "17:46:44" "18:57:53"
# [49] "20:47:15" "23:27:21"
Upvotes: 1
Reputation: 10375
A roundabout way of doing it
tmp=as.numeric(lapply(strsplit(as.character(df$Time),"\\."),function(x){nchar(x[1])}))
ifelse(tmp>2,
substr(as.POSIXct(df$Time,format="%H%M.%S"),12,19),
substr(as.POSIXct(df$Time,format="%M.%S"),12,19))
Upvotes: 1
Reputation: 388907
Here is a way to do this, storing the output from dput
in x
.
library(magrittr)
#Remove all the dots
gsub('\\.', '', x) %>%
#Select only first 6 characters
substr(1, 6) %>%
#Pad 0's at the end
stringr::str_pad(6,pad = '0', side = 'right') %>%
#Add colon (:) separator
sub('(.{2})(.{2})', '\\1:\\2:', .)
# [1] "17:44:30" "23:27:54" "17:18:51" "23:12:32" "14:14:16" "20:46:15"
# [7] "14:42:50" "19:12:22" "23:03:21" "21:46:32" "14:59:02" "19:30:15"
#[13] "18:56:23" "23:19:15" "14:51:05" "25:46:00" "14:53:25" "23:09:02"
#...
Note that this can be done without pipes as well but using it for clarity. From here you can convert the time to POSIXct
format if needed.
Upvotes: 2