Reputation: 41
I have the following dataframe:
I have several questions on how to format and work with the dataframe (all my columns are character type of variables).
First, is there a way to reformat the Start column to a time type variable from a character variable.
Just like as.Date to format date, is there an as.time function in R ?
Second, I have two values in my Start and Stop columns that are not properly displayed as a 24 hour time format, is there any way to change them to the following :
24:46:00 to 0:46:00 and 25:14:00 to 01:14:00
Third, I would like to merge the date and Start column to one column as the following:
2019-05-12 18:11
I have tried the following: but it gives me N/As as value for the new variable.
df$DT <- as.POSIXct(paste(df$Date, df$Start), format="%d-%m-%Y %H:%M:%S")
Last I would like to calculate the difference between the start and stop time, I have done the following and it returns me with an error message :
df$Time_diff = difftime(df$Stop, df$Start, units = "mins")
the error message are the following:
Error in as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format
Could anyone help me with the above four points (I would like to write the code with base R without importing any library)?
Thank you !
Upvotes: 1
Views: 426
Reputation: 19191
Alternative solution, using base R:
a
# [1,] "25:45:00" "17:34" "2020-09-17"
# [2,] "13:43" "13:34" "2019-04-12"
# [3,] "17:56:00" "12:45" "2019-11-03"
b <- data.frame( t(apply(
a[,1:2], 1, function(x) sapply(
x, function(y) if(as.numeric(substr(y,1,2)) > 23){
paste0(as.numeric(substr(y,1,2))-24,":",as.numeric(substr(y,4,5)) ) }
else{ y } ) )), a[,3] )
# V1 a a...3.
# 1 1:45 17:34 2020-09-17
# 2 13:43 13:34 2019-04-12
# 3 17:56:00 12:45 2019-11-03
cc <- data.frame( Start=as.POSIXlt(strptime( paste( b[,3], b[,1] ), "%Y-%m-%d %H:%M" )),
Stop=as.POSIXlt(strptime( paste( b[,3], b[,2] ), "%Y-%m-%d %H:%M" ) ) )
# Start Stop
# 1 2020-09-17 01:34:00 2020-09-17 17:34:00
# 2 2019-04-12 13:43:00 2019-04-12 13:34:00
# 3 2019-11-03 17:56:00 2019-11-03 12:45:00
diff.Date(cc)
# Stop
# 1 960 mins
# 2 -9 mins
# 3 -311 mins
Converting only a time ("17:45") to as.Date ultimately always results in "recent Date" + Time.
Upvotes: 1
Reputation: 887881
There are different formats in 'Start' column. An option is to use parse_date_time
for correctly parsing different formats
library(lubridate)
parse_date_time(c("2019-05-12 18:11", "2019-10-20 24:46:00"),
c("ymd HMS", "ymd HM"))
#[1] "2019-05-12 18:11:00 UTC" "2019-10-21 00:46:00 UTC"
Using the OP's code
strt <- with(df, parse_date_time(paste(Date, Start), c("ymd HMS", "ymd HM")))
stop <- with(df, parse_date_time(paste(Date, Stop), c("ymd HMS", "ymd HM")))
and now we do the difftime
difftime(stop, strt, units = "mins")
Or another option is anytime
library(anytime)
anytime(c("2019-05-12 18:11", "2019-10-20 24:46:00"))
If there are only two formats for the 'Start', 'Stop' column, then we can do this in two steps
fmt1 <- "%d-%m-%Y %H:%M:%S"
fmt2 <- "%d-%m-%Y %H:%M"
strt <- with(df, as.POSIXct(paste(Date, Start), format = fmt1))
i1 <- is.na(strt)
str1[i1] <- with(df, as.POSIXct(paste(Date, Start)[i1], format = fmt2))
stop <- with(df, as.POSIXct(paste(Date, Stop), format = fmt1))
i2 <- is.na(stop)
stop[i2] <- with(df, as.POSIXct(paste(Date, Stop[i2], format = fmt2))
For the second part of the question, ie. to convert to 24 hour
period
str1 <- "25:14:00"
sapply(strsplit(str1, ":"), function(x) {
x1 <- as.numeric(x)
i1 <- x[1] >= 24
x1[1][i1] <- x1[1][i1] - 24
do.call(sprintf, c(as.list(x1), fmt = "%02d:%02d:%02d"))
})
#[1] "01:14:00"
Upvotes: 1