Reputation: 711
The date in my dataset is like this: 20130501000000 and I'm trying to convert this to a better datetime format in R
data1$date <- as.Date(data1$date, format = "%Y-%m-%s-%h-%m-%s")
However, I get an error for needing an origin. After I put the very first cell under date in as origin, it converts every cell under date to N/A. Is this right or should I try as.POSIXct()?
Upvotes: 5
Views: 20885
Reputation: 3829
Before I write my answer, I would like to say that the format
argument should be the format that your string is in. Therefore, if you have "20130501000000"
, you have to use (you don't have -
between each component of your date in the string format):
as.Date("20130501000000", format = "%Y%m%d%H%M%S")
# [1] "2013-05-01"
which works just fine, does not produce any error, and will return an object of class Date
:
as.Date("20130501000000", format = "%Y%m%d%H%M%S") |> class()
# [1] "Date"
Therefore, I think your issue is more of a formatting and not origin of the date.
Now to my detailed answer:
As far as I know and can understand, the as.Date()
will convert it to "date", so if you want the time part of the string as well, you have to use as.POSIXct()
:
as.POSIXct("20130501000000", format = "%Y%m%d%H%M%S")
# [1] "2013-05-01 EEST"
as.POSIXct("20130501000000", format = "%Y%m%d%H%M%S") |> class()
# [1] "POSIXct" "POSIXt"
Note that the timezone is EEST
which is my local timezone, if you want to define the timezone, you have to define it. For example to set the timezone to UTC:
as.POSIXct("20130501000000", format = "%Y%m%d%H%M%S", tz = "UTC")
# [1] "2013-05-01 UTC"
using the as.POSIXct()
you can do arithmetic with the object:
times <- c("20130501000000",
"20130501035001") # added 03:50:01 to the first element
class(times)
# [1] "character"
times <- as.POSIXct(times, format = "%Y%m%d%H%M%S", tz = "UTC")
class(times)
# [1] "POSIXct" "POSIXt"
times[2] - times[1]
# Time difference of 3.833611 hours
Upvotes: 0
Reputation: 368439
That is a somewhat degenerate format, but the anytime()
and anydate()
functions of the anytime package can help you, without requiring any explicit format strings:
R> anytime("20130501000000") ## returns POSIXct
[1] "2013-05-01 CDT"
R> anydate("20130501000000") ## returns Date
[1] "2013-05-01"
R>
Not that we parse from character representation here -- parsing from numeric would be wrong as we use a conflicting heuristic to make sense of dates stored a numeric values.
So here your code would just become
data1$data <- anytime::anydate(data1$date)
provided data1$date
is in character, else wrap one as.character()
around it.
Lastly, if you actually want Datetime rather than Date (as per your title), don't use anydate()
but anytime()
.
Upvotes: 14