Reputation: 1271
I have a series of dates that appear to be defined in seoncds since Jan 1, 1960.
'data.frame': 5 obs. of 1 variable:
$ original: int 1624086000 1624086000 1508137200 1508137200 1508137200
(for reproduction:)
data <- as.data.frame(c(1624086000,1624086000,1508137200,1508137200,1508137200))
setnames(data, c("original"))
I would like to convert these to dates in the format %Y-%m-%d
.
I wrote the following code for this:
uniqueDates <- as.data.frame(unique(data))
uniqueDates$converted <- sapply(uniqueDates$original, function(x) as.Date(as.POSIXct(x, origin="1960-01-01", tz = "GMT"), "GMT", "%Y-%m-%d"))
The result are dates in a five-digit numeric format:
> str(uniqueDates$converted)
num [1:2] 15144 13802
If I just run
as.Date(as.POSIXct(1624086000, origin="1960-01-01", tz = "GMT"), "GMT", "%Y-%m-%d")
I get the desired result:
[1] "2011-06-19"
What am I doing wrong that results in the five-digits numeric type values instead of the date objects?
Upvotes: 0
Views: 1330
Reputation: 28461
as.Date(as.POSIXct(data[,1], origin="1960-01-01", tz = "GMT"), "GMT", "%Y-%m-%d")
[1] "2011-06-19" "2011-06-19" "2007-10-16" "2007-10-16" "2007-10-16"
The function is already vectorized. There is no need for the lapply
function. Use the apply family if you have multiple columns of dates. If you want to avoid the long anonymous function, you can create the function first and use it in the way that works for your cases:
as.ymd <- function(x) {
as.Date(as.POSIXct(x, origin="1960-01-01", tz = "GMT"), "GMT", "%Y-%m-%d")
}
So now with either a single vector or array with multiple dimensions, you can convert the dates for those cases:
data2 <- data.frame(c(1624086000,1624086000,1508137200,1508137200,1508137200), c(1624086000,1624086000,1508137200,1508137200,1508137200))
setnames(data2, c("original", "second"))
as.ymd(data2[,1])
[1] "2011-06-19" "2011-06-19" "2007-10-16" "2007-10-16" "2007-10-16"
data2[] <- lapply(data2, as.ymd)
data2
original second
1 2011-06-19 2011-06-19
2 2011-06-19 2011-06-19
3 2007-10-16 2007-10-16
4 2007-10-16 2007-10-16
5 2007-10-16 2007-10-16
The five-digit numeric output from sapply
is due to its simplification process. The dates are being converted to class numeric
. Try adding the argument simplify=FALSE
to the first function that you tried for comparison.
You can work around it with strftime
since it outputs vectors with the class character
. With sapply
there will not be any problem simplifying it, but then you're left with character strings instead of the chosen date classes (POSIXct, POSIXlt, Date, zoo, xts, ...).
Upvotes: 1