MFR
MFR

Reputation: 2077

Issue in converting date format to numeric format in R

I had a dataset that looked like this:

#df 

id       date
1       2016-08-30 10:46:46.810

I tried to remove the hour part and only keep the date. This function worked:

df$date <-  format(as.POSIXct(strptime(df$date,"%Y-%m-%d %H:%M:%S")) ,format = "%Y-%m-%d")

and the date now look likes this

id      date
1       2016-08-30

Which is something that I was looking for. But the problem is I wish to do some calculation on this data and have to convert it to integer:

   temp <-  as.numeric(df$date )

It gives me the following warning:

Warning message:
NAs introduced by coercion 

and results in

NA

Does anyone know where the issue is?

Upvotes: 2

Views: 315

Answers (2)

Jason
Jason

Reputation: 2617

Not format(). format gives you a character vector (string), and this confuses as.numeric because there are weird non-numeric characters in there. As far as the parser is concerned, you might as well have asked as.numeric("ripe red tomatoes").

Use as.Date() instead. e.g.

as.Date(as.POSIXct(df$date, format="%Y-%m-%d %H:%M:%S"))

Upvotes: 1

Dirk is no longer here
Dirk is no longer here

Reputation: 368599

It's pretty easy as you have a standard format (see ISO 8601) which inter alia the anytime package supports (and it supports other, somewhat regular formats):

R> library(anytime)
R> at <- anytime("2016-08-30 10:46:46.810")
R> at
[1] "2016-08-30 10:46:46.80 CDT"
R> ad <- anydate("2016-08-30 10:46:46.810")
R> ad
[1] "2016-08-30"
R> 

The key, though, is to understand the relationship between the underlying date formats. You will have to read and try a bit more on that. Here, in essence we just have

R> as.Date(anytime("2016-08-30 10:46:46.810"))
[1] "2016-08-30"
R> 

The anytime package has a few other tricks such as automagic conversion from integer, character, factor, ordered, ...

As for the second part of your question, your were so close and then you spoiled it again with format() creating a character representation.

You almost always want Date representation instead:

R> ad <- as.Date(anytime("2016-08-30 10:46:46.810"))
R> as.integer(ad)
[1] 17043
R> as.numeric(ad)
[1] 17043
R> ad + 1:3
[1] "2016-08-31" "2016-09-01" "2016-09-02"
R> 

Upvotes: 4

Related Questions