Reputation: 1170
I have a string variable that I want to parse to class Date
. In addition to the day, year and month, the format has other characters like separators (,
), letters and apostrophes (u''
), like this:
"u'9', u'2005', u'06'"
I have tried
as.Date(my_data$date, format = '%d %Y %m')
...but it only produces missing values. I was hoping that R would interpret the u''
as a unicode designator, which it doesn't.
How do I strip all those unused characters so that this "u'9', u'2005', u'06'"
becomes simply this "9 2005 06"
?
Upvotes: 1
Views: 504
Reputation: 67778
You don't need to strip the characters not used in the conversion specification. In ?as.Date
, the format
argument is pointing to ?strptime
("Otherwise, the processing is via strptime
"). In the Details section of ?strptime
* we find that:
"[a]ny character in the format string not part of a conversion specification is interpreted literally"
That is, in the format
argument of as.Date
, you may include not only the conversion specification (introduced by %
) but also the "other characters":
Furthermore, from ?as.Date
:
Character strings are processed as far as necessary for the format specified: any trailing characters are ignored
Thus, this works:
as.Date("(u'9', u'2005', u'06')", format = "(u'%d', u'%Y', u'%m")
# [1] "2005-06-09"
Upvotes: 4
Reputation: 10401
Try this:
as.Date(gsub("[u',()]","",my_data$date), format = '%d %Y %m')
Example with a single string:
d <- "(u'9', u'2005', u'06')"
d <- gsub("[u',()]","",d)
d.date <- as.Date(d, "%d %Y %m")
Result:
d.date
[1] "2005-06-09"
Upvotes: 1
Reputation: 1751
If it is character class, you can try:
library(lubridate)
test <- c("u'9'", "u'2005'", "u'06'")
dym(paste(gsub("u|'", "", test), collapse = "/"))
[1] "2005-06-09 UTC"
Here I use lubridate
to convert the string where I removed "u" and the ' character into time format.
The collapse character I used in paste
is arbitrary, lubridate
can handle pretty much anything as a separator between date parts.
Upvotes: 0