Reputation: 1467
I have a data frame in R. It has a column containing dates in this format Dec-06, Jan-90, Feb-76 etc. They are strings. How can I extract year section of it in this format: 2006, 1990, 1976 etc? I want to discard month segment and treat it as distance on year portion of it and treat this column as continuous variable for my logistic regression.
I tried several Date format package provided in R like POSIX, lubridate etc but was not able to extract.
Any idea?
Upvotes: 1
Views: 366
Reputation: 886948
Here is another option with zoo
library(zoo)
data.table::year(as.yearmon("Dec-06", "%b-%y"))
#[1] 2006
Or as @G.Grothendieck mentioned, as.integer
returns the year
as.integer(as.yearmon("Dec-06", "%b-%y"))
#[1] 2006
Upvotes: 0
Reputation: 11128
Using lubridate
, it is easy, year function is a part of lubridate:
library(lubridate)
dat <- data.frame(x=c("Mar-06","Jan-90","May-76"))
dat$date <- as.POSIXlt(paste0("01-",tolower(dat$x)),format="%d-%b-%y",origin="1970-01-01")
dat$year <- year(dat$date)
Answer:
> dat
x date year
1 Mar-06 2006-03-01 2006
2 Jan-90 1990-01-01 1990
3 May-76 1976-05-01 1976
Upvotes: 1
Reputation: 32548
format(as.Date(gsub(".*-","","Dec-06"), format = "%y"), "%Y")
#[1] "2006"
OR
library(lubridate)
format(myd(paste("Dec-06","-01",sep="")), "%Y")
#[1] "2006"
Upvotes: 4
Reputation: 388817
We convert the string into a Date
class and then extract only the year from it.
format(as.Date(paste0("01-", x), "%d-%b-%y"), "%Y")
#[1] "2006" "1990" "1976"
data
x <- c("Dec-06", "Jan-90", "Feb-76 ")
Upvotes: 3