Reputation: 923
Hi I have a dataframe in the form shown below:
structure(list(ID = c(1, 2, 3, 4, 5, 6, 7), Date = c("20200230",
"20200422", "20100823", "20190801", "20130230", "20160230", "20150627"
)), class = "data.frame", row.names = c(NA, -7L))
ID Date
1 1 20200230
2 2 20200422
3 3 20100823
4 4 20190801
5 5 20130230
6 6 20160230
7 7 20150627
the date in the Date column is not in the standard format and it's shown in yyyymmdd
form. How can I separate year, month and day from Date
column and save them as separate new column in data frame, so the result look like this?
ID Date Year Month Day
1 1 20200230 2020 02 30
2 2 20200422 2020 04 22
3 3 20100823 ....................
4 4 20190801 ....................
5 5 20130230 ....................
6 6 20160230 ....................
7 7 20150627 ....................
I tried using format(as.Date(x, format="%YYYY%mm/%dd"),"%YYYY")
but it didn't work for me. I also tried follwing code:
Data$Year <- year(ymd(Data$Date))
The result is in this form:
ID Date Year
1 1 20200230 NA
2 2 20200422 2020
3 3 20100823 2010
4 4 20190801 2019
5 5 20130230 NA
6 6 20160230 NA
7 7 20150627 2015
As mentioned by @neilfws , the reason I get NA is that the date is not valid; however, I really don't care about the validity and I want to extract the year in anycase.
Upvotes: 0
Views: 103
Reputation: 5788
Base R in one expression:
# If you want to keep the Date vector:
cbind(df,
strcapture(pattern = "^(\\d{4})(\\d{2})(\\d{2})$",
x = df$Date,
proto = list(year = integer(), month = integer(), day = integer())))
# If you want to drop the Date vector:
cbind(within(df, rm(Date)),
strcapture(pattern = "^(\\d{4})(\\d{2})(\\d{2})$",
x = df$Date,
proto = list(year = integer(), month = integer(), day = integer())))
Upvotes: 1
Reputation: 33782
If you only want the year and are not concerned with date validation, the easiest solution is probably to extract the first 4 characters from Date
and convert to numeric.
Data$Year <- as.numeric(substring(Data$Date, 1, 4))
Might be good to have some kind of check for Date
, e.g. that they all contain 8 digits.
Upvotes: 3