Reputation: 87
Sorry in case of duplication, but the solutions I have seen does not solve my issue.
I have a data frame (df). One of its variables (df$Year) includes a list of years, such as:
> df$Year
Year
2001–
2013–
2016–
2003–
2012–2013
2013–
1993–2007, 2010–
In case of multiple years, I just want to keep the last one (i.e. rather than '1993–2007, 2010–' only '2010') and get rid of the '-'. Yet, I have tried with:
unlist(str_extract_all(df$Year, "[[:digit:]]4$"))
but this does not seem to work.
Any hint?
Upvotes: 1
Views: 34
Reputation: 520968
We can use sub
for a one liner:
df$Year <- sub(".*(\\d{4})\\–?", "\\1", df$Year)
df$Year
[1] "2001" "2013" "2016" "2003" "2013" "2013" "2010"
Note that the dashes you use in your year ranges appear to be em dashes (or maybe en dashes), not the regular ASCII character.
Upvotes: 2