Reputation: 29
I have a large column displaying a string such as:
20-1843PA-HY-4563-214DF
The "20" is the century while the "18 is the year. What is the simplest way to extract these two using a function and have an output of 2018 in R?
Upvotes: 0
Views: 116
Reputation:
I would do something like this:
chr_collumn<-"20-1843PA-HY-4563-214DF"
chr_collumn<-strsplit(chr_collumn,"-")
chr_collumn<-unlist(chr_collumn)[1:2]
chr_year<-paste0(chr_collumn[1],strtrim(chr_collumn[2],width=2))
chr_year<-as.numeric(chr_year)
chr_year
Upvotes: 0
Reputation: 887098
We can use sub
to capture the digits as a group from the start (^
) of the string followed by the -
, then capture the two digits ((\\d{2})
) and replace with the backreference (\\1\\2
) of the captured group
f1 <- function(nm) as.numeric(sub("^(\\d+)-(\\d{2}).*", "\\1\\2", nm))
f1(str1)
#[1] 2018
str1 <- "20-1843PA-HY-4563-214DF"
Upvotes: 1