Reputation: 1642
I have the following data:
id code
1 I560
2 K980
3 R30
4 F500
5 650
I would like to do the following two actions regarding the colum code
:
i) select the two numbers after the letter and
ii) remove those observations that do not start with a letter. So in the end, the data frame should look like this:
id code
1 I56
2 K98
3 R30
4 F50
Upvotes: 1
Views: 26
Reputation: 887611
An option with substr
from base R
df1$code <- substr(df1$code, 1, 3)
df1[grepl('^[A-Z]', df1$code),]
# id code
#1 1 I56
#2 2 K98
#3 3 R30
#4 4 F50
df1 <- structure(list(id = 1:5, code = c("I56", "K98", "R30", "F50",
"650")), row.names = c(NA, -5L), class = "data.frame")
Upvotes: 1
Reputation: 389175
In base R, you could do :
subset(transform(df, code = sub('([A-Z]\\d{2}).*', '\\1', code)),
grepl('^[A-Z]', code))
Or using tidyverse
functions
library(dplyr)
library(stringr)
df %>%
mutate(code = str_extract(code, '[A-Z]\\d{2}')) %>%
filter(str_detect(code, '^[A-Z]'))
# id code
#1 1 I56
#2 2 K98
#3 3 R30
#4 4 F50
Upvotes: 1