Reputation: 79
I am trying to extact an ID thats a part of a string within a column in R. I would like to write an expression that would extract the art starting with IAB and ending in a number. how would I do this?
sample strings:
[31] "{\"\"element\"\":\"\"IAB1_4\"\"}"
[32] "{\"\"element\"\":\"\"IAB19_3\"\"}"
[33] "{\"\"element\"\":\"\"IAB19_16\"\"}"
[34] "{\"\"element\"\":\"\"IAB9_11\"\"}"
[35] "{\"\"element\"\":\"\"IAB19_5\"\"}"
[36] "{\"\"element\"\":\"\"IAB18_1\"\"}"
I need to extract just the part that starts with IAB and end in a number. How could I do this?
Upvotes: 1
Views: 7682
Reputation: 887971
We can use str_extract
to match one or more digits (\\d+
) after the string 'IAB' followed by an underscore (_
) and one or more digits (\\d+
)
library(stringr)
str_extract(v1, 'IAB\\d+_\\d+')
#[1] "IAB1_4" "IAB19_3" "IAB19_16" "IAB9_11" "IAB19_5" "IAB18_1"
Or with regexpr
from base R
regmatches(v1, regexpr('IAB\\d+_\\d+', v1))
#[1] "IAB1_4" "IAB19_3" "IAB19_16" "IAB9_11" "IAB19_5" "IAB18_1"
v1 <- c("{\"\"element\"\":\"\"IAB1_4\"\"}", "{\"\"element\"\":\"\"IAB19_3\"\"}",
"{\"\"element\"\":\"\"IAB19_16\"\"}", "{\"\"element\"\":\"\"IAB9_11\"\"}",
"{\"\"element\"\":\"\"IAB19_5\"\"}", "{\"\"element\"\":\"\"IAB18_1\"\"}"
)
Upvotes: 3