Reputation: 1
I have a variable in a data frame that contains raw json text. Some observations have a set 14 digit number that I want to extract and some don't. If the observation has the information it is under this format:
{"blur": "10010010010010"
I want to extract the 14 digits after {"blur": " if there is a match for this left-hand side part of the string. I tried str_extract but my regex syntax is not the best, any suggestions here?
Upvotes: 0
Views: 971
Reputation: 269481
If it's fully formed JSON you could use a JSON parser but assuming
NA
then try this.
The second argument to strapply
is the regular expression. It returns the portion matched to the capture group, i.e. the part of the regular expression within parentheses. The empty=NA
argument tells it what to return if no occurrences are found.
library(gsubfn)
s <- c('{"blur": "10010010010010"', 'abc') # test input
strapply(s, '{"blur": "(\\d+)"', empty = NA, simplify = TRUE)
## [1] "10010010010010" NA
Upvotes: 1