thequietus
thequietus

Reputation: 129

Match a substring with character, digits and spaces with gsub

I have a string like:

a <- '{:name=>"krill", :priority=>2, :count=>1}, {:name=>"vit a", :priority=>2]}, {:name=>"vit-b", :priority=>2, :count=>1}, {:name=>"vit q10", :priority=>2]}'

I would like to parse via str_match the elements within ':name=>" ' and ' " '

krill
vit a
vit-b
vit q10

So far I tried:

str_match(a, ':name=>\\"([A-Za-z]{3})')

But it doesn't work.

Any help is appreciated

Upvotes: 0

Views: 72

Answers (2)

s_baldur
s_baldur

Reputation: 33498

Using stringr and positive lookbehind:

library(stringr)
str_match_all(a, '(?<=:name=>")[^"]+')

[[1]]
     [,1]     
[1,] "krill"  
[2,] "vit a"  
[3,] "vit-b"  
[4,] "vit q10"

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626952

You may extract those values with

> regmatches(a, gregexpr(':name=>"\\K[^"]+', a, perl=TRUE))
[[1]]
[1] "krill"   "vit a"   "vit-b"   "vit q10"

The :name=>"\\K[^"]+ pattern matches

  • :name=>" - a literal substring
  • \K - omits the substring from the match
  • [^"]+ - one or more chars other than ".

If you need to use stringr package, use str_extract_all:

> library(stringr)
> str_extract_all(a, '(?<=:name=>")[^"]+')
[[1]]
[1] "krill"   "vit a"   "vit-b"   "vit q10"

In (?<=:name=>")[^"]+, the (?<=:name=>") matches any location that is immediately preceded with :name=>".

Upvotes: 2

Related Questions