hiroru
hiroru

Reputation: 129

Go lang get matching substring from string

I'm trying to extract all words from a string which are between quotes.

Here's my current code:

func StrExtract(word string) []string {
  r, _ := regexp.Compile(`".*"`)
  result := r.FindAllString(word, -1)
  RemoveDuplicates(&result)
  return (result)
}

Test the code here

With an input like:

`Hi guys, this is a "test" and a "demo" ok?`

I get the output:

["test" and a "demo"]

But I'd like to get:

[test demo]

Please help me fix this, or suggest better alternatives.

Upvotes: 2

Views: 3657

Answers (2)

Andris Leduskrasts
Andris Leduskrasts

Reputation: 1230

You can just add a lazy quantifier .*?, ".*?" being the regex, if you want to keep it simple. The reason you are getting "test" and a "demo" is because just .* is greedy and matches as much text as possible (therefore, it actually matches the " before test and after demo, ignoring the fact that there are other quotes in between).

Normally a better but in some ways slightly more complicated way to do this is using character classes "[^"]*", disabling matching quotes in between. This can also cause some other behaviors like including newlines (in which case you can also disable them [^"\n], or perhaps you actually want such a case)

Since you want to also not have the quotes some additional things need to be done. You can do that with either lookarounds: (?<=")[^"]*(?="), or with capture groups: "(.*?)" and "([^"]*)". If you choose the capture group route, you have to use the capture group, not whole matches.

Upvotes: 2

Doro
Doro

Reputation: 785

Regex:

"(.*?)"

Here is an online demo: https://regex101.com/r/sI4tA9/1

All you have to do now is to join matches. Unfortunately I'm not so into go that's why I can't help you in that case

Upvotes: 2

Related Questions