Reputation: 21564
I am kind of new to go and that's the first time I have to deal with regexp.
I am a bit surprised that the someregex.FindAllStringSubmatch("somestring", -1)
returns a slice of slice [][]string
instead of a simple slice of string : []string
.
example :
someRegex, _ := regexp.Compile("^.*(mes).*$")
matches := someRegex.FindAllStringSubmatch("somestring", -1)
fmt.Println(matches) // logs [[somestring mes]]
What is the reason of this behavior, I can't figure it out ?
Upvotes: 5
Views: 12907
Reputation: 626747
The func (*Regexp) FindAllStringSubmatch
extracts matches and captured submatches.
A submatch is a part of the text that is matched by the regex part that is enclosed with a pair of unescaped parentheses (a so called capturing group).
In your case, ^.*(mes).*$
matches:
^
- start of string.*
- any 0+ chars as many as possible(mes)
- Capturing group 1: a mes
substring.*$
- the rest of the string.So, the match value is the whole string. It will be the first value in the output. Then, since there is a capturing group, there must be a place for it in the results, hence, mes
is placed as the second item in the list.
Since there may be more matches than 1, we need a list of lists.
A better example may be the one with several match / submatch extraction (and maybe an optional group, too):
package main
import (
"fmt"
"regexp"
)
func main() {
someRegex, _ := regexp.Compile(`[^aouiye]([aouiye])([^aouiye])?`)
matches := someRegex.FindAllStringSubmatch("somestri", -1)
fmt.Printf("%q\n", matches)
}
The [^aouiye]([aouiye])([^aouiye])?
matches a non-vowel, a vowel, and a non-vowel, capturing the last 2 into separate groups #1 and #2.
The results are [["som" "o" "m"] ["ri" "i" ""]]
. There are 2 matches, and each contains a match value, Group 1 value and Group 2 value. Since the ri
match has no text captured into Group 2 (([^aouiye])?
), it is empty, but it is still there since the group is defined in the regex pattern.
Upvotes: 10
Reputation: 76434
FindAllStringSubmatch is the 'All' version of FindStringSubmatch; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.
Docs.
To sum up: You need an array of arrays of strings, because this is the all version of FindStringSubmatch. FindStringSubmatch will return a single string array.
Upvotes: 3