Reputation: 539
I am using package regexp
to find all mactch substring in Golang, but get unexpected result.Here is my code:
package main
import (
"fmt"
"regexp"
)
func main() {
str := "build: xxxxxx Prefix:middle#6\nPrefix:middle#16026Prefix:middle#1111\n Prefix:middle#110 Prefix:middle.#2 Prefix:middl.e#111 Prefix:middle#112"
regexpStr := "\\bPrefix:([a-zA-Z0-9]+[\\w-.]+[^.])#[0-9]+"
re := regexp.MustCompile(regexpStr)
matchs := re.FindAllString(str, -1)
fmt.Println(matchs)
}
You can see it in https://go.dev/play/p/XFSMW09MKxV.
expected:
[Prefix:middle#6 Prefix:middle#110 Prefix:middl.e#111 Prefix:middle#112]
But I got:
[Prefix:middle#6 Prefix:middle#16026 Prefix:middle#110 Prefix:middl.e#111 Prefix:middle#112]
Why Prefix:middle#16026
macthed? Could someone tell me the reason? And how to fix it, thx.
Here is the rules for what should match:
I want to extract Prefix:${middle}#${number}
in a String
.
${middle}
rules:
${number}
rules:
Prefix:${middle}#${number}
can appear at the beginning or end of a string, or in the middle of a string, but:
/n
;/n
;Upvotes: 3
Views: 129
Reputation: 626747
You can use the following regex with regexp.FindAllStringSubmatch
:
(?:\s|^)(Prefix:[a-zA-Z0-9][\w.-]*[^.]#\d+)(?:\s|$)
See the regex demo.
Note that this pattern will only work after doubling whitespaces in the string because both the whitespace boundaries, (?:\s|^)
and (?:\s|$)
, are consuming patterns, and will prevent getting consecutive matches. Hence, regexp.MustCompile(
\s).ReplaceAllString(str, "$0$0")
or similar should be used before running the above regex.
Details:
(?:\s|^)
- either a whitespace or start of string(Prefix:[a-zA-Z0-9][\w.-]*[^.]#\d+)
- Group 1:
Prefix:
- a fixed string[a-zA-Z0-9]
- an alphanumeric[\w.-]*
- zero or more letters, digits, underscores, dots or hyphens[^.]
- a char other than .
#
- a #
char\d+
- one or more digits(?:\s|$)
- either a whitespace or end of stringSee the Go demo:
package main
import (
"fmt"
"regexp"
)
func main() {
str := "Prefix:middle#113 build: xxxxxx Prefix:middle#6\nPrefix:middle#16026Prefix:middle#1111\n Prefix:middle#110 Prefix:middle.#2 Prefix:middl.e#111 Prefix:middle#112"
re := regexp.MustCompile(`(?:\s|^)(Prefix:[a-zA-Z0-9][\w.-]*[^.]#\d+)(?:\s|$)`)
matchs := re.FindAllStringSubmatch(regexp.MustCompile(`\s`).ReplaceAllString(str, "$0$0"), -1)
for _, m := range matchs {
fmt.Println(m[1])
}
}
Output:
Prefix:middle#113
Prefix:middle#6
Prefix:middle#110
Prefix:middl.e#111
Prefix:middle#112
Upvotes: 2