Reputation: 549
I want to split a string on a regular expresion, but preserve the matches.
I have tried splitting the string on a regex, but it throws away the matches. I have also tried using this, but I am not very good at translating code from language to language, let alone C#.
re := regexp.MustCompile(`\d`)
array := re.Split("ab1cd2ef3", -1)
I need the value of array to be ["ab", "1", "cd", "2", "ef", "3"], but the value of array is ["ab", "cd", "ef"]. No errors.
Upvotes: 4
Views: 2290
Reputation: 1
You can use a bufio.Scanner
:
package main
import (
"bufio"
"strings"
)
func digit(data []byte, eof bool) (int, []byte, error) {
for i, b := range data {
if '0' <= b && b <= '9' {
if i > 0 {
return i, data[:i], nil
}
return 1, data[:1], nil
}
}
return 0, nil, nil
}
func main() {
s := bufio.NewScanner(strings.NewReader("ab1cd2ef3"))
s.Split(digit)
for s.Scan() {
println(s.Text())
}
}
https://golang.org/pkg/bufio#Scanner.Split
Upvotes: 1
Reputation: 842
The kind of regex support in the link you have pointed out is NOT available in Go regex package. You can read the related discussion.
What you want to achieve (as per the sample given) can be done using regex to match digits or non-digits.
package main
import (
"fmt"
"regexp"
)
func main() {
str := "ab1cd2ef3"
r := regexp.MustCompile(`(\d|[^\d]+)`)
fmt.Println(r.FindAllStringSubmatch(str, -1))
}
Playground: https://play.golang.org/p/L-ElvkDky53
Output:
[[ab ab] [1 1] [cd cd] [2 2] [ef ef] [3 3]]
Upvotes: 2
Reputation: 1
Dumb solutions. Add separator in the string and split with separator.
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
re := regexp.MustCompile(`\d+`)
input := "ab1cd2ef3"
sep := "|"
indexes := re.FindAllStringIndex(input, -1)
fmt.Println(indexes)
move := 0
for _, v := range indexes {
p1 := v[0] + move
p2 := v[1] + move
input = input[:p1] + sep + input[p1:p2] + sep + input[p2:]
move += 2
}
result := strings.Split(input, sep)
fmt.Println(result)
}
Upvotes: 0
Reputation: 156
I don't think this is possible with the current regexp package, but the Split
could be easily extended to such behavior.
This should work for your case:
func Split(re *regexp.Regexp, s string, n int) []string {
if n == 0 {
return nil
}
matches := re.FindAllStringIndex(s, n)
strings := make([]string, 0, len(matches))
beg := 0
end := 0
for _, match := range matches {
if n > 0 && len(strings) >= n-1 {
break
}
end = match[0]
if match[1] != 0 {
strings = append(strings, s[beg:end])
}
beg = match[1]
// This also appends the current match
strings = append(strings, s[match[0]:match[1]])
}
if end != len(s) {
strings = append(strings, s[beg:])
}
return strings
}
Upvotes: 0