Reputation: 313
Consider a text file like this:
Some text
here.
---
More text
another line.
---
Third part of text.
I want to split it into three parts, divided by the ---
separator. The parts should be stored in a map.
Now, the exact same programs with different types.
When I use string
, everything works fine:
KEY: 0
Some text
here.
KEY: 1
More text
another line.
KEY: 2
Third part of text.
https://play.golang.org/p/IcGdoUNcTEe
When I use []byte
, things gets messed up:
KEY: 0
Third part of teKEY: 1
Third part of text.
ne.
KEY: 2
Third part of text.
https://play.golang.org/p/jqLhCrqsvOs
Why?
Program 1 (string
):
func main() {
parts := parseParts([]byte(input))
for k, v := range parts {
fmt.Printf("KEY: %d\n%s", k, v)
}
}
func parseParts(input []byte) map[int]string {
parts := map[int]string{}
s := bufio.NewScanner(bytes.NewReader(input))
buf := bytes.Buffer{}
i := 0
for s.Scan() {
if s.Text() == "---" {
parts[i] = buf.String()
buf.Reset()
i++
continue
}
buf.Write(s.Bytes())
buf.WriteString("\n")
}
parts[i] = buf.String()
return parts
}
Program 2 ([]byte
):
func main() {
parts := parseParts([]byte(input))
for k, v := range parts {
fmt.Printf("KEY: %d\n%s", k, v)
}
}
func parseParts(input []byte) map[int]string {
parts := map[int]string{}
s := bufio.NewScanner(bytes.NewReader(input))
buf := bytes.Buffer{}
i := 0
for s.Scan() {
if s.Text() == "---" {
parts[i] = buf.String()
buf.Reset()
i++
continue
}
buf.Write(s.Bytes())
buf.WriteString("\n")
}
parts[i] = buf.String()
return parts
}
Upvotes: 0
Views: 464
Reputation: 34031
In the string version,
parts[i] = buf.String()
sets parts[i]
to a new string every time. In the []byte
version,
parts[i] = buf.Bytes()
sets parts[i]
to a byte slice backed by the same array every time. The contents of the backing array are the same for all three slices, but the lengths match the length when created, which is why all three slices show the same content but cut off at different places.
You could replace the byte slice line
parts[i] = buf.Bytes()
with something like this:
bb := buf.Bytes()
b := make([]byte, len(bb))
copy(b, bb)
parts[i] = b
in order to get the behavior to match the string version. But the string version is easier and better matches what you seem to be trying to do.
Upvotes: 5
Reputation: 2329
The difference is that bytes.Buffer.String
copies the memory, while bytes.Buffer.Bytes
does not. Quoting the documentation,
The slice is valid for use only until the next buffer modification (that is, only until the next call to a method like Read, Write, Reset, or Truncate).
Upvotes: 3