Vishal Kumar
Vishal Kumar

Reputation: 802

When golang does allocation for string to byte conversion

var testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
//var testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"
func BenchmarkHashing900000000(b *testing.B){
    var bufByte = bytes.Buffer{}
    for i := 0; i < b.N ; i++{
        bufByte.WriteString(testString)
        Sum32(bufByte.Bytes())
        bufByte.Reset()
    }
}

func BenchmarkHashingWithNew900000000(b *testing.B){
    for i := 0; i < b.N ; i++{
        bytStr := []byte(testString)
        Sum32(bytStr)
    }
}

test result:

With  testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
BenchmarkHashing900000000-4         50000000            35.2 ns/op         0 B/op          0 allocs/op
BenchmarkHashingWithNew900000000-4  50000000            30.9 ns/op         0 B/op          0 allocs/op

With testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"
BenchmarkHashing900000000-4         30000000            46.6 ns/op         0 B/op          0 allocs/op
BenchmarkHashingWithNew900000000-4  20000000            73.0 ns/op        64 B/op          1 allocs/op

Why there is allocation in case of BenchmarkHashingWithNew900000000 when string is long but no allocation when string is small.
Sum32 : https://gowalker.org/github.com/spaolacci/murmur3
I am using go1.6

Upvotes: 7

Views: 3813

Answers (2)

Francis Stephens
Francis Stephens

Reputation: 647

Your benchmarks are observing a curious optimisation by the Golang compiler (version 1.8).

You can see the PR from Dmitry Dvyukov here

https://go-review.googlesource.com/c/go/+/3120

Unfortunately that is from a long time ago, when the compiler was written in C, I am not sure where to find the optimisation in the current compiler. But I can confirm that it still exists, and Dmitry's PR description is accurate.

If you want a clearer self contained set of benchmarks to demonstrate this I have a gist here.

https://gist.github.com/fmstephe/f0eb393c4ec41940741376ab08cbdf7e

If we look only at the second benchmark BenchmarkHashingWithNew900000000 we can see a clear spot where it 'should' allocate.

bytStr := []byte(testString)

This line must copy the contents of testString into a new []byte. However in this case the compiler can see that bytStr is never used again after Sum32 returns. Therefore it can be allocated on the stack. However, as strings can be arbitrarily large a limit is set to 32 bytes for a stack allocated string or []byte.

It's worth being aware of this little trick, because it can be easy to trick yourself into believing some code does not allocate, if your benchmark strings are all short.

Upvotes: 5

Roland Illig
Roland Illig

Reputation: 41617

When you write something into a byte.Buffer, it allocates memory as needed. When you call byte.Buffer.Reset, that memory is not freed but instead kept for later reuse. And your code does exactly that. It marks the buffer as empty and then fills it again.

Actually, there is some allocation going on, but when iterating 50000000 times, it is negligible. But if you move the declaration for bufByte into the for loop, you will get some allocations.

Upvotes: -1

Related Questions