Pete Garafano
Pete Garafano

Reputation: 4913

Buffer.BlockCopy vs Array.Copy curiosity

I've been toying around with some .NET features (namely Pipelines, Memory, and Array Pools) for high speed file reading/parsing. I came across something interesting while playing around with Array.Copy, Buffer.BlockCopy and ReadOnlySequence.CopyTo. The IO Pipeline reads data as byte and I'm attempting to efficiently turn it into char.

While playing around with Array.Copy I found that I am able to copy from byte[] to char[] and the compiler (and runtime) are more than happy to do it.

char[] outputBuffer = ArrayPool<char>.Shared.Rent(inputBuffer.Length);
Array.Copy(buffer, 0, outputBuffer, 0, buffer.Length);

This code runs as expected, though I'm sure there are some UTF edge cases not properly handled here.

My curiosity comes with Buffer.BlockCopy

char[] outputBuffer = ArrayPool<char>.Shared.Rent(inputBuffer.Length);
Buffer.BlockCopy(buffer, 0, outputBuffer, 0, buffer.Length);

The resulting contents of outputBuffer are garbage. For example, with the example contents of buffer as

{ 50, 48, 49, 56, 45 }

The contents of outputBuffer after the copy is

{ 12338, 14385, 12333, 11575, 14385 }

I'm just curious what is happening "under the hood" inside the CLR that is causing these 2 commands to output such different results.

Upvotes: 6

Views: 1924

Answers (1)

Hans Passant
Hans Passant

Reputation: 941942

Array.Copy() is smarter about the element type. It will try to use the memmove() CRT function when it can. But will fall back to a loop that copies each element when it can't. Converting them as necessary, it considers boxing and primitive type conversions. So one element in the source array will become one element in the destination array.

Buffer.BlockCopy() skips all that and blasts with memmove(). No conversions are considered. Which is why it can be slightly faster. And easier to mislead you about the array content. Do note that utf8 encoded character data is visible in that array, 12338 == 0x3032 = "2 ", 14385 = 0x3831 = "18", etc. Easier to see with Debug > Windows > Memory > Memory 1.

Noteworthy perhaps is that this type-coercion is a feature. Say when you receive an int[] through a socket or pipe but have the data in a byte[] buffer. By far the fastest way to do it.

Upvotes: 15

Related Questions