thecoop
thecoop

Reputation: 46128

Array.Copy vs Buffer.BlockCopy

Array.Copy and Buffer.BlockCopy both do the same thing, but BlockCopy is aimed at fast byte-level primitive array copying, whereas Copy is the general-purpose implementation. My question is - under what circumstances should you use BlockCopy? Should you use it at any time when you are copying primitive type arrays, or should you only use it if you're coding for performance? Is there anything inherently dangerous about using Buffer.BlockCopy over Array.Copy?

Upvotes: 165

Views: 113615

Answers (8)

user1121956
user1121956

Reputation: 1961

On .NET 5.0.6 (x64) - for copying byte array to byte array - Array.Copy seems to be the winner even for short arrays. Interestingly enough Enumumerable.Concat is also relatively fast on longer arrays because it optimizes for ICollection<T> if enumerable implements it (it is not the case for .NET Framework though).

Benchmark results and source code:

Method ArrayLength NumberOfArrays Mean Error StdDev
EnumerableConcat 50 1 63.54 ns 1.863 ns 5.435 ns
ForLoop 50 1 95.01 ns 2.008 ns 4.694 ns
ForeachLoop 50 1 91.80 ns 1.953 ns 4.527 ns
ArrayCopy 50 1 26.66 ns 1.043 ns 3.075 ns
BufferBlockCopy 50 1 27.65 ns 0.716 ns 2.076 ns
EnumerableConcat 50 2 265.30 ns 9.362 ns 26.558 ns
ForLoop 50 2 188.80 ns 5.084 ns 13.659 ns
ForeachLoop 50 2 180.16 ns 4.953 ns 14.448 ns
ArrayCopy 50 2 42.47 ns 0.970 ns 2.623 ns
BufferBlockCopy 50 2 47.28 ns 1.038 ns 2.024 ns
EnumerableConcat 50 3 327.81 ns 9.332 ns 27.368 ns
ForLoop 50 3 285.21 ns 6.028 ns 17.680 ns
ForeachLoop 50 3 260.04 ns 5.308 ns 14.795 ns
ArrayCopy 50 3 62.97 ns 1.505 ns 4.366 ns
BufferBlockCopy 50 3 73.45 ns 3.265 ns 9.626 ns
EnumerableConcat 100 1 69.27 ns 1.762 ns 5.167 ns
ForLoop 100 1 189.44 ns 3.907 ns 11.398 ns
ForeachLoop 100 1 163.03 ns 3.311 ns 5.057 ns
ArrayCopy 100 1 33.23 ns 1.225 ns 3.574 ns
BufferBlockCopy 100 1 35.55 ns 1.004 ns 2.865 ns
EnumerableConcat 100 2 291.20 ns 10.245 ns 30.207 ns
ForLoop 100 2 363.01 ns 7.160 ns 9.310 ns
ForeachLoop 100 2 357.98 ns 7.228 ns 7.734 ns
ArrayCopy 100 2 56.59 ns 1.702 ns 5.019 ns
BufferBlockCopy 100 2 61.82 ns 1.747 ns 5.095 ns
EnumerableConcat 100 3 354.19 ns 9.679 ns 27.925 ns
ForLoop 100 3 544.59 ns 16.346 ns 48.198 ns
ForeachLoop 100 3 522.59 ns 12.927 ns 37.914 ns
ArrayCopy 100 3 80.66 ns 3.154 ns 9.300 ns
BufferBlockCopy 100 3 87.21 ns 2.414 ns 7.081 ns
EnumerableConcat 1000 1 181.98 ns 4.073 ns 11.882 ns
ForLoop 1000 1 1,643.59 ns 32.135 ns 50.030 ns
ForeachLoop 1000 1 1,444.37 ns 28.705 ns 70.951 ns
ArrayCopy 1000 1 143.55 ns 3.874 ns 11.301 ns
BufferBlockCopy 1000 1 146.69 ns 3.349 ns 9.662 ns
EnumerableConcat 1000 2 525.41 ns 10.621 ns 29.254 ns
ForLoop 1000 2 3,264.64 ns 47.449 ns 39.622 ns
ForeachLoop 1000 2 2,818.58 ns 56.489 ns 126.345 ns
ArrayCopy 1000 2 283.73 ns 5.613 ns 15.175 ns
BufferBlockCopy 1000 2 292.29 ns 5.827 ns 15.654 ns
EnumerableConcat 1000 3 712.58 ns 15.274 ns 44.068 ns
ForLoop 1000 3 5,005.50 ns 99.791 ns 214.810 ns
ForeachLoop 1000 3 4,272.26 ns 89.589 ns 261.335 ns
ArrayCopy 1000 3 422.30 ns 8.542 ns 22.502 ns
BufferBlockCopy 1000 3 433.49 ns 8.808 ns 20.587 ns
EnumerableConcat 10000 1 1,221.27 ns 28.138 ns 82.964 ns
ForLoop 10000 1 16,464.04 ns 441.552 ns 1,294.995 ns
ForeachLoop 10000 1 13,916.99 ns 273.792 ns 676.746 ns
ArrayCopy 10000 1 1,150.18 ns 26.901 ns 79.318 ns
BufferBlockCopy 10000 1 1,154.10 ns 23.094 ns 60.025 ns
EnumerableConcat 10000 2 2,798.41 ns 54.615 ns 141.952 ns
ForLoop 10000 2 32,570.61 ns 646.828 ns 1,473.154 ns
ForeachLoop 10000 2 27,707.12 ns 545.888 ns 1,051.741 ns
ArrayCopy 10000 2 2,379.49 ns 72.264 ns 213.073 ns
BufferBlockCopy 10000 2 2,374.17 ns 59.035 ns 173.140 ns
EnumerableConcat 10000 3 3,885.27 ns 77.809 ns 196.633 ns
ForLoop 10000 3 49,833.15 ns 984.022 ns 2,097.031 ns
ForeachLoop 10000 3 41,174.21 ns 819.971 ns 1,392.373 ns
ArrayCopy 10000 3 3,738.32 ns 74.331 ns 91.285 ns
BufferBlockCopy 10000 3 3,839.79 ns 78.865 ns 231.298 ns
public class ArrayConcatBenchmark
{
    [Params(50, 100, 1000, 10000)]
    public int ArrayLength;

    [Params(1, 2, 3)]
    public int NumberOfArrays;

    private byte[][] data;

    [GlobalSetup]
    public void GlobalSetup()
    {
        data = new byte[NumberOfArrays][];
        var random = new Random(42);
        for (int i = 0; i < NumberOfArrays; i++)
        {
            data[i] = new byte[ArrayLength];
            random.NextBytes(data[i]);
        }
    }

    [Benchmark]
    public byte[] EnumerableConcat()
    {
        IEnumerable<byte> enumerable = data[0];

        for (int n = 1; n < NumberOfArrays; n++)
        {
            enumerable = enumerable.Concat(data[n]);
        }

        return enumerable.ToArray();
    }

    [Benchmark]
    public byte[] ForLoop()
    {
        var result = new byte[ArrayLength * NumberOfArrays];

        for (int n = 0; n < NumberOfArrays; n++)
        {
            for (int i = 0; i < ArrayLength; i++)
            {
                result[i + n * ArrayLength] = data[n][i];
            }
        }

        return result;
    }

    [Benchmark]
    public byte[] ForeachLoop()
    {
        var result = new byte[ArrayLength * NumberOfArrays];

        for (int n = 0; n < NumberOfArrays; n++)
        {
            int i = 0;

            foreach (var item in data[n])
            {
                result[i + n * ArrayLength] = item;
                i++;
            }
        }

        return result;
    }

    [Benchmark]
    public byte[] ArrayCopy()
    {
        var result = new byte[ArrayLength * NumberOfArrays];

        for (int n = 0; n < NumberOfArrays; n++)
        {
            Array.Copy(data[n], 0, result, n * ArrayLength, ArrayLength);
        }

        return result;
    }

    [Benchmark]
    public byte[] BufferBlockCopy()
    {
        var result = new byte[ArrayLength * NumberOfArrays];

        for (int n = 0; n < NumberOfArrays; n++)
        {
            Buffer.BlockCopy(data[n], 0, result, n * ArrayLength, ArrayLength);
        }

        return result;
    }

    public static void Main(string[] args)
    {
        //Console.WriteLine("Are all results the same: " + AreAllResultsTheSame());
        BenchmarkRunner.Run<ArrayConcatBenchmark>();
    }

    private static bool AreAllResultsTheSame()
    {
        var ac = new ArrayConcatBenchmark()
        {
            NumberOfArrays = 2,
            ArrayLength = 100,
        };

        ac.GlobalSetup();

        var firstResult = ac.EnumerableConcat();
        var otherResults = new[]
        {
            ac.ForLoop(),
            ac.ForeachLoop(),
            ac.ArrayCopy(),
            ac.BufferBlockCopy(),
        };

        return otherResults.All(x => firstResult.SequenceEqual(x));
    }
}

Upvotes: 6

MusiGenesis
MusiGenesis

Reputation: 75336

Since the parameters to Buffer.BlockCopy are byte-based rather than index-based, you're more likely to screw up your code than if you use Array.Copy, so I would only use Buffer.BlockCopy in a performance-critical section of my code.

Upvotes: 73

Special Sauce
Special Sauce

Reputation: 5604

Prelude

I'm joining the party late, but with 32k views, it's worth getting this right. Most of the microbenchmarking code in the posted answers thus far suffer from one or more severe technical flaws, including not moving memory allocations out of the test loops (which introduces severe GC artifacts), not testing variable vs. deterministic execution flows, JIT warmup, and not tracking intra-test variability. In addition, most answers did not test the effects of varying buffer sizes and varying primitive types (with respect to either 32-bit or 64-bit systems). To address this question more comprehensively, I hooked it up to a custom microbenchmarking framework I developed that reduces most of the common "gotchas" to the extent possible. Tests were run in .NET 4.0 Release mode on both a 32-bit machine and a 64-bit machine. Results were averaged over 20 testing runs, in which each run had 1 million trials per method. Primitive types tested were byte (1 byte), int (4 bytes), and double (8 bytes). Three methods were tested: Array.Copy(), Buffer.BlockCopy(), and simple per-index assignment in a loop. The data is too voluminous to post here, so I will summarize the important points.

The Takeaways

  • If your buffer length is about 75-100 or less, an explicit loop copy routine is usually faster (by about 5%) than either Array.Copy() or Buffer.BlockCopy() for all 3 primitive types tested on both 32-bit and 64-bit machines. Additionly, the explicit loop copy routine has noticeably lower variability in performance compared to the two alternatives. The good performance is almost surely due to locality of reference exploited by CPU L1/L2/L3 memory caching in conjunction with no method call overhead.
    • For double buffers on 32-bit machines only: The explicit loop copy routine is better than both alternatives for all buffer sizes tested up to 100k. The improvement is 3-5% better than the other methods. This is because the performance of Array.Copy() and Buffer.BlockCopy() become totally degraded upon passing the native 32-bit width. Thus I assume the same effect would apply to long buffers as well.
  • For buffer sizes exceeding ~100, explicit loop copying quickly becomes much slower than the other 2 methods (with the one particular exception just noted). The difference is most noticeable with byte[], where explicit loop copying can become 7x or more slower at large buffer sizes.
  • In general, for all 3 primitive types tested and across all buffer sizes, Array.Copy() and Buffer.BlockCopy() performed almost identically. On average, Array.Copy() seems to have a very slight edge of about 2% or less time taken (but 0.2% - 0.5% better is typical), although Buffer.BlockCopy() did occasionally beat it. For unknown reasons, Buffer.BlockCopy() has noticeably higher intra-test variability than Array.Copy(). This effect could not be eliminated despite me trying multiple mitigations and not having an operable theory on why.
  • Because Array.Copy() is a "smarter", more general, and much safer method, in addition to being very slightly faster and having less variability on average, it should be preferred to Buffer.BlockCopy() in almost all common cases. The only use case where Buffer.BlockCopy() will be significantly better is when the source and destination array value types are different (as pointed out in Ken Smith's answer). While this scenario is not common, Array.Copy() can perform very poorly here due to the continual "safe" value type casting, compared to the direct casting of Buffer.BlockCopy().
  • Additional evidence from outside StackOverflow that Array.Copy() is faster than Buffer.BlockCopy() for same-type array copying can be found here.

Upvotes: 189

Thulani Chivandikwa
Thulani Chivandikwa

Reputation: 3539

To weigh in on this argument, if one is not careful how they author this benchmark they could be easily misled. I wrote a very simple test to illustrate this. In my test below if I swap the order of my tests between starting Buffer.BlockCopy first or Array.Copy the one that goes first is almost always the slowest (although its a close one). This means for a bunch of reasons which I wont go into simply running the tests multiple times esp one after the other will not give accurate results.

I resorted to maintaining the test as is with 1000000 tries each for an array of 1000000 sequential doubles. However in I then disregard the first 900000 cycles and average the remainder. In that case the Buffer is superior.

private static void BenchmarkArrayCopies()
        {
            long[] bufferRes = new long[1000000];
            long[] arrayCopyRes = new long[1000000];
            long[] manualCopyRes = new long[1000000];

            double[] src = Enumerable.Range(0, 1000000).Select(x => (double)x).ToArray();

            for (int i = 0; i < 1000000; i++)
            {
                bufferRes[i] = ArrayCopyTests.ArrayBufferBlockCopy(src).Ticks;
            }

            for (int i = 0; i < 1000000; i++)
            {
                arrayCopyRes[i] = ArrayCopyTests.ArrayCopy(src).Ticks;
            }

            for (int i = 0; i < 1000000; i++)
            {
                manualCopyRes[i] = ArrayCopyTests.ArrayManualCopy(src).Ticks;
            }

            Console.WriteLine("Loop Copy: {0}", manualCopyRes.Average());
            Console.WriteLine("Array.Copy Copy: {0}", arrayCopyRes.Average());
            Console.WriteLine("Buffer.BlockCopy Copy: {0}", bufferRes.Average());

            //more accurate results - average last 1000

            Console.WriteLine();
            Console.WriteLine("----More accurate comparisons----");

            Console.WriteLine("Loop Copy: {0}", manualCopyRes.Where((l, i) => i > 900000).ToList().Average());
            Console.WriteLine("Array.Copy Copy: {0}", arrayCopyRes.Where((l, i) => i > 900000).ToList().Average());
            Console.WriteLine("Buffer.BlockCopy Copy: {0}", bufferRes.Where((l, i) => i > 900000).ToList().Average());
            Console.ReadLine();
        }

public class ArrayCopyTests
    {
        private const int byteSize = sizeof(double);

        public static TimeSpan ArrayBufferBlockCopy(double[] original)
        {
            Stopwatch watch = new Stopwatch();
            double[] copy = new double[original.Length];
            watch.Start();
            Buffer.BlockCopy(original, 0 * byteSize, copy, 0 * byteSize, original.Length * byteSize);
            watch.Stop();
            return watch.Elapsed;
        }

        public static TimeSpan ArrayCopy(double[] original)
        {
            Stopwatch watch = new Stopwatch();
            double[] copy = new double[original.Length];
            watch.Start();
            Array.Copy(original, 0, copy, 0, original.Length);
            watch.Stop();
            return watch.Elapsed;
        }

        public static TimeSpan ArrayManualCopy(double[] original)
        {
            Stopwatch watch = new Stopwatch();
            double[] copy = new double[original.Length];
            watch.Start();
            for (int i = 0; i < original.Length; i++)
            {
                copy[i] = original[i];
            }
            watch.Stop();
            return watch.Elapsed;
        }
    }

https://github.com/chivandikwa/Random-Benchmarks

Upvotes: 2

Ken Smith
Ken Smith

Reputation: 20445

Another example of when it makes sense to use Buffer.BlockCopy() is when you're provided with an array of primitives (say, shorts), and need to convert it to an array of bytes (say, for transmission over a network). I use this method frequently when dealing with audio from the Silverlight AudioSink. It provides the sample as a short[] array, but you need to convert it to a byte[] array when you're building the packet that you submit to Socket.SendAsync(). You could use BitConverter, and iterate through the array one-by-one, but it's a lot faster (about 20x in my testing) just to do this:

Buffer.BlockCopy(shortSamples, 0, packetBytes, 0, shortSamples.Length * sizeof(short)).  

And the same trick works in reverse as well:

Buffer.BlockCopy(packetBytes, readPosition, shortSamples, 0, payloadLength);

This is about as close as you get in safe C# to the (void *) sort of memory management that's so common in C and C++.

Upvotes: 71

dragonfly02
dragonfly02

Reputation: 3679

Just want to add my testing case which shows again BlockCopy has no 'PERFORMANCE' benefit over Array.Copy. They seem to have the same performance under release mode on my machine (both take about 66ms to copy 50 million integers). Under debug mode, BlockCopy is just marginally faster.

    private static T[] CopyArray<T>(T[] a) where T:struct 
    {
        T[] res = new T[a.Length];
        int size = Marshal.SizeOf(typeof(T));
        DateTime time1 = DateTime.Now;
        Buffer.BlockCopy(a,0,res,0, size*a.Length);
        Console.WriteLine("Using Buffer blockcopy: {0}", (DateTime.Now - time1).Milliseconds);
        return res;
    }

    static void Main(string[] args)
    {
        int simulation_number = 50000000;
        int[] testarray1 = new int[simulation_number];

        int begin = 0;
        Random r = new Random();
        while (begin != simulation_number)
        {
            testarray1[begin++] = r.Next(0, 10000);
        }

        var copiedarray = CopyArray(testarray1);

        var testarray2 = new int[testarray1.Length];
        DateTime time2 = DateTime.Now;
        Array.Copy(testarray1, testarray2, testarray1.Length);
        Console.WriteLine("Using Array.Copy(): {0}", (DateTime.Now - time2).Milliseconds);
    }

Upvotes: 1

user3523091
user3523091

Reputation: 945

ArrayCopy is smarter than BlockCopy. It figures out how to copy elements if the source and destination are the same array.

If we populate an int array with 0,1,2,3,4 and apply:

Array.Copy(array, 0, array, 1, array.Length - 1);

we end up with 0,0,1,2,3 as expected.

Try this with BlockCopy and we get: 0,0,2,3,4. If I assign array[0]=-1 after that, it becomes -1,0,2,3,4 as expected, but if the array length is even, like 6, we get -1,256,2,3,4,5. Dangerous stuff. Don't use BlockCopy other than for copying one byte array into another.

There is another case where you can only use Array.Copy: if the array size is longer than 2^31. Array.Copy has an overload with a long size parameter. BlockCopy does not have that.

Upvotes: 7

Kevin
Kevin

Reputation: 8902

Based on my testing, performance is not a reason to prefer Buffer.BlockCopy over Array.Copy. From my testing Array.Copy is actually faster than Buffer.BlockCopy.

var buffer = File.ReadAllBytes(...);

var length = buffer.Length;
var copy = new byte[length];

var stopwatch = new Stopwatch();

TimeSpan blockCopyTotal = TimeSpan.Zero, arrayCopyTotal = TimeSpan.Zero;

const int times = 20;

for (int i = 0; i < times; ++i)
{
    stopwatch.Start();
    Buffer.BlockCopy(buffer, 0, copy, 0, length);
    stopwatch.Stop();

    blockCopyTotal += stopwatch.Elapsed;

    stopwatch.Reset();

    stopwatch.Start();
    Array.Copy(buffer, 0, copy, 0, length);
    stopwatch.Stop();

    arrayCopyTotal += stopwatch.Elapsed;

    stopwatch.Reset();
}

Console.WriteLine("bufferLength: {0}", length);
Console.WriteLine("BlockCopy: {0}", blockCopyTotal);
Console.WriteLine("ArrayCopy: {0}", arrayCopyTotal);
Console.WriteLine("BlockCopy (average): {0}", TimeSpan.FromMilliseconds(blockCopyTotal.TotalMilliseconds / times));
Console.WriteLine("ArrayCopy (average): {0}", TimeSpan.FromMilliseconds(arrayCopyTotal.TotalMilliseconds / times));

Example Output:

bufferLength: 396011520
BlockCopy: 00:00:02.0441855
ArrayCopy: 00:00:01.8876299
BlockCopy (average): 00:00:00.1020000
ArrayCopy (average): 00:00:00.0940000

Upvotes: 17

Related Questions