JDR
JDR

Reputation: 1146

C# chunking two-dimensional array into batches

I have a two-dimensional object[,] array which contains a matrix of rows and columns (object[nRows, nColumns]).

I would like to chunk this into a batch of rows - e.g. batches of 1,000 rows each which I can enumerate over.

In summary, I am looking for C# code that does the following but for two dimensional arrays (source):

private IEnumerable<T[]> SplitArray<T>(T[] sourceArray, int rangeLength)
{
    int startIndex = 0;

    do
    {
        T[] range = new T[Math.Min(rangeLength, sourceArray.Length - startIndex)];
        Array.Copy(sourceArray, startIndex, range, 0, range.Length);
        startIndex += rangeLength;
        yield return range;
    }
    while (startIndex < sourceArray.Length);            
}

This attempt at adapting the code for [,] arrays fails - rows/columns begin to get jumbled-up after the first iteration:

        private IEnumerable<T[,]> SplitArray<T>(T[,] sourceArray, int rangeLength)
        {
            int startIndex = 0;

            do
            {
                T[,] range = new T[Math.Min(rangeLength, sourceArray.GetLength(0) - startIndex), sourceArray.GetLength(1)];
                Array.Copy(sourceArray, startIndex, range, 0, range.Length);
                startIndex += rangeLength;
                yield return range;
            }
            while (startIndex < sourceArray.GetLength(0));
        }

Upvotes: 3

Views: 790

Answers (3)

Bart van der Drift
Bart van der Drift

Reputation: 1336

I think you're looking for something like this:

private static List<T[]> SplitArray<T>(T[,] sourceArray)
{
    List<T[]> result = new List<T[]>();
    int rowCount = sourceArray.GetLength(0);
    for (int i = 0; i < rowCount; i++)
    {
        result.Add(GetRow(sourceArray, i));
    }

    return result;
}

private static T[] GetRow<T>(T[,] sourceArray, int rownumber)
{
    int columnCount = sourceArray.GetLength(1);
    var row = new T[columnCount];
    for (int i = 0; i < columnCount; i++)
    {
        row[i] = sourceArray[rownumber, i];
    }
    return row;
}

Upvotes: 1

Magnetron
Magnetron

Reputation: 8573

This will solve your code issues. As Array.Copy threats the array as a single dimensional, you have to multiply by the number of columns to get the total amount of elements in some places:

private IEnumerable<T[,]> SplitArray<T>(T[,] sourceArray, int rangeLength)
{
    int startIndex = 0;
    do
    {
        T[,] range = new T[Math.Min(rangeLength, sourceArray.GetLength(0) - startIndex/sourceArray.GetLength(1)), sourceArray.GetLength(1)];
        Array.Copy(sourceArray, startIndex, range, 0, range.Length);
        startIndex += rangeLength*sourceArray.GetLength(1);
        yield return range;
    }
    while (startIndex < sourceArray.Length);
}

Upvotes: 2

David
David

Reputation: 10708

By Using GetLength(int dimension), you can see how long a particular dimension is for an array, and then iterate through that. You'll also need to take the other dimensions as constants, and make sure the whole thing matches up to the Array.Rank value. From there, just look up the value via Array.GetValue(int[]). This may be a touch difficult since Array isn't generic:

public static IEnumerable<T> GetRow<T>(this Array source, int dimension, params int[] fixedDimensions)
{
    if(source == null) throw new ArgumentNullException(nameof(source));
    if(!typeof(T).IsAssignableFrom(source.GetType().GetElementType()) throw new OperationException($"Cannot return row of type {typeof(T)} from array of type {source.GetType().GetElementType()}");

    if(fixedDimensions == null) fixedDimensions = new T[0];
    if(source.Rank != fixedDimensions.Length + 1) throw new ArgumentException("Fixed dimensions must have exactly one fewer elements than dimensions in source", nameof(fixedDimensions));
    if(dimension > source.Rank) throw new ArgumentException($"Cannot take dimension {dimension} of an array with {source.Rank} dimensions!", nameof(dimension));
    if(dimension < 0) throw new ArgumentException("Cannot take a negative dimension", nameof(dimension));

    var coords = dimension == source.Rank
         ? fixedDimensions
            .Concat(new [] { 0 })
            .ToArray()
        : fixedDimensions
            .Take(dimension)
            .Concat(new [] { 0 })
            .Concat(fixedDimensions.Skip(dimension))
            .ToArray();

    var length = source.GetLength(dimension);
    for(; coords[dimension] < length; coords[dimension]++)
    {
        yield return (T)source.GetValue(coords);
    }
}

Upvotes: 1

Related Questions