Chris McAtackney
Chris McAtackney

Reputation: 5232

LINQ alternative to creating collection by looping over array of fields

I have an array which contains fields for a data structure in the following format;

[0] = Record 1 (Name Field)
[1] = Record 1 (ID Field)
[2] = Record 1 (Other Field)
[3] = Record 2 (Name Field)
[4] = Record 2 (ID Field)
[5] = Record 2 (Other Field)

etc.

I'm processing this into a collection as follows;

for (int i = 0; i < components.Length; i = i + 3)
{
    results.Add(new MyObj
        {
            Name = components[i],
            Id = components[i + 1],
            Other = components[i + 2],
        });
}

This works fine, but I was wondering if there is a nice way to achieve the same output with LINQ? There's no functional requirement here, I'm just curious if it can be done or not.

I did do some experimenting with grouping by an index (after ToList()'ing the array);

var groupings = components
    .GroupBy(x => components.IndexOf(x) / 3)
    .Select(g => g.ToArray())
    .Select(a => new
        {
            Name = a[0],
            Id = a[1],
            Other = a[2]
        });

This works, but I think it's a bit overkill for what I'm trying to do. Is there a simpler way to achieve the same output as the for loop?

Upvotes: 1

Views: 410

Answers (4)

Corey
Corey

Reputation: 16574

Looks like a perfect candidate for Josh Einstein's IEnumerable.Batch extension. It slices an enumerable into chunks of a certain size and feeds them out as an enumeration of arrays:

public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> self, int batchSize)

In the case of this question, you'd do something like this:

var results = 
    from batch in components.Batch(3)
    select new MyObj { Name = batch[0], Id = batch[1], Other = batch[2] };

Update: 2 years on and the Batch extension I linked to seems to have disappeared. Since it was considered the answer to the question, and just in case someone else finds it useful, here's my current implementation of Batch:

public static partial class EnumExts
{
    /// <summary>Split sequence into blocks of specified size.</summary>
    /// <typeparam name="T">Type of items in sequence</typeparam>
    /// <param name="sequence"><see cref="IEnumerable{T}"/> sequence to split</param>
    /// <param name="batchLength">Number of items per returned array</param>
    /// <returns>Arrays of <paramref name="batchLength"/> items, with last array smaller if sequence count is not a multiple of <paramref name="batchLength"/></returns>
    public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> sequence, int batchLength)
    {
        if (sequence == null)
            throw new ArgumentNullException("sequence");
        if (batchLength < 2)
            throw new ArgumentException("Batch length must be at least 2", "batchLength");

        using (var iter = sequence.GetEnumerator())
        {
            var bfr = new T[batchLength];
            while (true)
            {
                for (int i = 0; i < batchLength; i++)
                {
                    if (!iter.MoveNext())
                    {
                        if (i == 0)
                            yield break;
                        Array.Resize(ref bfr, i);
                        break;
                    }

                    bfr[i] = iter.Current;
                }
                yield return bfr;
                bfr = new T[batchLength];
            }
        }
    }
}

This operation is deferred, single enumeration and executes in linear time. It is relatively quick compared to a few other Batch implementations I've seen, even though it is allocating a new array for each result.

Which just goes to show: you never can tell until you profile, and you should always quote the code in case it disappears.

Upvotes: 2

Tim Schmelter
Tim Schmelter

Reputation: 460168

I would say stick with your for-loop. However, this should work with Linq:

List<MyObj> results = components
    .Select((c ,i) => new{ Component = c, Index = i })
    .GroupBy(x => x.Index / 3)
    .Select(g => new MyObj{
        Name = g.First().Component,
        Id = g.ElementAt(1).Component,
        Other = g.Last().Component
    })
    .ToList();

Upvotes: 2

sehe
sehe

Reputation: 393239

This method may give you an idea on how to make the code more expressive.

public static IEnumerable<MyObj> AsComponents<T>(this IEnumerable<T> serialized)
    where  T:class
{
    using (var it = serialized.GetEnumerator())
    {
        Func<T> next = () => it.MoveNext() ? it.Current : null;

        var obj = new MyObj
            {
                Name  = next(),
                Id    = next(),
                Other = next()
            };

        if (obj.Name == null)
            yield break;

        yield return obj;
    }
}

As it stands, I dislike the way I detect the end of the input, but you might have domain specific information on how to do this better.

Upvotes: 1

Ryszard Dżegan
Ryszard Dżegan

Reputation: 25434

Maybe an iterator could be appropriate.

Declare a custom iterator:

static IEnumerable<Tuple<int, int, int>> ToPartitions(int count)
{
    for (var i = 0; i < count; i += 3)
        yield return new Tuple<int, int, int>(i, i + 1, i + 2);
}

Prepare the following LINQ:

var results = from partition in ToPartitions(components.Length)
              select new {Name = components[partition.Item1], Id = components[partition.Item2], Other = components[partition.Item3]};

Upvotes: 1

Related Questions