Reputation: 20464
I'm trying to split a collection into an specific number of parts, I've taken some help seeying solutions over StackOverflow: Split a collection into `n` parts with LINQ?
This is my VB.Net translation from @Hasan Khan solution:
''' <summary>
''' Splits an <see cref="IEnumerable(Of T)"/> into the specified amount of secuences.
''' </summary>
Public Shared Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))
Dim i As Integer = 0
Dim splits As IEnumerable(Of IEnumerable(Of T)) =
From item As T In col
Group item By item = Threading.Interlocked.Increment(i) Mod amount
Into Group
Select Group.AsEnumerable()
Return splits
End Function
And this my VB.Net translation of @manu08 solution:
''' <summary>
''' Splits an <see cref="IEnumerable(Of T)"/> into the specified amount of secuences.
''' </summary>
Public Shared Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))
Return col.Select(Function(item, index) New With {index, item}).
GroupBy(Function(x) x.index Mod amount).
Select(Function(x) x.Select(Function(y) y.item))
End Function
The problem is that both functions returns a wrong result.
Because if I split a collection like this:
Dim mainCol As IEnumerable(Of Integer) = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Dim splittedCols As IEnumerable(Of IEnumerable(Of Integer)) =
SplitIntoParts(col:=mainCol, amount:=2)
Both functions gives this result:
1: { 1, 3, 5, 7, 9 }
2: { 2, 4, 6, 8, 10 }
Instead of these secuences:
1: { 1, 2, 3, 4, 5 }
2: { 6, 7, 8, 9, 10 }
What I'm doing wrong?.
Upvotes: 1
Views: 184
Reputation: 1819
MyExtensions class has two public Split methods:
More of that: this algorithm is trying to bring back exactly specified number of collections.
public static class MyExtensions
{
// Works with ICollection - iterates through collection only once.
public static IEnumerable<IEnumerable<T>> Split<T>(this ICollection<T> items, int count)
{
return Split(items, items.Count, count);
}
// Works with IEnumerable and iterates items TWICE: first for count items, second to split them.
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, int count)
{
// ReSharper disable PossibleMultipleEnumeration
var itemsCount = items.Count();
return Split(items, itemsCount, count);
// ReSharper restore PossibleMultipleEnumeration
}
private static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, int itemsCount, int partsCount)
{
if (items == null)
throw new ArgumentNullException("items");
if (partsCount <= 0)
throw new ArgumentOutOfRangeException("partsCount");
var rem = itemsCount % partsCount;
var min = itemsCount / partsCount;
var max = rem != 0 ? min + 1 : min;
var index = 0;
var enumerator = items.GetEnumerator();
while (index < itemsCount)
{
var size = 0 < rem-- ? max : min;
yield return SplitPart(enumerator, size);
index += size;
}
}
private static IEnumerable<T> SplitPart<T>(IEnumerator<T> enumerator, int count)
{
for (var i = 0; i < count; i++)
{
if (!enumerator.MoveNext())
break;
yield return enumerator.Current;
}
}
}
Example program:
var items = new [] {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'};
for(var i = 1; i <= items.Length + 3; i++)
{
Console.WriteLine("{0} part(s)", i);
foreach (var part in items.Split(i))
Console.WriteLine(string.Join(", ", part));
Console.WriteLine();
}
... and output of this program:
1 part(s)
a, b, c, d, e, f, g, h, i, j
2 part(s)
a, b, c, d, e
f, g, h, i, j
3 part(s)
a, b, c, d
e, f, g
h, i, j
4 part(s)
a, b, c
d, e, f
g, h
i, j
5 part(s)
a, b
c, d
e, f
g, h
i, j
6 part(s)
a, b
c, d
e, f
g, h
i
j
7 part(s)
a, b
c, d
e, f
g
h
i
j
8 part(s)
a, b
c, d
e
f
g
h
i
j
9 part(s)
a, b
c
d
e
f
g
h
i
j
10 part(s)
a
b
c
d
e
f
g
h
i
j
11 part(s) // Only 10 items in collection.
a
b
c
d
e
f
g
h
i
j
12 part(s) // Only 10 items in collection.
a
b
c
d
e
f
g
h
i
j
13 part(s) // Only 10 items in collection.
a
b
c
d
e
f
g
h
i
j
Upvotes: 3
Reputation: 3476
Inefficient solution (too many iterations over data):
class Program
{
static void Main(string[] args)
{
var data = Enumerable.Range(1, 10);
var result = data.Split(2);
}
}
static class Extensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> col, int amount)
{
var chunkSize = (int)Math.Ceiling((double)col.Count() / (double)amount);
for (var i = 0; i < amount; ++i)
yield return col.Skip(chunkSize * i).Take(chunkSize);
}
}
EDIT:
In VB.Net
Public Shared Iterator Function SplitIntoParts(Of T)(ByVal col As IEnumerable(Of T),
ByVal amount As Integer) As IEnumerable(Of IEnumerable(Of T))
Dim chunkSize As Integer = CInt(Math.Ceiling(CDbl(col.Count()) / CDbl(amount)))
For i As Integer = 0 To amount - 1
Yield col.Skip(chunkSize * i).Take(chunkSize)
Next
End Function
Upvotes: 1
Reputation: 101072
You're not doing something wrong; it's just that the methods you use don't keep the ordering the way you want. Think about how mod
and GroupBy
work and you see why.
I suggest you use Jon Skeet's answer, since it keeps the ordering of your collection (I took the liberty to translate it for you into VB.Net).
You just have to calculate the size of each partition beforehand, since it does not split a collection into n
chunks, but into chunks of a length of n
:
<Extension> _
Public Shared Iterator Function Partition(Of T)(source As IEnumerable(Of T), size As Integer) As IEnumerable(Of IEnumerable(Of T))
Dim array__1 As T() = Nothing
Dim count As Integer = 0
For Each item As T In source
If array__1 Is Nothing Then
array__1 = New T(size - 1) {}
End If
array__1(count) = item
count += 1
If count = size Then
yield New ReadOnlyCollection(Of T)(array__1)
array__1 = Nothing
count = 0
End If
Next
If array__1 IsNot Nothing Then
Array.Resize(array__1, count)
yield New ReadOnlyCollection(Of T)(array__1)
End If
End Function
And to use it:
mainCol.Partition(CInt(Math.Ceiling(mainCol.Count() / 2)))
Feel free to hide the Partition(CInt(Math.Ceiling(...))
part in a new method.
Upvotes: 1