Reputation: 8982
I really like the algorithm shown below for splitting a list into sublists of a fixed size. It might not be the most an efficient algorithm (edit: at all).
I'd like something that has a good balance of readability, elegance, and performance. The problem is, most algorithms I find in C# require the yield
keyword, which is not available if you're using .NET 3.5 in Visual Studio 2010 ;)
public IEnumerable<IEnumerable<T>> Partition<T>(IEnumerable<T> source, int size)
{
if (source == null)
throw new ArgumentNullException("list");
if (size < 1)
throw new ArgumentOutOfRangeException("size");
int index = 1;
IEnumerable<T> partition = source.Take(size).AsEnumerable();
while (partition.Any())
{
yield return partition;
partition = source.Skip(index++ * size).Take(size).AsEnumerable();
}
}
I tried rewriting this in VB, but had to use a second list to collect results into, which ended up taking significantly more time than the implementation above.
I'm looking for another algorithm I could use in VB.NET, but most of the results run into the issue of having to basically load everything into the memory instead of the ability to dynamically produce the results a la generators in python. Not an huge issue, but not as ideal as it would be with yield return
.
Is there a good, recommended algorithm for doing this in VB.NET? Would I have to create something implementing IEnumerator
to generate the results on demand?
Upvotes: 1
Views: 1850
Reputation: 8982
Since I'm using the partitions locally, I ended up chosing to invert the situation: instead of passing the partitions to the action using it, I can pass the action to the function doing the partitioning.
Public Sub DoForPartition(Of T)(source As IEnumerable(Of T),
size As Integer,
doThis As Action(Of IEnumerable(Of T)))
Dim partition(size - 1) As T
Dim count = 0
For Each t in source
partition(count) = t
count += 1
If count = size Then
doThis(partition)
count = 0
End If
Next
If count > 0 Then
Array.Resize(partition, count)
doThis(partition)
End If
End Sub
This function avoids looping through the source multiple times and the only memory overhead is the size of the partition (instead of the entire source like some of the other options). I didn't write this function myself, but adapted a similarly looking C# function from this answer.
This looks like a much better algorithm than the one in my question.
Upvotes: 2
Reputation: 6948
This might work as a work around. Make the sub routine a Sub and pass the target list in. now you can add the sub lists to it directly without creating a whole intermediary object first.
Dim testlist As New List(Of List(Of Integer))
Partition(Of Integer)({1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0}, 4, testlist)
Public Sub Partition(Of T)(source As IEnumerable(Of T), size As Integer, ByRef input As List(Of List(Of T)))
If source Is Nothing Then
Throw New ArgumentNullException("list")
End If
If size < 1 Then
Throw New ArgumentOutOfRangeException("size")
End If
For i = 0 To source.Count - 1 Step size
input.Add(source.Skip(i).Take(size).ToList())
Next
End Sub
Upvotes: 2
Reputation: 134035
Lacking the Yield
, you can do the following. I'm assuming you can convert to VB
public IEnumerable<IEnumerable<T>> Partition<T>(IEnumerable<T> source, int size)
{
if (source == null)
throw new ArgumentNullException("list");
if (size < 1)
throw new ArgumentOutOfRangeException("size");
List<List<T>> partitions = new List<List<T>>();
int index = 1;
IEnumerable<T> partition = source.Take(size).AsEnumerable();
while (partition.Any())
{
partitions.Add(partition);
partition = source.Skip(index++ * size).Take(size).AsEnumerable();
}
return partitions;
}
Note that this isn't an exact translation. The original code returns one partition at a time. This code creates all of the partitions and then returns them all in a list. Not a problem if the source
isn't huge.
That said, it will look like the same thing to your code. That is, if you have:
foreach (IEnumerable<Foo> part in Partition(FooList, 100))
{
// whatever
}
both versions will do essentially the same thing. The real difference is that my translation above will work as though you had written:
foreach (IEnumerable<Foo> part in Partition(FooList, 100).ToList())
As somebody pointed out in comments, this isn't the most efficient way to do things because the Skip
could end up having to iterate over items multiple times. Again, if the source
list isn't huge, then that probably isn't a problem. And it's likely that Skip
is optimized to do direct indexing for things that implement IList
.
Upvotes: 0