Reputation: 1509
My specific requirement is that I have an IEnumerable<IEnumerable<string>>
, and I want to "take" all items in the outer enumeration except any "empty" trailing items, where "empty" means all strings are null/empty or the inner enumeration is empty. Note that I want to keep any empty items that appear before the last non-empty one. For example:
Item 1: a, b, c
Item 2: (nothing)
Item 3: a, f, g, h
Item 4: (nothing)
Item 5: (nothing)
I would like to keep items 1–3 but trim items 4 and 5.
In a more general sense, I have an enumeration of items, where I want to trim any trailing items that satisfy a condition, that appear behind the last item that does not satisfy the condition.
For the sake of choosing an appropriate solution, I might add that the outer enumeration will usually contain a few hundred up to a few hundred thousand items, while the inner enumerations contain only a few items each. There will probably be only a couple of empty items that I need to trim.
My current solution puts all outer items in a list (after transforming them with .Select(...)
), and then in a loop keep removing the last item if it's empty, until a non-empty item is found.
Upvotes: 4
Views: 1913
Reputation: 156748
How about this?
var trimmedItems = items.Reverse().SkipWhile(e => !e.Any()).Reverse();
If you have very large sets of data, this will require more memory than some other solutions you could come up with, but it's pretty easy to read and follow.
juharr's suggestion is only slightly more complicated, and performs much better if you have large numbers of items:
var trimmedItems = items.Take(items.Reverse().TakeWhile(e => !e.Any()).Count());
Here's the benchmarking code I'm using. It's meant to run in LINQPad, but you can change the result.Dump();
call to output the results to the console or something if you prefer. Also, I'm using an IEnumerable<string>
instead of IEnumerable<IEnumerable<string>>
just for simplicity, but that shouldn't impact the performance of the algorithm:
/* This is a benchmarking template I use in LINQPad when I want to do a
* quick performance test. Just give it a couple of actions to test and
* it will give you a pretty good idea of how long they take compared
* to one another. It's not perfect: You can expect a 3% error margin
* under ideal circumstances. But if you're not going to improve
* performance by more than 3%, you probably don't care anyway.*/
void Main()
{
// Enter setup code here
var items = new[] { "a, b, c",
"",
"a, f, g, h",
"",
""}.AsEnumerable();
var manyitems = Enumerable.Range(1, 10000).SelectMany(i => items);
var actions = new[]
{
new TimedAction("Control", () =>
{
// ToList() is the one thing that all of these have to do.
manyitems.ToList();
}),
new TimedAction("Reverse().SkipWhile().Reverse()", () =>
{
manyitems.Reverse().SkipWhile(e => !e.Any()).Reverse().ToList();
}),
new TimedAction("Take(Reverse().TakeWhile().Count())", () =>
{
manyitems.Take(manyitems.Reverse().TakeWhile(e => !e.Any()).Count()).ToList();
}),
new TimedAction("SkipLastWhile", () =>
{
manyitems.SkipLastWhile(e => !e.Any()).ToList();
}),
// Add tests as desired
};
const int TimesToRun = 100; // Tweak this as necessary
TimeActions(TimesToRun, actions);
}
public static class EnumerableExtensions
{
public static IEnumerable<T> SkipLastWhile<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var skipBuffer = new List<T>();
foreach (var item in source)
{
if (predicate(item))
skipBuffer.Add(item);
else
{
foreach (var skipped in skipBuffer)
yield return skipped;
skipBuffer.Clear();
yield return item;
}
}
}
}
#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
Stopwatch s = new Stopwatch();
int length = actions.Length;
var results = new ActionResult[actions.Length];
// Perform the actions in their initial order.
for (int i = 0; i < length; i++)
{
var action = actions[i];
var result = results[i] = new ActionResult { Message = action.Message };
// Do a dry run to get things ramped up/cached
result.DryRun1 = s.Time(action.Action, 10);
result.FullRun1 = s.Time(action.Action, iterations);
}
// Perform the actions in reverse order.
for (int i = length - 1; i >= 0; i--)
{
var action = actions[i];
var result = results[i];
// Do a dry run to get things ramped up/cached
result.DryRun2 = s.Time(action.Action, 10);
result.FullRun2 = s.Time(action.Action, iterations);
}
results.Dump();
}
public class ActionResult
{
public string Message { get; set; }
public double DryRun1 { get; set; }
public double DryRun2 { get; set; }
public double FullRun1 { get; set; }
public double FullRun2 { get; set; }
}
public class TimedAction
{
public TimedAction(string message, Action action)
{
Message = message;
Action = action;
}
public string Message { get; private set; }
public Action Action { get; private set; }
}
public static class StopwatchExtensions
{
public static double Time(this Stopwatch sw, Action action, int iterations)
{
sw.Restart();
for (int i = 0; i < iterations; i++)
{
action();
}
sw.Stop();
return sw.Elapsed.TotalMilliseconds;
}
}
#endregion
Results:
The benchmark results are more profound if your IEnumerable
is backed by a List, because then LINQ can make some additional optimizations on Reverse():
var manyitems = Enumerable.Range(1, 10000).SelectMany(i => items).ToList().AsEnumerable();
Upvotes: 3
Reputation: 205929
There is no standard efficient LINQ solution. I would go with a custom extension "LINQ like" method like this:
public static class EnumerableExtensions
{
public static IEnumerable<T> SkipLastWhile<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var skipBuffer = new List<T>();
foreach (var item in source)
{
if (predicate(item))
skipBuffer.Add(item);
else
{
if (skipBuffer.Count > 0)
{
foreach (var skipped in skipBuffer)
yield return skipped;
skipBuffer.Clear();
}
yield return item;
}
}
}
}
It requires additional space for buffering the longest item sequence satisfying the skip predicate while the LINQ Reverse
method has to buffer the whole input sequence.
The usage will be:
var result = input.SkipLastWhile(e => !e.Any());
Upvotes: 4