Reputation: 26169

Using Linq to get the last N elements of a collection?

Given a collection, is there a way to get the last N elements of that collection? If there isn't a method in the framework, what would be the best way to write an extension method to do this?

Upvotes: 348

Answers (21)

James Curran

Reputation: 103605

coll.Reverse().Take(N).Reverse().ToList();


public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> coll, int N)
{
    return coll.Reverse().Take(N).Reverse();
}

UPDATE: To address clintp's problem: a) Using the TakeLast() method I defined above solves the problem, but if you really want to do it without the extra method, you just have to recognize that while Enumerable.Reverse() can be used as an extension method, you aren't required to use it that way:

List<string> mystring = new List<string>() { "one", "two", "three" }; 
mystring = Enumerable.Reverse(mystring).Take(2).Reverse().ToList();

Upvotes: 75

YSH

Reputation: 1

In case you use simplified enumrable like byte[] , you may want to use the SkipWhile()

on this example collecting the last 10 items:

myCollection.SkipWhile((item,index)=>i<=myCollection.Length-10)

Upvotes: 0

Alex

Reputation: 5257

My solution is based on ranges, introduced in C# version 8.

        public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int N)
        {
            return source.ToArray()[(source.Count()-N)..];
        }

After running a benchmark with most rated solutions (and my humbly proposed solution):

    public static class TakeLastExtension
    {
        public static IEnumerable<T> TakeLastMarkByers<T>(this IEnumerable<T> source, int takeCount)
        {
            if (source == null) { throw new ArgumentNullException("source"); }
            if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
            if (takeCount == 0) { yield break; }

            T[] result = new T[takeCount];
            int i = 0;

            int sourceCount = 0;
            foreach (T element in source)
            {
                result[i] = element;
                i = (i + 1) % takeCount;
                sourceCount++;
            }

            if (sourceCount < takeCount)
            {
                takeCount = sourceCount;
                i = 0;
            }

            for (int j = 0; j < takeCount; ++j)
            {
                yield return result[(i + j) % takeCount];
            }
        }

        public static IEnumerable<T> TakeLastKbrimington<T>(this IEnumerable<T> source, int N)
        {
            return source.Skip(Math.Max(0, source.Count() - N));
        }

        public static IEnumerable<T> TakeLastJamesCurran<T>(this IEnumerable<T> source, int N)
        {
            return source.Reverse().Take(N).Reverse();
        }

        public static IEnumerable<T> TakeLastAlex<T>(this IEnumerable<T> source, int N)
        {
            return source.ToArray()[(source.Count()-N)..];
        }
    }

Test

    [MemoryDiagnoser]
    public class TakeLastBenchmark
    {
        [Params(10000)]
        public int N;

        private readonly List<string> l = new();

        [GlobalSetup]
        public void Setup()
        {
            for (var i = 0; i < this.N; i++)
            {
                this.l.Add($"i");
            }
        }

        [Benchmark]
        public void Benchmark1_MarkByers()
        {
            var lastElements = l.TakeLastMarkByers(3).ToList();
        }

        [Benchmark]
        public void Benchmark2_Kbrimington()
        {
            var lastElements = l.TakeLastKbrimington(3).ToList();
        }

        [Benchmark]
        public void Benchmark3_JamesCurran()
        {
            var lastElements = l.TakeLastJamesCurran(3).ToList();
        }

        [Benchmark]
        public void Benchmark4_Alex()
        {
            var lastElements = l.TakeLastAlex(3).ToList();
        }
    }

Program.cs:

var summary = BenchmarkRunner.Run(typeof(TakeLastBenchmark).Assembly);

Command dotnet run --project .\TestsConsole2.csproj -c Release --logBuildOutput

The results were following:

// * Summary *

BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.1889/21H2/November2021Update) AMD Ryzen 5 5600X, 1 CPU, 12 logical and 6 physical cores .NET SDK=6.0.401 [Host] : .NET 6.0.9 (6.0.922.41905), X64 RyuJIT AVX2 DefaultJob : .NET 6.0.9 (6.0.922.41905), X64 RyuJIT AVX2

Method	N	Mean	Error	StdDev	Gen0	Gen1	Allocated
Benchmark1_MarkByers	10000	89,390.53 ns	1,735.464 ns	1,704.457 ns	-	-	248 B
Benchmark2_Kbrimington	10000	46.15 ns	0.410 ns	0.363 ns	0.0076	-	128 B
Benchmark3_JamesCurran	10000	2,703.15 ns	46.298 ns	67.862 ns	4.7836	0.0038	80264 B
Benchmark4_Alex	10000	2,513.48 ns	48.661 ns	45.517 ns	4.7607	-	80152 B

Turns out the solution proposed by @Kbrimington to be the most efficient in terms of memory alloc as well as raw performance.

Upvotes: 3

Philipp P

Reputation: 619

Honestly I'm not super proud of the answer, but for small collections you could use the following:

var lastN = collection.Reverse().Take(n).Reverse();

A bit hacky but it does the job ;)

Upvotes: 2

Krishan Kaliraman

Reputation: 19

//detailed code for the problem
//suppose we have a enumerable collection 'collection'
var lastIndexOfCollection=collection.Count-1 ;
var nthIndexFromLast= lastIndexOfCollection- N;

var desiredCollection=collection.GetRange(nthIndexFromLast, N);
---------------------------------------------------------------------

// use this one liner
var desiredCollection=collection.GetRange((collection.Count-(1+N)), N);

Upvotes: -1

Lasse V. Karlsen

Reputation: 391724

Note: I missed your question title which said Using Linq, so my answer does not in fact use Linq.

If you want to avoid caching a non-lazy copy of the entire collection, you could write a simple method that does it using a linked list.

The following method will add each value it finds in the original collection into a linked list, and trim the linked list down to the number of items required. Since it keeps the linked list trimmed to this number of items the entire time through iterating through the collection, it will only keep a copy of at most N items from the original collection.

It does not require you to know the number of items in the original collection, nor iterate over it more than once.

Usage:

IEnumerable<int> sequence = Enumerable.Range(1, 10000);
IEnumerable<int> last10 = sequence.TakeLast(10);
...

Extension method:

public static class Extensions
{
    public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> collection,
        int n)
    {
        if (collection == null)
            throw new ArgumentNullException(nameof(collection));
        if (n < 0)
            throw new ArgumentOutOfRangeException(nameof(n), $"{nameof(n)} must be 0 or greater");

        LinkedList<T> temp = new LinkedList<T>();

        foreach (var value in collection)
        {
            temp.AddLast(value);
            if (temp.Count > n)
                temp.RemoveFirst();
        }

        return temp;
    }
}

Upvotes: 56

Ray

Reputation: 192486

.NET Core 2.0+ provides the LINQ method TakeLast():

https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.takelast

example:

Enumerable
    .Range(1, 10)
    .TakeLast(3) // <--- takes last 3 items
    .ToList()
    .ForEach(i => System.Console.WriteLine(i))

// outputs:
// 8
// 9
// 10

Upvotes: 76

Romasz

Reputation: 29790

Little different implementation with usage of circular buffer. The benchmarks show that the method is circa two times faster than ones using Queue (implementation of TakeLast in System.Linq), however not without a cost - it needs a buffer which grows along with the requested number of elements, even if you have a small collection you can get huge memory allocation.

public IEnumerable<T> TakeLast<T>(IEnumerable<T> source, int count)
{
    int i = 0;

    if (count < 1)
        yield break;

    if (source is IList<T> listSource)
    {
        if (listSource.Count < 1)
            yield break;

        for (i = listSource.Count < count ? 0 : listSource.Count - count; i < listSource.Count; i++)
            yield return listSource[i];

    }
    else
    {
        bool move = true;
        bool filled = false;
        T[] result = new T[count];

        using (var enumerator = source.GetEnumerator())
            while (move)
            {
                for (i = 0; (move = enumerator.MoveNext()) && i < count; i++)
                    result[i] = enumerator.Current;

                filled |= move;
            }

        if (filled)
            for (int j = i; j < count; j++)
                yield return result[j];

        for (int j = 0; j < i; j++)
            yield return result[j];

    }
}

Upvotes: 0

kbrimington

Reputation: 25692

collection.Skip(Math.Max(0, collection.Count() - N));

This approach preserves item order without a dependency on any sorting, and has broad compatibility across several LINQ providers.

It is important to take care not to call Skip with a negative number. Some providers, such as the Entity Framework, will produce an ArgumentException when presented with a negative argument. The call to Math.Max avoids this neatly.

The class below has all of the essentials for extension methods, which are: a static class, a static method, and use of the this keyword.

public static class MiscExtensions
{
    // Ex: collection.TakeLast(5);
    public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int N)
    {
        return source.Skip(Math.Max(0, source.Count() - N));
    }
}

A brief note on performance:

Because the call to Count() can cause enumeration of certain data structures, this approach has the risk of causing two passes over the data. This isn't really a problem with most enumerables; in fact, optimizations exist already for Lists, Arrays, and even EF queries to evaluate the Count() operation in O(1) time.

If, however, you must use a forward-only enumerable and would like to avoid making two passes, consider a one-pass algorithm like Lasse V. Karlsen or Mark Byers describe. Both of these approaches use a temporary buffer to hold items while enumerating, which are yielded once the end of the collection is found.

Upvotes: 516

Ali asghar Fendereski

Reputation: 157

Using This Method To Get All Range Without Error

 public List<T> GetTsRate( List<T> AllT,int Index,int Count)
        {
            List<T> Ts = null;
            try
            {
                Ts = AllT.ToList().GetRange(Index, Count);
            }
            catch (Exception ex)
            {
                Ts = AllT.Skip(Index).ToList();
            }
            return Ts ;
        }

Upvotes: -1

tigrou

Reputation: 4526

I tried to combine efficiency and simplicity and end up with this :

public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int count)
{
    if (source == null) { throw new ArgumentNullException("source"); }

    Queue<T> lastElements = new Queue<T>();
    foreach (T element in source)
    {
        lastElements.Enqueue(element);
        if (lastElements.Count > count)
        {
            lastElements.Dequeue();
        }
    }

    return lastElements;
}

About performance : In C#, Queue<T> is implemented using a circular buffer so there is no object instantiation done each loop (only when the queue is growing up). I did not set queue capacity (using dedicated constructor) because someone might call this extension with count = int.MaxValue . For extra performance you might check if source implement IList<T> and if yes, directly extract the last values using array indexes.

Upvotes: 4

Jonathan Gilbert

Reputation: 3870

Here's my solution:

public static class EnumerationExtensions
{
    public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> input, int count)
    {
        if (count <= 0)
            yield break;

        var inputList = input as IList<T>;

        if (inputList != null)
        {
            int last = inputList.Count;
            int first = last - count;

            if (first < 0)
                first = 0;

            for (int i = first; i < last; i++)
                yield return inputList[i];
        }
        else
        {
            // Use a ring buffer. We have to enumerate the input, and we don't know in advance how many elements it will contain.
            T[] buffer = new T[count];

            int index = 0;

            count = 0;

            foreach (T item in input)
            {
                buffer[index] = item;

                index = (index + 1) % buffer.Length;
                count++;
            }

            // The index variable now points at the next buffer entry that would be filled. If the buffer isn't completely
            // full, then there are 'count' elements preceding index. If the buffer *is* full, then index is pointing at
            // the oldest entry, which is the first one to return.
            //
            // If the buffer isn't full, which means that the enumeration has fewer than 'count' elements, we'll fix up
            // 'index' to point at the first entry to return. That's easy to do; if the buffer isn't full, then the oldest
            // entry is the first one. :-)
            //
            // We'll also set 'count' to the number of elements to be returned. It only needs adjustment if we've wrapped
            // past the end of the buffer and have enumerated more than the original count value.

            if (count < buffer.Length)
                index = 0;
            else
                count = buffer.Length;

            // Return the values in the correct order.
            while (count > 0)
            {
                yield return buffer[index];

                index = (index + 1) % buffer.Length;
                count--;
            }
        }
    }

    public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> input, int count)
    {
        if (count <= 0)
            return input;
        else
            return input.SkipLastIter(count);
    }

    private static IEnumerable<T> SkipLastIter<T>(this IEnumerable<T> input, int count)
    {
        var inputList = input as IList<T>;

        if (inputList != null)
        {
            int first = 0;
            int last = inputList.Count - count;

            if (last < 0)
                last = 0;

            for (int i = first; i < last; i++)
                yield return inputList[i];
        }
        else
        {
            // Aim to leave 'count' items in the queue. If the input has fewer than 'count'
            // items, then the queue won't ever fill and we return nothing.

            Queue<T> elements = new Queue<T>();

            foreach (T item in input)
            {
                elements.Enqueue(item);

                if (elements.Count > count)
                    yield return elements.Dequeue();
            }
        }
    }
}

The code is a bit chunky, but as a drop-in reusable component, it should perform as well as it can in most scenarios, and it'll keep the code that's using it nice and concise. :-)

My TakeLast for non-IList`1 is based on the same ring buffer algorithm as that in the answers by @Mark Byers and @MackieChan further up. It's interesting how similar they are -- I wrote mine completely independently. Guess there's really just one way to do a ring buffer properly. :-)

Looking at @kbrimington's answer, an additional check could be added to this for IQuerable<T> to fall back to the approach that works well with Entity Framework -- assuming that what I have at this point does not.

Upvotes: 1

Sasha

Reputation: 853

I know it's to late to answer this question. But if you are working with collection of type IList<> and you don't care about an order of the returned collection, then this method is working faster. I've used Mark Byers answer and made a little changes. So now method TakeLast is:

public static IEnumerable<T> TakeLast<T>(IList<T> source, int takeCount)
{
    if (source == null) { throw new ArgumentNullException("source"); }
    if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
    if (takeCount == 0) { yield break; }

    if (source.Count > takeCount)
    {
        for (int z = source.Count - 1; takeCount > 0; z--)
        {
            takeCount--;
            yield return source[z];
        }
    }
    else
    {
        for(int i = 0; i < source.Count; i++)
        {
            yield return source[i];
        }
    }
}

For test I have used Mark Byers method and kbrimington's andswer. This is test:

IList<int> test = new List<int>();
for(int i = 0; i<1000000; i++)
{
    test.Add(i);
}

Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();

IList<int> result = TakeLast(test, 10).ToList();

stopwatch.Stop();

Stopwatch stopwatch1 = new Stopwatch();
stopwatch1.Start();

IList<int> result1 = TakeLast2(test, 10).ToList();

stopwatch1.Stop();

Stopwatch stopwatch2 = new Stopwatch();
stopwatch2.Start();

IList<int> result2 = test.Skip(Math.Max(0, test.Count - 10)).Take(10).ToList();

stopwatch2.Stop();

And here are results for taking 10 elements:

enter image description here

and for taking 1000001 elements results are: enter image description here

Upvotes: 1

bradgonesurfing

Reputation: 32202

It is a little inefficient to take the last N of a collection using LINQ as all the above solutions require iterating across the collection. TakeLast(int n) in System.Interactive also has this problem.

If you have a list a more efficient thing to do is slice it using the following method

/// Select from start to end exclusive of end using the same semantics
/// as python slice.
/// <param name="list"> the list to slice</param>
/// <param name="start">The starting index</param>
/// <param name="end">The ending index. The result does not include this index</param>
public static List<T> Slice<T>
(this IReadOnlyList<T> list, int start, int? end = null)
{
    if (end == null)
    {
        end = list.Count();
    }
     if (start < 0)
    {
        start = list.Count + start;
    }
     if (start >= 0 && end.Value > 0 && end.Value > start)
    {
        return list.GetRange(start, end.Value - start);
    }
     if (end < 0)
    {
        return list.GetRange(start, (list.Count() + end.Value) - start);
    }
     if (end == start)
    {
        return new List<T>();
    }
     throw new IndexOutOfRangeException(
        "count = " + list.Count() + 
        " start = " + start +
        " end = " + end);
}

with

public static List<T> GetRange<T>( this IReadOnlyList<T> list, int index, int count )
{
    List<T> r = new List<T>(count);
    for ( int i = 0; i < count; i++ )
    {
        int j=i + index;
        if ( j >= list.Count )
        {
            break;
        }
        r.Add(list[j]);
    }
    return r;
}

and some test cases

[Fact]
public void GetRange()
{
    IReadOnlyList<int> l = new List<int>() { 0, 10, 20, 30, 40, 50, 60 };
     l
        .GetRange(2, 3)
        .ShouldAllBeEquivalentTo(new[] { 20, 30, 40 });
     l
        .GetRange(5, 10)
        .ShouldAllBeEquivalentTo(new[] { 50, 60 });

}
 [Fact]
void SliceMethodShouldWork()
{
    var list = new List<int>() { 1, 3, 5, 7, 9, 11 };
    list.Slice(1, 4).ShouldBeEquivalentTo(new[] { 3, 5, 7 });
    list.Slice(1, -2).ShouldBeEquivalentTo(new[] { 3, 5, 7 });
    list.Slice(1, null).ShouldBeEquivalentTo(new[] { 3, 5, 7, 9, 11 });
    list.Slice(-2)
        .Should()
        .BeEquivalentTo(new[] {9, 11});
     list.Slice(-2,-1 )
        .Should()
        .BeEquivalentTo(new[] {9});
}

Upvotes: 1

s.m.

Reputation: 8043

If using a third-party library is an option, MoreLinq defines TakeLast() which does exactly this.

Upvotes: 3

ALT

Reputation: 1242

Below the real example how to take last 3 elements from a collection (array):

// split address by spaces into array
string[] adrParts = adr.Split(new string[] { " " },StringSplitOptions.RemoveEmptyEntries);
// take only 3 last items in array
adrParts = adrParts.SkipWhile((value, index) => { return adrParts.Length - index > 3; }).ToArray();

Upvotes: 0

dav_i

Reputation: 28157

If you are dealing with a collection with a key (e.g. entries from a database) a quick (i.e. faster than the selected answer) solution would be

collection.OrderByDescending(c => c.Key).Take(3).OrderBy(c => c.Key);

Upvotes: 7

Mark Byers

Reputation: 839224

Here's a method that works on any enumerable but uses only O(N) temporary storage:

public static class TakeLastExtension
{
    public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int takeCount)
    {
        if (source == null) { throw new ArgumentNullException("source"); }
        if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
        if (takeCount == 0) { yield break; }

        T[] result = new T[takeCount];
        int i = 0;

        int sourceCount = 0;
        foreach (T element in source)
        {
            result[i] = element;
            i = (i + 1) % takeCount;
            sourceCount++;
        }

        if (sourceCount < takeCount)
        {
            takeCount = sourceCount;
            i = 0;
        }

        for (int j = 0; j < takeCount; ++j)
        {
            yield return result[(i + j) % takeCount];
        }
    }
}

Usage:

List<int> l = new List<int> {4, 6, 3, 6, 2, 5, 7};
List<int> lastElements = l.TakeLast(3).ToList();

It works by using a ring buffer of size N to store the elements as it sees them, overwriting old elements with new ones. When the end of the enumerable is reached the ring buffer contains the last N elements.

Upvotes: 33

Nick Babcock

Reputation: 6141

I am surprised that no one has mentioned it, but SkipWhile does have a method that uses the element's index.

public static IEnumerable<T> TakeLastN<T>(this IEnumerable<T> source, int n)
{
    if (source == null)
        throw new ArgumentNullException("Source cannot be null");

    int goldenIndex = source.Count() - n;
    return source.SkipWhile((val, index) => index < goldenIndex);
}

//Or if you like them one-liners (in the spirit of the current accepted answer);
//However, this is most likely impractical due to the repeated calculations
collection.SkipWhile((val, index) => index < collection.Count() - N)

The only perceivable benefit that this solution presents over others is that you can have the option to add in a predicate to make a more powerful and efficient LINQ query, instead of having two separate operations that traverse the IEnumerable twice.

public static IEnumerable<T> FilterLastN<T>(this IEnumerable<T> source, int n, Predicate<T> pred)
{
    int goldenIndex = source.Count() - n;
    return source.SkipWhile((val, index) => index < goldenIndex && pred(val));
}

Upvotes: 12

piers7

Reputation: 4424

Use EnumerableEx.TakeLast in RX's System.Interactive assembly. It's an O(N) implementation like @Mark's, but it uses a queue rather than a ring-buffer construct (and dequeues items when it reaches buffer capacity).

(NB: This is the IEnumerable version - not the IObservable version, though the implementation of the two is pretty much identical)

Upvotes: 9

Richard Szalay

Reputation: 84824

If you don't mind dipping into Rx as part of the monad, you can use TakeLast:

IEnumerable<int> source = Enumerable.Range(1, 10000);

IEnumerable<int> lastThree = source.AsObservable().TakeLast(3).AsEnumerable();

Upvotes: 5

Using Linq to get the last N elements of a collection?

Answers (21)

Related Questions