Reputation: 23

C# linq remove duplicates from the top and bottom of the list and keep the duplicates in the middle

C# linq remove duplicates from the top and bottom of the list and keep the duplicates in the middle

For example,

  var myArray = new[] { 
    1, 1, 2, 2, 3, 4, 5, 5, 9, 9 
  };
  
  List<int> myList = myArray.ToList();

Expected Output after removing the duplicates at the top and bottom is below list

{ 2, 2, 3, 4, 5, 5 };

Please advice how to perform this logic and tried myList.Distinct() wouldn't help as it remove all the duplicates in the middle as well.

**EDIT: The list will not be EMPTY and regardless of the duplicate at the top and bottom, the first and last records should be removed for calculation of a business logic. If duplicates are found at the top or bottom, should be removed as well. The list will be ordered ascending before performing the removal operation **

Upvotes: 0

Answers (4)

Astrid E.

Reputation: 2872

For removing the duplicates in the beginning of the list, you could benefit from .First() and .SkipWhile() in the System.Linq namespace:

var firstDuplicate = myList.First();

var listWithoutFirstDuplicate = myList
    .SkipWhile(l => l == firstDuplicate)
    .ToList();

For removing the duplicates in the end of the list, you would benefit from .Last() and .SkipLastWhile(), had .SkipLastWhile() existed:

var lastDuplicate = myList.Last();

var listWithoutLastDuplicate = myList
    .SkipLastWhile(l => l == lastDuplicate)
    .ToList();

Paulo Morgado has suggested an implementation of .SkipLastWhile() in this blog post. It looks like this:

public static IEnumerable<TSource> SkipLastWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
    var buffer = new List<TSource>();

    foreach (var item in source)
    {
        if (predicate(item))
        {
            buffer.Add(item);
        }
        else
        {
            if (buffer.Count > 0)
            {
                foreach (var bufferedItem in buffer)
                {
                    yield return bufferedItem;
                }

                buffer.Clear();
            }

            yield return item;
        }
    }
}

Paulo's implementation of .SkipLastWhile() checks each element against the predicate. If the predicate is not fulfilled for an element, the element is returned. If the predicate is fulfilled (i.e. in your scenario: if the element is equal to lastDuplicate), the element is not immediately returned, but rather added to a buffer. The contents of the buffer is only returned if a succeeding element does not fulfill the predicate.

A few examples:

{ 1, 9, 9, 9, 9 }.SkipLastWhile(i => i == 9)

will return { 1 }.
The buffer will build up from { 9 } (the element at index 1) to { 9, 9, 9, 9 } (the elements from index 1 to index 4), and will not be returned.

{ 1, 9, 9, 9, 9, 1 }.SkipLastWhile(i => i == 1)

will return { 1, 9, 9, 9, 9 }.
The buffer will first be { 1 } (the element at index 0), then be returned and emptied when 9 is found (at index 1). When reaching the last element, the buffer will again be be { 1 }; and it is not returned.

{ 9, 1, 9, 9, 9, 9 }.SkipLastWhile(i => i == 9)

will return { 9, 1 }.
The buffer will first be { 9 } (the element at index 0), then be returned and emptied when 1 is found (at index 1). When reaching 9 (at index 2), the buffer starts building up again, starting with { 9 }. At the last element, the buffer will be { 9, 9, 9, 9 }, and its contents are not returned.

Using this implementation of SkipLastWhile(), you can get your filtered list:

var myList = new List<int> { 1, 1, 9, 2, 2, 3, 9, 4, 5, 5, 1, 9, 9 };

var firstDuplicate = myList.First();
var lastDuplicate = myList.Last();

var myFilteredList = myList
    .SkipWhile(l => l == firstDuplicate)
    .SkipLastWhile(l => l == lastDuplicate)
    .ToList();

The output for the given myList is the following:

9, 2, 2, 3, 9, 4, 5, 5, 1

As pointed out by Dmitry Bychenko, there are two problems with this implementation:

If myList is null or empty ({ }), and exception is thrown
- none of the extension methods can tolerate being called on null
- .First() (and .Last()) throw an exception when being called on an empty list
If the first element is no duplicate (i.e. not equal to the second element), the first element will nonetheless be excluded; that is not expected behavior. Similarily, the last element will also be excluded, even if it is no duplicate.

One way to address these problems is to make checks along the way before filtering out actual duplicates. If handled in a method, it could be implemented like this:

public static List<int> GetFilteredList(List<int> list)
{
    // if list is null; return null
    if (list == null)
    {
        return null;
    }
    
    // If list is empty or contains only one element; return list as new list
    if (!list.Skip(1).Any())
    {
        return list.ToList();
    }
    
    var filtered = list.AsEnumerable();
    
    // Remove duplicates at beginning of list (if any)
    if (list.First() == list.Skip(1).First())
    {
        filtered = filtered.SkipWhile(l => l == list.First());
    }
    
    // Remove duplicates at end of list (if any)
    if (list.Last() == list.SkipLast(1).Last())
    {
        filtered = filtered.SkipLastWhile(l => l == list.Last());
    }
    
    return filtered.ToList();
}

It may be called as follows:

var filteredList = GetFilteredList(myList);

Example fiddle here.

Upvotes: 2

Dmitrii Bychenko

Reputation: 186803

If you want to remove duplicates just from the very top and bottom of the collection, but want to keep the same values int the middle:

1, 1, 2, 3, 1, 1, 8, 8, 9, 9, 4, 5, 9, 9 => 2, 3, 1, 1, 8, 8, 9, 9, 4, 5
                                                  ^  ^     ^  ^
                                                    preserved

you can compute left and right borders and then trim the collection with a help of Skip and Take:

      int[] myArray = new int[] { 
        ... 
      };

      ...

      int left = 0;

      for (int i = 1; i < myArray.Length; ++i)
        if (myArray[i - 1] == myArray[i])
          left = i + 1;
        else
          break;

      int right = myArray.Length - 1;

      for (int i = myArray.Length - 2; i >= 0; --i)
        if (myArray[i + 1] == myArray[i])
          right = i - 1;
        else
          break;

      List<int> myList = myArray
        .Skip(left)
        .Take(right - left + 1)
        .ToList();

If you want to remove all the values which are appeared to be duplicates

1, 1, 2, 3, 1, 1, 8, 8, 9, 9, 4, 5, 9, 9 => 2, 3, 8, 8, 4, 5

you can collect these values and then filter out:

      int[] myArray = new int[] { 
        ... 
      };

      ...

      HashSet<int> remove = new HashSet<int>();

      if (myArray.Length > 1) {
        if (myArray[0] == myArray[1])
          remove.Add(myArray[0]);
        if (myArray[myArray.Length - 1] == myArray[myArray.Length - 2])
          remove.Add(myArray[myArray.Length - 1]);
      }

      var myList = myArray
        .Where(item => !remove.Contains(item))
        .ToList();

Please, fiddle

Upvotes: 2

Abolfazl Moslemian

Reputation: 173

var myList = new List<int> { 1, 1, 2, 2, 3, 4, 5, 5, 9, 9 };
    var topDublicate = myList.First();
    var lastDublicate = myList.Last();

    myList.RemoveAll(l => l == topDublicate);
    myList.RemoveAll(l => l == lastDublicate);

Upvotes: 1

All_Languages

Reputation: 23

    int topDuplicate = myList [0];
    myList .RemoveAll(x => x == topDuplicate);
    int bottomDuplicate = myList [myList .Count - 1];
    myList .RemoveAll(x => x == bottomDuplicate);

Upvotes: 0

C# linq remove duplicates from the top and bottom of the list and keep the duplicates in the middle

Answers (4)

Related Questions