dotnetdevcsharp
dotnetdevcsharp

Reputation: 3980

Removing duplicates that appear more than once

Just practising a bit on linq and I am doing the usual remove items that are only duplicated more than once so in below example I have.

Word1 Word2 Word1

It should only print out word 2 Here is a https://dotnetfiddle.net/P47DGT is their an easy way of doing this with lambada better.

class Words {
    public string MyWords { get; set; }
}

public class Program    
{
    public static void Main()
{
  List<Words> _words = new List<Words>();
  _words.Add(new Words() { MyWords = "word1" });
  _words.Add(new Words() { MyWords = "word2" });
  _words.Add(new Words() { MyWords = "word1" });

   RemoveDuplicatesLinq(_words);
}   
    static void RemoveDuplicatesLinq(List<Words> _words) {
        List<Words> duplicatesRemoved = new List<Words>();
        duplicatesRemoved = _words.GroupBy(x => x)
                    .Where(group => group.Count() > 1)
                    .Select(group => group.Key).ToList();  
        foreach (var item in duplicatesRemoved) {
            Console.Write("Words Left " + item.MyWords + "\r\n");
        }
    }       
}

It should just print me out the one occurrence Word2 but its not

Upvotes: 1

Views: 108

Answers (2)

Pavel Anikhouski
Pavel Anikhouski

Reputation: 23218

Possible option here can be a wrapping MyWords property into anonymous type (they are compared by property values, not by the reference equality), then select groups with count equals 1

var duplicatesRemoved = _words.GroupBy(x => new { x.MyWords })
    .Where(group => group.Count() == 1)
    .Select(group => group.Key).ToList();

foreach (var item in duplicatesRemoved)
{
    Console.Write("Words Left " + item.MyWords + "\r\n");
}

In your code group.Count() > 1 condition means select groups with duplicates, but actually you'll need the opposite.

Another option is to group the source list by MyWords property, get the groups with count equals 1 and then select the first item from these groups (because there is only one item in every group after Where clause)

var duplicatesRemoved = _words.GroupBy(x => x.MyWords)
    .Where(group => group.Count() == 1)
    .Select(group => group.First())
    .ToList();

Upvotes: 1

RoadRunner
RoadRunner

Reputation: 26315

You should just group by MyWords and only filter the groups of length 1. Then you could flatten the groups into IEnumerable<Words> with System.Linq.Enumerable.SelectMany. Since you seem to want a list, we can add ToList() to convert the result to List<Words>.

List<Words> duplicatesRemoved = _words
            .GroupBy(x => x.MyWords)
            .Where(group => group.Count() == 1)
            .SelectMany(group => group)
            .ToList();

foreach (var item in duplicatesRemoved)
{
    Console.Write("Words Left " + item.MyWords + "\r\n");
}

Which Outputs:

Words Left word2

Upvotes: 5

Related Questions