user3738277
user3738277

Reputation: 111

Removing duplicities from Dictionary

      // removing duplicities from Dictionary
        var removables = data.ToLookup(x => x.Value, x => x.Key)
            .SelectMany(x => x.Skip(1)).ToList();
        foreach (var key in removables)
            data.Remove(key);

This code works pretty well with below input(data):

102030;"http://xxx.yyy.com/102030.ashx"
102030;"http://xxx.yyy.com/102030_x.ashx"

102030;"http://xxx.yyy.com/102030_x.ashx" is removed.

But when I give this input:

102030;"http://xxx.yyy.com/102030_x.ashx"
102030;"http://xxx.yyy.com/102030.ashx"

102030;"http://xxx.yyy.com/102030.ashx" is removed. But I only need to remove items containing '_'.

How to solve this problem ? Is it possible to sort inputs by length or adjusting the linq query ?

Upvotes: 2

Views: 111

Answers (3)

user3738277
user3738277

Reputation: 111

Thank you very much for your solutions.

I find the next:

        var removables = dict.OrderBy(x => x.Key).ToLookup(x => x.Value, x => x.Key).SelectMany(x => x.Skip(1)).ToList();
        foreach (var key in removables)
            dict.Remove(key);

I only add ordering by Key and now I have correctly ordered set :-)

Thank you for your comments to this solution.

Upvotes: 0

Marco
Marco

Reputation: 23937

If Mark Shevchenkos answer doesn't float your boat for whatever reason, you can very well sort by length, if you want to.

I've created a dummy data source of type List<KeyValuePair<int, string>> since a Dictionary doesn't allow for duplicate keys.

Removing the duplicates then is straight forward:

  1. Group by Key
  2. Order by Value length
  3. Take the first result of every groupset

    var source = new List<KeyValuePair<int, string>>() {
    new KeyValuePair<int,string>(102030, "http://xxx.yyy.com/102030.ashx"),
    new KeyValuePair<int,string>(102030, "http://xxx.yyy.com/102030_x.ashx"),
    new KeyValuePair<int,string>(102040, "http://xxx.yyy.com/102040_x.ashx"),
    new KeyValuePair<int,string>(102040, "http://xxx.yyy.com/102040.ashx"),
    new KeyValuePair<int,string>(102050, "http://xxx.yyy.com/102050.ashx"),
    new KeyValuePair<int,string>(102050, "http://xxx.yyy.com/102050_x.ashx"),
    new KeyValuePair<int,string>(102060, "http://xxx.yyy.com/102060_y.ashx"),
    new KeyValuePair<int,string>(102060, "http://xxx.yyy.com/102060.ashx")
    

    };

    source.GroupBy (s => s.Key)
          .Select(x => x.OrderBy (y => y.Value.Length))
          .Select (x => x.First())
          .Dump();
    

Upvotes: 1

Mark Shevchenko
Mark Shevchenko

Reputation: 8197

If you want skip elements with underscores you shouldn't skip first element but remain all elements without underscores:

// smart removing duplicities from Dictionary
var removables = data.ToLookup(x => x.Value, x => x.Key)
                     .SelectMany(x => x.Where(y => !y.Key.Contains('_')).ToList();
foreach (var key in removables)
    data.Remove(key);

Upvotes: 1

Related Questions