user630190
user630190

Reputation: 1162

How to remove duplicates in the middle

Given a sequence like below:-

var list = new[] {"1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a"}.Select(x => new { P1 = x.Substring(0,1), P2 = x.Substring(1,1)});

I'd like to remove the duplicates in the "middle" to end up with:-

var expected = new[] {"1a", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a"}.Select(x => new { P1 = x.Substring(0, 1), P2 = x.Substring(1, 1) });

So any repeats of more than two are stripped out. It's important that I get the first and last duplicate though.

Upvotes: 0

Views: 69

Answers (2)

Hogan
Hogan

Reputation: 70523

For those that don't Aggregate and want a super short answer using closure here:

var data = new[] { "1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "1e", "5a", "6a", "7a", "7b", "8a" };
char priorKey = ' ';
int currentIndex = 0;

var result2 = data.GroupBy((x) => x[0] == priorKey ? new { k = x[0], g = currentIndex } : new { k = priorKey = x[0], g = ++currentIndex })
    .Select(i => new[] { i.First(), i.Last() }.Distinct())
    .SelectMany(i => i).ToArray();

Hat Tip to @Slai for the code this is based on (I added a fix for the non-continuous group issue.)


Here is how to do it with Aggregate. I didn't test all edge cases... just your test cases.

var list = new[] { "1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a" }
           .Aggregate(new { result = new List<string>(), first = "", last = "" },
              (store, given) =>
              {
                var result = store.result;
                var first = store.first;
                var last = store.last;

                 if (first == "")
                  // this is the first one.
                  first = given;
                else
                {
                  if (first[0] == given[0])
                    last = given;
                  else
                  {
                    result.Add(first);
                    if (last != "")
                      result.Add(last);
                    first = given;
                    last = "";
                  }

                }
                 return new { result = result, first = first, last = last }; },
                 (store) => { store.result.Add(store.first); if (store.last != "") store.result.Add(store.last); return store.result; })
           .Select(x => new { P1 = x.Substring(0,1), P2 = x.Substring(1,1)});

I create an object to hold the list so far and the first and last known so far.

Then I just apply logic to remove the middle stuff.

Upvotes: 1

Slai
Slai

Reputation: 22876

Group by the first character and take the first and last item of each group:

var list = "1a 1b 1c 1d 2a 3a 4a 4b 5a 6a 7a 7b 8a".Split();

var result = list.GroupBy(i => i[0])
    .Select(i => new[] { i.First(), i.Last() }.Distinct())
    .SelectMany(i => i).ToArray();

Debug.Print(string.Join("\", \"", result)); 
// { "1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a" }

Upvotes: 1

Related Questions