Christian
Christian

Reputation: 7852

How to do a GroupJoin based on other predicate than equality?

I want to do a GroupJoin between two collections but based on some other predicate than equality. For example, if I have one collection with items, each containing a range property, I want to correlate each of them with items from another collection having some property with a value in that range. Can this be accomplished with GroupJoin or any other LINQ method? Two collections with resulting groups

Upvotes: 5

Views: 406

Answers (3)

Matthew Haugen
Matthew Haugen

Reputation: 13296

Unfortunately, IEqualityComparer<T>, the best way to modify the grouping logic in GroupJoin, only allows the comparison of two T's.

That means you have two options:

  1. Make them the same type--your range can be a Tuple<int, int> with Item1 being the min and Item2 being the max, and your values can be the same type, with Item1 being the value and Item2 being the...value. Then you can write an equality comparer that handles them. It's pretty darn awful, and I really wouldn't suggest that. But it would work.
  2. Implement your own method of GroupJoin.

The second approach is pretty straight-forward, as such things go.

This is a quick-and-dirty implementation, with some not-so-pretty complexity rankings. But it should serve as a valid proof of concept.

public static IEnumerable<IGrouping<TKey, TValue>> GroupJoin<TKey, TValue>(this IEnumerable<TValue> values, IEnumerable<TKey> keys, Func<TKey, TValue, bool> predicate)
{       
    return values.SelectMany(v => keys, (v, k) => new { v, k })
                 .Where(c => predicate(c.k, c.v))
                 .GroupBy(c => c.k, c => c.v);
}

I'm sure there's some black magic to be had here, but in my sample implementation, I'm basically just cross-joining the collections, then grabbing whichever ones match.

There are certainly some optimizations to be had in this, particularly if you're certain that one value will only ever map to one key. And you could probably do something even better if you knew for certain that you were dealing with ints that must fit into ranges, but I get the impression that you're asking generically, so there's a generic answer.

To address your example, then, you'd be looking at something like this:

var keys = new Tuple<int, int>[] { Tuple.Create(1, 5), Tuple.Create(5, 10) };

var array = new[] { 3, 4, 7, 9 };

var groups = array.GroupJoin(keys, (a, b) => a.Item1 <= b && a.Item2 > b);

Upvotes: 1

Jossef Harush Kadouri
Jossef Harush Kadouri

Reputation: 34237

Assuming these are your datatypes:

public class Range
{
    public int Start { get; set; }
    public int End { get; set; }
}

public class Item
{
    public int Number { get; set; }
}

This Linq expression will give you what you want (including overlapping ranges)

var ranges = new Range[];
var items = new Item[];

// ...

var rangeGroups = ranges
    .Select(r=> new {Range=r, Items=items.Where(i=> (r.Start <= i.Number) && (i.Number <= r.End))});

rangeGroups will have Range and Items for each item.

Check out this online demo - https://ideone.com/HQomfc

Upvotes: 1

Martijn
Martijn

Reputation: 12102

There is an overload with an IEqualityComparer<T> which you can use for this: https://msdn.microsoft.com/en-us/library/bb535047(v=vs.95).aspx

Offcourse, that means T has to be a concrete type, which leads to boilerplate heavy code. You'll get something like

public class Foo {
  public int min;
  public int max;
}

public class Bar {
  public int value;
}

private class FooBarJoinKey {
  public Foo foo;
  public Bar bar;
}

private class FooBarJoinCondition : IEQualityComparer<FooBarJoinKey> {
  public bool Equals(FooBarJoinKey left, FooBarJoinKey right){
    Foo foo = left.foo ?? right.foo;
    Bar bar = right.bar ?? left.bar;
    return (foo.min <= bar.value &&
           foo.max >= bar.value)

  }

  //If Equals returns true, the HashKeys of both objects *have* to be equal.
  //There is no way to guarantee this other than it being constant
  public int GetHashCode(FooBarJoinKey dummy){ return 0;}
}

example use:

IEnumerable<Foo> foos = ???
IEnumerable<Bar> bars = ???
Func<Foo, IEnumerable<Bar>> resultselector = ???
var comparer = new FooBarJoinCondition();

var grouped = foos.GroupJoin(bars, foo => new FooBarJoinKey(){ foo = foo;}, bar => new FooBarJoinKey() { bar = bar ;}, resultselector, comparer);

This is a really terrible solution. It also is, as far as I know, the only way to do this.

Upvotes: 0

Related Questions