Reputation: 123
How do I group "adjacent" Sites:
Given data:
List<Site> sites = new List<Site> {
new Site { RouteId="A", StartMilepost=0.00m, EndMilepost=1.00m },
new Site { RouteId="A", StartMilepost=1.00m, EndMilepost=2.00m },
new Site { RouteId="A", StartMilepost=5.00m, EndMilepost=7.00m },
new Site { RouteId="B", StartMilepost=3.00m, EndMilepost=5.00m },
new Site { RouteId="B", StartMilepost=11.00m, EndMilepost=13.00m },
new Site { RouteId="B", StartMilepost=13.00m, EndMilepost=14.00m },
};
I want result:
[
[
Site { RouteId="A", StartMilepost=0.00m, EndMilepost=1.00m },
Site { RouteId="A", StartMilepost=1.00m, EndMilepost=2.00m }
],
[
Site { RouteId="A", StartMilepost=5.00m, EndMilepost=7.00m }
],
[
Site { RouteId="B", StartMilepost=3.00m, EndMilepost=5.00m }
],
[
Site { RouteId="B", StartMilepost=11.00m, EndMilepost=13.00m },
Site { RouteId="B", StartMilepost=13.00m, EndMilepost=14.00m }
]
]
I tried using GroupBy with a custom comparer function checking routeIds match and first site's end milepost is equal to the next sites start milepost. My HashKey function just checks out routeId so all sites within a route will get binned together but I think the comparer makes an assumption like if A = B, and B = C, then A = C, so C won't get grouped with A,B,C since in my adjacency case, A will not equal C.
Upvotes: 6
Views: 438
Reputation: 21999
I was surprised that GroupBy
doesn't have overload with Func<..., bool>
for in-place grouping without hassle of implementing custom class.
So I've created one:
public static IEnumerable<IEnumerable<T>> GroupBy<T>(this IEnumerable<T> source, Func<T, T, bool> func)
{
var items = new List<T>();
foreach (var item in source)
{
if (items.Count != 0)
if (!func(items[0], item))
{
yield return items;
items = new List<T>();
}
items.Add(item);
}
if (items.Count != 0)
yield return items;
}
The usage:
var result = sites.GroupBy((x, y) => x.RouteId == y.RouteId &&
x.StartMilepost <= y.EndMilepost && x.EndMilepost >= y.StartMilepost).ToList();
This should produce wanted result.
Few words about implementation. In above extension method you have to supply delegate which should return true if x
and y
should be grouped. The method is dumb and will simply compare adjacent items in same order as they come. Your input is ordered, but you may have to use OrderBy
/ThenBy
before using it with something else.
Upvotes: 1
Reputation: 43515
Here is an extension method for grouping lists of the specific class (Site
). It is implemented with an inner iterator function GetGroup
that produces one group with adjacent sites. This function is called in a while
loop to produce all groups.
public static IEnumerable<IEnumerable<Site>> GroupAdjacent(
this IEnumerable<Site> source)
{
var ordered = source
.OrderBy(item => item.RouteId)
.ThenBy(item => item.StartMilepost);
IEnumerator<Site> enumerator;
bool finished = false;
Site current = null;
using (enumerator = ordered.GetEnumerator())
{
while (!finished)
{
yield return GetGroup();
}
}
IEnumerable<Site> GetGroup()
{
if (current != null) yield return current;
while (enumerator.MoveNext())
{
var previous = current;
current = enumerator.Current;
if (previous != null)
{
if (current.RouteId != previous.RouteId) yield break;
if (current.StartMilepost != previous.EndMilepost) yield break;
}
yield return current;
}
finished = true;
}
}
Usage example:
var allGroups = sites.GroupAdjacent();
foreach (var group in allGroups)
{
foreach (var item in group)
{
Console.WriteLine(item);
}
Console.WriteLine();
}
Output:
A 0,00..1,00
A 1,00..2,00A 5,00..7,00
B 3,00..5,00
B 11,00..13,00
B 13,00..14,00
Upvotes: 1
Reputation: 4815
Here are a couple of implementations where order of Site
's does not matter. You can use the LINQ Aggregate
function:
return sites.GroupBy(x => x.RouteId)
.SelectMany(x =>
{
var groupedSites = new List<List<Site>>();
var aggs = x.Aggregate(new List<Site>(), (contiguous, next) =>
{
if (contiguous.Count == 0 || contiguous.Any(y => y.EndMilepost == next.StartMilepost))
{
contiguous.Add(next);
}
else if (groupedSites.Any(y => y.Any(z => z.EndMilepost == next.StartMilepost)))
{
var groupMatchIndex = groupedSites.FindIndex(y => y.Any(z => z.EndMilepost == next.StartMilepost));
var el = groupedSites.ElementAt(groupMatchIndex);
el.Add(next);
groupedSites[groupMatchIndex] = el;
}
else
{
groupedSites.Add(contiguous);
contiguous = new List<Site>();
contiguous.Add(next);
}
return contiguous;
}, final => { groupedSites.Add(final); return final; });
return groupedSites;
});
Alternatively, just with foreach
:
return sites.GroupBy(x => x.RouteId)
.SelectMany(x =>
{
var groupedSites = new List<List<Site>>();
var aggList = new List<Site>();
foreach (var item in x)
{
if (aggList.Count == 0 || aggList.Any(y => y.EndMilepost == item.StartMilepost))
{
aggList.Add(item);
continue;
}
var groupMatchIndex = groupedSites.FindIndex(y => y.Any(z => z.EndMilepost == item.StartMilepost));
if (groupMatchIndex > -1)
{
var el = groupedSites.ElementAt(groupMatchIndex);
el.Add(item);
groupedSites[groupMatchIndex] = el;
continue;
}
groupedSites.Add(aggList);
aggList = new List<Site>();
aggList.Add(item);
}
groupedSites.Add(aggList);
return groupedSites;
});
Upvotes: 2
Reputation: 186688
First, let Site
class be (for debugging / demonstration)
public class Site {
public Site() { }
public string RouteId;
public Decimal StartMilepost;
public Decimal EndMilepost;
public override string ToString() => $"{RouteId} {StartMilepost}..{EndMilepost}";
}
Well, as you can see we have to break the rules: equality must be transitive, i.e. whenever
A equals B
B equals C
then
A equals C
It's not the case in your example. However, if we sort the sites by StartMilepost
we, technically, can implement IEqualityComparer<Site>
like this:
public class MySiteEqualityComparer : IEqualityComparer<Site> {
public bool Equals(Site x, Site y) {
if (ReferenceEquals(x, y))
return true;
else if (null == x || null == y)
return false;
else if (x.RouteId != y.RouteId)
return false;
else if (x.StartMilepost <= y.StartMilepost && x.EndMilepost >= y.StartMilepost)
return true;
else if (y.StartMilepost <= x.StartMilepost && y.EndMilepost >= x.StartMilepost)
return true;
return false;
}
public int GetHashCode(Site obj) {
return obj == null
? 0
: obj.RouteId == null
? 0
: obj.RouteId.GetHashCode();
}
}
then GroupBy
as usual; please, note that OrderBy
is required, since order of comparision matters here. Suppose we have
A = {RouteId="X", StartMilepost=0.00m, EndMilepost=1.00m}
B = {RouteId="X", StartMilepost=1.00m, EndMilepost=2.00m}
C = {RouteId="X", StartMilepost=2.00m, EndMilepost=3.00m}
Here A == B
, B == C
(so in case of A, B, C
all items will be in the same group) but A != C
(and thus in A, C, B
will end up with 3
groups)
Code:
List<Site> sites = new List<Site> {
new Site { RouteId="A", StartMilepost=0.00m, EndMilepost=1.00m },
new Site { RouteId="A", StartMilepost=1.00m, EndMilepost=2.00m },
new Site { RouteId="A", StartMilepost=5.00m, EndMilepost=7.00m },
new Site { RouteId="B", StartMilepost=3.00m, EndMilepost=5.00m },
new Site { RouteId="B", StartMilepost=11.00m, EndMilepost=13.00m },
new Site { RouteId="B", StartMilepost=13.00m, EndMilepost=14.00m },
};
var result = sites
.GroupBy(item => item.RouteId)
.Select(group => group
// Required Here, since MySiteEqualityComparer breaks the rules
.OrderBy(item => item.StartMilepost)
.GroupBy(item => item, new MySiteEqualityComparer())
.ToArray())
.ToArray();
// Let's have a look
var report = string.Join(Environment.NewLine, result
.Select(group => string.Join(Environment.NewLine,
group.Select(g => string.Join("; ", g)))));
Console.Write(report);
Outcome:
A 0.00..1.00; A 1.00..2.00
A 5.00..7.00
B 3.00..5.00
B 11.00..13.00; B 13.00..14.00
Upvotes: 3