Save
Save

Reputation: 11948

Alternatives to nested Select in Linq

Working on a clustering project, I stumbled upon this, and I'm trying to figure out if there's a better solution than the one I've come up with.

PROBLEM : Given a List<Point> Points of points in R^n ( you can think at every Point as a double array fo dimension n), a double minDistance and a distance Func<Point,Point,double> dist , write a LINQ expression that returns, for each point, the set of other points in the list that are closer to him than minDistance according to dist.

My solution is the following:

        var lst = Points.Select( 
                x => Points.Where(z => dist(x, z) < minDistance)
                    .ToList() )
                .ToList();

So, after noticing that

I have the following questions:

  1. Is it possible to translate my code in query expression? and if so, how?
  2. Is there a better way to solve this in dot notation?

Upvotes: 4

Views: 2241

Answers (3)

King King
King King

Reputation: 63377

I think you could add some bool Property to your Point class to mark it's has been browsed to prevent twice calling to dist, something like this:

public class Point {
   //....
   public bool IsBrowsed {get;set;}
}
var lst = Points.Select( 
                 x => {
                       var list = Points.Where(z =>!z.IsBrowsed&&dist(x, z) < minDistance).ToList();
                       x.IsBrowsed = true;
                       return list;
                      })
                 .ToList();

Upvotes: 0

p.s.w.g
p.s.w.g

Reputation: 149058

Because Points is a List you can take advantage of the fact that you can access each item by its index. So you can avoid comparing each item twice with something like this:

var lst = 
    from i in Enumerable.Range(0, Points.Length)
    from j in Enumerable.Range(i + 1, Points.Length - i - 1)
    where dist(Points[i], Points[j]) < minDistance
    select new 
    {
        x = Points[i], y = Points[j]
    };

This will return a set composed of all points within minDistance of each other, but not exactly what the result you wanted. If you want to turn it into some kind of Lookup so you can see which points are close to a given point you can do this:

var lst = 
    (from i in Enumerable.Range(0, Points.Length)
     from j in Enumerable.Range(i + 1, Points.Length - i - 1)
     where dist(Points[i], Points[j]) < minDistance
     select new { x = Points[i], y = Points[j] })
    .SelectMany(pair => new[] { pair, { x = pair.y, y = pair.x })
    .ToLookup(pair => pair.x, pair => pair.y);

Upvotes: 0

BartoszKP
BartoszKP

Reputation: 35911

The problem definition, that you want "for each point, the set of other points" makes it impossible to solve without the inner query - you could just disguise it in clever manner. If you could change your data storage policy, and don't stick to LINQ then, in general, there are many approaches to Nearest Neighbour Search problem. You could for example hold the points sorted according to their values on one axis, which can speed-up the queries for neighbours by eliminating early some candidates without full distance calculation. Here is the paper with this approach: Flexible Metric Nearest Neighbor Classification.

Upvotes: 3

Related Questions