Reputation: 409
I am using C# and I have two list<AACoordinate>
where each element in these lists represents a 3D point in space by x,y and z.
class AACoordinate
{
public int ResiNumber { get; set; }
public double x { get; set; }
public double y { get; set; }
public double z { get; set; }
}
Each list can contain 2000 or more points and my aim is to compare each point of list1 to all the points of list2 and if the distance is smaller than a specific number I keep a record of it. at the moment I am using foreach to compare each element of list1 to all of list2. This is quite slow because of the number of points. Do you have any suggestion to make it fast?
my loop is:
foreach (var resiSet in Program.atomList1)
{
foreach (var res in Program.atomList2)
{
var dis = EuclideanDistance(resiSet, res);
if (dis < 5)
temp1.Add(resiSet.ResiNumber);
}
}
Thanks in advance for your help.
Upvotes: 7
Views: 1824
Reputation: 9080
Maybe is a little complicated to implement, but I don't have any other ideas than this:
To lower down the computational complexity probably you have to use some data structure like KD-Tree or QuadTree.
You can use a KD-Tree to do nearest neighbor search, and this is what you need.
1) You build your kd-tree for the first list in O(n log n). This must be done in a single thread.
2) For each item in your second list, you do a lookup in the kd-tree for the nearest neighbor (the nearest point to the point you are looking for), in O(m log n). If the distance from current point to the nearest found point is less than your delta, you have it. If you want you can do this step using multiple threads.
So at the end the complexity of the algorithm will be O(max(n, m) * log n) where n is the number of items in the first list, m is the number of items in the second list.
For KD-Trees, see:
See http://home.wlu.edu/~levys/software/kd/ this seems a good implementation, in java and C#.
See http://www.codeproject.com/KB/architecture/KDTree.aspx
For quad trees, see:
See http://csharpquadtree.codeplex.com/
See http://www.codeproject.com/KB/recipes/QuadTree.aspx
And of course, look on Wikipedia what is a quadtree and a kd-tree
Consider that (2000 * log base 2(2000)) is about 21931.5
Instead 2000*2000 is 4000000, a big difference!
Using a parallel algorithm, if you have 4 processors, the normal O(n*n) algorithm will require 1000000 per processor, and I guess, it will be still too much if you need something fast or almost real time.
Upvotes: 1
Reputation: 28322
Your current method checks each ordered pair in L x R, a simple O(n^2) algorithm. A couple of ideas come to mind.
First, you can try splitting each of the two arrays into, say, cubes of side equal to your maximum distance; then you'd only have to compute distances between elements in L and R if they are no more than 1 cube away. This is still O(n^2) in the worst case, but if your points are much farther apart on average than your maximum distance, you can save on a lot of spurious comparisons here.
Second, you can micro-optimize how you're doing the distance function. You never need to use sqrt(), for instance; comparing the squared distance to the maximum distance squared is sufficient. Also, you can avoid doing integer multiplications to get the squared distance if you first check whether |dx|, |dy| or |dz| satisfy certain properties (i.e., are already bigger than the maximum distance).
Parallelization, as mentioned by the other posters, is always a good bet. In particular, a sophisticated parallelization + boxing strategy (outlined in the first suggestion) should make for a particularly scalable, efficient solution.
Upvotes: 0
Reputation: 1150
If you really want to compare each element of list1 with each of list2, you won't get rid of the nested for. But you could speed it up using Parallel.ForEach.
Upvotes: 0
Reputation: 10967
You can use Parallel Libraries where you can find Parallel.ForEach. Paralel Example
Upvotes: 0