John Latham
John Latham

Reputation: 255

Efficient data structure to find intersection of two lists

I have two very large List<List<int>> A and B. I need to find intersection in between each element of those lists.

A[0] = { 1, 2, 3};
B[0] = {2, 3, 4};

Intersection = { 2, 3 };

My implementation:

List<int> intersection = A[0].Intersection(B[0]).ToList();

This solution takes very long time to execute. I am wondering if there are any better way to do this and any more efficient data structure I can use to perform it in better time.

Thanks!

Upvotes: 4

Views: 1400

Answers (2)

aspiring
aspiring

Reputation: 1647

Code sample using HashSet(T).IntersectWith:

HashSet<string> lst1 = new HashSet<string> 

     { "id1", "id2", "id3" };

HashSet<string> lst2 = new HashSet<string> 

     { "id2", "id3", "id4" };

// what happens is that, lst1 will be modified by only leaving the intersect items
lst1.IntersectWith(lst2);

PS: I used the sample for String, but you can utilize your own integer values.

Upvotes: 1

BrokenGlass
BrokenGlass

Reputation: 160862

You should use a Hashset for this, in C# HashSet<T>. Lookups in hashsets are O(1) (if decent hashing function and using an array underneath) as opposed to O(n) for lists.

Using Linq in C# you basically get this "built-in": Intersect() will use a hashset internally to compute the intersection in O(n) instead of O(n^2) if using two lists.

var intersection = a.Intersect(b).ToList();

Upvotes: 7

Related Questions