volni
volni

Reputation: 5366

Algorithm for checking if set A is a subset of set B in faster than linear time

Is there an algorithm (preferably constant time) to check if set A is a subset of set B?

Creating the data structures to facilitate this problem does not count against the runtime.

Upvotes: 6

Views: 6571

Answers (3)

argentage
argentage

Reputation: 2778

If you have a list of the least common letters and pairs of letters in your string set, you can store your sets sorted with their least common letters and letter pairs and maximize your chances of tossing out negative matches as quickly as possible. It's not clear to me how well this would combine with a bloom filter, Probably a hash table will do since there aren't very many digrams and letters.

If you had some information about the maximum size of subsets or even a common size you could similarly preproccess data by putting all of the subsets of a given size into a bloom filter as mentioned.

You could also do a combination of both of these.

Upvotes: 0

ElKamina
ElKamina

Reputation: 7807

You might go for bloom filter (http://en.wikipedia.org/wiki/Bloom_filter ). However there might be false positives, which can be addressed by the method mentioned by Keith above (but note that the worst case complexity of hashing is NOT O(n), but you can do O(nlogn).

  1. See if A is a subset of B according to Bloom filter
  2. If yes, then do a thorough check

Upvotes: 0

Keith Randall
Keith Randall

Reputation: 23265

Well, you're going to have to look at each element of A, so it must be at least linear time in the size of A.

An O(A+B) algorithm is easy using hashtables (store elements of B in a hashtable, then look up each element of A). I don't think you can do any better unless you know some advance structure for B. For instance, if B is stored in sorted order, you can do O(A log B) using binary search.

Upvotes: 1

Related Questions