George Panic
George Panic

Reputation: 441

optimal way of finding differencies betwen two sets

I have two sets of elements and I want an optimal algorithm to find their differences or in math form: A U B - A ∩ B

One way I thought is

Bfound=0;
for (1->Na,i)
{
 flag=0;
 for (1->Nb,j)
 {
  if(A[i]==B[j])
  {
   flag=1;
   Bfound[j]=1;
  }

 }
 if (flag==0)
 print A[i]
}

for(1->Nb,i)
{
  if(Bfound[i]==0)
  print B[i]
}

Is this optimal?

Upvotes: 0

Views: 163

Answers (2)

user1071136
user1071136

Reputation: 15725

To answer your question - no, this is not optimal. The complexity of your solution is O(nm) time, where n and m are the sizes of A and B, respectively. You can improve this time to O(nlogn + mlogm), if n ~ m.

  1. Sort both arrays, in n log n + m log m time.
  2. Find the intersection in n+m time:

    i = 0; 
    j = 0;
    while(i < n && j < m) {
      if (A[i] == B[j]) {
        print(A[i]);
        i++;
        j++;
      } else if (A[i] < B[j]) {
        i++;
      } else {
        j++;
      } 
    }
    

Upvotes: 3

jfs
jfs

Reputation: 414179

Symmetric difference: A∪B - A∩B i.e., return a new set with elements in either A or B but not both. The straightforward way from the definition:

# result = A ^ B
result = set() # start with empty set

# add all items that are in A but not in B
for item in A: # for each item in A; O(|A|), where |A| - number of elements in A
    if item not in B: # amortized O(1) (for hash-based implementation)
       result.add(item) # amortized O(1) (for hash-based implementation)

# add all items that are in B but not in A
for item in B:
    if item not in A:
       result.add(item)

Complexity O(|A|+|B|).

In C++ the type is unordered_set, in Java, C# -- HashSet, in Python -- set.

Another approach (used in Python) is to copy A into result and then try to remove B items from the result:

# result = A ^ B
result = set(A) # copy A into result
for item in B:
    try: result.remove(item) # O(1)
    except KeyError: # item was not in the result
        result.add(item) # add previously not-found item

Complexity O(|A|+|B|).

Upvotes: 1

Related Questions