Omerta
Omerta

Reputation: 83

(Algorithm) Find if two unsorted arrays have any common elements in O(n) time without sorting?

We have two unsorted arrays and each array has a length of n. These arrays contain random integers in the range of 0-n100. How to find if these two arrays have any common elements in O(n)/linear time? Sorting is not allowed.

Upvotes: 7

Views: 11041

Answers (8)

Giridhar
Giridhar

Reputation: 1

Based on the ideas posted till date.We can store the one array integer elements into a hash map . Maximum number of different integers can be stored in RAM . Hash map will have only unique integer values. Duplicates are ignored.

Here is the implementation in Perl language.

#!/usr/bin/perl
use strict;
use warnings;
sub find_common_elements{ # function that prints common elements in two unsorted array
my (@arr1,@array2)=@_; # array elements assumed to be filled and passed as function arguments 

my $hash; # hash map to store value of one array

# runtime to prepare hash map is O(n). 
foreach my $ele ($arr1){
$hash->{$ele}=true; # true here element exists key is integer number and value is true,          duplicate elements will be overwritten

# size of array will fit in memory as duplicate integers are ignored   ( mx size will     be 2 ^( 32) -1 or 2^(64) -1 based on operating system) which can be stored in RAM.
} 
# O(n ) to traverse second array and finding common elements in two array
foreach my $ele2($arr2){
  # search in hash map is O(1), if all integers of array are same then hash map will have         only one entry and still search tim is O(1) 
   if( defined $hash->{$ele}){  
   print "\n $ele is common in both array \n";
   }
  }
 } 

I hope it helps.

Upvotes: 0

meriton
meriton

Reputation: 70564

You have not defined the model of computation. Assuming you can only read O(1) bits in O(1) time (anything else would be a rather exotic model of computation), there can be no algorithm solving the problem in O(n) worst case time complexity.

Proof Sketch: Each number in the input takes O(log(n ^ 100)) = O(100 log n) = O(log n) bits. The entire input therefore O(n log n) bits, which can not be read in O(n) time. Any O(n) algorithm can therefore not read the entire input, and hence not react if these bits matter.

Upvotes: 6

Dr. belisarius
Dr. belisarius

Reputation: 61016

Linearity Test

Using Mathematica hash function and arbitrary length integers.

alt text

Tested until n=2^20, generating random numbers till (2^20)^100 = (approx 10^602)

Just in case ... the program is:

k = {};
For[t = 1, t < 21, t++,
  i = 2^t;
  Clear[a, b];
  Table[a[RandomInteger[i^100]] = 1, {i}];
  b = Table[RandomInteger[i^100], {i}];
  Contains = False;
  AppendTo[k,
   {i, First@Timing@For[j = 2, j <= i, j++,
       Contains = Contains || (NumericQ[a[b[[j]]]]);
       ]}]];
ListLinePlot[k, PlotRange -> All, AxesLabel -> {"n", "Time(secs)"}]

Upvotes: 2

Neil
Neil

Reputation: 5782

If storage is not important, then scratch hash table in favor for an array of n in length. Flag to true when you come across that number in first array. In pass through second array, if you find any of them to be true, you have your answer. O(n).

Define largeArray(n)
// First pass
for(element i in firstArray)
   largeArray[i] = true;

// Second pass
Define hasFound = false;
for(element i in secondArray)
   if(largeArray[i] == true) 
      hasFound = true;
      break;

Upvotes: 1

antirez
antirez

Reputation: 18504

Put the elements of the first array in an hash table, and check for existence scanning the second array. This gives you a solution in O(N) average case.

If you want a truly O(N) worst case solution then instead of using an hash table use a linear array in the range 0-n^100. Note that you need to use just a single bit per entry.

Upvotes: 1

David Weiser
David Weiser

Reputation: 5195

Have you tried a counting sort? It is simple to implement, uses O(n) space and also has a \theta(n) time complexity.

Upvotes: 0

Guy Nir
Guy Nir

Reputation: 1244

Answering Neil: Since you know at start what is your N (two arrays of size N), you can create a hash with array size of 2*N*some_ratio (for example: some_ratio= 1.5). With this size, almost all simple hash functions will provide you good spread of the entities.

You can also implement find_or_insert to search for existing or insert a new one at the same action, this will reduce the hash function and comparison calls. (c++ stl find_or_insert is not good enough since it doesnt tell you whether the item was there before or not).

Upvotes: 2

Nikita Rybak
Nikita Rybak

Reputation: 68006

Hashtable will save you. Really, it's like a swiss knife for algorithms.
Just put in it all values from the first array and then check if any value from the second array is present.

Upvotes: 11

Related Questions