Reputation: 75

Intersection of two dimensional array

Is there a simple way to find the intersection of a two dimensional array? For example:

arr1 = [1,2,3,4,5]
arr2 = [5,6,7,8]
arr3 = [5]
bigarr = [arr1,arr1,arr3]

I know that it's possible to do:

intersection = arr1 & arr2 & arr3 # => 5
intersection = big_arr[0] & big_arr[1] & big_arr[2] # => 5

but the number of elements in big_arr will vary. I was wondering if there was a simple way to intersect all the elements in big_arr regardless of the number of elements.

Upvotes: 3

Answers (2)

Cary Swoveland

Reputation: 110675

What do you want: a method with a pretty face or one that is first to finish line? My friend @Arup has supplied one; I'll offer the another.

Code

def heavy_lifter(a)
  wee_one = a.min_by(&:size)
  return [] if wee_one.empty?
  wee_loc = a.index(wee_one)
  counts = wee_one.each_with_object({}) { |e,h| h.update(e=>1) }
  nbr_reqd = 1
  a.each_with_index do |b,i|
    next if i == wee_loc
    b.each do |e|
      cnt = counts[e]
      case
      when cnt.nil?
        next
      when cnt == nbr_reqd
        counts[e] = cnt + 1
      when cnt < nbr_reqd
        counts.delete(e)
        return [] if counts.empty?
      end
    end
    nbr_reqd += 1
  end
  counts.keys.each { |k| counts.delete(k) if counts[k] < nbr_reqd }
  counts.keys
end

Example

a1 = [1,2,3,4,5]
a2 = [5,6,7,8]
a3 = [5]
a = [a1,a2,a3]

heavy_lifter(a)
  #=> [5]

Explanation

Here's how the method works:

select the smallest array (wee_one). To simplify the explanation, assume it is the first element of a.
convert wee_one to a counting hash, counts, where counts[e] = 1 for each element of wee_one.
iterate through the remaining arrays.
keys of counts will be removed as arrays are processed.
after all calculations are complete, counts.keys equals the intersection of all arrays.
after nbr_reqd arrays have been processed (including wee_one), counts[k] equals the number of those arrays that have been found to contain k. Obviously, if counts[k] < nbr_reqd, key k can be removed from counts (but we will not remove such keys until our attention is drawn to them, or at the end).
suppose we are now to process the array b at offset nbr_reqd, meaning nbr_reqd arrays have been processed (including wee_one at offset zero). For each element e of b, we obtain cnt = counts[e]. There are four possibilities:
- cnt == nil, in which case there is nothing to be done;
- cnt < nbr_reqd, in which case key e is removed from counts;
- cnt == nbr_reqd, meaning e has been present in all previous arrays processed, in which case we execute counts[k] = cnt + 1; and
- cnt == nbr_read+1, meaning e has been present in all previous arrays processed and is a duplicate of another e in b that has already been processed, in which case nothing is to be done.
nbr_reqd is incremented by one and the process is repeated for the next array.
after all arrays have been processed, all that remains is to remove each key k in counts for which counts[k] < nbr_reqd.

Cutie method

def cutie(a)
  a.reduce(:&)
end

Test data

def test(mx, *sizes)
  sizes.map { |sz| Array.new(sz) { rand(mx) } }
end

For example:

test(10,5,6,7)
  #=> [[9, 1, 5, 1, 1], [0, 8, 7, 8, 5, 0], [5, 1, 7, 6, 7, 9, 5]]

Benchmark code

require 'benchmark'

def bench(tst)
  Benchmark.bm(12) do |bm|
    bm.report 'cutie' do
      cutie(tst)
    end
    bm.report 'heavy_lifter' do
      heavy_lifter(tst)
    end
  end
end

Benchmark results

tst = test(1_000_000, 400_000, 600_000, 800_000) 

cutie(tst).size
  #=> 81929
cutie(tst).sort == heavy_lifter(tst).size
  #=> true

bench(tst)
                   user     system      total        real
cutie          1.610000   0.030000   1.640000 (  1.639736)
heavy_lifter   1.800000   0.020000   1.820000 (  1.824281)

sizes = (700_000..890_000).step(10_000).to_a
  #=> [700000, 710000, 720000, 730000, 740000,
  #    750000, 760000, 770000, 780000, 790000,
  #    800000, 810000, 820000, 830000, 840000,
  #    850000, 860000, 870000, 880000, 890000]

tst = test(1_000_000, *sizes) 

bench(tst)
                   user     system      total        real
cutie         14.090000   0.440000  14.530000 ( 14.679101)
heavy_lifter   5.830000   0.030000   5.860000 (  5.935438)

Upvotes: 0

Arup Rakshit

Reputation: 118271

Use #reduce like

arr1 = [1,2,3,4,5]
arr2 = [5,6,7,8]
arr3 = [5]
bigarr = [arr1,arr2,arr3]
bigarr.reduce(:&) # => [5]

Upvotes: 7

Intersection of two dimensional array

Answers (2)

Related Questions