Reputation: 14736
I have a method that gets an array of arrays and detects if any sub array is occurs more than one time, regardless of its order:
def has_similar_content?(array)
array.each.with_index do |prop1, index1|
array.each.with_index do |prop2, index2|
next if index1 == index2
return true if prop1.sort == prop2.sort
end
end
false
end
> has_similar_content?([%w[white xl], %w[red xl]])
=> false
> has_similar_content?([%w[blue xl], %w[xl blue cotton]])
=> false
> has_similar_content?([%w[blue xl], %w[xl blue]])
=> true
> has_similar_content?([%w[xl green], %w[red xl], %w[green xl]])
=> true
My problem is the runtime of this method, it has a quadratic complexity and needs an additional sort of the arrays to detect if the elements are the same.
Is there a more efficient way to do this?
Upvotes: 0
Views: 81
Reputation: 26768
this way is simpler:
array.
group_by(&:sort).
transform_values(&:length).
values.any? { |count| count > 1 }
Upvotes: 1
Reputation: 110675
I have assumed the question is as stated in my comment on the question.
Code
def disregarding_order_any_dups?(arr)
arr.map do |a|
a.each_with_object(Hash.new(0)) do |k,h|
h[k] += 1
end
end.uniq.size < arr.size
end
Examples
disregarding_order_any_dups? [%w[white xl], %w[red xl]]
#=> false
disregarding_order_any_dups? [%w[blue xl],
%w[xl blue cotton]]
#=> false
disregarding_order_any_dups? [%w[blue xl], %w[xl blue]]
#=> true
disregarding_order_any_dups? [%w[xl green], %w[red xl],
%w[green xl]]
#=> true
disregarding_order_any_dups? [[1,2,3,2], [3,1,3,2],
[2,3,1,2]]
#=> true
Complexity
If n = arr.size
and m = arr.map(&:size).max
, the computational complexity is O(n*m
). The single statement within map
's block could be replaced with a.sort
, but that would increase the computational complexity to O(n*m*log(m)
).
Explanation
For the last example the steps are as follows.
arr = [[1,2,3,2], [3,1,3,2], [2,3,1,2]]
b = arr.map do |a|
a.each_with_object(Hash.new(0)) do |k,h|
h[k] += 1
end
end
#=> [{1=>1, 2=>2, 3=>1}, {3=>2, 1=>1, 2=>1},
# {2=>2, 3=>1, 1=>1}]
c = b.uniq
#=> [{1=>1, 2=>2, 3=>1}, {3=>2, 1=>1, 2=>1}]
d = c.size
#=> 2
e = arr.size
#=> 3
d < e
#=> true
The expression
h = Hash.new(0)
creates a counting hash. Ruby expands h[k] += 1
to
h[k] = h[k] + 1
The hash instance methods are :[]=
on the left, :[]
on the right. If h
does not have a key k
, h[k]
on the right is replaced with h
's default value, which has been defined to equal zero, resulting in:
h[k] = 0 + 1
If h
has a key k
, h[k]
on the right, the value of k
, is not replaced with h
's default value. See the version of Hash::new which takes an argument equal to the hash's default value.
Upvotes: 2
Reputation: 1633
This is still quadratic but it is faster :
def has_similar_content?(array)
# sort subarray only once. O( n * m * log(m) )
sorted_array= array.map(&:sort)
# if you can change the input array, this prevent object allocation :
# array.map!(&:sort!)
# compare each pair only once O( n * n/2 )
nb_elements= sorted_array.size
0.upto(nb_elements - 1).each do |i|
(i + 1).upto(nb_elements - 1).each do |j|
return true if sorted_array[i] == sorted_array[j]
end
end
return false
end
Upvotes: 1