Reputation: 2711
How can I remove all substrings of another string within an array of strings? I want this array of strings:
arr = ["Bochum", "Stu", "Stut", "Stuttt", "Stutt", "Stuttgart", "Heesestr.", "Berl", "Berlin"]
to shrink to:
["Bochum", "Stuttt", "Stuttgart", "Heesestr.", "Berlin"]
Edit:
arr
is unique, with no dups.Upvotes: 1
Views: 772
Reputation: 16189
Find the sub-strings and remove them, might be not good but clear
ar = ["Bochum", "Stu", "Stut", "Stuttt", "Stutt", "Stuttgart", "Heesestr.", "Berl", "Berlin"]
sub_strings = []
ar.collect do |string|
for index in 0...string.length
sub_strings << string[0...index] if ar.include?(string[0...index])
end
end
results = ar - sub_strings
Upvotes: 0
Reputation: 118781
Here's an implementation using a Trie-like data structure. It achieves the goal by simply losing information :-)
(I've assumed you only care about strings being prefixes of each other, rather than substrings...)
class LossyTrie
def initialize; @dict = {}; end
def add(str)
# Break the new string apart into characters, traversing down the trie at each step.
# As a side effect, if a prefix of str was already present, it will be forgotten.
# Similarly, if str itself is a prefix of an existing string, nothing will change.
dict = @dict
str.each_char do |c|
dict = (dict[c] ||= {})
end
end
def all_strings
strs = []
def traverse(dict, so_far, &block)
for k, v in dict
if v.empty?
block.call(so_far + k)
else
traverse(v, so_far + k, &block)
end
end
end
traverse(@dict, "") { |leaf| strs << leaf }
strs
end
end
strs = ["Bochum", "Stu", "Stut", "Stuttt", "Stutt", "Stuttgart", "Heesestr.", "Berl", "Berlin"]
trie = LossyTrie.new
strs.each { |s| trie.add(s) }
trie.all_strings # => ["Bochum", "Berlin", "Stuttt", "Stuttgart", "Heesestr."]
Upvotes: 0
Reputation: 492
No need for Rails, plain Ruby will do:
my_array =["Bochum", "Stu", "Stut", "Stuttt", "Stutt", "Stuttgart", "Heesestr.", "Berl", "Berlin"]
my_array.keep_if {|x| my_array.join(",").scan(x).length == 1}
Upvotes: 0
Reputation: 84453
Assuming your array elements always start with the same letters, one way to remove substrings is to sort, which will place shorter elements first. You can then reject elements that have longer matches deeper into the array. For example:
array = %w[Bochum Stu Stut Stuttt Stutt Stuttgart Heesestr. Berl Berlin]
array.sort.reject { |elem| array.grep(/\A#{elem}/).count > 1 }
#=> ["Berlin", "Bochum", "Heesestr.", "Stuttgart", "Stuttt"]
If your array shouldn't be sorted, then this is not the right solution for you. However, it definitely contains the right array elements, and is both short and easy to read. Your mileage may vary.
Upvotes: 0
Reputation: 8898
A solution that does not preserve the order:
["Bochum", "Stu", "Stut", "Stuttt", "Stutt", "Stuttgart",
"Heesestr.", "Berlin", "Berl"].sort_by(&:size).reduce([]) do |ary, word|
ary.reject{|s| word.include?(s)}.push(word)
end
Upvotes: 0
Reputation: 110755
If you're not opposed to the use of brute force:
arr = ["Bochum", "Stu", "Stut", "Stuttt", "Stutt", "Stuttgart",
"Heesestr.", "Berl", "Berlin"]
arr.each_with_object([]) { |str,a|
a << str unless arr.any? { |s| s.include?(str) && s.size > str.size } }
#=> ["Bochum", "Stuttt", "Stuttgart", "Heesestr.", "Berlin"]
Upvotes: 2