LearningBasics
LearningBasics

Reputation: 680

Remove words from string which are present in some set

I want to remove words from a string which are there in some set. One way is iterate over this set and remove the particular word using str.gsub("subString", ""). Does this kind of function already exits ?

Example string :

"Hotel Silver Stone Resorts"

Strings in set:

["Hotel" , "Resorts"]

Output should be:

" Silver Stone "

Upvotes: 1

Views: 1455

Answers (4)

Nikkolasg
Nikkolasg

Reputation: 454

You could try something different , but I don't know if it will be faster or not (depends on the length of your strings and set)

require 'set'
str = "Hotel Silver Stone Resorts"
setStr = Set.new(str.split)
setToRemove = Set.new( ["Hotel", "Resorts"])
modifiedStr = (setStr.subtract setToRemove).to_a.join " "

Output

"Silver Stone"

It uses the Set class which is faster for retrieving single element (built on Hash). But again, the underlying transformation with to_a may not improve speed if your strings / set are very big.

It also remove implicitly the duplicates in your string and your set (when your create the sets)

Upvotes: 0

Gagan Gami
Gagan Gami

Reputation: 10251

I am not sure what you wanted but as I understood

sentence = 'Hotel Silver Stone Resorts'
remove_words  = ["Hotel" , "Resorts"] # you can add words to this array which you wanted to remove
sentence.split.delete_if{|x| remove_words.include?(x)}.join(' ')
=> "Silver Stone"

OR

if you have an array of strings, it's easier:

sentence = 'Hotel Silver Stone Resorts'
remove_words  = ["Hotel" , "Resorts"]
(sentence.split - remove_words).join(' ')
=> "Silver Stone"

Upvotes: 0

Stefan
Stefan

Reputation: 114178

You can build a union of several patterns with Regexp::union:

words = ["Hotel" , "Resorts"]
re = Regexp.union(words)
#=> /Hotel|Resorts/

"Hotel Silver Stone Resorts".gsub(re, "")
#=> " Silver Stone "

Note that you might have to escape your words.

Upvotes: 6

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230336

You can subtract one array from another in ruby. Result is that all elements from the first array are removed from the second.

Split the string on whitespace, remove all extra words in one swift move, rejoin the sentence.

s = "Hotel Silver Stone Resorts"

junk_words = ['Hotel', 'Resorts']

def strip_junk(original, junk)
  (original.split - junk).join(' ')
end

strip_junk(s, junk_words) # => "Silver Stone"

It certainly looks better (to my eye). Not sure about performance characteristics (too lazy to benchmark it)

Upvotes: 1

Related Questions