tob88
tob88

Reputation: 2211

Finding exact words in a string

I have a list of links to clothing websites that I am categorising by gender using keywords. Depending on what website they are for, they all have different URL structures, for example...

www.website1.com/shop/womens/tops/tshirt

www.website2.com/products/womens-tshirt

I cannot use the .include? method because regardless of whether it is .include?("mens") or .include?("womens"), it will return true. How can I have a method that will only return true for "womens" (and vice versa). I suspect it may have to be some sort of regex, but I am relatively inexperienced with these, and the different URL structures make it all the more tricky. Any help is much appreciated, thanks!

Upvotes: 2

Views: 10756

Answers (4)

Dave Newton
Dave Newton

Reputation: 160191

The canonical regex way of doing this is to search on word boundaries:

pry(main)> "foo/womens/bar".match(/\bwomens\b/)
=> #<MatchData "womens">
pry(main)> "foo/womens/bar".match(/\bmens\b/)
=> nil
pry(main)> "foo/mens/bar".match(/\bmens\b/)
=> #<MatchData "mens">
pry(main)> "foo/mens/bar".match(/\bwomens\b/)
=> nil

That said, either splitting, or searching with the leading "/", may be adequate.

Upvotes: 16

Yule
Yule

Reputation: 9764

keyword = "women"
url = "www.website1.com/shop/womens/tops/tshirt"
/\/#{keyword}/ =~ url 
=> 21
keyword = "men"
url = "www.website1.com/shop/womens/tops/tshirt"
/\/#{keyword}/ =~ url 
=> nil
keyword = "women"
url = www.website2.com/products/womens-tshirt
/\/#{keyword}/ =~ url 
=> 25
keyword = "men"
url = www.website2.com/products/womens-tshirt
/\/#{keyword}/ =~ url 
=> nil

Then just do a !! on it:

=> !!nil => false
=> !!25 => true

Upvotes: 0

Behrang Saeedzadeh
Behrang Saeedzadeh

Reputation: 47913

If you first check for women it should work:

# assumes str is not nil
def gender(str)
  if str.include?("women")
    "F"
  elsif str.include?("men") 
    "M"
  else
    nil
  end
end

If this is not what you are looking for, please explain your problem in more detail.

Upvotes: 11

fge
fge

Reputation: 121710

You could split with / and check for string equality on the component(s) you want -- no need for a regex there

Upvotes: 1

Related Questions