jkeuhlen
jkeuhlen

Reputation: 4517

Regex to match exact word in string

I've looked around but haven't been able to find a working solution to my problem.

I have an array of two strings input and want to test which element of the array contains an exact substring Test.

One thing I have tried (among numerous other attempts):

input = ["Test's string", "Test string"]
# Alternative input array that it needs to work on:
#  ["Testing string", "some Test string"]
substring = "Test"
if (input[0].match(/\b#{substring}\b/))
  puts "Test 0 "
  # Do something...
elsif (input[1].match(/\b#{substring}\b/))
  puts "Test 1"
  # Do something different...
end

The desired result is a print of "Test 1". The input can be more complex but overall I am looking for a way to find an exact match of a substring in a longer string. I feel like this should be a rather trivial regex but I haven't been able to come up with the correct pattern. Any help would be greatly appreciated!

Upvotes: 2

Views: 9071

Answers (3)

Fumu 7
Fumu 7

Reputation: 1091

Following code may be what you are looking for.

input = ["Testing string", "Test string"]
substring = "Test"

if (input[0].match(/[^|\s]#{substring}[\s|$]/)
  puts "Test 0 "
elsif (input[1].match(/[^|\s]#{substring}[\s|$]/)
  puts "Test 1"
end

The meaning of the pattern /[^|\s]#{substring}[\s|$]/ is

  1. [^|\s] : left side of the substring is begining of string(^) or white space,

  2. {substring} : subsring is matched exactly,

  3. [\s|$] : right side of the substring is white space or end of string($).

Upvotes: 3

Z. Huey Hu
Z. Huey Hu

Reputation: 61

The problem is with your bounding. In your original question, the word Test will match the first string because the ' is will match the \b word boundary. It's a perfect match and is responding with "Test 0" correctly. You need to determine how you'll terminate your search. If your input contains special characters, I don't think the regex will work properly. /\bTest my $money.*/ will never match because the of the $ in your substring.

What happens if you have multiple matches in your input array? Do you want to do something to all of them or just the first one?

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110675

One way to that is as follows:

input = ["Testing string", "Test"]

"Test #{ input.index { |s| s[/\bTest\b/] } }"
  #=> "Test 1"

input = ["Test", "Testing string"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
  #=> "Test 0"

\b is the regex denotes a word boundary.

Maybe you want a method to return the index of the first element of input that contains the word? That could be:

def matching_index(input, word)
  input.index { |s| s[/\b#{word}\b/i] }
end

input = ["Testing string", "Test"]   
matching_index(input, "Test")    #=> 1
matching_index(input, "test")    #=> 1
matching_index(input, "Testing") #=> 0
matching_index(input, "Testy")   #=> nil

Then you could use it like this, for example:

word = 'Test'
puts "The matching element for '#{word}' is at index #{ matching_index(input, word) }"
  #=> The matching element for 'Test' is at index 1

word = "Testing"
puts "The matching element for '#{word}' is '#{ input[matching_index(input, word)] }'"
  #The matching element for 'Testing' is 'Testing string'

Upvotes: 2

Related Questions