Confusion with the output of /\S\W/ and /\W\S/ in Ruby 1.9.3

Question

I just familiared with the \W and \S. Now i was playing to see how they behave and accordongly tried the below:

> s="abd12 de 5t6"
=> "abd12 de 5t6"  #understood
> /\W/ =~ s
=> 5               #understood
> /\W\S/ =~ s
=> 5               #Confusion(A)             
> /\S\W/ =~ s
=> 4               #Confusion(B) 
> /\S/ =~ s  
=> 0               #understood
>

What the logic ran in Part-A and Part-B to give the output as 5 and 4. Just wanted to clear my concept there. In Part-A 5 indicates a non-word character but that is not a non- white space charater also.

I just want to know How IRB treat such statements in the confusion - A and B?

Thanks

Marc Baumbach · Accepted Answer

When you have \W\S in your regular expression, you are essentially saying: "Find a match in the string where a character is a non-word character, followed by a non-space character."

In Confusion A the first non-word character is the first space (at index 5). The next character right after it is the d which is a non-space character. That's a match and therefore returns 5 since that's the index where the match began.

Similarly, for the \S\W the first non-space character is a, but it's followed by b which is a word character, so the match doesn't work yet. Once it gets to the 2 (position 4), that matches a non-space character and it is followed by the space which is a non-word character.

Confusion with the output of /\S\W/ and /\W\S/ in Ruby 1.9.3

Answers (1)

Related Questions