JayTarka
JayTarka

Reputation: 571

Stop scan splitting my match

When I scan for zan I get the entire group in one entry, behavior I expect and want:

"ax zan".scan /(zan)/ #=> [["zan"]] 

How can I have the whole match returned, white space and all ? i.e

"ax       b".scan /.../ #=> [["ax       b"]]

When I scan for (ax)\s*(b), the match is split into two entries:

"ax b".scan /(ax)\s*(b)/ #=> [["ax", "b"]]

Update

How do I use the | operator without groups?

"sab   x".scan /sab|p\s*x/ #=> [["sab   x"]]
"sap   x".scan /sab|p\s*x/ #=> [["sap   x"]]

Upvotes: 0

Views: 45

Answers (2)

Jordan Running
Jordan Running

Reputation: 106077

As you've discovered, when you use a RegExp with groups, String#scan will return an array of arrays, the inner arrays each having one element for each capture. If that's not what you want, then you have to make your groups non-capturing by using the ?: flag, e.g. (?:foo|bar).

expr = /sa(?:b|p)\s*x/
"sab   x".scan(expr) #=> ["sab   x"]
"sap   x".scan(expr) #=> ["sap   x"]

P.S. The above works, but since only one character differs, in this case you should use a character class instead:

/sa[bp]\s*x/

P.P.S. You should only use scan if you're looking for multiple matches. If you just want one match, use String#slice, which has the handy String#[] alias. This will return the match as a string instead of an array:

expr = /sa(?:b|p)\s*x/
"sab   x"[expr] #=> "sab   x"
"sap   x"[expr] #=> "sap   x"

In case it's not clear, this works on variables, too, like any other method:

str = "sab   x"
str[/sa[bp]\s*x/] #=> "sab   x"

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174756

Just remove the capturing groups.

"ax       b".scan(/ax\s*b/)

To get the element inside another array, then put the above regex inside a capturing group.

"ax       b".scan(/(ax\s*b)/)

"ax b".scan /(ax)\s*(b)/ #=> [["ax", "b"]] , i got two results why?

Because scan by default gives the first preference to groups . If no groups are present then it consider the matches. In the above, capturing groups are present in your regex which captures ax, b so you got these two elements within an array. Note that if you have any single capturing group present, the output format must be a two dimensional array.

Example:

irb(main):001:0> "ax       b".scan(/ax\s*b/)
=> ["ax       b"]
irb(main):002:0> "ax       b".scan(/(ax\s*b)/)
=> [["ax       b"]]

Upvotes: 2

Related Questions