lulalala
lulalala

Reputation: 17991

What's the difference between scan and match on Ruby string

I am new to Ruby and has always used String.scan to search for the first occurrence of a number. It is kind of strange that the returned value is in nested array, but I just go [0][0] for the values I want. (I am sure it has its purpose, just that I haven't used it yet.)

I just found out that there is a String.match method. And it seems to be more convenient because the returned array is not nested.

Here is an example of the two, first is scan:

>> 'a 1-night stay'.scan(/(a )?(\d*)[- ]night/i).to_a
=> [["a ", "1"]]

then is match

>> 'a 1-night stay'.match(/(a )?(\d*)[- ]night/i).to_a
=> ["a 1-night", "a ", "1"]

I have check the API, but I can't really differentiate the difference, as both referred to 'match the pattern'.

This question is, for simply out curiousity, about what scan can do that match can't, and vise versa. Any specific scenario that only one can accomplish? Is match the inferior of scan?

Upvotes: 53

Views: 66923

Answers (3)

Erik Waters
Erik Waters

Reputation: 327

Previous answers state that scan will return every match from the string the method is called on but this is incorrect

the string class' scan method iterates over a string and returns non overlapping matches

string = 'xoxoxo'

p string.scan('xo') # => ['xo' 'xo' 'xo' ]
# so far so good but...

p string.scan('xox') # => ['xox']
# if this retured EVERY instance of 'xox' it would include a substring
# starting at indices 0 and 2 but only one match is returned

Upvotes: 10

Jonathan Julian
Jonathan Julian

Reputation: 12272

Short answer: scan will return all matches. This doesn't make it superior, because if you only want the first match, str.match[2] reads much nicer than str.scan[0][1].

ruby-1.9.2-p290 :002 > 'a 1-night stay, a 2-night stay'.scan(/(a )?(\d*)[- ]night/i).to_a
 => [["a ", "1"], ["a ", "2"]] 
ruby-1.9.2-p290 :004 > 'a 1-night stay, a 2-night stay'.match(/(a )?(\d*)[- ]night/i).to_a
 => ["a 1-night", "a ", "1"] 

Upvotes: 64

unrelativity
unrelativity

Reputation: 3730

#scan returns everything that the Regex matches.

#match returns the first match as a MatchData object, which contains data held by special variables like $& (what was matched by the Regex; that's what's mapping to index 0), $1 (match 1), $2, et al.

Upvotes: 27

Related Questions