Richard Stokes
Richard Stokes

Reputation: 3552

String.scan returning empty array in Ruby

I've written a very basic regex in Ruby for scraping email-addresses off the web. It looks like the following:

/\b\w+(\.\w+)*@\w+\.\w+(\.\w+)*\b/

When I load this into irb or rubular, I create the following string:

"[email protected]"

When I run the Regexp.match(string) command in irb, I get this:

regexp.match(string) =>#<MatchData "[email protected]" 1:nil 2:nil>

So the match seems to be recorded in the MatchData object. However, when I run the String.scan(regex) command (which is what I'm primarily interested in), I get the following:

string.scan(regex) => [[nil, nil]]

Why isn't scan returning the matched email address? Is it a problem with the regular expression? Or is it a nuance of String.scan/Regexp/MatchData that somebody could make me aware of?

Upvotes: 0

Views: 1315

Answers (1)

Steve Wang
Steve Wang

Reputation: 1824

The main issue is that your capturing groups (the stuff matched by whatever's in parentheses) aren't capturing what you want.

Let's say you want just the username and domain. You should use something along the lines of /\b(\w+(?:\.\w+)*)@(\w+(?:\.\w+)*)\.\w+\b/. As it stands, your pattern matches the input text, but the groups don't actually capture any text.

Also, why not just use /([\w\.]+)@([\w\.]+)\.\w+/? (not too familiar with ruby's regex engine, but that should be about right... you don't even need to check for word boundaries if you're using greedy quantifiers)

Upvotes: 3

Related Questions