Reputation: 1088
I have a bunch of phone numbers with one per line:
[Home] (202) 121-7777
C (202) 456-1111
[mobile] 55 55 5 55555
[Work] (404) 555-1234
[Cell] (505) 555-1234
W 303-555-5555
M 777-555-5555
c 12346567s
I want to grab the first one that contains the letter "c" upper or lower case.
So far, I have this /^.*[C].*$/i
and that matches C (202) 456-1111
, [Cell] (505) 555-1234
and c 12346567s
. How do I return only the first? In other words, the match should only be C (202) 456-1111
.
I have been blindly putting question marks everywhere without success.
I am using Ruby if it makes a difference http://www.rubular.com/r/h6ReB9IN8t
Edit: Here is another question that Hrishi pointed to but I cannot figure out how to adapt it to match the whole line.
Upvotes: 2
Views: 3086
Reputation: 80
Try match
method. Here is an example:
list = <<EOF
[Home] (202) 121-7777
C (202) 456-1111
[mobile] 55 55 5 55555
[Work] (404) 555-1234
[Cell] (505) 555-1234
W 303-555-5555
M 777-555-5555
c 12346567s
EOF
Update
#match line with "c" letter in line, even that are part of word
puts list.match(/^.*C.*$/i)
#match line with "c" letter in line, that are not a part of word
puts list.match(/^\W*C\W.*$/i)
Upvotes: 2
Reputation: 160551
I'd go about this a bit differently. I prefer to reduce regular expressions to very simple patterns:
str = <<EOT
[Home] (202) 121-7777
C (202) 456-1111
[mobile] 55 55 5 55555
[Work] (404) 555-1234
[Cell] (505) 555-1234
W 303-555-5555
M 777-555-5555
c 12346567s
EOT
Finding the right line to work with is easily done using either select
or find
:
str.split("\n").select{ |s| s[/c/i] }.first # => "C (202) 456-1111"
str.split("\n").find{ |s| s[/c/i] } # => "C (202) 456-1111"
I'd recommend find
because it only returns the first occurrence.
Once the desired string is found, use scan
to grab the numbers:
str.split("\n").find{ |s| s[/c/i] }.scan(/\d+/) # => ["202", "456", "1111"]
Then join
them. When you have phone numbers stored in a database you don't really want them to be formatted, you just want the numbers. Formatting occurs later when you're outputting them again.
phone_number = str.split("\n").find{ |s| s[/c/i] }.scan(/\d+/).join # => "2024561111"
When you need to output the number, break it into the right grouping based on the regional phone-number representation. You should have some idea where the person is located, because you've usually also got their country code. Based on that you know how many digits you should have, plus the groups:
area_code, prefix, number = phone_number[0 .. 2], phone_number[3 .. 5], phone_number[6 .. 9] # => ["202", "456", "1111"]
Then output them so they're displayed correctly:
"(%s) %s-%s" % [area_code, prefix, number] # => "(202) 456-1111"
As far as your original pattern /^.*[C].*$/i
, there are some things wrong with your understanding of regex:
^.*
says "start at the beginning of the string and find zero or more characters", which is no more effective than saying /[C]
. [C]
creates an unnecessary character-set which means "find one of the letters in the set "C"; It does nothing useful, so just use C
as /C
. .*$
artificially finds the end of the string also, but since you're not capturing it there's nothing accomplished, so don't bother with it. The regex is now /C/
./C/i
or /c/i
. (Or you could use /[cC]/
but why?)Instead:
/c/i
. That's all that's needed. http://rubular.com/r/uPyxACOWls/c(?:ell)?/
. http://rubular.com/r/TkSRPWG2y6/\bc(?:ell)?\b/
. http://rubular.com/r/Smo0bFs9w8You can get a whole lot more complicated, but if you're not accomplishing anything with the additional pattern information, you're just wasting the regex-engine's CPU-time, and slowing your code. A confused regex-engine can waste a LOT of CPU-time, so be efficient and aware of what you're asking it to do.
Upvotes: 1
Reputation: 4847
Split the string by the new line characters, and select
the substring which matches your requirements and grab the first one:
str = '[Home] (202) 121-7777
C (202) 456-1111
[mobile] 55 55 5 55555
[Work] (404) 555-1234
[Cell] (505) 555-1234
W 303-555-5555
M 777-555-5555
c 12346567s'
p str.split(/\n/).select{|el| el =~ /^.*[C].*$/i}[0]
or use match
:
p str.match(/^.*[C].*$/i)[0]
EDITED:
Or, in case you want to find the first chunk that exactly starts with C
try this:
p str.match(/^C.*$/)[0]
Upvotes: 1
Reputation: 9414
EDIT Added two more ways of handling this. The last one is preferable.
This will do what you want. It will search for matches of your regex, and then get the first one. Please note that this will produce an error if string does not have any matches.
string = "[Home] (202) 121-7777
C (202) 456-1111
[mobile] 55 55 5 55555
[Work] (404) 555-1234
[Cell] (505) 555-1234
W 303-555-5555
M 777-555-5555
c 12346567s"
puts string.match(/^(.*[C].*)$/i).captures.first
puts string.match(/^(.*[C].*)$/i)
puts string[/^(.*[C].*)$/i]
Upvotes: 1