Dex
Dex

Reputation: 12749

Regex to Grab City, State, Zip

Trying to make a regex that can handle input like either:

  1. Beverly Hills, CA
  2. Beverly Hills, CA 90210

I have this:

^(.+)[,\\s]+(.+)\s+(\d{5})?$

It works for the #2 case, but not #1. If I change the \s+ to \s* then it works for #1 but not #2.

You can play around with it here: http://rubular.com/r/oqKBJ4r8cq

Upvotes: 2

Views: 7222

Answers (4)

Andrew Hare
Andrew Hare

Reputation: 351526

Try this instead:

^([^,]+),\s([A-Z]{2})(?:\s(\d{5}))?$

This expression works on both examples, captures each piece of the address in separate groups, and properly handles whitespace.

Here is how it breaks down:

^           # anchor to the start of the string
([^,]+)     # match everything except a comma one or more times
,           # match the comma itself
\s          # match a single whitespace character
([A-Z]{2})  # now match a two letter state code 
(?:         # create a non-capture group
    \s        # match a single whitespace character
    (\d{5})   # match a 5 digit number
)?          # this whole group is optional
$           # anchor to the end of the string

Upvotes: 6

Bnicholas
Bnicholas

Reputation: 14121

((?:\w|\s)+),\s(AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY)

Here is a long one which only grabs valid state codes.

Upvotes: 0

James Kyburz
James Kyburz

Reputation: 14453

["Beverly Hills, CA 90210", "Beverly Hills, CA"].each do |s|
  m = s.match(/^([^,]*),\s*(\w*)\s*(\d*)?$/)
  $1 # => "Beverly Hills", "Beverly Hills"
  $2 # => "CA", "CA"
  $3 # => "90210", ""
end

The # => comments show the results for both runs.

Upvotes: 0

Dogbert
Dogbert

Reputation: 222198

Try this:

^(.+)[,\\s]+(.+?)\s*(\d{5})?$

http://rubular.com/r/qS0e5vAQnT

Upvotes: 6

Related Questions