ebcrypto
ebcrypto

Reputation: 610

Convert street address from string to columns - Regex?

I have a list of 350 addresses in a single column excel file that I need to import to a SQL table, breaking the data into columns.

content of the Excel cells is such as this one

Courtesy Motors 2520 Cohasset Rd - Chico, CA 95973-1307 530-893-1300  

What strategy should I apply to import this in a clean fashion?

I was thinking

NAME <- anything before the 1st digit

STREET ADDRESS <- from the 1st digit to the '-'

STATE <- Anything from the last ',' to the '-' immediately before (the address field can contain some - )

TELEPHONE <- Last 12 char

ZIP <- 10 first char of the last 22 char

I work in C# if this matters.

Is RegEx the appropriate approach? I'm not too familiar with them, so I'm not sure. Can somebody suggest a RegEx expression that would do the job (or part of it)?

Thanks!

Upvotes: 0

Views: 484

Answers (3)

MoXplod
MoXplod

Reputation: 3852

You can use google geocode API. You might have to remove phone number from there, but if someone is looking for address parsing with more functionality than just regex - they can even get lat/long for address.

For your address example

http://maps.googleapis.com/maps/api/geocode/xml?address=2520%20Cohasset%20Rd%20-%20Chico%2C%20CA%2095973-1307%20530-893-1300%20%20&sensor=false

Documentation

https://developers.google.com/maps/documentation/geocoding/

Upvotes: 0

Jason McCreary
Jason McCreary

Reputation: 72971

A regular expression is the tool for this job. I am not a C# developer, so I can't give you the exact code. Nonetheless, the following regex should work. Most IDEs have this built in or if you have access to UNIX sed would work.

([^\d]+)\s(.+?)\s-\s[^,]+,\s([A-Z]{2})\s([^\s]+)\s([^\s]+)

Captures:

  1. Name
  2. Address
  3. State
  4. ZIP
  5. Phone

Upvotes: 1

Amber
Amber

Reputation: 526583

The following regex should pull out each part in a capture group:

(\D+) ([^-]+) - ([^,]+, \w+) ([\d-]+) ([\d-]+)

Capture groups, in order:

  1. Name
  2. Street address
  3. City, State
  4. Zip
  5. Phone

Upvotes: 1

Related Questions