Reputation: 1897
Hey, folks. I'm looking for some regular expressions to help grab street addresses and phone numbers from free-form text (a la Gmail).
Given some text: "John, I went to the store today, and it was awesome! Did you hear that they moved to 500 Green St.? ... Give me a call at +14252425424 when you get a chance."
I'd like to be able to pull out:
500 Green St.
(recognized as a street address)
+14252425424
(recognized as a phone number)
What makes this problem easier is that I don't care about parsing text that gets pulled out. That is, I don't care that Green
is the name of the road or that 425
is the area code. I just want to grab strings that "look like" addresses or telephone numbers.
Unfortunately, this needs to work internationally, as best as possible.
Anyone have any leads? Thanks!
Upvotes: 2
Views: 1132
Reputation: 145
You can give RecogniContact (-> address-parser.com) a try, it recognizes both postal addresses and phone numbers.
Upvotes: 1
Reputation: 154543
Phone numbers as long as you have a list of all country codes and number formats is easy, street addresses I have no idea, the only advice I can give you is to validate each set of words @ addressdoctor.com
Upvotes: 1
Reputation: 20621
Take a look at Chapter 7 of Dive Into Python. It touches both phone numbers and street addresses. I believe you can use this as a starting point. The international part seems tough. I suggest you build a first draft, try it on several locales, iterate and improve.
Upvotes: 0