Reputation: 457
I'm new to python and I was wondering how string comparison is done
Let's say I have a list of strings containing state names like
states = ["New York", "California", "Nebraska", "Idaho"]
I also have another string that contains an address like
postal_addr = "1234 1st E St San Jose California 95112"
How do I parse this address string and find a match with the items in the states list? In the above example, California will be a match. How do I then, after matching, extract "California"
and store it as a separate string?
Upvotes: 0
Views: 1005
Reputation: 5382
I would do
matches = [ s for s in states if s in postal_addr ]
Then, if you want to get the string from the postal address:
import re
if matches:
extracted = re.findall( matches[0], postal_addr)[0]
EDIT: ..but this won't work for city/state combos where the city name contains a different state, for example if postal_adr = '1 Arrowhead Dr, Kansas City, Missouri 64129'
and states = ["New York", "California", "Nebraska", "Idaho", "Missouri", "Kansas"]
etc. In this case
import re
if matches:
extracted = [(re.search(m, postal_addr).start() , m) for m in matches ]
extracted = sorted( extracted )[-1][1]
Upvotes: 1
Reputation: 3474
Here's another alternative answer using a regexp:
import re
states = ["New York", "California", "Nebraska", "Idaho"]
pattern = re.compile(r'.*(' + r'|'.join(states) + ').*')
postal_addr = "1234 1st E St San Jose California 95112"
match = pattern.match(postal_addr)
if match:
state = match.group(1)
Upvotes: 0
Reputation: 21506
You can try like this,
In [2]: states = ["New York", "California", "Nebraska", "Idaho"]
In [3]: postal_addr = "1234 1st E St San Jose California 95112"
In [4]: ''.join(state for state in states if state in postal_addr)
Out[4]: 'California'
Upvotes: 0
Reputation: 18851
states = ["New York", "California", "Nebraska", "Idaho"]
postal_addr = "1234 1st E St San Jose California 95112"
result = None
for state in states:
if state in postal_addr:
result = state
print(result)
Unfortunately, this will also match words that contain a state name such as Idahoba.
Upvotes: 0
Reputation: 845
To find all matches in the string you could do:
matches = [m for m in postal_addr.split() if m in states]
Upvotes: -1
Reputation: 363486
>>> states = ["New York", "California", "Nebraska", "Idaho"]
>>> postal_addr = "1234 1st E St San Jose California 95112"
>>> first_match = next(state for state in states if state in postal_addr)
>>> first_match
'California'
However, if you need to match at word boundaries, you might be better off using a regex.
Upvotes: 1