Krithika Raghavendran
Krithika Raghavendran

Reputation: 457

Python - match a word in a string with a list of strings

I'm new to python and I was wondering how string comparison is done

Let's say I have a list of strings containing state names like

states = ["New York", "California", "Nebraska", "Idaho"]

I also have another string that contains an address like

postal_addr = "1234 1st E St San Jose California 95112"

How do I parse this address string and find a match with the items in the states list? In the above example, California will be a match. How do I then, after matching, extract "California" and store it as a separate string?

Upvotes: 0

Views: 1005

Answers (6)

dermen
dermen

Reputation: 5382

I would do

matches = [ s for s in states if s in postal_addr ]

Then, if you want to get the string from the postal address:

import re
if matches:
    extracted = re.findall( matches[0],  postal_addr)[0]

EDIT: ..but this won't work for city/state combos where the city name contains a different state, for example if postal_adr = '1 Arrowhead Dr, Kansas City, Missouri 64129' and states = ["New York", "California", "Nebraska", "Idaho", "Missouri", "Kansas"] etc. In this case

import re
if matches:
    extracted = [(re.search(m, postal_addr).start() , m) for m in matches ]
    extracted = sorted( extracted )[-1][1]

Upvotes: 1

Fuu
Fuu

Reputation: 3474

Here's another alternative answer using a regexp:

import re

states = ["New York", "California", "Nebraska", "Idaho"]
pattern = re.compile(r'.*(' + r'|'.join(states) + ').*')

postal_addr = "1234 1st E St San Jose California 95112"
match = pattern.match(postal_addr)

if match:
    state = match.group(1)

Upvotes: 0

Adem Öztaş
Adem Öztaş

Reputation: 21506

You can try like this,

In [2]: states = ["New York", "California", "Nebraska", "Idaho"]

In [3]: postal_addr = "1234 1st E St San Jose California 95112"

In [4]: ''.join(state for state in states if state in postal_addr)
Out[4]: 'California'

Upvotes: 0

Maxime Chéramy
Maxime Chéramy

Reputation: 18851

states = ["New York", "California", "Nebraska", "Idaho"]
postal_addr = "1234 1st E St San Jose California 95112"

result = None
for state in states:
    if state in postal_addr:
        result = state

print(result)

Unfortunately, this will also match words that contain a state name such as Idahoba.

Upvotes: 0

Erve1879
Erve1879

Reputation: 845

To find all matches in the string you could do:

matches = [m for m in postal_addr.split() if m in states]

Upvotes: -1

wim
wim

Reputation: 363486

>>> states = ["New York", "California", "Nebraska", "Idaho"]
>>> postal_addr = "1234 1st E St San Jose California 95112"
>>> first_match = next(state for state in states if state in postal_addr)
>>> first_match
'California'

However, if you need to match at word boundaries, you might be better off using a regex.

Upvotes: 1

Related Questions