Jeff Sword
Jeff Sword

Reputation: 37

Split 'City, State Zip' into three columns in pandas dataframe

I am trying to split a column containing City, State, and Zip into three columns. The data in the column is in this format: 'City, State Zip' - comma separating the city from state, and a space separating state from zip code. I can split out the city using:

df['Owner City State Zip'].str.split(',').apply(lambda x: x[0]

But for some reason when I try the following to split out the state and zip:

df['Owner City State Zip'].str.split(',').apply(lambda x: x[1]

I get the error - Index is out of range

Any help would be appreciated! This seems trivial but has been more difficult than I was expecting.

Upvotes: 1

Views: 2651

Answers (1)

piRSquared
piRSquared

Reputation: 294258

Consider the df

df = pd.DataFrame({'Owner City State Zip': ["Los Angeles, CA 90015"]})

print(df)

    Owner City State Zip
0  Los Angeles, CA 90015

I'd use this handy bit of regex and pandas str string accessor

regex = r'(?P<City>[^,]+)\s*,\s*(?P<State>[^\s]+)\s+(?P<Zip>\S+)'
df['Owner City State Zip'].str.extract(regex)

          City State    Zip
0  Los Angeles    CA  90015

Upvotes: 6

Related Questions