user185864
user185864

Reputation: 21

Extract part from an address in pandas dataframe column

I work through a pandas tutorial that deals with analyzing sales data (https://www.youtube.com/watch?v=eMOA1pPVUc4&list=PLFCB5Dp81iNVmuoGIqcT5oF4K-7kTI5vp&index=6). The data is already in a dataframe format, within the dataframe is one column called "Purchase Address" that contains street, city and state/zip code. The format looks like this:

Purchase Address
917 1st St, Dallas, TX 75001
682 Chestnut St, Boston, MA 02215
...

My idea was to convert the data to a string and to then drop the irrelevant list values. I used the command:

all_data['Splitted Address'] = all_data['Purchase Address'].str.split(',')

That worked for converting the data to a comma separated list of the form

[917 1st St, Dallas, TX 75001]

Now, the whole column 'Splitted Address' looks like this and I am stuck at this point. I simply wanted to drop the list indices 0 and 2 and to keep 1, i.e. the city in another column.

In the tutorial the solution was layed out using the .apply()-method:

all_data['Column'] = all_data['Purchase Address'].apply(lambda x: x.split(',')[1])

This solutions definitely looks more elegant than mine so far, but I wondered whether I can reach a solution with my approach with a comparable amount of effort.

Thanks in advance.

Upvotes: 2

Views: 1674

Answers (1)

jezrael
jezrael

Reputation: 863166

Use Series.str.split with selecting by indexing:

all_data['Column'] = all_data['Purchase Address'].str.split(',').str[1]

Upvotes: 1

Related Questions