How to extract certain string from a text?

Question

I have a certain feature "Location" from which I want to extract country.

The feature looks like:

data['Location'].head()

0    stockton, california, usa
1    edmonton, alberta, canada
2     timmins, ontario, canada
3      ottawa, ontario, canada
4                n/a, n/a, n/a
Name: Location, dtype: object

I want:

data['Country'].head(3)

0   usa
1   canada
2   canada

I've tried:

data['Country'] = data.Location.str.extract('(+[a-zA-Z])', expand=False)
data[['Location', 'Country']].sample(10)

which returns:

error: nothing to repeat at position 1

When I try to put the '[a-zA-Z]+' it gives me city.

Help would be appreciated. Thanks.

Imtinan Azhar · Accepted Answer

data['Country'] = data['Location'].apply(lambda row: str(row).split(',')[-1])

You may do this, df.apply applies a function across all rows, our lambda function extracts the country, and apply is only called on one column and saved into another

How to extract certain string from a text?

Answers (2)

Related Questions