Reputation: 2333
I have a dataframe
AID Name Country
1 XX-USA FL, USA
2 YY USA-USA
3 YY-USA UK
I want to replace instances in Country
only that include the string 'USA' for instance into 'USA' alone.
So the result would be
AID Name Country
1 XX-USA USA
2 YY USA
3 YY-USA UK
I tried df.replace
but couldn't figure out how to do it.
Upvotes: 1
Views: 58
Reputation: 210982
UPDATE: I don't know what am i doing "wrong", but it works just fine for me:
In [83]: fn = r'D:\download\countries.csv'
In [84]: Alldf = pd.read_csv(fn, sep=',')
In [85]: pd.options.display.max_rows = 10
In [86]: Alldf
Out[86]:
aid name country
0 79533B41 john sarracino USA
1 7E0706FD ben wiedermann USA
2 7B33445B rishit sheth USA
3 7F4CE233 yijun zhao USA
4 262087DD roni khardon USA
... ... ... ...
13387 7C62148F marcel kutsch USA
13388 7F42F95A john z zhang Canada
13389 7DAF3AED chris sanden Canada
13390 00375817 michal laclavik Slovakia
13391 13D9B371 marek ciglan Slovakia
[13392 rows x 3 columns]
In [87]: Alldf.loc[Alldf.country.str.contains('USA'), 'country'] = 'USA'
In [88]: Alldf.country.value_counts()
Out[88]:
USA 6150
China 1426
Singapore 517
UK 448
Germany 398
...
New York Unversity 1
Berkeley and QUT 1
CA 94022 1
WA 98109 1
and Google research 1
Name: country, Length: 377, dtype: int64
In [89]: Alldf.loc[Alldf.country.str.contains('USA'), 'country'].unique()
Out[89]: array(['USA'], dtype=object)
Old answer:
In [74]: d
Out[74]:
AID Name Country
0 1 XX-USA FL, USA
1 2 YY USA-USA
2 3 YY-USA UK
In [75]: d.loc[d.Country.str.contains('USA'), 'Country'] = 'USA'
In [76]: d
Out[76]:
AID Name Country
0 1 XX-USA USA
1 2 YY USA
2 3 YY-USA UK
Upvotes: 2