Reputation: 5184
I have the following dataframe called df:
Symbol Country Type etc...
0 AG.L UK OS
1 UZ. UK OS
2 DT UK OS
3 XX.L US OS
4 MSFT US OS
5 AAPL US OS
6 DB.S SG OS
I want to perform the following on the frame. Where the Country == 'UK',
there can be 3 cases.
Case 1: ends with .L,
do nothing
Case 2: ends with .,
add 'L' to the end
Case3: ends with neither . or .L,
add '.L' to the end
As long as the Country == 'UK', I want it to end with a '.L'.
So it should look like this.
Symbol Country Type etc...
0 AG.L UK OS
1 UZ.L UK OS
2 DT.L UK OS
3 XX.L US OS
4 MSFT US OS
5 AAPL US OS
6 DB.S SG OS
I use the following code.
df.loc[df['Country'].eq('UK'),'Symbol'] = df.loc[df['Country'].eq('UK'),'Symbol'].str.replace(r'\.', '.L').str.replace(r'[a-z]$', '.L')
but i get this
AG.LL
UZ.L
DT
What's the right way to do it?
Upvotes: 3
Views: 69
Reputation: 2663
You almost got it right, but you missed the dollar sign at the dot replacement and the other one has to be slightly different, so try:
df.loc[df['Country'].eq('UK'),'Symbol'] = df.loc[df['Country'].eq('UK'),'Symbol'].str.replace(r'^([A-Z]+)$', r'\1.L').str.replace(r'\.$', '.L')
In my Python shell it outputs:
0 AG.L
1 UZ.L
2 DT.L
Name: Symbol, dtype: object
Upvotes: 3