Reputation: 179
My pandas dataframe looks as follows:
col1 | col2 |
---|---|
1 | ABC8392akl |
2 | 001523 |
3 | 000ABC58 |
Now I want to remove the leading zeroes, if the string is only numerical. Any suggestions? So outcome should be:
col1 | col2 |
---|---|
1 | ABC8392akl |
2 | 1523 |
3 | 000ABC58 |
Upvotes: 2
Views: 1356
Reputation: 78650
Use:
where = (df['col2'].str.isdigit(), 'col2')
df.loc[where] = df.loc[where].str.lstrip('0')
Upvotes: 2
Reputation: 332
Is there a problem with this, or you want to keep the numbers as string?
df['col2'] = pd.to_numeric(df['col2'] ,errors='ignore')
Upvotes: 1
Reputation: 260410
You can use a regex with str.replace
for this:
df['col2'] = df['col2'].str.replace(r'^0+(?!.*\D)', '', regex=True)
output:
col1 col2
0 1 ABC8392akl
1 2 1523
2 3 000ABC58
regex:
^0+ # match leading zeros
(?!.*\D) # only if not followed at some point by a non digit character
suggested by @timgeb
df['col2'] = df['col2'].str.replace(r'^0+(\d*)$', r'\1', regex=True)
regex:
^0+ # match leading zeros
(\d*) # capture other digits (if any)
$ # match end of string
replace with captured digits (\1
)
Upvotes: 5