NKG
NKG

Reputation: 69

Remove leading zeroes pandas

For example I have such a data frame

import pandas as pd
nums = {'amount': ['0324','S123','0010', None, '0030', 'SA40', 'SA24']}
df = pd.DataFrame(nums)

enter image description here

And I need to remove all leading zeroes and replace NONEs with zeros:

enter image description here

I did it with cycles but for large frames it works not fast enough. I'd like to rewrite it using vectores

Upvotes: 1

Views: 8978

Answers (5)

Karn Kumar
Karn Kumar

Reputation: 8816

I see already nice answer from @Epsi95 though, you even can try with character set with regex

>>> df['amount'].str.replace(r'^[0]*', '', regex=True).fillna('0')
0     324
1    S123
2      10
3       0
4      30
5    SA40
6    SA24

Explanation:

^[0]*

^ asserts position at start of a line
Match a single character present in the list below [0]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)

Upvotes: 3

SeaBean
SeaBean

Reputation: 23217

If you need to ensure single 0 or the last 0 is not removed, you can use:

df['amount'] = df['amount'].str.replace(r'^(0+)(?!$)', '', regex=True).fillna('0')

Regex (?!$) ensure the matching substring (leading zeroes) does not including the last 0. Thus, effectively keeping the last 0.

Demo

Input Data

nums = {'amount': ['0324','S123','0010', None, '0030', 'SA40', 'SA24', '0', '000']}
df = pd.DataFrame(nums)

  amount
0   0324
1   S123
2   0010
3   None
4   0030
5   SA40
6   SA24
7      0           <==   Added a single 0 here
8    000           <==   Added a sequence of all 0's here

Output

print(df)

  amount
0    324
1   S123
2     10
3      0
4     30
5   SA40
6   SA24
7      0           <==  Single 0 is not removed  
8      0           <==  Last 0 is kept

Upvotes: 0

Thomas Vivier
Thomas Vivier

Reputation: 60

Step by step :

Remove all leading zeros:

Use str.lstrip which returns a copy of the string with leading characters removed (based on the string argument passed).

Here,

df['amount'] = df['amount'].str.lstrip('0')

For more, (https://www.programiz.com/python-programming/methods/string/lstrip)

Replace None with zeros:

Use fill.na which works with others than None as well

Here,

df['amount'].fillna(value='0')

And for more : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html

Result in one line:

df['amount'] = df['amount'].str.lstrip('0').fillna(value='0')

Upvotes: 0

Haha
Haha

Reputation: 1009

df['amount'] = df['amount'].str.lstrip('0').fillna(value='0')

Upvotes: 2

Epsi95
Epsi95

Reputation: 9047

you can try str.replace

df['amount'].str.replace(r'^(0+)', '').fillna('0')
0     324
1    S123
2      10
3       0
4      30
5    SA40
6    SA24
Name: amount, dtype: object

Upvotes: 4

Related Questions