user10655585
user10655585

Reputation:

Remove superfluous punctuation from strings in pandas

hi I have a dataframe as below

df1:-

rade   volume    packitt 
wear   28        cult,,daok
kwat   45        vaner ,boera
itre   17        eaker, ewlvwe, The wrerin
reww   87     
hakw   57        ,rabe,,boera
kryh   45        vaner ,boera,vanya,

now I want to remove extra commas

Output dataframe

rade   volume    packitt 
wear   28        cult,daok
kwat   45        vaner,boera
itre   17        eaker,ewlvwe,The wrerin
reww   87 
hakw   57        rabe,boera
kryh   45        vaner,boera,vanya

Upvotes: 1

Views: 67

Answers (1)

cs95
cs95

Reputation: 402922

This was likely caused by improper column-wise aggregation of strings (did you mean to do something like df.agg(lambda x: ','.join(x.dropna()), axis=1)?).

However, for reference, you can remove commas using a non-regex solution involving str.split and str.join:

df['packitt'] = [
    ','.join(filter(None, x.split(','))) if pd.notna(x) else x 
    for x in df['packitt']
]

df
   rade  volume                    packitt
0  wear      28                  cult,daok
1  kwat      45               vaner ,boera
2  itre      17  eaker, ewlvwe, The wrerin
3  reww      87                       None
4  hakw      57                 rabe,boera
5  kryh      45         vaner ,boera,vanya

Or, using pandas column string operations str.replace (with regex) + str.strip:

df['packitt'] = df.packitt.str.replace(r'(\s*,\s*)+', ',').str.strip(',')

df
   rade  volume                  packitt
0  wear      28                cult,daok
1  kwat      45              vaner,boera
2  itre      17  eaker,ewlvwe,The wrerin
3  reww      87                     None
4  hakw      57               rabe,boera
5  kryh      45        vaner,boera,vanya

Where r'(\s*,\s*)+' will match 1 or more commas surrounded by 0 or more spaces.

Upvotes: 2

Related Questions