brygid
brygid

Reputation: 85

pandas strip all strings in column that starts with symbols +-

    value      result
1    fish      +d de hzd +po
2    duck      +g lo kju +okz -kji
3    mouse     +z gh +k
4    rabbit    +bhh uj +okk
5    donky     -de thb +thy

And the expected data should look like:

   value      result
1    fish      de hzd 
2    duck      lo kju 
3    mouse     gh 
4    rabbit    uj
5    donky     thb 

I tried

df["result"] = [x.lstrip("+- + aAbBcC") for x in df["result"]

but it only strips the first string appeared:

    value      result
1    fish      de hzd +po
2    duck      lo kju +okz -kji
3    mouse     gh +k
4    rabbit    uj +okk
5    donky     thb +thy

is there a way to fix this, to get the expected data? thank you.

Upvotes: 1

Views: 710

Answers (3)

hpal007
hpal007

Reputation: 313

" ".join([ el for el in df["result"].split() if el.isalpha()]) this should help.

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521997

We can try using str.replace along with str.strip:

df["result"] = df["result"].str.replace(r'\s*[+-]\S+\s*', '').str.strip()
print(df)

This prints:

    value  result
0    fish  de hzd
1    duck  lo kju
2   mouse      gh
3  rabbit      uj
4  donkey     thb

Upvotes: 2

Алексей Р
Алексей Р

Reputation: 7627

df['result'].replace(to_replace='[+-]\w+', value='', regex=True,inplace=True)
df['result']=df['result'].str.strip()
print(df)

Prints:

    value  result
0    fish  de hzd
1    duck  lo kju
2   mouse      gh
3  rabbit      uj
4   donky     thb

Upvotes: 3

Related Questions