Cleanup with 'replace' not working as intended - Pandas

Question

I've tried to apply the 'replace' method on this list in a dataframe:

df=pd.DataFrame({'DC': ['DOUBLE CHANCE 1 OR X 2.50 X OR 2 1.12 1 OR 2 1.20']})

Desired result: 2.50 1.12 1.20


Looking for suggestion to make the cleanup work either 'replace' method or 'regex'

import pandas as pd

df=pd.DataFrame({'DC': ['DOUBLE CHANCE
1 OR X
2.50
X OR 2
1.12
1 OR 2
1.20']})

df = df['DC']['Double_Chance'].str.replace(r'([^\d\.
])','').str.replace(r'1
','').str.replace(r'2
','').str.replace(r'12
','').str.strip()
df
0       2.50
1.111.20
Name: Score, dtype: object

Wiktor Stribiżew · Accepted Answer

You can use Series.str.replace:

df['Double_Chance'] = df['DC'].str.replace(r'(?m)^(?!\d+\.\d+$).*
*', '')

Or, you may use Series.str.findall:

df['Double_Chance'] = df['DC'].str.findall(r'(?m)^\d+\.\d+$').str.join("
")

Both produce 2.50 1.12 1.20.

See the regex demo. Details:

(?m) - the re.M option that makes ^ match start of each line
^ - start of a line
(?!\d+\.\d+$) - fail the match if the line is a float number
.* - zero or more chars other than line break chars, as many as possible
* - zero or more line feed chars.

Cleanup with 'replace' not working as intended - Pandas

Answers (1)

Related Questions

Cleanup with &#39;replace&#39; not working as intended - Pandas

Answers (1)

Related Questions

Cleanup with 'replace' not working as intended - Pandas