PV8
PV8

Reputation: 6270

Pandas extrac number with a decimal operator afer $ from a string

also there are several similar questions to that, I am still not able to solve my issue.

I have a pandas column from a poker game and want to analyze the pot size out of it, therefore I need to extract the number (with a . decimalseperator) after a $. The column looks like this:

Action
Player (8, 5) won the $5.40 main pot with a Straight
...
Player (A, 2) won the $21.00 main pot with a flush
...

when i run: df['number'] = df['action'].str.extract('([0-9][,.]*[0-9]*)') it doesn't give me the expected outcome the outcome shold be:

number
5.40
...
21.00

Upvotes: 1

Views: 39

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627020

You can use

>>> import pandas as pd
>>> df = pd.DataFrame({'action':['Player (8, 5) won the $5.40 main pot with a Straight','Player (A, 2) won the $21.00 main pot with a flush']})
>>> df['action'].str.extract(r'\$(\d+(?:[,.]\d+)*)', expand=False)
0     5.40
1    21.00
Name: Action, dtype: object

The \$(\d+(?:[,.]\d+)*) pattern matches a literal $ symbol, and then captures into Group 1 any one or more digits and then zero or more sequences of a , or . and then one or more digits.

See the regex demo.

Upvotes: 2

Related Questions