Mariam Imran
Mariam Imran

Reputation: 13

Python: How to extract numbers and certain upercase letters after a keyword

I'm trying to extract the digits after the word 'Amount' and the currency code after the digits into two separate columns using Python. Any help would be appreciated.

Successful refund. IBE payment ID 79104467 | Transaction-ref: 73462794 | Amount: 50.00 EUR

Successful refund by Hyperwallet. Transaction-ref: 48886217 | Amount: 214.64 USD | Hyperwallet payout id: 581082-2

Upvotes: 1

Views: 38

Answers (3)

Andrej Kesely
Andrej Kesely

Reputation: 195543

To construct a DataFrame from the given string try:

import re
import pandas as pd

s = """\
Successful refund. IBE payment ID 79104467 | Transaction-ref: 73462794 | Amount: 50.00 EUR
Successful refund by Hyperwallet. Transaction-ref: 48886217 | Amount: 214.64 USD | Hyperwallet payout id: 581082-2"""

df = pd.DataFrame(
    re.findall(r"Amount:\s*([\d.]+)\s*([^\s]+)", s),
    columns=["Amount", "Currency"],
)
print(df)

Prints:

   Amount Currency
0   50.00      EUR
1  214.64      USD

Upvotes: 0

chrslg
chrslg

Reputation: 13381

I would use regex for that

import re
def listAmounts(s):
    return [a for a,b in re.findall('(\d+(\.\d+)?\s[A-Z]+)', s)]

(Returns any strings made of some digits, and an optional dot with some more digits, and a space and some uppercase letters. You can of course use some variant, allowing more spaces or no space before currency, or fixing the number of digits after dot, or allowing sign, etc.)

Upvotes: 1

Robert Kadak
Robert Kadak

Reputation: 69

Not the best solution, but should work

to_filter = 'Successful refund. IBE payment ID 79104467 | Transaction-ref: 73462794 | Amount: 50.00 EUR'
to_filter = to_filter.split(' ')
amount = [float(to_filter[to_filter.index('Amount:') + 1]), to_filter[to_filter.index('Amount:') + 2]]
print(amount)

Upvotes: 0

Related Questions