regex count occurrences

Question

I am looking for a way to count the occurrences found in the string based on my regex. I used findall() and it returns a list but then the len() of the list is only 1? shouldn't the len() of the list be 2?

import re

string1 = r'Total $200.00 Total $900.00'
regex = r'(.*Total.*|.*Invoice.*|.*Amount.*)?(\s+?\$\s?[1-9]{1,10}.*(?: 
[.,]\d{3})*(?:[.,]\d{2})?)'
patt = re.findall(regex,string1)
print(patt)
print(len(patt))

Resut:

>     [('Total $200.00 Total', ' $900.00')]
>     1

not sure if my regex is causing it to miscalculate. I am looking to get the Total from a file but there are many combinations of this. Examples:

Total $900.00
Invoice Amt $500.00
Total 800.00

etc.

I am looking to count this because there could be multiple invoice details in one file.

Tomalak · Accepted Answer

First off, because that's a common misconception:

There is no need to match "all text up to the match" or "all the text after a match". You can drop those .* in your regex. Start with what you actually want to match.

import re

string1 = 'Total $200.00 Total $900.00'

amount_pattern = r'(?:Total|Amt|Invoice Amt|Others)[:\s]*\$([\d\.,]*\d)'
amount_expr = re.compile(amount_pattern, re.IGNORECASE)

amount_expr.findall(string1)
# -> ['200.00', '900.00']

\$([\d\.,]*\d) is a half-way reasonable approximation of prices ("things that start with a $ and then contain a bunch of digits and possibly dots and commas"). The final \d makes sure we are not accidentally matching sentence punctuation. It might be good enough, but you know what data you are working with. Feel free to come up with a more specific sub-expression. Include an optional leading - if you expect to see negative amounts.

regex count occurrences

Answers (2)

Related Questions