Reputation: 493
I am trying to extract a particular "float" from a string, it contains multiple formatted "integers", "floats" and dates. The particular "float" in question is presided by some standardized text.
my_string = """03/14/2019 07:07 AM
💵Soles in mDm : 2864.35⬇
🔶BTC purchase in mdm: 11,202,782.0⬇
"""
I have been able to extract the desired float pattern for, 2864.35
, from my_string
but if this particular float changes in pattern or another float with the same format shows up, my script won't return the desired result
regex = r"(\d+\.\d+)"
matches = re.findall(regex, my_string)
for match in matches:
print(match)
regex
regex
Soles
it could be upper/lower case:
What you see bellow are three examples of the same line, the second line in my_string
. The regex should be able to return only line number two despite any variations such as soles or Soles
Any assistance in editing or re-writing the current regular expression regex
is greatly appreciated
Upvotes: 2
Views: 355
Reputation: 482
EDIT - Hmmm... If it has to follow soles
then hopefully this helps
Try these, granted my console can't take the extra characters, but based on your input:
>>> my_string = """03/14/2019 07:07 AM
Soles in mDm : 2864.35
BTC purchase in mdm: 11,202,782.0
Soles in mDm : 2864.35
soles MDM: 2,864.35
Soles in mdm :2,864.355
"""
>>> re.findall('(?i)soles[\S\s]*?([\d]+[\d,]*\.[\d]+)', my_string)
#Output
['2864.35', '2864.35', '2,864.35', '2,864.355']
>>> re.findall('[S|s]oles[\S\s]*?([\d]+[\d,]*\.[\d]+)', my_string)
#Output
['2864.35', '2864.35', '2,864.35', '2,864.355']
Upvotes: 2
Reputation: 38502
If you want to match multiple instances then just add the g
flag other wise it'll only match the single instance. REGEX
(?<=:)\s?([\d,]*\.\d+)
With Python,
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"(?<=:)\s?([\d,]*\.\d+)"
test_str = ("\n"
" 💵Soles in mDm : 2864.35⬇\n"
" soles MDM: 2,864.35\n"
" Soles in mdm :2,864.355\n")
matches = re.search(regex, test_str, re.IGNORECASE)
if matches:
print ("Match was found at {start}-{end}: {match}".format(start = matches.start(), end = matches.end(), match = matches.group()))
for groupNum in range(0, len(matches.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = matches.start(groupNum), end = matches.end(groupNum), group = matches.group(groupNum)))
Upvotes: 0