Reputation: 324
I have a problem matching some stuff out of a string the problem is with ( re.findall() ) and it only allows me to match greedy or non-greedy and I want to match the things between greedy and non-greedy for example:
import re
text = "f(s(5)+5)+f(12)"
regex = re.findall("f\(.*\)", text)
>>>['f(s(5)+5)+f(12)']
this is greedy and will match the whole string. another example:
import re
text = "f(s(5)+5)+f(12)"
regex = re.findall("f\(.*?\)", text)
>>>['f(s(5)', 'f(12)']
this is non-greedy and will match some parts but not enough i want to match all greedy and non-greedy and the matches between them like
>>> ['f(s(5)', 'f(s(5)+5)', 'f(12), 'f(s(5)+5)+f(12)']
see there is one match missing from the non-greedy and greedy ones it is 'f(s(5)+5)' and it would be more than one missing if the string is larger.
Upvotes: 0
Views: 124
Reputation: 145
Yeah like everyone already told, there is no direct regex that would give you the desired output.
But with a loop on regex, i was able to achieve your desired output. See if it helps.
import re
text = "f(s(5)+5)+f(12)"
print ("occurences of ')' : {}".format(text.count(")")))
test_str = text
# loop repeatedly until all substrings starting with 'f(' are parsed
while test_str:
# for loop: to parse all ')'
for i in range(1,test_str.count(")")+1):
# regex explanation can be found @ https://regex101.com/r/jJOXr0/1/
regex = r'^f\((?:.*?\)){' + re.escape(str(i)) + r'}'
output_list = re.findall(regex, test_str)
print(output_list[0])
# find the next substring starting with 'f('
substr_id = test_str.find('f(',1)
if substr_id == -1:
break
else:
test_str = test_str[substr_id:]
Output :
occurences of ')' : 3
f(s(5)
f(s(5)+5)
f(s(5)+5)+f(12)
f(12)
Upvotes: 1