Reputation: 432
I want to find the index of the letter in the string satisfying a certain condition.I want to find the index of letter g if all the brackets before the letter are complete.
This is what I have
sen = 'abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'
This is what I have done
lst = [(i.end()) for i in re.finditer('g', sen)]
# lst
# [7, 16, 20, 29, 32, 36, 40]
count_open = 0
count_close = 0
for i in lst:
sent=sen[0:i]
for w in sent:
if w == '(':
count_open += 1
if w == ')':
count_close += 1
if count_open == count_close && count_open != 0:
c = i-1
break
It is giving me the c as 39, which is the last index, however the right answer should be 35 as the brackets before the second last g is complete.
Upvotes: 1
Views: 984
Reputation: 571
This is a simpler adoption of the code in the OP (and takes into account the condition count_open != 0
):
def get_idx(f, sen):
idx = []
count_open= 0
count_close=0
for i, w in enumerate(sen):
if w == '(':
count_open += 1
if w == ')':
count_close += 1
if count_open == count_close & count_open != 0:
if w == f:
idx.append(i)
return idx
get_idx('g', sen)
Out:
[31, 35, 39]
Upvotes: 1
Reputation: 95948
You can dispense with regex
and simply use a stack to keep track of whether or not your parens are balanced while you iterate over the characters:
In [4]: def find_balanced_gs(sen):
...: stack = []
...: for i, c in enumerate(sen):
...: if c == "(":
...: stack.append(c)
...: elif c == ")":
...: stack.pop()
...: elif c == 'g':
...: if len(stack) == 0:
...: yield i
...:
In [5]: list(find_balanced_gs(sen))
Out[5]: [31, 35, 39]
Using a stack here is the "classic" way of checking for balanced parans. It's been a while since I've implemented it from scratch, so there might be some edge cases that I haven't considered. But this should be a good start. I've made a generator, but you can make it a normal function that returns a list of indices, the first such index or the last such index.
Upvotes: 3
Reputation: 53029
@Thierry Lathuille's answer is perfectly good. Here I'm just suggesting some minor variations without claiming they are better:
out = [] # collect all valid 'g'
ocount = 0 # only store the difference between open and closed
for m in re.finditer('[\(\)g]', sen): # use re to preselect
L = m.group()
ocount += {'(':1, ')':-1, 'g':0}[L] # save a bit of typing
assert ocount >= 0 # enforce some grammar if you like
if L == 'g' and ocount == 0:
out.append(m.start())
out
# [31, 35, 39]
Upvotes: 1
Reputation: 24232
Keeping your idea, just a few things were off, see comments:
import re
sen='abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'
lst=[ (i.end()) for i in re.finditer('g', sen)]
#lst
#[7, 16, 20, 29, 32, 36, 40]
for i in lst:
# You have to reset the count for every i
count_open= 0
count_close=0
sent=sen[0:i]
for w in sent:
if w=='(':
count_open+=1
if w==')':
count_close+=1
# And iterate over all of sent before comparing the counts
if count_open == count_close & count_open != 0:
c=i-1
break
print(c)
# 31 - actually the right answer, not 35
But this is not very efficient, as you iterate many times over the same parts of the string. You can make it more efficient, iterating only once over the string:
sen='abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'
def find(letter, string):
count_open = 0
count_close = 0
for (index, char) in enumerate(sen):
if char == '(':
count_open += 1
elif char == ')':
count_close += 1
elif char == letter and count_close == count_open and count_open > 0:
return index
else:
raise ValueError('letter not found')
find('g', sen)
# 31
find('a', sen)
# ...
# ValueError: letter not found
Upvotes: 1
Reputation: 302
You can use .index() to find the index of a string or element within a string or list.
Put the stringvar.index(string) this will give you the offset or index of string.
Upvotes: -1