Shweta Kamble
Shweta Kamble

Reputation: 432

How to find position of letter in a string based on condition python

I want to find the index of the letter in the string satisfying a certain condition.I want to find the index of letter g if all the brackets before the letter are complete.

This is what I have

sen = 'abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'

This is what I have done

lst = [(i.end()) for i in re.finditer('g', sen)]
# lst
# [7, 16, 20, 29, 32, 36, 40]
count_open = 0
count_close = 0
for i in lst:
    sent=sen[0:i]
    for w in sent:
        if w == '(':
            count_open += 1
        if w == ')':
            count_close += 1    
        if count_open == count_close && count_open != 0:
            c = i-1
     break

It is giving me the c as 39, which is the last index, however the right answer should be 35 as the brackets before the second last g is complete.

Upvotes: 1

Views: 984

Answers (5)

Benjamin
Benjamin

Reputation: 571

This is a simpler adoption of the code in the OP (and takes into account the condition count_open != 0):

def get_idx(f, sen):
    idx = []
    count_open= 0
    count_close=0

    for i, w in enumerate(sen):
        if w == '(':
            count_open += 1
        if w == ')':
            count_close += 1    
        if count_open == count_close & count_open != 0:
            if w == f:
                idx.append(i)

    return idx

get_idx('g', sen)

Out:

[31, 35, 39]

Upvotes: 1

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95948

You can dispense with regex and simply use a stack to keep track of whether or not your parens are balanced while you iterate over the characters:

In [4]: def find_balanced_gs(sen):
   ...:     stack = []
   ...:     for i, c in enumerate(sen):
   ...:         if c == "(":
   ...:             stack.append(c)
   ...:         elif c == ")":
   ...:             stack.pop()
   ...:         elif c == 'g':
   ...:             if len(stack) == 0:
   ...:                 yield i
   ...:

In [5]: list(find_balanced_gs(sen))
Out[5]: [31, 35, 39]

Using a stack here is the "classic" way of checking for balanced parans. It's been a while since I've implemented it from scratch, so there might be some edge cases that I haven't considered. But this should be a good start. I've made a generator, but you can make it a normal function that returns a list of indices, the first such index or the last such index.

Upvotes: 3

Paul Panzer
Paul Panzer

Reputation: 53029

@Thierry Lathuille's answer is perfectly good. Here I'm just suggesting some minor variations without claiming they are better:

out = []    # collect all valid 'g'
ocount = 0  # only store the difference between open and closed
for m in re.finditer('[\(\)g]', sen):   # use re to preselect
    L = m.group()
    ocount += {'(':1, ')':-1, 'g':0}[L] # save a bit of typing
    assert ocount >= 0                  # enforce some grammar if you like
    if L == 'g' and ocount == 0:
        out.append(m.start())

out
# [31, 35, 39]

Upvotes: 1

Thierry Lathuille
Thierry Lathuille

Reputation: 24232

Keeping your idea, just a few things were off, see comments:

import re

sen='abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'


lst=[ (i.end()) for i in re.finditer('g', sen)]
#lst
#[7, 16, 20, 29, 32, 36, 40]

for i in lst:
    # You have to reset the count for every i
    count_open= 0
    count_close=0
    sent=sen[0:i]
    for w in sent:
        if w=='(':
            count_open+=1
        if w==')':
            count_close+=1    
    # And iterate over all of sent before comparing the counts
    if count_open == count_close & count_open != 0:
        c=i-1
        break
print(c)
# 31 - actually the right answer, not 35

But this is not very efficient, as you iterate many times over the same parts of the string. You can make it more efficient, iterating only once over the string:

sen='abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'

def find(letter, string):
    count_open = 0
    count_close = 0
    for (index, char) in enumerate(sen):
        if char == '(':
            count_open += 1
        elif char == ')':
            count_close += 1
        elif char == letter and count_close == count_open and count_open > 0:
            return index
    else:
        raise ValueError('letter not found')

find('g', sen)
# 31
find('a', sen)
# ...
# ValueError: letter not found

Upvotes: 1

Vipin Joshi
Vipin Joshi

Reputation: 302

You can use .index() to find the index of a string or element within a string or list.

Put the stringvar.index(string) this will give you the offset or index of string.

Upvotes: -1

Related Questions