sss
sss

Reputation: 1347

Get index of words

If I have this line:

"He is a very good man. He has a [good wife]."

And I want to extract the line, beside the [good wife] + the index of the fist word of the [good wife].

So the output will be:

"He is a very good man. He has a [good wife], good wife: 12

I tried this

fi = codecs.open('file', 'r', 'utf-8')
regex = re.compile(r"\[(.*?)\]")
for line in fi.readlines():
    line2= line.split()
    mw = re.findall(regex, line2)
    print (line, mw, line2.index(mw[0]))

But it does give what is wanted

Can someone help?

Upvotes: 0

Views: 55

Answers (1)

Kasravnd
Kasravnd

Reputation: 107287

You can just use re.search :

>>> def find(s):
...   try:
...     sub=re.search(r"\[(.*?)\]",s).group(1)
...     return sub,s.split().index('['+sub.split()[0])
...   except AttributeError:
...     return '[]'
... 
>>> print find('He is a very good man. He has a [good wife].')
('good wife', 9)
>>> print find('He is a very good man. He has a good wife.')
[]

note that as the result of re.search is 'good wife' for grub the index of first word you need to concatenate [ to it as good is not a separate word in your string.

Upvotes: 1

Related Questions