Finding words around a substring

Question

I have to extract two words before and after my substring match in a large string. For example:

sub = 'name'

str = '''My name is Avi. Name identifies who you are. It is important to have a name starting with the letter A.'''

Now I have to find all occurences of sub in str and then return the following:

(My name is Avi), (Name identifies who), (have a name starting with)

Note that if the re is a full stop after the string than only the words before string are returned as shown in example above.

What I have tried?

>>> import re
>>> text = '''My name is Avi. Name identifies who you are. It is important to have a name starting with the letter A.'''
>>> for m in re.finditer( 'name', text ):
...     print( 'name found', m.start(), m.end() )

Which gives me the starting and ending position of the matched substring. I am not able to proceed further as to how to find words around it.

perreal · Accepted Answer

import re
sub = '(\w*)\W*(\w*)\W*(name)\W*(\w*)\W*(\w*)'
str1 = '''My name is Avi. Name identifies who you are. It is important to have a name starting with the letter A.'''
for i in re.findall(sub, str1, re.I):
    print " ".join([x for x in i if x != ""])

Output

My name is Avi
Name identifies who
have a name starting with

or,

sub = '\w*\W*\w*\W*name\W*\w*\W*\w*'
for i in re.findall(sub, str1, re.I):
    i=i.strip(" .")
    print i

Finding words around a substring

Answers (2)

Related Questions