Reputation: 727
I am trying to find all words containing "hell" in 1 sentence. There are 3 occurrences in the below string. But re.search is returning only the first 2 occurrences. I tried both findall and search. Can someone please tell me what is wrong here ?
>>> s = 'heller pond hell hellyi'
>>> m = re.findall('(hell)\S*', s)
>>> m.group(0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute 'group'
>>> m = re.search('(hell)\S*', s)
>>> m.group(0)
'heller'
>>> m.group(1)
'hell'
>>> m.group(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: no such group
>>>
Upvotes: 2
Views: 3224
Reputation: 139
Maybe it's me but i use regex very little. Python3 has extensive text functions, what is wrong with using the build-in function ?
'heller pond hell hellyi'.count('hell')
The only drawback i see is that this way i never really learn to use regex. :-)
Upvotes: 0
Reputation: 54223
Your regex isn't finding hell
because you're only matching hell
that precedes some other non-space character. Instead just look for a literal hell
-- nothing fancy.
In [3]: re.findall('hell', 'heller pond hell hellyi')
Out[3]: ['hell', 'hell', 'hell']
EDIT
Per your comment, you want to return the whole word if it's found in the middle of the word. In which case you should use the *
zero-or-or more quantifier.
In [4]: re.findall(r"\S*hell\S*", 'heller pond hell hellyi')
Out[4]: ['heller', 'hell', 'hellyi']
In other words:
re.compile(r"""
\S* # zero or more non-space characters
hell # followed by a literal hell
\S* # followed by zero or more non-space characters""", re.X)
Note that Padraic's answer is definitely the BEST way to go about this:
[word for word in "heller pond hell hellyi".split() if 'hell' in word]
Upvotes: 2
Reputation:
You can use re.findall
and search for hell
with zero or more word characters on either side:
>>> import re
>>> s = 'heller pond hell hellyi'
>>> re.findall('\w*hell\w*', s)
['heller', 'hell', 'hellyi']
>>>
Upvotes: 5
Reputation: 180481
You can use str.split and see if the substring is in each word:
s = 'heller pond hell hellyi'
print([w for w in s.split() if "hell" in w])
Upvotes: 2