Reputation: 141
I want to sort out words which are started with 's' in sentence by python.
Here is my code:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall(r'[s]\w+', text)
print m
But the result of code is :
['searching', 'source', 'sk', 'sterday'].
How do I write a code about regular expression? Or, is there any method to sort out words?
Upvotes: 10
Views: 67761
Reputation: 1
I would like to add one small thing here,
Let's say you have a line to find words which starts with 's'
line = "someone should show something to [email protected]"
if you write regular expression like,
swords = re.findall(r"\b[sS]\w+", line)
output will be,
['someone','should','show','something','some']
But if you modify regular expression to,
# use \S instead of \w
swords = re.findall(r"\b[sS]\S+", line)
output will be,
['someone','should','show','something','[email protected]']
Upvotes: -1
Reputation: 51
Lambda style:
text = 'I was searching my source to make a big desk yesterday.'
list(filter(lambda word: word[0]=='s', text.split()))
Output:
['searching', 'source']
Upvotes: 2
Reputation: 21446
I know it is not a regex solution, but you can use startswith
>>> text="I was searching my source to make a big desk yesterday."
>>> [ t for t in text.split() if t.startswith('s') ]
['searching', 'source']
Upvotes: 13
Reputation: 11
I tried this sample of code and I think it does exactly what you want:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall (r'\b[s]\w+', text)
print (m)
Upvotes: 1
Reputation: 92976
If you want to match a single character, you don't need to put it in a character class, so s
is the same than [s]
.
What you want to find is a word boundary. A word boundary \b
is an anchor that matches on a change from a non word character (\W
) to a word character (\w
) or vice versa.
The solution is:
\bs\w+
this regex will match on a s
with not a word character before (works also on the start of the string) and needs at least one word character after it. \w+
is matching all word characters it can find, so no need for a \b
at the end.
See it here on Regexr
Upvotes: 2
Reputation: 133514
>>> import re
>>> text = "I was searching my source to make a big desk yesterday."
>>> re.findall(r'\bs\w+', text)
['searching', 'source']
For lowercase and uppercase s
use: r'\b[sS]\w+'
Upvotes: 25