PrimingRyan
PrimingRyan

Reputation: 141

How to find a word that starts with a specific character

I want to sort out words which are started with 's' in sentence by python.
Here is my code:

import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall(r'[s]\w+', text)
print m

But the result of code is :

['searching', 'source', 'sk', 'sterday'].

How do I write a code about regular expression? Or, is there any method to sort out words?

Upvotes: 10

Views: 67761

Answers (6)

S A G A R
S A G A R

Reputation: 1

I would like to add one small thing here,

Let's say you have a line to find words which starts with 's'

line = "someone should show something to [email protected]"

if you write regular expression like,

swords = re.findall(r"\b[sS]\w+", line)

output will be,

['someone','should','show','something','some']

But if you modify regular expression to,

# use \S instead of \w
swords = re.findall(r"\b[sS]\S+", line)

output will be,

['someone','should','show','something','[email protected]']

Upvotes: -1

user3533685
user3533685

Reputation: 51

Lambda style:

text = 'I was searching my source to make a big desk yesterday.'

list(filter(lambda word: word[0]=='s', text.split()))

Output:

['searching', 'source']

Upvotes: 2

Adem Öztaş
Adem Öztaş

Reputation: 21446

I know it is not a regex solution, but you can use startswith

>>> text="I was searching my source to make a big desk yesterday."
>>> [ t for t in text.split() if t.startswith('s') ]
['searching', 'source']

Upvotes: 13

Narekzzz
Narekzzz

Reputation: 11

I tried this sample of code and I think it does exactly what you want:

import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall (r'\b[s]\w+', text)
print (m)

Upvotes: 1

stema
stema

Reputation: 92976

  1. If you want to match a single character, you don't need to put it in a character class, so s is the same than [s].

  2. What you want to find is a word boundary. A word boundary \b is an anchor that matches on a change from a non word character (\W) to a word character (\w) or vice versa.

The solution is:

\bs\w+

this regex will match on a s with not a word character before (works also on the start of the string) and needs at least one word character after it. \w+ is matching all word characters it can find, so no need for a \b at the end.

See it here on Regexr

Upvotes: 2

jamylak
jamylak

Reputation: 133514

>>> import re
>>> text = "I was searching my source to make a big desk yesterday."
>>> re.findall(r'\bs\w+', text)
['searching', 'source']

For lowercase and uppercase s use: r'\b[sS]\w+'

Upvotes: 25

Related Questions