Priya
Priya

Reputation: 51

Match words that don't start with a certain letter using regex

I am learning regex but have not been able to find the right regex in python for selecting characters that start with a particular alphabet.

Example below

text='this is a test'
match=re.findall('(?!t)\w*',text)

# match returns
['his', '', 'is', '', 'a', '', 'est', '']

match=re.findall('[^t]\w+',text)

# match
['his', ' is', ' a', ' test']

Expected : ['is','a']

Upvotes: 5

Views: 3613

Answers (2)

Olivier Melançon
Olivier Melançon

Reputation: 22314

With regex

Use the negative set [^\Wt] to match any alphanumeric character that is not t. To avoid matching subsets of words, add the word boundary metacharacter, \b, at the beginning of your pattern.

Also, do not forget that you should use raw strings for regex patterns.

import re

text = 'this is a test'
match = re.findall(r'\b[^\Wt]\w*', text)

print(match) # prints: ['is', 'a']

See the demo here.

Without regex

Note that this is also achievable without regex.

text = 'this is a test'
match = [word for word in text.split() if not word.startswith('t')]

print(match) # prints: ['is', 'a']

Upvotes: 8

revo
revo

Reputation: 48711

You are almost on the right track. You just forgot \b (word boundary) token:

\b(?!t)\w+

Live demo

Upvotes: 2

Related Questions