Time Lord
Time Lord

Reputation: 341

Using Python regex find words starting and ending with specific letters

I don't use or do much text searching but have not been able to find an answer as to what the regex is to find all words starting with T and ending with T from a text file where each word is on a newline. Tried a number of suggestions from searches; the following finds all words starting with T and where T occurs next. However, I want to find where the LAST letter is T also, irrespective of how many T's occur between. Apologies if this is actually trivial, but after every combo I can find I have no result. I am unsure why r'^T.*T$' doesn't work.

with open('/Users/../words.txt') as f:
    passage = f.read()
words = re.findall(r'T.+T', passage)
print(words)

Upvotes: 2

Views: 16505

Answers (2)

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Use word boundary anchor \b and non-whitespace character \S:

words = re.findall(r'\bT\S+T\b', passage)

this will also allow to match such words as Trust-TesT, Tough&FasT etc.

Upvotes: 2

Jean-François Fabre
Jean-François Fabre

Reputation: 140168

I'd use that expression:

re.findall(r"\bT\w*?T\b",s))
  • use word boundary
  • use any numbers of \w to avoid matching spaces in between
  • use "non-greedy" mode (maybe not that useful here since word boundary already does the job)

Upvotes: 6

Related Questions