Reputation: 341
I don't use or do much text searching but have not been able to find an answer as to what the regex is to find all words starting with T and ending with T from a text file where each word is on a newline. Tried a number of suggestions from searches; the following finds all words starting with T and where T occurs next. However, I want to find where the LAST letter is T also, irrespective of how many T's occur between. Apologies if this is actually trivial, but after every combo I can find I have no result. I am unsure why r'^T.*T$'
doesn't work.
with open('/Users/../words.txt') as f:
passage = f.read()
words = re.findall(r'T.+T', passage)
print(words)
Upvotes: 2
Views: 16505
Reputation: 92854
Use word boundary anchor \b
and non-whitespace character \S
:
words = re.findall(r'\bT\S+T\b', passage)
this will also allow to match such words as Trust-TesT
, Tough&FasT
etc.
Upvotes: 2
Reputation: 140168
I'd use that expression:
re.findall(r"\bT\w*?T\b",s))
\w
to avoid matching spaces in betweenUpvotes: 6