Reputation: 13
#open text file
with open('words') as f:
for line in f.readlines():
#pull out all 3 letter words using regular expression and add to wordlist
word_list += re.findall(r'\b(\w{3})\b', line)
I use this to find all 3 letter words in a dictionary. From there, I want to add a question mark to the beginning of each word. I assume I need the re.sub
function, but can't seem to get the syntax right.
Upvotes: 1
Views: 473
Reputation: 6912
You can use re.sub
, where \1
refers to the first capture group:
re.sub(r'\b(\w{3})\b', r'?\1', line)
Upvotes: 1
Reputation: 142156
You can do this a few ways, one of them is to get all your 3 letters words and then update them afterwards, otherwise, you can do along the lines of what you're doing and extend a list as you go. There's not really a need for re.sub
here if you want to end up building a list of 3 letters words prefixed with ?
Sample words
file:
the quick brown fox called bob jumped over the lazy dog
and went straight to bed
cos bob needed to sleep right now
Sample code:
word_list = []
with open('words') as fin:
for line in fin:
matches = re.findall(r'\b(\w{3})\b', line)
word_list.extend(f'?{word}' for word in matches)
Sample word_list
after run:
['?the',
'?fox',
'?bob',
'?the',
'?dog',
'?and',
'?bed',
'?cos',
'?bob',
'?now']
Upvotes: 1
Reputation: 37367
First compile pattern:
re.compile(r'\b(\w{3})\b')
and then use it like this:
word_list += '?' + re.search(line)
Upvotes: 0