Reputation: 1185
I have a problem I'd be very grateful for help with.
Specifically, I have a gigantic text file; I need to replace specific strings in it with entries from a dictionary. Usefully, the words I need to replace are named in sequential fashion: 'Word1', 'Word2', ... , 'Wordn'.
Now, I'd like to write a 'for' loop that loops across the file, and for all instances of 'Wordx' replaces it with dictionary[x]. The problem, of course, is that 'Wordx' requires the 'x' part to function as a variable, which (so far as I know) can't be done inside a string.
Does anyone have workaround? I tried looking at regular expressions, but found nothing obvious (possibly because I also found it somewhat confusing).
(Note that I can when I generate the text file, I have complete control over the form the words I want to replace can take: i.e., it need not be 'Word11; it can be 'Wordeleven' or 'wordXI' or anything ascii at all.)
Edit: To add more detail, as requested: my text file is an export of the javascript behind a survey file. The original survey software only allows me to enter text prompts one at a time (as opposed to pipe the in from a csv), but I have several thousand text prompts to enter (the words). My plan is to manually enter about 100 words ('Word1, ..., 'Word100'), export the survey javascript as a text file, write a script to replace the words with dictionary entries, import the resulting files, and join them into a new survey.
However, the issue remains whether I can use the number portion of a string as a variable to loop across
Upvotes: 0
Views: 147
Reputation: 1531
I suppose the text file you were talking was like this:
Hi! This is word1
I like to swim, word2 and word3 ....
if so, then you can read line by line, split lines and replace words with values from dictionary, whose keys would be int(word[-1])
Here is the code,
from __future__ import print_function
dict = {1: 'Aravind', 2: 'eat', 3:'play'}
def word_gen(file):
for line in file:
for word in line.split():
if word[0:4] == 'word' and len(word) == 5:
print( dict[ int( word[-1] ) ], end=" " ) #remove int() if keys are are "chars" like {'1':'Mark',..}
#this------------------^
else: print(word, end = " ")
print("\r")
with open('re.txt', 'r') as f:
word_gen(f)
now direct terminal output to another file with
python replace.py > replaced.txt
Hope that helps :)
Upvotes: 1
Reputation: 8585
With re.sub()
, you can pass it a function instead of a replacement string. This function can look up the replacement from a dictionary. For example:
d = {'0': 'foo', '1': 'bar', '2': 'baz'}
re.sub(r'word(\d+)',
lambda match: d[match.group(1)],
"Hello word0, this is word2. How is word1?")
Hello foo, this is baz. How is bar?
Upvotes: 5
Reputation: 22544
n = 1
while not done:
replace_str = 'Word' + str(n)
# find and replace all instances of replace_str in the file text
# set variable done if finished
n += 1
Does that framework solve your needs? A string is not a variable: a string is a value which can be calculated, while a variable is a name, which (usually) is not calculated. With more difficulty you can also set strings like 'WordEleven' and so on.
Upvotes: 2