Lodore66
Lodore66

Reputation: 1185

Can I loop through a text file when values are strings?

I have a problem I'd be very grateful for help with.

Specifically, I have a gigantic text file; I need to replace specific strings in it with entries from a dictionary. Usefully, the words I need to replace are named in sequential fashion: 'Word1', 'Word2', ... , 'Wordn'.

Now, I'd like to write a 'for' loop that loops across the file, and for all instances of 'Wordx' replaces it with dictionary[x]. The problem, of course, is that 'Wordx' requires the 'x' part to function as a variable, which (so far as I know) can't be done inside a string.

Does anyone have workaround? I tried looking at regular expressions, but found nothing obvious (possibly because I also found it somewhat confusing).

(Note that I can when I generate the text file, I have complete control over the form the words I want to replace can take: i.e., it need not be 'Word11; it can be 'Wordeleven' or 'wordXI' or anything ascii at all.)

Edit: To add more detail, as requested: my text file is an export of the javascript behind a survey file. The original survey software only allows me to enter text prompts one at a time (as opposed to pipe the in from a csv), but I have several thousand text prompts to enter (the words). My plan is to manually enter about 100 words ('Word1, ..., 'Word100'), export the survey javascript as a text file, write a script to replace the words with dictionary entries, import the resulting files, and join them into a new survey.

However, the issue remains whether I can use the number portion of a string as a variable to loop across

Upvotes: 0

Views: 147

Answers (3)

Aravind Voggu
Aravind Voggu

Reputation: 1531

I suppose the text file you were talking was like this:

Hi! This is word1

I like to swim, word2 and word3 ....

if so, then you can read line by line, split lines and replace words with values from dictionary, whose keys would be int(word[-1])

Here is the code,

from __future__ import print_function

dict = {1: 'Aravind', 2: 'eat', 3:'play'}

def word_gen(file):
    for line in file:
        for word in line.split():

            if word[0:4] == 'word' and len(word) == 5:
                 print( dict[ int( word[-1] ) ], end=" " )  #remove int() if keys are are "chars" like {'1':'Mark',..}
                 #this------------------^

            else: print(word, end = " ")

        print("\r")


with open('re.txt', 'r') as f:
    word_gen(f)

now direct terminal output to another file with

python replace.py > replaced.txt

Hope that helps :)

Upvotes: 1

z0r
z0r

Reputation: 8585

With re.sub(), you can pass it a function instead of a replacement string. This function can look up the replacement from a dictionary. For example:

d = {'0': 'foo', '1': 'bar', '2': 'baz'}
re.sub(r'word(\d+)',
       lambda match: d[match.group(1)],
       "Hello word0, this is word2. How is word1?")

Hello foo, this is baz. How is bar?

Upvotes: 5

Rory Daulton
Rory Daulton

Reputation: 22544

n = 1
while not done:
    replace_str = 'Word' + str(n)
    # find and replace all instances of replace_str in the file text
    # set variable done if finished
    n += 1

Does that framework solve your needs? A string is not a variable: a string is a value which can be calculated, while a variable is a name, which (usually) is not calculated. With more difficulty you can also set strings like 'WordEleven' and so on.

Upvotes: 2

Related Questions