Reputation: 167
I'm using the 're' library to replace occurrences of different strings in multiple files. The replacement pattern works fine, but I'm not able to maintain the changes to the files. I'm trying to get the same functionality that comes with the following lines:
with open(KEY_FILE, mode='r', encoding='utf-8-sig') as f:
replacements = csv.DictReader(f)
user_data = open(temp_file, 'r').read()
for col in replacements:
user_data = user_data.replace(col[ORIGINAL_COLUMN], col[TARGET_COLUMN])
data_output = open(f"{temp_file}", 'w')
data_output.write(user_data)
data_output.close()
The key line here is:
user_data = user_data.replace(col[ORIGINAL_COLUMN], col[TARGET_COLUMN])
It takes care of updating the data in place using the replace method.
I need to do the same but with the 're' library:
with open(KEY_FILE, mode='r', encoding='utf-8-sig') as f:
replacements = csv.DictReader(f)
user_data = open(temp_file, 'r').read()
a = open(f"{test_file}", 'w')
for col in replacements:
original_str = col[ORIGINAL_COLUMN]
target_str = col[TARGET_COLUMN]
compiled = re.compile(re.escape(original_str), re.IGNORECASE)
result = compiled.sub(target_str, user_data)
a.write(result)
I only end up with the last item in the .csv dict changed in the output file. Can't seem to get the changes made in previous iterations of the for loop to persist.
I know that it is pulling from the same file each time... which is why it is getting reset each loop, but I can't sort out a workaround.
Thanks
Upvotes: 0
Views: 544
Reputation: 11114
Try something like this?
#!/usr/bin/env python3
import csv
import re
import sys
from io import StringIO
KEY_FILE = '''aaa,bbb
xxx,yyy
'''
TEMP_FILE = '''here is aaa some text xxx
bla bla aaaxxx
'''
ORIGINAL_COLUMN = 'FROM'
TARGET_COLUMN = 'TO'
user_data = StringIO(TEMP_FILE).read()
with StringIO(KEY_FILE) as f:
reader = csv.DictReader(f, ['FROM','TO'])
for row in reader:
original_str = row[ORIGINAL_COLUMN]
target_str = row[TARGET_COLUMN]
compiled = re.compile(re.escape(original_str), re.IGNORECASE)
user_data = compiled.sub(target_str, user_data)
sys.stdout.write("modified user_data:\n" + user_data)
Some things to note:
result = sub(..., user_data)
rather than result = sub(..., result)
. You want to keep updating the same string, rather than always applying to the original.StringIO
versions inline and printing to stdout; hopefully that's easy enough to translate back to your real code (:
Upvotes: 1