Reputation: 87
I have a CSV file from which I create a list:
with open('old_id_new_id.csv', newline='') as csvfile:
reader = csv.DictReader(csvfile, delimiter=',')
result = [[row['oldid'],row['newid']] for row in reader]
print(result)
this result list contains of several elements like this:
result = [['e000001_kuttenberger_religionsfrieden_tschech', 'pa000001-0020'],
['e000001_kuttenberger_religionsfrieden_dt', 'pa000001-0021']]
I have an XML file of the following structure:
<struct label="Kuttenberger Religionsfrieden (1485)" order="2">
<view file="e000001_kuttenberger_religionsfrieden_einleitung" label="Einleitung"/>
<view file="e000001_kuttenberger_religionsfrieden_tschech" label="Quellentext"/>
<view file="e000001_kuttenberger_religionsfrieden_dt" label="Deutsche Übersetzung"/>
</struct>
How do I open this and replace the string result[0][0] with result[0][1]:
simply put, the following doesn't work:
with open('struct.xml', 'rb') as file:
for line in file:
if str(result[0][0]) in line:
line.replace(str(result[0][0]), str(result[0][1]))
any hints?
Upvotes: 4
Views: 102
Reputation: 521249
You could build a dictionary of search terms and their replacements. Also, build a regex alternation of all search terms to be replaced. Then apply re.sub
to each line with the alternation, and in a callback lookup each match in the dictionary to find the replacement.
result = (['e000001_kuttenberger_religionsfrieden_tschech', 'pa000001-0020'], ['e000001_kuttenberger_religionsfrieden_dt', 'pa000001-0021'])
terms = dict(result)
regex = r'\b(?:' + '|'.join([x[0] for x in result]) + r')\b'
with open('struct.xml', 'rb') as file:
for line in file:
line = re.sub(regex, lambda m: terms[m.group()], line)
Upvotes: 2