kev1807
kev1807

Reputation: 87

How do I find strings in a file and replace with another in Python?

I have a CSV file from which I create a list:

with open('old_id_new_id.csv', newline='') as csvfile:
    reader = csv.DictReader(csvfile, delimiter=',')
    result = [[row['oldid'],row['newid']] for row in reader]
    print(result)

this result list contains of several elements like this:

result = [['e000001_kuttenberger_religionsfrieden_tschech', 'pa000001-0020'], 
          ['e000001_kuttenberger_religionsfrieden_dt', 'pa000001-0021']]

I have an XML file of the following structure:

<struct label="Kuttenberger Religionsfrieden (1485)" order="2">
    <view file="e000001_kuttenberger_religionsfrieden_einleitung" label="Einleitung"/>
    <view file="e000001_kuttenberger_religionsfrieden_tschech" label="Quellentext"/>
    <view file="e000001_kuttenberger_religionsfrieden_dt" label="Deutsche Übersetzung"/>
</struct>

How do I open this and replace the string result[0][0] with result[0][1]:

simply put, the following doesn't work:

    with open('struct.xml', 'rb') as file:
        for line in file:
            if str(result[0][0]) in line:
                line.replace(str(result[0][0]), str(result[0][1]))

any hints?

Upvotes: 4

Views: 102

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521249

You could build a dictionary of search terms and their replacements. Also, build a regex alternation of all search terms to be replaced. Then apply re.sub to each line with the alternation, and in a callback lookup each match in the dictionary to find the replacement.

result = (['e000001_kuttenberger_religionsfrieden_tschech', 'pa000001-0020'], ['e000001_kuttenberger_religionsfrieden_dt', 'pa000001-0021'])
terms = dict(result)
regex = r'\b(?:' + '|'.join([x[0] for x in result]) + r')\b'

with open('struct.xml', 'rb') as file:
    for line in file:
        line = re.sub(regex, lambda m: terms[m.group()], line)

Upvotes: 2

Related Questions