DanEng
DanEng

Reputation: 428

Python Using Regex to replace a list of words

I would like to replace a list of words using the RegEx module, but I have failed even after many attempts.

#full_list.txt
#Tab is the delimiter
#The left column is the list of words to be searched
#The right column is the list of words to be replaced

%!あ [×啞]    @#あ [啞]
%!あい きょう [\(愛嬌\)・\(愛▷敬\)]   @#あいきょう [愛嬌]
    .
    .
    .

My codes are as follows:

import re

with open('full_list.txt', 'r', encoding='utf-8') as f:
    search_list = [line.strip().split('\t')[0] for line in f]

with open('full_list.txt', 'r', encoding='utf-8') as f:
    replace_list = [line.strip().split('\t')[1] for line in f]

with open('document.txt', 'r', encoding='utf-8') as f:
    content = f.read()

def replace_func(x, content):
    content = re.sub(search_list[x], replace_list[x], content)
    return content

x = 0
while x < 30:
    content = replace_func(x, content)
    x+=1

with open('new_document.txt', 'w', encoding='utf-8') as f:
    f.write(content)

After running the codes, some words can be replaced while some cannot. What could have been wrong with the codes?

Upvotes: 0

Views: 766

Answers (1)

Daniel
Daniel

Reputation: 42778

If you only want to replace words, don't use regular expressions, but the replace-Method for strings:

with open('full_list.txt', 'r', encoding='utf-8') as f:
    search_and_replace = [line.strip().split('\t') for line in f]

with open('document.txt', 'r', encoding='utf-8') as f:
    content = f.read()

for search, repl in search_and_replace:
    content = content.replace(search, repl)

with open('new_document.txt', 'w', encoding='utf-8') as f:
    f.write(content)

Upvotes: 2

Related Questions