Seb
Seb

Reputation: 49

Python: Can't replace special/foreign characters in text file

I have a text file named g.txt. It holds a large collection of German words. I am looking to replace äÄöÖüÜß characters. I started with ü and want to replace it with the html Unicode call ü However it throws no error, but does not work. I have tried to replace normal letters and it works with the code below, but not with the German umlaut.

reading_file = open("g.txt", "r")

new_file_content = ""
for line in reading_file:
  stripped_line = line.strip()
  new_line = stripped_line.replace("ü", "&#252")
  new_file_content += new_line +"\n"
reading_file.close()

writing_file = open("g.txt", "w")
writing_file.write(new_file_content)
writing_file.close()

Any help apreciated!

Upvotes: 0

Views: 71

Answers (2)

mtdot
mtdot

Reputation: 312

You should use encoding utf8. Try this code

reading_file = open("abc.data", "r", encoding="utf8")

new_file_content = ""
for line in reading_file.readlines():
    stripped_line = line.strip()
    new_line = stripped_line.replace("ü", "&#252")
    new_file_content += new_line +"\n"
reading_file.close()

writing_file = open("abc.data2", "w", encoding="utf8")
writing_file.write(new_file_content)
writing_file.close()

Upvotes: 1

saquintes
saquintes

Reputation: 1089

You may need to use the unicode escape version of character you want to replace.

new_line = stripped_line.replace("\u0252", "&#252")

Upvotes: 0

Related Questions