Nikwin
Nikwin

Reputation: 6756

lxml Changing Unicode Characters

I am using lxml to read through an xml file and change a few details. However, when running it I find that even if I just use lxml to read the file and then write it out again, as below:

fil='iTunes Music Library.XML'
tre=etree.parse(fil)
tre.write('temp.xml')

I find Queensrÿche converted to Queensrÿche. Anyone know how to fix this?

Upvotes: 3

Views: 1570

Answers (1)

Denis Otkidach
Denis Otkidach

Reputation: 33200

Change your last line to:

tre.write('temp.xml', encoding='utf-8')

Otherwise lxml writes XML in ASCII encoding, so it have to escape all non-ASCII characters.

Upvotes: 7

Related Questions