Reputation: 266
I have a problem in Python 3.x with writing to file, write function in FOR is writing czech signs in utf-8 coding. I am new in Python but i set up IDE and .py, .xml files for 'utf-8' encoding and i have no idea why is output file looking like that. My code:
-*- coding: utf-8 -*-
from lxml import etree
from io import BytesIO
import sys
import codecs
f = open('uzivatelska_prirucka.xml','rb')
fo = open('try.xml','wb',1)
header = '?xml version="1.0" encoding="utf-8"?>\n<root\n'
fo.write(bytes(header,'UTF-8'))
some_file_like_object = f
tree = etree.parse(some_file_like_object)
root = tree.getroot()
node = tree.xpath('/prirucka/body/p');
for a in node:
for b in a.getiterator():
if not (b.find('r') is None):
text = etree.tostring(b.find('r'))
fo.write(bytes(str(text),'UTF-8'))
Thanks for your help and advices
Upvotes: 1
Views: 2295
Reputation: 72
Is it necessary to read and write in binary mode??
I think a XML file is a simple text file and you could use it just like a txt file
also you should know python3.2 and newer versions of python don't make any difference between ASCII and UTF strings
python3.2 and above see all strings as unicode strings so you can write your string in the output file whether the string contains non-ASCII characters or not
Also I find no need to open file in binary mode to use with lxml.etree
package
Try to open files in text mode ( get rid of that b
in opening mode ) and see if it works but keep in mind tell open to use utf-8
encoding to open your files
f = open('uzivatelska_prirucka.xml', 'r', encoding='utf-8')
fo = open('try.xml', 'w', 1, encoding='utf-8')
As a side note, you could just write:
if b.find('r'):
instead of:
if not (b.find('r') is None):
because None
in if clauses assumed as False
and if find()
returns None
python itself don't run the code in if block and jump it:
$ python3.3
Python 3.3.1 (default, Apr 17 2013, 22:30:32)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print(1) if None else print(0)
0
>>> print(1) if not None else print(0)
1
Have fun coding ;)
Upvotes: 1