Chris Hall
Chris Hall

Reputation: 931

Fix newlines when writing UTF-8 to Text file in python

I'm at my wits end on this one. I need to write some Chinese characters to a text file. The following method works however the newlines get stripped so the resulting file is just one super long string.

I tried inserting every known unicode line break that I know of and nothing. Any help is greatly appreciated. Here is snippet:

import codecs   
file_object = codecs.open( 'textfile.txt', "w", "utf-8" )
xmlRaw = (data to be written to text file )    
newxml = xmlRaw.split('\n')
for n in newxml:
    file_object.write(n+(u'2424'))# where \u2424 is unicode line break    

Upvotes: 7

Views: 31144

Answers (3)

Ferd
Ferd

Reputation: 1341

The easiest way to do that is using the combination of "\r\n" as marc_a said.

So, your code should look like this:

import codecs   
file_object = codecs.open( 'textfile.txt', "w", "utf-8" )
xmlRaw = (data to be written to text file )    
newxml = xmlRaw.split('\n')
for n in newxml:
    file_object.write(n+u"\r\n")

Upvotes: 0

marc_a
marc_a

Reputation: 21

I had the same problem to the same effect (wit's end and all). In my case, it was not an encoding issue, but the need to replace every '\n' with '\r\n', which led to better understanding the difference between line-feeds and carriage returns, and the fact that Windows editors often require \r\n for line breaks: 12747722

Upvotes: 0

rtxndr
rtxndr

Reputation: 932

If you use python 2, then use u"\n" to append newline, and encode internal unicode format to utf when you write it to file: file_object.write((n+u"\n").encode("utf")) Ensure n is of type unicode inside your loop.

Upvotes: 4

Related Questions