Reputation: 227
I have written a program to generate a unicode text file to upload into a website. I have successfully prototyped this particular file using Microsoft Access and have noted that the website rejects the file as not a unicode text file if it is encoded as a utf-8 file. From the text editor (using Windows Notepad), if I save the file using just the unicode encoding in the drop-down box, the website I'm uploading into accepts the file just fine.
So, with this context in mind, I've written a program to autogenerate the file using pandas and DataFrames. The last line of my program exports the DataFrame to a text file:
coa1.to_csv('0000-2951-test.txt',index=False, sep='\t', encoding = 'utf-8')
This generates the right file but when I open it in Notepad, the encoding listed is ANSI and not unicode. How do I write my DataFrame to a unicode file? And what kind of unicode does Notepad mean by Unicode (when it doesn't have any additional qualifiers)?
Upvotes: 3
Views: 2786
Reputation: 16037
"unicode" is not one particular encoding but a set of encodings. It can be utf-8, utf-16, utf-32. more detailed explanation
in Windows, unicode often means utf-16, and it is possible that this website expects utf-16. try to encode your csv in utf-16 and see if it is "unicode" in Notepad and this website accepts it.
coa1.to_csv('0000-2951-test.txt',index=False, sep='\t', encoding = 'utf-16')
EDIT Instead of using Windows Notepad, I suggest to use a more decent text editor for your prototyping, which enables to precisely set the encoding. I would use Sublime or Notepad++
Upvotes: 5