Poker Prof
Poker Prof

Reputation: 169

Python Strange Characters Appear when Write Out to a Text File

I have a simple code over here that should output 2 columns of characters in a text file.

infile = open('anything.txt', 'r')
outfile = open('some.txt', 'w')
f = infile.readlines()
data=[]
a=['1','2','3']
b=['5','6','7']
for i in a:
  for j in b:
    outfile.write(i + "\t" + j + "\n")

What I got when I open the resultant text file with the standard Notepad are these strange characters! ऱਵऱਸ਼ऱ਷लਵलਸ਼ल਷ळਵळਸ਼ळ਷

However, when I opened the text file with Notepad++ or Wordpad, the result is two columns of numbers with a tab between them, as what we expected.

I am really lost here. What is going on ? Can't I open the text file with a standard Notepad?

Thanks for your help.

Upvotes: 4

Views: 8196

Answers (3)

Chris Kent
Chris Kent

Reputation: 871

I believe it is a bug with Notepad.

Notepad is interpreting the data in the file as Unicode when it is ASCII. The first two characters are 1 and tab their ascii hex values are 31 and 09 If notepad mistakes the file for Unicode it will read the two values as one 3109 and display one character to match: http://www.unicodemap.org/details/0x0931/index.html (You can see that this matches the first character in your string of "strange characters".)

This is a well known bug with notepad and even has it's own humorously titled page on wikipedia: http://en.wikipedia.org/wiki/Bush_hid_the_facts

You can pick the character encoding in notepad to force the file open in the correct encoding by selecting it in the encoding drop-down (ANSI in this case). But it may be better to use another text editor if you want to see the correct values of data in text files.

Upvotes: 3

Levon
Levon

Reputation: 143017

I don't have this problem, what version of Python are you using, what OS?

You should explicitly close your files when you are done.

infile.close()
infile.close()

Better yet consider using with as it will close your files for you "automagically" when you are done or an exception is encountered:

with open('data.txt') as infile, open('some.txt', 'w') as outfile:

With earlier versions of Python (pre-2.7?) you may have to break this down into two:

with open('data.txt') as infile:  # default mode is "read" if not specified
  with open('some.txt', 'w') as outfile:

(Given that you mentioned you use Python v2.4 with won't work for you, it was introduced in v2.5 - still it's good to know about it)

I get this output:

1   5
1   6
1   7
2   5
2   6
2   7
3   5
3   6
3   7

Also, note you are not using these three lines in your program at all:

infile = open('anything.txt', 'r')
f = infile.readlines()
data=[]

Upvotes: 2

Karl Bartel
Karl Bartel

Reputation: 3434

The different editors might assume a different character encoding. This would explain why some editors show the result correctly.

Upvotes: 0

Related Questions