How to read and write files in python for Arabic

I am a beginner in Python. I am using Python 2.7.3. I tried to read from Arabic text to do some processes on it for my program idea.

but it prints unreadable output

this is a script of my code:

>>> fname = open (r"C:\Python27\نجود.txt ", "rb")
>>> text = fname.read()
>>> print text
ï»؟ط§ظ„ط³ظ„ط§ظ… ط¹ظ„ظٹظƒظ… ط£ظ†ط§ ط¨طµط¯ط¯ طھط¬ط±ط¨ط© ظ‡ط°ط§ 
ط§ظ„ط¨ط±ظ†ط§ظ…ط¬ ظپظٹ ط¨ط§ظٹط«ظˆظ†. ط¨ط§ظٹط«ظˆظ† ط±ط§ط¦ط¹ ظˆط¬ظ…ظٹظ„, ``ظˆظ„ظƒظ† طھط¬ط±ط¨ط© ط¨ط§ظٹط«ظˆظ† ظ…ط¹ ط§ظ„ط¹ط±ط¨ظٹ ط³طھظƒظˆظ† ظ…ط®طھظ„ظپط©!. ط¨ط§ظٹط«ظˆظ† ط±ط§ط¦ط¹ ظˆظٹط³طھط­ظ‚ ط§ظ„طھط¬ط±ط¨ط©.

I tried many solutions like:

text= fname.encoding() #or encode , but it did not work and gave me this error:
########
text= fname.encoding()
TypeError: 'NoneType' object is not callable

try to put # encoding: utf-8 in the top of code file but it did not give any change.

also try to do this:

fname = open (r"C:\Python27\نجود.txt ", "r", encoding='utf-8')

but it gave me this error:

fname = open (r"C:\Python27\نجود.txt ", "r", encoding='utf-8')
 TypeError: 'encoding' is an invalid keyword argument for this function

any suggesions? thanks in advance.

Upvotes: 1

Views: 3276

Answers (2)

Burhan Khalid
Burhan Khalid

Reputation: 174642

First, you need to open the file in the right encoding. Arabic on Windows is usually windows-1256 or sometimes it can be utf-8.

For 2.7.3, make sure you are opening the file correctly:

import io

with io.open(r"C:\Python27\نجود.txt ", "r", encoding="utf-8") as f:
    for line in f:
       print(line)

Upvotes: 0

Alfe
Alfe

Reputation: 59486

Reading from a file will return a str which, in Python2, is an arbitrary byte string (which might be a UTF-8 encoded string of unicode characters, but it could also be binary data like the contents of a JPG file or similar).

If you know that it is a UTF-8 encoded string of characters, you should decode it:

decoded = text.decode('utf8')

This will produce a unicode object which is a string of unicode characters. If you handle this, Python will try to do everything properly. E. g. depending on your terminal, printing this should work as expected:

print decoded

Upvotes: 1

Related Questions