ankit
ankit

Reputation: 61

How to read non ascii charactersin python 2.7

I know this could be a very common problem and there would be lots of solution already given. I am unable to find solution for my problem, can some one please let me know if there is any duplicate post, or how to fix it.

I need to read source data which has both ascii and non-ascii characters(need help in python2.7). After reading I need to do some comparison on the source data and then write it into a target file.

with open('read.txt', "r") as file:

    reader = csv.reader(file, delimiter='\t')
    for lines in reader:
        LST_NM = (lines[0])
    print(LST_NM)

My Source File is : read.txt

"Abràmoff"

With this non-ascii character, my code is giving below error UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 266: ordinal not in range(128)

Thanks!!!

Upvotes: 0

Views: 41

Answers (1)

user5386938
user5386938

Reputation:

You'll need to determine what encoding was used to create your file. For example, if your file was written using utf-8 then you can use something like this:

your_encoding = 'utf-8'
import codecs
f = codecs.open('read.txt', encoding=your_encoding)
for line in f:
    print repr(line)

Some other encodings you can try include 'cp1252' which is common on Windows and maybe 'latin_1'

Reference

Upvotes: 1

Related Questions