charvi
charvi

Reputation: 211

how to check the utf-8 equivalent value of a character in python?

I want to know how to find the utf-8 equivalent of a tamil character. Is there any function for it? Can you give the syntax.

for line in f:
    words = line.strip().split() 
    for word1, word2 in zip(words, words[1:]): 
            if word1 == '1' and word2 == "கோடி":
                ff.write("onru\n")
                ff.write(word2+'\n')
            else:
                ff.write(word1+'\n')
                ff.write(word2+'\n')

But it gives, SyntaxError: Non-ASCII character '\xe0' in file replacement.py on line 5, but no encoding declared. So how to read the non-ascii characters or how to read the tamil words. mainly how to compare and check. Thanx in advance.

Upvotes: 1

Views: 184

Answers (2)

charvi
charvi

Reputation: 211

I dont know if its technically making any difference, but i just removed the double quotes and replaced them with single quotes and now my pgm works. it is doing the comparison correctly. now am giving as'கோடி' instead of "கோடி". I tried u'கோடி, u'/கோடி, u"கோடி. all of them were giving errors.

Upvotes: 0

Valentin Lorentz
Valentin Lorentz

Reputation: 9753

The error happens before Python starts executing the file, because it detects a non-ASCII characters. (By the way, this is a Python 2-only issue, so you should probably remove the python-3.x tag from your post).

To tell Python the file is encoded in UTF-8, you should add this at the beginning of the file (as defined in PEP 263):

# -*- coding: utf8 -*-

Upvotes: 1

Related Questions