Reputation: 211
I want to know how to find the utf-8 equivalent of a tamil character. Is there any function for it? Can you give the syntax.
for line in f:
words = line.strip().split()
for word1, word2 in zip(words, words[1:]):
if word1 == '1' and word2 == "கோடி":
ff.write("onru\n")
ff.write(word2+'\n')
else:
ff.write(word1+'\n')
ff.write(word2+'\n')
But it gives, SyntaxError: Non-ASCII character '\xe0' in file replacement.py on line 5, but no encoding declared. So how to read the non-ascii characters or how to read the tamil words. mainly how to compare and check. Thanx in advance.
Upvotes: 1
Views: 184
Reputation: 211
I dont know if its technically making any difference, but i just removed the double quotes and replaced them with single quotes and now my pgm works. it is doing the comparison correctly. now am giving as'கோடி' instead of "கோடி". I tried u'கோடி, u'/கோடி, u"கோடி. all of them were giving errors.
Upvotes: 0
Reputation: 9753
The error happens before Python starts executing the file, because it detects a non-ASCII characters. (By the way, this is a Python 2-only issue, so you should probably remove the python-3.x
tag from your post).
To tell Python the file is encoded in UTF-8, you should add this at the beginning of the file (as defined in PEP 263):
# -*- coding: utf8 -*-
Upvotes: 1