Python unicode decoding encoding

Question

Here is my full code, and it is working fine with ASCII, but when comes the "unicode" charaters in the picture... I hate my life...

I know this is not english, but let me explain:

I have got 2 input files (realmek, nevek), and 1 result file (osszes).

I have got a working page in (html).

Like I said with ANSI characters this is working.

BUT when I try use strange chracters: "űáéđĐ" I need to save 2 input, and 1 output files in UNICODE. But than my program drops a "encoding decoding" error. And I know it is normal.

So my question is: How can I solve this? where I need to handle decoding encoding?

I am thinking about this for 3 days... I tried many decoding, like "u = unicode( s, "utf-8" )" ; $ export LANG=en_US.UTF-8; etc. But it didn't worked.

from urllib import urlopen
import re

faj = "hiba"
cast = "hiba"
pont = 0
szint = 0

fj = open("C:\Users\Rendszergazda\Desktop\Achievements\Realmek.txt", "r")
tombr = fj.readline()
realmek = tombr.split(" ")
fj.close()

fh = open("C:\Users\Rendszergazda\Desktop\Achievements\Nevek.txt", "r")
tomb = fh.readline()
nevek = tomb.split(" ")
fh.close()

osszes = open("C:\Users\Rendszergazda\Desktop\Achievements\Osszes.txt", "a")

for x in realmek:
    realm = x
    for y in nevek:
        nev = y
        lap = urlopen("http://eu.battle.net/wow/en/character/"+str(realm)+"/"+str(nev)+"/achievement").read()
        letezik = re.compile('')
        letez = re.findall(letezik,lap)
        if (letez != []):   
            a = 0    
        else:

            lapn = lap.split("
")      
            mapo = lapn[1087]
            pontos = re.compile('					(.*)
')
            pont = re.findall(pontos,mapo)

            mapom = lapn[1322]
            feastn = re.compile('												(.*)
')
            feast = re.findall(feastn,mapom)

            fajkeres = re.compile(' ')
            castkeres = re.compile(' ')
            szintkeres = re.compile('(.*)

Python unicode decoding encoding

Answers (1)

Related Questions