tree em
tree em

Reputation: 21781

python Encoding Problem?

I read from source.sql ( sql script ) file

INSERT INTO `Tbl_abc` VALUES (1111, 2222, 'CLEMENT', 'taya', 'MME', 'Gérant', NULL, NULL, NULL, NULL, NULL, NULL, NULL, 4688, 0, NULL, NULL, 'MAILLOT 01/02/09', 'MAILLOT 01/04/09', NULL, NULL);

And write to dest.sql With my list formated

I met the problem with encoding for example:

Gérant= G\xc3\xa9rant

WHAT I AM TRYING

def DataMigration(dest, source, tbl_name, return_data=True):
    '''      
    '''
    data = []
    for ln in codecs.open(source, 'r', "utf-8").xreadlines():
        replace1 = ln.replace("INSERT INTO `"+tbl_name+"` VALUES (", "")
        replace2 = replace1.replace(");", "")
        list_replace = replace2.split(',')        
        s = list_replace
        data.append(list_replace)

    if return_data == True:
        ouputdata = [d for d in data if d[1] == ' 0' and d[6]==' 0']
        return ouputdata
    if return_data == False:
        return data

I print print DataMigration('dest.sql', '.source.sql', 'Tbl_abc', False)

OUTPUT

  [['1111', ' 2222', " 'CLEMENT'", " 'taya'", " 'MME'", " 'G\xc3\xa9rant'", ' NULL', ' NULL', ' NULL', ' NULL', ' NULL', ' NULL', ' NULL', ' 4688', ' 0', ' NULL', ' NULL', " 'MAILLOT 01/04/09'", " 'MAILLOT 01/04/09'", ' NULL', ' NULL']]


But My Ouput file still has the problem.Any Could help me ?

Upvotes: 0

Views: 220

Answers (3)

Steve De Caux
Steve De Caux

Reputation: 1779

Store your working data internally in Python as Unicode (use decode on read), and always write out using encode.

In your instance, you need to know the encoding of your database to know the correct output encoding.

Upvotes: 0

przemo_li
przemo_li

Reputation: 350

Hi check encoding of your file .sql maybe it is not utf-8!

Upvotes: 0

YOU
YOU

Reputation: 123917

Please use .encode("utf-8"), when you write to .sql file too.

open the file

fileObj = codecs.open( "someFile", "r", "utf-8" )

lets say you read it

data=fileOjb.read()

... do something on data

open("newfile","w").write(data.encode("utf-8"))

Upvotes: 1

Related Questions