Rami Dridi
Rami Dridi

Reputation: 351

read Chinese character from excel file python3

I have an Excel file that contains two columns, first one in Chinese and the second is just a link. I tried two methods I found here. but it didn't work and I can't print the value in the console, I changed my encoding variable in settings (pycharm) to U8, still doesn't work. I used Pandas & xlrd libs, both didn't work while it worked for others who posted. this is my current code :

from xlrd import open_workbook
class Arm(object):
    def __init__(self, id, dsp_name):
        self.id = id
        self.dsp_name = dsp_name

    def __str__(self):
        return("Arm object:\n"
               "  Arm_id = {0}\n"
               "  DSPName = {1}\n"
               .format(self.id, self.dsp_name))

if __name__ == '__main__':

    wb = open_workbook('test.xls')
    for sheet in wb.sheets():
        print(sheet)
        number_of_rows = sheet.nrows
        number_of_columns = sheet.ncols

        items = []

        rows = []
        for row in range(1, number_of_rows):
            values = []
            for col in range(number_of_columns):
                value = str(sheet.cell(row, col).value)
                for a in value:
                    print('\n'.join([a]))
                values.append(value)

                print(value)
    for item in items:
        print (item)
        print("Accessing one single value (eg. DSPName): {0}".format(item.dsp_name))
        print

obviously it's not working, I was just messing around with it after giving up. File : http://www59.zippyshare.com/v/UxITFjis/file.html

Upvotes: 0

Views: 1647

Answers (2)

Rami Dridi
Rami Dridi

Reputation: 351

Well the problem I had wasn't in reading the Chinese characters actually! my problem we're in printing in console. I thought that the print encoder works fine and I just didn't read it the characters, but this code works fine :

from xlrd import open_workbook

wb = open_workbook('test.xls')
messages = []
links = []

for sheet in wb.sheets():
    number_of_rows = sheet.nrows
    number_of_columns = sheet.ncols
    for row in range(1, number_of_rows):
        i = 0
        for col in range(number_of_columns):
            value  = (sheet.cell(row,col).value).encode('gbk')
            if i ==0:
                messages.append(value)
            else:
                links.append(value)
            i+=1



print(links)

to check it, I paste the first result in selenium driver (since I was going to use it anyway)

element = driver.find_element_by_class_name('email').send_keys(str(messages[0],'gbk'))

and it works like a charme!

Upvotes: 0

Flynn
Flynn

Reputation: 41

It's not about encoding, you are not access the right rows.

On the line 24
for row in range(1, number_of_rows):

why are you want to start with 1 instead of 0.
tryfor row in range(number_of_rows):

Upvotes: 1

Related Questions