sheikhsalman08
sheikhsalman08

Reputation: 424

using tablib and django-import-export to import data, getting error with excel, but with csv it's working

At first trying to read excel/csv data to check it before inserting into database, While csv working fine can read data, but xlsx and xlx showing following error

UnicodeDecodeError at /academy/add_advisor 'utf-8' codec can't decode byte 0xa1 in position 10: invalid start byte

Snippet of my code:

from tablib import Dataset
this_file = request.FILES['bulk_file']

dataset = Dataset()
imported_data = dataset.load(this_file.read().decode("utf-8"),format='xlsx')

for data in dataset:
    print(data[0], data[1], data[2], data[3], data[4])

According to error message error coming from this line

imported_data = dataset.load(this_file.read().decode("utf-8"),format='xlsx')

The excel file i am trying to import, i downloaded it from google drive excel as xlsx file(microsoft excel). Also downloaded one from onedrive(microsoft) xlsx file still getting same error.

Few more ways i have tried with are

imported_data = dataset.load(this_file.read().decode("ISO-8859-1"),format='xlsx')
imported_data = dataset.load(this_file.read().strip().decode("ISO-8859-1"),format='xlsx')
imported_data = dataset.load(this_file.read().strip().decode("CP1252"),format='xlsx')
imported_data = dataset.load(this_file.read().strip().decode("windows-1252"),format='xlsx')
imported_data = dataset.load(this_file.read().strip().decode("Latin-1"),format='xlsx')

But no luck :( Please share if there has any better way i can try :)(: Thank you for read :)

Upvotes: 0

Views: 4432

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 477641

Well if your the_file is an .xlsx file, then it is probably no good idea to decode the stream manually: an .xlsx is a zipped stream of XML files. So that means it does not really follow a standard string encoding.

But the dataset.load(..) also suggests it does not per see needs a string, it needs data that it aims to load. In case of an .xlsx file, that is binary data, so we can use .read() to read the content of the file into a binary string, and thus let the dataset do the correct processing, like:

imported_data = dataset.load(this_file.read(),format='xlsx')

Upvotes: 1

Related Questions