jmrcas
jmrcas

Reputation: 83

Python Large Excel File error to read cols and row size of file is 90mb

import xlrd


def excel_count_row_col(excel_file):
    print(excel_file)
    wb = xlrd.open_workbook(excel_file)
    worksheet = wb.sheet_by_index(0)

    row_count = worksheet.nrows
    column_count = worksheet.ncols

    total_col_row = row_count * column_count
    print(f"row * column = {total_col_row}")

I use python3.7. The error exist is raise BadZipFile: Bad CRC-32 for file 'xl/worksheets/sheet1.xml'

Upvotes: 1

Views: 1714

Answers (1)

Grismar
Grismar

Reputation: 31319

Modern Office documents (with an extension ending in x, like .docx and .xlsx) are really just zip-files with the actual parts that make up the document.

If you rename the .xlsx to .zip, you can open the file and have a look at its structure. In the zip, you'll find a folder called xl/worksheets and in that a sheet1.xml. Apparently, that file either got corrupted, or for a file this size, xlrd runs into a problem checking it.

The Bad CRC-32 indicates that the checksum for the file (a number computed over its contents that should always be the same if the file remains unchanged) no longer matches the file, suggesting file corruption.

Upvotes: 1

Related Questions