LING173
LING173

Reputation: 43

Why xlrd reading a text format number as a float?

I`m using xlrd to handle some data in Excel.

I have three same number data '123' in a column.

The first cell formatted as Text, the second cell formatted as General, and the last one is same as the first.

I edited the last cell, just double clicked and didn`t change any thing. So I got the Error Check tips(the small green triangle) in the last cell: 'The number in this cell is formatted as text or preceded by an apostrophe.

I save my excel file and run my code:

excel = xlrd.open_workbook(MyExcelName)
for sheet in excel.sheets():
    for i in range(sheet.nrows):
       for jin range(len(sheet.row_values(i))):
          value=str(sheet.row_values(i)[j])
          print(value)
...

And I get this:

123.0 

123.0 

123

It means xlrd read number in the text format as a float.

But what puzzles me is the third cell wasn`t read as float.

How this happen? I means, its all formatted as text, xlrd would check the Error Tips? Why don`t it just check the cell format, read text as string?

Upvotes: 3

Views: 2872

Answers (2)

hozzi.fozzi
hozzi.fozzi

Reputation: 1

Thank you for the clarifications John Y.

I realized that the 'cell format update mechanism' is only triggered in Excel when you manually update (i.e delete, then re-enter the same value/number manually). Because I tried first to copy the (many) cell values to some other temp work sheet, set the format of the (now empty) concerned cells to Text, then copy back the cell values (only, w/o the formatting, from the temp sheet) into the cleaned cell but that doesn't trigger the Excel internal format update mechanism.

Upvotes: 0

John Y
John Y

Reputation: 14529

Formatting is always purely cosmetic in Excel, and is an attribute that is stored separately and independently from the value of a cell. The value and the format do not have to be "synchronized".

The author of xlrd made the choice to always just give you the value that was stored by Excel. No matter what. That's the whole story from the xlrd side of things.

The Excel program (that is, the GUI application from Microsoft) is more complicated. If you type something that looks like it could be a number, then Excel will check the existing format of the cell. If the cell format is Text, it will store a string. Otherwise, it will store a float.

If you start with a blank sheet in Excel, all the cells will have the default format, which is General. Crucially, General is not Text, so if you type 123 in cells A1, A2, and A3 on that blank sheet, all of them will be stored as float.

If you then go back and format A1 and A3 as Text, Excel will change the format of those cells to Text. At that point, the values of all three cells are still float.

But if you then edit A3, Excel will check the format of the cell, find that it is Text, and thus store a string.

Notice that once the value of A3 is a string, if you then go back and format A3 as General, the value stays a string, even though General is not Text.

So any funny business between values, types, and formatting is happening due to Excel, and how you're entering the data. xlrd is simply faithfully reading what is there.

Upvotes: 4

Related Questions