Muthu Kumar
Muthu Kumar

Reputation: 945

Pandas read_excel - returning nan for cells having formula

I have an excel file that contains accounting data and also the file use formula for some of the cells. When I use pandas read_excel to read the values in the file, it returns nan value for cells having formula's. I have also used openpyxl, but still having the same issue.

Is there any way to read values instead of formula for cells having formula.

I have also attached the xlsx file used.

https://drive.google.com/file/d/1aOTYwtKTrqyjF16vDizzzvMkUsvdgRe1/view?usp=sharing

Thanks...

Upvotes: 5

Views: 18593

Answers (3)

Evan Toh
Evan Toh

Reputation: 71

i was facing the exact same issue as you when reading an excel file that has been appended to using excel writer. It happens because the links on the excel file is not updated hence the formulas do not calculate the value.

2 solutions that worked for me:

  1. Open the file and update links when prompted OR
  2. Add in code to refresh the file before reading it according to solution posted here. Refresh Excel External Data with Python

Hope this helps.

Edit: New user here, no idea how reject/approve works, any constructive feedback will be greatly appreciated.

Upvotes: 7

gina
gina

Reputation: 101

I don't have enough reputation points to comment so I have to put this in as an answer which it is not.

With your sample file I don't have any issues. I'm using python 3.8 and Pandas 0.25.3. Pandas reads it fine and the numbers are available. That said, my experience with reading values computed by formulas is that it depends on the complexity of the formula. Volatile functions like Index, Match will be read as nan, but more simple functions can be read in as numbers.

This is a workaround only and not a real solution, but if you can make a new column with a formula that just references the column with the complex formulas, then that column's data can be read.

Upvotes: 3

cerebral_assassin
cerebral_assassin

Reputation: 212

Before working with your excelsheet make sure you have set the permissions of your excel file to read and write,if it is a read only file,then change to read-write.

import pandas as pd
your_data = pd.read_excel('yourfile.xlsx',sheet_name='your_sheet_name')
print(your_data) #checking
your_data.dropna(inplace=True)
your_data.rename(columns = {'Unnamed: 1':'Total'},inplace =True) 
print(your_data['Total'].tolist()) #The column name where your formula is being calculated.

Upvotes: 3

Related Questions