Reputation: 128
I am a novice programmer and I'm learning python on my own.
I read the excel
df = pd.read_excel(file, sheet_name=sheet_name, header=3)
Now I would like to know which cells of the excel are calculated with a formula to replace them with a condition.
The calculated cell will have a value according to the column that is
For example,
if df[column1] (have formula_excel in cell):
this.cell = new value
I hope I have explained, thank you !!
Upvotes: 0
Views: 135
Reputation: 2670
Maybe not the perfect answer, but openpyxl may be able to shine some light on what formulae is in an Excel file (before putting it into a dataframe.
As one example (shown below) you could pre-process the Excel file to show the formulae and then get it into a dataframe.
Given:
A spreadsheet that looks like (normal view and showing formulae)...
Code:
import pandas as pd
import numpy as np
from openpyxl import load_workbook
# from openpyxl.utils import FORMULAE
df = pd.read_excel('calcs.xlsx')
print(df)
print('\n\n')
'''Shows (Col3 is all formulae but it's not showing):
Col1 Col2 Col3
0 858 3 1368.333333
1 651 1 4105.000000
2 448 5 821.000000
3 210 4 1026.250000
4 301 6 684.166667
5 671 2 2052.500000
6 420 5 821.000000
7 397 1 4105.000000
8 149 5 821.000000
'''
# Note: 'wb' and 'ws' are shorthand for workbook and worksheet
wb = load_workbook(filename = 'calcs.xlsx')
ws = wb['Sheet1']
# print Col3 (aka column C)
for cell in ws['C']:
print(cell.value)
print('\n\n')
'''Shows:
Col3
=SUM(A$2:A$10)/B2
=SUM(A$2:A$10)/B3
=SUM(A$2:A$10)/B4
=SUM(A$2:A$10)/B5
=SUM(A$2:A$10)/B6
=SUM(A$2:A$10)/B7
=SUM(A$2:A$10)/B8
=SUM(A$2:A$10)/B9
=SUM(A$2:A$10)/B10
'''
# add a single quote to the start of all cells in Col3 (aka column C)
for cell in ws['C']:
cell.value = "'" + str(cell.value)
# create a dataframe with the updated values (showing the forumulae in Col3)
df = pd.DataFrame(ws.values)
df.columns = df.iloc[0]
df = df[1:]
print(df)
'''Shows:
0 Col1 Col2 'Col3
1 858 3 '=SUM(A$2:A$10)/B2
2 651 1 '=SUM(A$2:A$10)/B3
3 448 5 '=SUM(A$2:A$10)/B4
4 210 4 '=SUM(A$2:A$10)/B5
5 301 6 '=SUM(A$2:A$10)/B6
6 671 2 '=SUM(A$2:A$10)/B7
7 420 5 '=SUM(A$2:A$10)/B8
8 397 1 '=SUM(A$2:A$10)/B9
9 149 5 '=SUM(A$2:A$10)/B10
'''
print('\n\n')
So you now have a dataframe with the formulae showing in Col3 and could further process the frame in Pandas to do if this, then etc. and eventually delete the original Col3 once you get what you need.
Upvotes: 1