royalewithcheese
royalewithcheese

Reputation: 502

How can I insert an excel formula using pandas when I save it as an excel file?

I have the following dataframe with multiple cols and rows,

A  |  B  |  C  |  D     |  E   |....
2  |  b  |  c  |  NaN   |  1   |
3  |  c  |  b  |  NaN   |  0   |
4  |  b  |  b  |  NaN   |  1   |
.
.
.

Is there a way to add excel formulas (for some columns) in the manner stated below through an example using python in an output excel file?

For instance, I want to be able to have the output something like this,

  =SUM(A0:A2)   |     |     |                             | =SUM(E0:E2)
       A        |  B  |  C  |              D              |      E
0      2        |  b  |  c  |  =IF(B0=C0, "Yes", "No")    |      1
1      3        |  c  |  b  |  =IF(B1=C1, "Yes", "No")    |      0
2      4        |  b  |  b  |  =IF(B2=C2, "Yes", "No")    |      1
.      
.
.

Final output,

       9        |     |     |         |      2
       A        |  B  |  C  |  D      |      E
0      2        |  b  |  c  |  No     |      1
1      3        |  c  |  b  |  No     |      0
2      4        |  b  |  b  |  Yes    |      1
.      
.
.

I want to add formulas in the final output excel file so that if there are any changes in the values of columns (in the final output excel file) other columns can also be updated in the excel file in real time, for instance,

       15       |     |     |         |      3
       A        |  B  |  C  |  D      |      E
0      2        |  b  |  b  |  Yes    |      1
1      9        |  c  |  b  |  No     |      1
2      4        |  b  |  b  |  Yes    |      1
.      
.
.

If I change the values of, for instance, A1 from 3 to 9, then the sum of the column changes to 15; when I change the value of C0 from "c" to "b", the value of its corresponding row value, that is, D0 changes from "No" to "Yes"; Same for col E.

I know you can use xlsxwriter library to write the formulas but I am not able to figure out as to how I can add the formulas in the manner I have stated in the example above.

Any help would be really appreciated, thanks in advance!

Upvotes: 3

Views: 12672

Answers (1)

PacketLoss
PacketLoss

Reputation: 5746

You're best doing all of your formulas you wish to keep via xlsxwriter and not pandas.

You would use pandas if you only wanted to export the result, since you want to preserve the formula, do it when you write your spreadsheet.

The code below will write out the dataframe and formula to an xlsx file called test.

import xlsxwriter
import pandas as pd

from numpy import nan

data = [[2, 'b', 'c', nan, 1], [3, 'c', 'b', nan, 0], [4, 'b', 'b', nan, 1]]
df = pd.DataFrame(data=data, columns=['A', 'B', 'C', 'D', 'E'])


## Send values to a list so we can iterate over to allow for row:column matching in formula ##

values = df.values.tolist()

## Create Workbook ##

workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()

row = 0
col = 0

## Iterate over the data we extracted from the DF, generating our cell formula for 'D' each iteration ##

for idx, line in enumerate(values):
    d = f'=IF(B{row + 1}=C{row + 1}, "Yes", "No")'
    a, b, c, _, e = line
   
    ## Write cells into spreadsheet ##

    worksheet.write(row, col, a)
    worksheet.write(row, col + 1, b)
    worksheet.write(row, col + 2, c)
    worksheet.write(row, col + 3, d)
    worksheet.write(row, col + 4, e)
    
    row += 1

## Write the total sums to the bottom row of the sheet utilising the row counter to specify our stop point ##        
worksheet.write(row, 0, f'=SUM(A1:A{row})')
worksheet.write(row, 4, f'=SUM(E1:E{row})')

workbook.close()

Upvotes: 6

Related Questions