Reputation: 502
I have the following dataframe with multiple cols and rows,
A | B | C | D | E |....
2 | b | c | NaN | 1 |
3 | c | b | NaN | 0 |
4 | b | b | NaN | 1 |
.
.
.
Is there a way to add excel formulas (for some columns) in the manner stated below through an example using python in an output excel file?
For instance, I want to be able to have the output something like this,
=SUM(A0:A2) | | | | =SUM(E0:E2)
A | B | C | D | E
0 2 | b | c | =IF(B0=C0, "Yes", "No") | 1
1 3 | c | b | =IF(B1=C1, "Yes", "No") | 0
2 4 | b | b | =IF(B2=C2, "Yes", "No") | 1
.
.
.
Final output,
9 | | | | 2
A | B | C | D | E
0 2 | b | c | No | 1
1 3 | c | b | No | 0
2 4 | b | b | Yes | 1
.
.
.
I want to add formulas in the final output excel file so that if there are any changes in the values of columns (in the final output excel file) other columns can also be updated in the excel file in real time, for instance,
15 | | | | 3
A | B | C | D | E
0 2 | b | b | Yes | 1
1 9 | c | b | No | 1
2 4 | b | b | Yes | 1
.
.
.
If I change the values of, for instance, A1 from 3 to 9, then the sum of the column changes to 15; when I change the value of C0 from "c" to "b", the value of its corresponding row value, that is, D0 changes from "No" to "Yes"; Same for col E.
I know you can use xlsxwriter library to write the formulas but I am not able to figure out as to how I can add the formulas in the manner I have stated in the example above.
Any help would be really appreciated, thanks in advance!
Upvotes: 3
Views: 12672
Reputation: 5746
You're best doing all of your formulas you wish to keep via xlsxwriter
and not pandas
.
You would use pandas
if you only wanted to export the result, since you want to preserve the formula, do it when you write your spreadsheet.
The code below will write out the dataframe
and formula to an xlsx
file called test
.
import xlsxwriter
import pandas as pd
from numpy import nan
data = [[2, 'b', 'c', nan, 1], [3, 'c', 'b', nan, 0], [4, 'b', 'b', nan, 1]]
df = pd.DataFrame(data=data, columns=['A', 'B', 'C', 'D', 'E'])
## Send values to a list so we can iterate over to allow for row:column matching in formula ##
values = df.values.tolist()
## Create Workbook ##
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
row = 0
col = 0
## Iterate over the data we extracted from the DF, generating our cell formula for 'D' each iteration ##
for idx, line in enumerate(values):
d = f'=IF(B{row + 1}=C{row + 1}, "Yes", "No")'
a, b, c, _, e = line
## Write cells into spreadsheet ##
worksheet.write(row, col, a)
worksheet.write(row, col + 1, b)
worksheet.write(row, col + 2, c)
worksheet.write(row, col + 3, d)
worksheet.write(row, col + 4, e)
row += 1
## Write the total sums to the bottom row of the sheet utilising the row counter to specify our stop point ##
worksheet.write(row, 0, f'=SUM(A1:A{row})')
worksheet.write(row, 4, f'=SUM(E1:E{row})')
workbook.close()
Upvotes: 6