Reputation: 1785
My python script generates a pandas dataframe containing some NaN values:
A B C D
0.351741 NaN 0.238705 NaN
0.950817 0.665594 0.671151 NaN
NaN 0.442725 0.658816 NaN
0.155604 0.567044 NaN 0.666576
NaN 0.751562 NaN 0.597252
0.577770 NaN NaN 0.123392
I want to write this df into an excel spreadsheet using xlwings. The important point is that I need to skip the cells set to NaN and preserve the corresponding excel cells. In my case, the cells that I want to preserve can contain live excel formula, values or even empty cells.
If my excel values are:
A B C D
Excel A1 Excel B1 Excel C1 Excel D1
Excel A2 Excel B2 Excel C2 Excel D2
Excel A3 Excel B3 Excel C3 Excel D3
Excel A4 Excel B4 Excel C4 Excel D4
Excel A5 Excel B5 Excel C5 Excel D5
Excel A6 Excel B6 Excel C6 Excel D6
The final output should be:
A B C D
0.351741 Excel B1 0.238705 Excel D1
0.950817 0.665594 0.671151 Excel D2
Excel A3 0.442725 0.658816 Excel D3
0.155604 0.567044 Excel C4 0.666576
Excel A5 0.751562 Excel C5 0.597252
0.577770 Excel B6 Excel C6 0.123392
Usually, I write to Excel using a simple xlwings call:
r = 'A1:D6'
sht.range(r).options(index=False, header=False).value = df
With this call, the NaN would overwrite the excel cell by an empty cell. Loosing the values that I want to preserve. In our previous example, it means that I would obtain this in Excel:
A B C D
0.351741 BLANK 0.238705 BLANK
0.950817 0.665594 0.671151 BLANK
BLANK 0.442725 0.658816 BLANK
0.155604 0.567044 BLANK 0.666576
BLANK 0.751562 BLANK 0.597252
0.577770 BLANK BLANK 0.123392
I checked xlwings documentation and its code. It seems that the parameter skip_blanks
is only implemented for the function paste()
. It is set to False
by default. The only workaround that I currently found is to use a temporary sheet where I save my df. Then, I copy to the clipboard the range where it was saved. Finally, I can paste it where I want using skip_blanks=True
obtaining the expected output:
tmp_sht.range(r).options(index=False, header=False).value = df
tmp_sht.range(r).copy(destination=None)
sht.range(r).paste(paste='values', skip_blanks=True)
I don't like to use this temporary sheet. I suppose xlwings should be able to handle a direct call using the skip_blanks
, something that would look like this:
sht.range(r).options(index=False, header=False, skip_blanks=True).value = df
Is it possible to do it?
Many thanks in advance!
Upvotes: 1
Views: 1932
Reputation: 1785
A better way to solve this issue is based on the idea of @mozway to move the whole process to python by retrieving the values we want to preserve and simply combining the two dataframes:
out = df1.combine_first(df2)
You can retrieve the formulas from the excel file using the formula
parameter of a Range
. Depending on the types of your values, you could simply replace the NaN like this:
formulas = sht.range(r).options(pd.DataFrame).formula
df = df.fillna(formulas) # replace the `NaN` with the formulas
This actually didn't work for me due to some dtypes that the function fillna
cannot handle. But I adapted it by changing the index and columns of the retrieved df to be able to use combine_first:
formulas = pd.DataFrame(list(sht.range(r).formula))
formulas.index = df.index
formulas.columns = df.columns
df = df.combine_first(formulas)
sht.range(r).options(index=False, header=False).value = df
By doing this the live excel formula are not overwritten by their values:
A B C D
0.351741 Excel B1 0.238705 Excel D1
0.950817 0.665594 0.671151 Excel D2
Excel A3 0.442725 0.658816 Excel D3
0.155604 0.567044 Excel C4 0.666576
Excel A5 0.751562 Excel C5 0.597252
0.577770 Excel B6 Excel C6 0.123392
Upvotes: 1