Reputation: 742
I have a dataframe that looks like
column1 column2 column3 colum4 column5
1 r_n_1 r_s_1 r_n_2 r_s_3 r_n_3
2 r_n_1 r_s_1 r_n_4 r_s_4 r_n_5
3 r_n_1 r_s_1 r_n_6 r_s_5 r_n_7
4 r_n_1 r_s_1 r_n_6 r_s_6 r_n_9
5 r_n_10 r_s_7 r_n_11 r_s_8 r_n_12
6 r_n_10 r_s_9 r_n_11 r_s_10 r_n_13
And I would like to merge cells in data frame so I could write in excel that could look like
So basically merge cells that have same value in excel. I am guessing I can use MultiIndex from pandas but I don't know how to do that.
my code to get this data frame is like.
new_list = []
for k1 in remove_empties_from_dict(combined_dict):
curr_dict = remove_empties_from_dict(combined_dict)[k1]
for k2 in curr_dict:
curr_dict_2=curr_dict[k2]
for k3 in curr_dict_2:
curr_dict_3=curr_dict_2[k3]
for k4 in curr_dict_3:
curr_dict_4=curr_dict_3[k4]
new_dict= {'c1': k1, 'c2': k2, 'c3': k3, 'c4': k4,'c5': curr_dict_4}
new_list.append(new_dict)
df = pd.DataFrame(new_list)
Upvotes: 3
Views: 293
Reputation: 967
I couldn't find a direct function that does the merging of cells with similar values so instead, I have written a code that does that.
print(df)
column1 column2 column3 column4 column5
0 r_n_1 r_s_1 r_n_2 r_s_3 r_n_3
1 r_n_1 r_s_1 r_n_4 r_s_4 r_n_5
2 r_n_1 r_s_1 r_n_6 r_s_5 r_n_7
3 r_n_1 r_s_1 r_n_6 r_s_6 r_n_9
4 r_n_10 r_s_7 r_n_11 r_s_8 r_n_12
5 r_n_10 r_s_9 r_n_11 r_s_10 r_n_13
This is the df I have to work with. But in order to do that what I did was, I iterated it once to check which values are similar, and replaced the with a -
. The reason I did not make it NoneType
was because the cells below the table have a NoneType
value, so the further part of the code will keep on iterating infinitely. What I did was:
for i in df.columns:
for j in range(len(df[i])):
for k in range(j+1,len(df[i])):
if df[i][j]== df[i][k]:
df[i][k]='-'
So now my df
looks something like:
print(df)
column1 column2 column3 column4 column5
0 r_n_1 r_s_1 r_n_2 r_s_3 r_n_3
1 - - r_n_4 r_s_4 r_n_5
2 - - r_n_6 r_s_5 r_n_7
3 - - - r_s_6 r_n_9
4 r_n_10 r_s_7 r_n_11 r_s_8 r_n_12
5 - r_s_9 - r_s_10 r_n_13
Now that I have all unique values in the Data Frame, I will check whether the df
element is a valid input or a -
. And the cells that are -
will be merged with its upper value. I did that by:
from openpyxl.workbook import Workbook
exportPath = r'C:\Users\T01144\Desktop\PythonExport.xlsx'
wb= Workbook()
ws=wb.active
rowInd=1
colInd=1
colList=['-', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H','I'] # Continue if there are more columns
for i in df.columns:
for j in range(0,len(df[i])):
if(df[i][j]!='-'):
ws.cell(row=rowInd,column=colInd,value=df[i][j])
else:
count=0
for l in range(j+1,len(df[i])):
count+=1
if df[i][l]!='-':
count-=1
break
ws.merge_cells(str(str(colList[colInd]+str(rowInd-1))+":"+str(colList[colInd]+str(rowInd+count))))
rowInd+=1
colInd+=1
rowInd=1
The output I have now is:
The entire code can be found here.
NOTE: Some of you may get this error after creating the Excel:
We found a problem with some content in 'PythonExport.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes.
Just ignore this error and click Yes.
Upvotes: 1