Balaji
Balaji

Reputation: 43

Adding an empty row with group wise to csv or excel file in python

How to insert an empty line to separate each group in excel or csv. Currently I am using pandas, I am not bale to do this.

CURRENT TABLE:

column1   |   column2   |  column3
----------------------------------
  A       |     23     |  blue
  A       |     23     |  orange
  A       |     45     |  yellow
  A       |     45     |  yellow
  A       |     45     |  blue
  A       |     60     |  green
  A       |     60     |  green
  A       |     75     |  pink

_

DESIRED TABLE

Note: the blank row after each distinct column1

column1   |   column2   |  column3
----------------------------------
  A       |     23     |  blue
  A       |     23     |  orange

  A       |     45     |  yellow
  A       |     45     |  yellow
  A       |     45     |  blue

  A       |     60     |  green
  A       |     60     |  green

  A       |     75     |  pink

Can any one suggest me the way how it is achievable in python.

Upvotes: 3

Views: 1751

Answers (1)

jezrael
jezrael

Reputation: 863246

You can use groupby with custom function where add last empty row. Last use to_csv with parameter index=False for ignoring index.

Notice:

Before writing to csv is df cast to string, because if add NaN row, all integer columns are converted to float

def f(x):
    x.loc[-1] = pd.Series([])
    return x
df = df.astype(str).groupby(['column1','column2'], as_index=False).apply(f)

print (df)
     column1 column2 column3
0  0       A      23    blue
   1       A      23  orange
  -1     NaN     NaN     NaN
1  2       A      45  yellow
   3       A      45  yellow
   4       A      45    blue
  -1     NaN     NaN     NaN
2  5       A      60   green
   6       A      60   green
  -1     NaN     NaN     NaN
3  7       A      75    pink
  -1     NaN     NaN     NaN

#default separator is ,
df.to_csv('file.csv', index=False)
A,23,blue
A,23,orange
,,
A,45,yellow
A,45,yellow
A,45,blue
,,
A,60,green
A,60,green
,,
A,75,pink
,,

#custom separator tab
df.to_csv('file.csv', index=False, sep='\t')
column1 column2 column3
A       23      blue
A       23      orange

A       45      yellow
A       45      yellow
A       45      blue

A       60      green
A       60      green

A       75      pink

For excel use to_excel:

df.to_excel('file.xlsx', index=False)

Upvotes: 3

Related Questions