Dance Party
Dance Party

Reputation: 3713

Pandas insert alternate blank rows

Given the following data frame:

import pandas as pd
import numpy as np
df1=pd.DataFrame({'A':['a','b','c','d'],
                 'B':['d',np.nan,'c','f']})
df1
    A   B
0   a   d
1   b   NaN
2   c   c
3   d   f

I'd like to insert blank rows before each row. The desired result is:

    A   B
0   NaN NaN
1   a   d
2   NaN NaN
3   b   NaN
4   NaN NaN
5   c   c
6   NaN NaN
7   d   f

In reality, I have many rows.

Thanks in advance!

Upvotes: 6

Views: 4964

Answers (3)

piRSquared
piRSquared

Reputation: 294278

Use numpy and pd.DataFrame

def pir(df):
    nans = np.where(np.empty_like(df.values), np.nan, np.nan)
    data = np.hstack([nans, df.values]).reshape(-1, df.shape[1])
    return pd.DataFrame(data, columns=df.columns)

pir(df1)

enter image description here

Testing and Comparison

Code

def banana(df):
    df1 = df.set_index(np.arange(1, 2*len(df)+1, 2))
    df2 = pd.DataFrame(index=range(0, 2*len(df1), 2), columns=df1.columns)
    return pd.concat([df1, df2]).sort_index()

def anton(df):
    df = df.set_index(np.arange(1, 2*len(df)+1, 2))
    return df.reindex(index=range(2*len(df)))

def pir(df):
    nans = np.where(np.empty_like(df.values), np.nan, np.nan)
    data = np.hstack([nans, df.values]).reshape(-1, df.shape[1])
    return pd.DataFrame(data, columns=df.columns)

Results

pd.concat([f(df1) for f in [banana, anton, pir]],
          axis=1, keys=['banana', 'anton', 'pir'])

enter image description here

Timing

enter image description here

Upvotes: 3

Anton Protopopov
Anton Protopopov

Reputation: 31672

I think you could change your index like @bananafish did and then use reindex:

df1.index = range(1, 2*len(df1)+1, 2)
df2 = df1.reindex(index=range(2*len(df1)))

In [29]: df2
Out[29]:
     A    B
0  NaN  NaN
1    a    d
2  NaN  NaN
3    b  NaN
4  NaN  NaN
5    c    c
6  NaN  NaN
7    d    f

Upvotes: 7

bananafish
bananafish

Reputation: 2917

A bit roundabout but this works:

df1.index = range(1, 2*len(df1)+1, 2)
df2 = pd.DataFrame(index=range(0, 2*len(df1), 2), columns=df1.columns)
df3 = pd.concat([df1, df2]).sort()

Upvotes: 2

Related Questions