Giang Do
Giang Do

Reputation: 93

Pandas - Insert blank row for each group in pandas

I have a dataframe

import pandas as pd
import numpy as np
df1=pd.DataFrame({'group':[1,1,2,2,2],
             'value':[2,3,np.nan,5,4]})
df1

    group   value
0   1       2
1   1       3
2   2       NaN
3   2       5
4   2       4

I want to add a row after each group in which the value of value is NaN . The desire output is:

   group   value
0   1       2
1   1       3
2   1       NaN
3   2       NaN
4   2       5
5   2       4
6   2       NaN

In my real dataset I have a lot of groups and more columns besides value, I want all of them to be NaN in newly added row.

Thanks a lot for the help

Upvotes: 5

Views: 3659

Answers (4)

Scott Boston
Scott Boston

Reputation: 153460

I wanted to get a little creative:

(pd.concat([df1, 
            df1.groupby('group')['value'].apply(lambda x: x.shift(-1).iloc[-1]).reset_index()])
    .sort_values('group')
    .reset_index(drop=True))

Output:

   group  value
0      1    2.0
1      1    3.0
2      1    NaN
3      2    NaN
4      2    5.0
5      2    4.0
6      2    NaN

Upvotes: 3

rafaelc
rafaelc

Reputation: 59274

Can also just groupby+apply, a one-liner

df.groupby('group').apply(lambda gr: gr.append(gr.tail(1).assign(value=np.nan))).reset_index(drop=True)

or to be explicit

g = df.groupby('group')
def f(gr):
    n = gr.tail(1).copy()
    n.value = np.nan
    return gr.append(n)
g.apply(f).reset_index(drop=True)


    group   value
0   1       2.0
1   1       3.0
2   1       NaN
3   2       NaN
4   2       5.0
5   2       4.0
6   2       NaN

Upvotes: 4

piRSquared
piRSquared

Reputation: 294298

My version of concat

ii = dict(ignore_index=True)
pd.concat([
    d.append({'group': n}, **ii) for n, d in df1.groupby('group')
], **ii).astype({'group': int})

   group  value
0      1    2.0
1      1    3.0
2      1    NaN
3      2    NaN
4      2    5.0
5      2    4.0
6      2    NaN

Upvotes: 2

user3483203
user3483203

Reputation: 51155

concat with append

s = df1.groupby('group')
out = pd.concat([i.append({'value': np.nan}, ignore_index=True) for _, i in s])
out.group = out.group.ffill().astype(int)

apply with append[1]

df1.groupby('group').apply(
    lambda d: d.append({'group': d.name}, ignore_index=True).astype({'group': int})
).reset_index(drop=True)

Both produce:

   group  value
0      1    2.0
1      1    3.0
2      1    NaN
3      2    NaN
4      2    5.0
5      2    4.0
6      2    NaN

[1] This solution brought to you by your local @piRSquared

Upvotes: 7

Related Questions