user2890059
user2890059

Reputation: 145

Keep columns after a groupby in an empty dataframe

The dataframe is an empty df after query.when groupby,raise runtime waring,then get another empty dataframe with no columns.How to keep the columns?

df = pd.DataFrame(columns=["PlatformCategory","Platform","ResClassName","Amount"])
print df

result:

Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []

then groupby:

df = df.groupby(["PlatformCategory","Platform","ResClassName"]).sum()
df = df.reset_index(drop=False,inplace=True)
print df

result: sometimes is None sometime is empty dataframe

Empty DataFrame
Columns: []
Index: []

why empty dataframe has no columns.

runtimewaring:

/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: divide by zero encountered in log

if alpha + beta * ngroups < count * np.log(count):

/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: invalid value encountered in double_scalars
  if alpha + beta * ngroups < count * np.log(count):

Upvotes: 12

Views: 7908

Answers (2)

rleelr
rleelr

Reputation: 1914

Some code that works the same for .sum() whether or not the dataframe is empty:

def groupby_sum(df, groupby_cols):
    groupby = df.groupby(groupby_cols, as_index=False)
    summed = groupby.sum()
    return (groupby.count() if summed.empty else summed).set_index(groupby_cols)

df = groupby_sum(df, ["PlatformCategory", "Platform", "ResClassName"])

Upvotes: 1

cs95
cs95

Reputation: 402263

You need as_index=False and group_keys=False:

df = df.groupby(["PlatformCategory","Platform","ResClassName"], as_index=False).count()
df

Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []

No need to reset your index afterwards.

Upvotes: 6

Related Questions