create a new data frame from GroupBy object in pandas

Question

What I really want to do can be expressed in sql like this:

SELECT v1, v2, COUNT(*) AS v_count FROM my_table GROUP BY 1,2

that means, I want to create a new data frame which is composed of 3 columns: (v1, v2, v_count).

Here is what I tried with pandas:

grp = df.groupby(['v1', 'v2'])  # GROUP BY v1, v2
cnt = grp.count()  # get v_count for each group

but how to put them together into a new data frame?

Matti John · Accepted Answer

You can select one of the aggregated columns to be v_count and then reset the index since v1 and v2 are in the index, e.g.:

df.groupby(['v1', 'v2'])['v1'].agg({'v_count': np.size}).reset_index()

Alternatively, you can use the as_index keyword argument instead of using reset_index, e.g.:

df.groupby(['v1', 'v2'], as_index=False)['v1'].agg({'v_count': np.size})

Answers (1)