Reputation: 501
I have two dataframes with a common index. I would like to group df1
based on a subset of columns in df2
.
I know how to groupby multiple columns already in df1
, like df1.groupby(['col1', 'col2'])
and I know how to group on a different series with the same index, like df1.groupby(df2['col1'])
. Is there an immediate way to do something like
>>> df1.groupby(df[['col1', 'col2']])
# ValueError: Grouper for '<class 'pandas.core.frame.DataFrame'>' not 1-dimensional
Of course, I could do
df1.groupby([df2['col1'], df2['col2']])
but it seems there should be a more direct syntax for this. (Imagine having several grouping columns, etc.)
Upvotes: 2
Views: 2598
Reputation: 371
You need to convert your df2
into a list
of lists:
df1.groupby(df[['col1', 'col2']].T.values.tolist())
Should give you the result you want. This is indeed similar to what @sply88 has suggested, but its more clean (at least in my opinion).
Upvotes: 0
Reputation: 893
It could be either merge
, join
or concat
the two dataframes and then group or a "more direct syntax" using a list comprehension, e.g:
many_grouping_columns = ['A', 'B', ...] # columns found in in df2
df1.groupby([df2[col] for col in many_grouping_columns])
Upvotes: 1
Reputation:
How about:
gbobj = pd.concat([df1, df2[['col1','col2']], axis=1).groupby(['col1','col2'])
Upvotes: 1