Apply results of pandas groupby to multiple rows

Question

I have a dataframe df that looks like this:

    PO  SO        Date  Name  Qty
0  123  34  2020-01-05  Carl    5
1  111  55  2020-10-10  Beth    7
2  123  12  2020-02-03  Greg   11
3  101  55  2019-12-03  Carl    3
4  123  34  2020-11-30  Beth   24
5  111  55  2019-04-02  Greg    6
6  202  99  2020-05-06  Beth   19

What I would like to do is replace dates with the minimum date for the dataframe when grouped by PO and SO. For instance, there are two rows with a PO of '123' and an SO of '34'. Since the minimum Date among these rows is '2020-01-05', both rows should have their Date column set to '2020-01-05'.

Thus the result would looks like this:

    PO  SO        Date  Name  Qty
0  123  34  2020-01-05  Carl    5
1  111  55  2019-04-02  Beth    7
2  123  12  2020-02-03  Greg   11
3  101  55  2019-12-03  Carl    3
4  123  34  2020-01-05  Beth   24
5  111  55  2019-04-02  Greg    6
6  202  99  2020-05-06  Beth   19

David Erickson · Accepted Answer

You can use transform with groupby to create a "calculated column", so that you can avoid a messy merge:

df = pd.DataFrame({'PO':  [123, 111, 123, 101, 123, 111, 202], 
                  'SO':   [34, 55, 12, 55, 34, 55, 99], 
                  'Date': ['2020-01-05', '2020-10-10', '2020-02-03', '2019-12-03', '2020-11-30', '2019-04-02', '2020-05-06'], 
                  'Name': ['Carl', 'Beth', 'Greg', 'Carl', 'Beth', 'Greg', 'Beth'], 
                  'Qty':  [5, 7, 11, 3, 24, 6, 19]})

df_grouped = df.copy()
df_grouped['Date'] = df_grouped.groupby(['PO', 'SO'])['Date'].transform('min')
df_grouped
Out[1]: 
    PO  SO        Date  Name  Qty
0  123  34  2020-01-05  Carl    5
1  111  55  2019-04-02  Beth    7
2  123  12  2020-02-03  Greg   11
3  101  55  2019-12-03  Carl    3
4  123  34  2020-01-05  Beth   24
5  111  55  2019-04-02  Greg    6
6  202  99  2020-05-06  Beth   19

Apply results of pandas groupby to multiple rows

Answers (2)

Related Questions