hanzgs
hanzgs

Reputation: 1616

How to create new column from groupby column and conditional check on another column in pandas?

I have a pandas dataframe,

data = pd.DataFrame([['TRAN','2019-01-06T21:44:09Z','T'],
                     ['LMI','2019-01-06T19:44:09Z','U'],
                     ['ARN','2019-01-02T19:44:09Z','V'],
                     ['TRAN','2019-01-08T06:44:09Z','T'],
                     ['TRAN','2019-01-06T18:44:09Z','U'],
                     ['ARN','2019-01-04T19:44:09Z','V'],
                     ['LMI','2019-01-05T16:34:09Z','U'],
                     ['ARN','2019-01-08T19:44:09Z','V'],
                     ['TRAN','2019-01-07T14:44:09Z','T'],
                     ['TRAN','2019-01-06T11:44:09Z','U'],
                     ['ARN','2019-01-10T19:44:09Z','V'],
                     ], 
                    columns=['Type', 'Date', 'Decision'])

I need to groupby Type column and find min Date of each type and create a new column for the min date as "First" else "Later"

I can data.groupby('Type') based on the Type, i dont know how to find min(data['Date']) in the groupdyDF and create a new column.

My final data looks like

['TRAN','2019-01-06T21:44:09Z','T','Later'],
['LMI','2019-01-06T19:44:09Z','U','Later'],
['ARN','2019-01-02T19:44:09Z','V','First'],
['TRAN','2019-01-08T06:44:09Z','T','Later'],
['TRAN','2019-01-06T18:44:09Z','U','Later'],
['ARN','2019-01-04T19:44:09Z','V','Later'],
['LMI','2019-01-05T16:34:09Z','U','First'],
['ARN','2019-01-08T19:44:09Z','V','Later'],
['TRAN','2019-01-07T14:44:09Z','T','Later'],
['TRAN','2019-01-06T11:44:09Z','U','First'],
['ARN','2019-01-10T19:44:09Z','V','Later'],
], 
columns=['Type', 'Date', 'Decision']

Upvotes: 0

Views: 82

Answers (2)

moys
moys

Reputation: 8033

IICU, you can use np.where to get your output

data['check']=np.where(data.Date > data.groupby('Type')['Date'].transform(min), 'Later','First')
print(data)

Output

    Type            Date    Decision    check
0   TRAN    2019-01-06T21:44:09Z    T   Later
1   LMI     2019-01-06T19:44:09Z    U   Later
2   ARN     2019-01-02T19:44:09Z    V   First
3   TRAN    2019-01-08T06:44:09Z    T   Later
4   TRAN    2019-01-06T18:44:09Z    U   Later
5   ARN     2019-01-04T19:44:09Z    V   Later
6   LMI     2019-01-05T16:34:09Z    U   First
7   ARN     2019-01-08T19:44:09Z    V   Later
8   TRAN    2019-01-07T14:44:09Z    T   Later
9   TRAN    2019-01-06T11:44:09Z    U   First
10  ARN     2019-01-10T19:44:09Z    V   Later

Upvotes: 1

M_S_N
M_S_N

Reputation: 2810

IIUC you can use this:

df.groupby('Type').agg(First=('Date','first'), Later=('Date','last')).reset_index()

Upvotes: 3

Related Questions