ddd
ddd

Reputation: 5029

Pandas aggregation: How to generate multiple new columns from one column and vise versa

I am preprocessing the following Pandas dataframe.

a

For the same assetid, I want to find the first and last AccountingDate and create another column isSold based on following logic of AccountingDate.

if AccountingDate of all deals for same assetid is Null:
    isSold = False
else:
    isSold = True

At the same time, I want to find out for each assetid, the Amount of first and last AccountingDate.

In general groupby aggregation in Pandas goes like this:

df.groupby('assetid').agg({'A', func})

However my aggregation functions is lot more complicated, from AccountingDate to generate 3 columns FirstAccountingDate, LastAccountingDate, isSold; from AccountingDate and Amount combined, to generate FirstDealAmount and LastDealAmount.

What is the best way to do this?

Upvotes: 1

Views: 53

Answers (1)

Dawei
Dawei

Reputation: 1066

If you want to get the first date and last date, and check if both of them is null. You can do it by the following code.

print(df)

  assetId       date
0      1,        NaT
1      1, 2018-10-01
2      1, 2018-10-02
3      1, 2018-10-03
4      4,        NaT
5      4,        NaT
def isSold(date_array):
    last = date_array.max()
    first = date_array.min()
    return pd.isnull(last) & pd.isnull(last)

result = df.groupby("assetId").date.apply(isSold).reset_index()
print(result)

It gives:

  assetId   date
0      1,  False
1      4,   True

Upvotes: 1

Related Questions