Reputation: 5029
I am preprocessing the following Pandas
dataframe.
For the same assetid
, I want to find the first and last AccountingDate
and create another column isSold
based on following logic of AccountingDate
.
if AccountingDate of all deals for same assetid is Null:
isSold = False
else:
isSold = True
At the same time, I want to find out for each assetid
, the Amount
of first and last AccountingDate
.
In general groupby
aggregation in Pandas
goes like this:
df.groupby('assetid').agg({'A', func})
However my aggregation functions is lot more complicated, from AccountingDate
to generate 3 columns FirstAccountingDate
, LastAccountingDate
, isSold
; from AccountingDate
and Amount
combined, to generate FirstDealAmount
and LastDealAmount
.
What is the best way to do this?
Upvotes: 1
Views: 53
Reputation: 1066
If you want to get the first date
and last date
, and check if both of them is null
. You can do it by the following code.
print(df)
assetId date
0 1, NaT
1 1, 2018-10-01
2 1, 2018-10-02
3 1, 2018-10-03
4 4, NaT
5 4, NaT
def isSold(date_array):
last = date_array.max()
first = date_array.min()
return pd.isnull(last) & pd.isnull(last)
result = df.groupby("assetId").date.apply(isSold).reset_index()
print(result)
It gives:
assetId date
0 1, False
1 4, True
Upvotes: 1