Reputation: 1741
I want to mark the last record in group ['ru','case','opdate'] based on 'lst_svc'
Here is my attempt (problematic because the duplicate is counted twice).
import pandas as pd
from datetime import datetime
# Create Dateframe
d = {'ru' : pd.Series([1., 1., 1., 1., 3, 3]),
'case' : pd.Series([2., 2., 2., 2., 2, 2]),
'opdate' : pd.Series([datetime(2012, 5, 2), datetime(2012, 5, 2), datetime(2012, 5, 2),datetime(2012, 5, 2), datetime(2012, 5, 3),datetime(2012, 5, 3)]),
'lst_svc': pd.Series([datetime(2012, 5, 2), datetime(2012, 5, 3), datetime(2012, 5, 5),datetime(2012, 5, 5),datetime(2012, 6, 5),])}
df = pd.DataFrame(d)
# Mark last
df['lastMark'] = (df.groupby(['ru','case','opdate'])['lst_svc'].transform(max) == df['lst_svc']).astype(int)
The DataFrame looks like this:
case lst_svc opdate ru
0 2.0 2012-05-02 2012-05-02 1.0
1 2.0 2012-05-03 2012-05-02 1.0
2 2.0 2012-05-05 2012-05-02 1.0
3 2.0 2012-05-05 2012-05-02 1.0
4 2.0 2012-06-05 2012-05-03 3.0
5 2.0 NaT 2012-05-03 3.0
The (wrong) result of my code looks like this:
case lst_svc opdate ru lastMark
0 2.0 2012-05-02 2012-05-02 1.0 0
1 2.0 2012-05-03 2012-05-02 1.0 0
2 2.0 2012-05-05 2012-05-02 1.0 1
3 2.0 2012-05-05 2012-05-02 1.0 1
4 2.0 2012-06-05 2012-05-03 3.0 1
5 2.0 NaT 2012-05-03 3.0 0
Upvotes: 3
Views: 1309
Reputation: 97301
How about:
import pandas as pd
from datetime import datetime
# Create example DateFrame
d = {'ru' : pd.Series([1., 1., 1., 1., 3, 3]),
'case' : pd.Series([2., 2., 2., 2., 2, 2]),
'opdate' : pd.Series([datetime(2012, 5, 2), datetime(2012, 5, 2), datetime(2012, 5, 2),datetime(2012, 5, 2), datetime(2012, 5, 3),datetime(2012, 5, 3)]),
'lst_svc': pd.Series([datetime(2012, 5, 2), datetime(2012, 5, 3), datetime(2012, 5, 5),datetime(2012, 5, 5),datetime(2012, 6, 5),])}
df = pd.DataFrame(d)
# Mark last
def f(s):
s2 = pd.Series(0, index=s.index)
s2.iloc[-1] = 1
return s2
df["lastMark"] = df.groupby(['ru','case','opdate'])['lst_svc'].apply(f)
The output looks like this:
case lst_svc opdate ru lastMark
0 2.0 2012-05-02 2012-05-02 1.0 0
1 2.0 2012-05-03 2012-05-02 1.0 0
2 2.0 2012-05-05 2012-05-02 1.0 0
3 2.0 2012-05-05 2012-05-02 1.0 1
4 2.0 2012-06-05 2012-05-03 3.0 0
5 2.0 NaT 2012-05-03 3.0 1
Upvotes: 5