Reputation: 5488
I have the following data which has a Value
, Time
and Date
column:
Desired output
I would like to create a new Series capturing the rows for current highest Value
as follows in this example:
This looks at the Value
column each day and captures the recent highest Value
.
Value
was 2 so this was the highest.Value
was 3 so we capture this.Please see df.to_dict() below to reproduce this:
df.to_dict()
{'Date': {0: Timestamp('2000-01-01 00:00:00'),
1: Timestamp('2000-01-01 00:00:00'),
2: Timestamp('2000-01-01 00:00:00'),
3: Timestamp('2000-01-02 00:00:00'),
4: Timestamp('2000-01-02 00:00:00'),
5: Timestamp('2000-01-02 00:00:00'),
6: Timestamp('2000-01-03 00:00:00'),
7: Timestamp('2000-01-03 00:00:00'),
8: Timestamp('2000-01-03 00:00:00'),
9: Timestamp('2000-01-04 00:00:00'),
10: Timestamp('2000-01-04 00:00:00'),
11: Timestamp('2000-01-04 00:00:00')},
'Time': {0: datetime.time(9, 0),
1: datetime.time(13, 0),
2: datetime.time(17, 0),
3: datetime.time(9, 0),
4: datetime.time(13, 0),
5: datetime.time(17, 0),
6: datetime.time(9, 0),
7: datetime.time(13, 0),
8: datetime.time(17, 0),
9: datetime.time(9, 0),
10: datetime.time(13, 0),
11: datetime.time(17, 0)},
'Value': {0: 2,
1: 2,
2: 3,
3: 2,
4: 3,
5: 3,
6: 1,
7: 1,
8: 1,
9: 3,
10: 1,
11: 2}}
Upvotes: 1
Views: 60
Reputation: 29711
IIUC, you need to use cummax
to get the cumulative maximum for the Value
column followed by dropping off duplicated entries after grouping them w.r.t Date
column.
grouped = df.groupby('Date').apply(lambda x: x['Value'].cummax() \
.drop_duplicates()) \
.reset_index()
print(df[df.index.isin(grouped['level_1'])])
Date Time Value
0 2000-01-01 09:00:00 2
2 2000-01-01 17:00:00 3
3 2000-01-02 09:00:00 2
4 2000-01-02 13:00:00 3
6 2000-01-03 09:00:00 1
9 2000-01-04 09:00:00 3
Upvotes: 1