Dan Murphy
Dan Murphy

Reputation: 79

Python Pandas GroupBy Max Date

I have a very simple dataframe with columns: Index, Person, Item, Date. There are only 4 people and 3 items and random dates. All person/item/date combinations are unique. I am trying to get a simple pivot-table like df to print using:

import pandas as pd

mydf = pd.read_csv("Test_Data.csv",index_col=[0])
mydf = mydf.sort_values(by=['Date','Item','Person'], ascending=False)

print(mydf.groupby(['Person','Item'])['Date'].max())

however, I noticed that while the structure is what I want, the data is not. It is not returning the max date for the Person/Item combination. I thought sorting things first would help, but it did not. Do I need to create a temp df first and then join to do what I'm trying to do?

Also to be clear, there are 28 rows of data (all test data) with some People/Items being repeated but with different dates. Index is just 0 through 27.

Upvotes: 1

Views: 839

Answers (1)

Dan Murphy
Dan Murphy

Reputation: 79

Figured it out! Should have made sure the Date field was actually recognized as a date:

mydf['Date'] = pd.to_datetime(mydf['Date'])

Upvotes: 1

Related Questions