Reputation: 1956
I've searched everywhere and tried everything I could but can't quite get what I want from my data.
Background:
I have a set of data that has been derived from invoice data. I've massaged that data to get to the point where I have a pandas dataframe consisting of six columns. These columns (and sample data is below):
Data sample can be found in this CSV file.
Each project can have multiple invoices, which is what is causing my issue.
What I want to do:
Aggregate by Project Type and get the min, max, mean and std of "Age" for each of project type. I thought this would be a simple groupby using the Project_Type column but I can't get the min, max, mean, std functions to work as applied to that groupby.
I'm sure this is a simple issue but nothing I've found has solved it for me.
Any help or pointers appreciated.
Data sample:
Project_ID Project_Type Create_Date Invoice_Dates Age
25098 Computers 1/11/12 0:00 2/6/12 0:00 26 days
25098 Computers 1/11/12 0:00 2/29/12 0:00 49 days
25113 Telecom 1/12/12 0:00 4/30/12 0:00 109 days
25113 Telecom 1/12/12 0:00 6/30/12 0:00 170 days
Upvotes: 0
Views: 777
Reputation: 8483
Eric, I didn't download your file, but I took a swing at it. I would post the first few lines in your question so we don't have to download it.
Yes, groupby() would be a good way to go. You can specify the agg functions in a list like this
df[['Project_Type','Project Age']].groupby('Project_Type').agg(['min',
'max',
'mean',
'std'])
Upvotes: 2