Reputation: 1040
I have extracted the table below from a csv file :
timestamp user_id main_val val1 val2 val3 transport
01/01/2011 1 1 3 1491 0 bus
01/07/2012 1 19 57 4867 5 bus
01/09/2013 1 21 63 3455 5 bus
01/02/2011 2 20 8 2121 5 bus
01/12/2012 2 240 30 3558 3 bus
01/01/2011 3 100 5 3357 3 bus
01/11/2012 3 3100 49 1830 bus
01/12/2013 3 3200 51 4637 4 bus
For this purpose I used the following statement:
import pandas as pd
newnames = ['date','user_id', 'cost', 'val1']
df = pd.read_csv('expenses.csv', names = newnames, header = False)
pivoted = df.pivot('date','user_id')
and now I have the dataframe pivoted containing the table below :
cost cost cost val1 val1 val1
user_id 1 2 3 1 2 3
timestamp
01/01/2011 1 100 3 5
01/02/2011 20 8
01/07/2012 19 57
01/09/2013 21 63
01/11/2012 3100 49
01/12/2012 240 30
01/12/2013 3200 51
How can I now calculate a monthly average cost and val1 for each user_id?
Thanks in advance for your help.
Upvotes: 0
Views: 323
Reputation: 17511
You probably want to use the resample method http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.DataFrame.resample.html
import pandas as pd
import numpy as np
newnames = ['date','user_id', 'cost', 'val1']
df = pd.read_csv('expenses.csv', names = newnames, header = False)
df['date'] = pd.to_datetime(df['date'])
pivoted = df.pivot('date','user_id')
pivoted.resample('M')
Upvotes: 1