Reputation: 93
A sample CSV data in which the first column is a time stamp (date + time):
2018-01-01 10:00:00,23,43
2018-01-02 11:00:00,34,35
2018-01-05 12:00:00,25,4
2018-01-10 15:00:00,22,96
2018-01-01 18:00:00,24,53
2018-03-01 10:00:00,94,98
2018-04-20 10:00:00,90,9
2018-04-10 10:00:00,45,51
2018-01-01 10:00:00,74,44
2018-12-01 10:00:00,76,87
2018-11-01 10:00:00,76,87
2018-12-12 10:00:00,87,90
I already wrote some codes to do the monthly aggregated values task while waiting for someone to give me some suggestions.
Thanks @moys, anyway!
import pandas as pd
df = pd.read_csv('Sample.txt', header=None, names = ['Timestamp', 'Value 1', 'Value 2'])
df1['Timestamp'] = pd.to_datetime(df1['Timestamp'])
df1['Monthly'] = df1['Timestamp'].dt.to_period('M')
grouper = pd.Grouper(key='Monthly')
df2 = df1.groupby(grouper)['Value 1', 'Value 2'].sum().reset_index()
The output is:
Monthly Value 1 Value 2
0 2018-01 202 275
1 2018-03 94 98
2 2018-04 135 60
3 2018-12 163 177
4 2018-11 76 87
What if there's a dataset with more columns, how to motified the my code to make it automatically working on the dataset which has more columns?
2018-02-01 10:00:00,23,43,32
2018-02-02 11:00:00,34,35,43
2018-03-05 12:00:00,25,4,43
2018-02-10 15:00:00,22,96,24
2018-05-01 18:00:00,24,53,98
2018-02-01 10:00:00,94,98,32
2018-02-20 10:00:00,90,9,24
2018-07-10 10:00:00,45,51,32
2018-01-01 10:00:00,74,44,34
2018-12-04 10:00:00,76,87,53
2018-12-02 10:00:00,76,87,21
2018-12-12 10:00:00,87,90,98
Upvotes: 0
Views: 36
Reputation: 8033
You can do something like below
df.groupby(pd.to_datetime(df['date']).dt.month).sum().reset_index()
Output Here, 'date' column is the month number.
date val1 val2
0 1 202 275
1 3 94 98
2 4 135 60
3 11 76 87
4 12 163 177
Upvotes: 1