Reputation: 37
I have a set of data that has several different columns, with daily data going back several years. The variable is the exact same for each column. I've calculated the daily, monthly, and yearly statistics for each column, and want to do the same, but combining all columns together to get one statistic for each day, month, and year rather than the several different ones I calculated before.
I've been using Pandas group by so far, using something like this:
sum_daily_files = daily_files.groupby(daily_files.Date.dt.day).sum()
sum_monthly_files = daily_files.groupby(daily_files.Date.dt.month).sum()
sum_yearly_files = daily_files.groupby(daily_files.Date.dt.year).sum()
Any suggestions on how I might go about using Pandas - or any other package - to combine the statistics together? Thanks so much!
edit
Here's a snippet of my dataframe:
Date site1 site2 site3 site4 site5 site6
2010-01-01 00:00:00 2 0 1 1 0 1
2010-01-02 00:00:00 7 5 1 3 1 1
2010-01-03 00:00:00 3 3 2 2 2 1
2010-01-04 00:00:00 0 0 0 0 0 0
2010-01-05 00:00:00 0 0 0 0 0 1
I just had to type it in because I was having trouble getting it over, so my apologies. Basically, it's six different sites from 2010 to 2019 that details how much snow (in inches) each site received on each day.
Upvotes: 0
Views: 168
Reputation: 120559
(Your problem need to be clarify)
Is this what you want?
all_sum_daily_files = sum_daily_files.sum(axis=1) # or daily_files.sum(axis=1)
all_sum_monthly_files = sum_monthly_files.sum(axis=1)
all_sum_yearly_files = sum_yearly_files.sum(axis=1)
If your data is daily, why calculate the daily sum, you can use directly daily_files.sum(axis=1)
.
Upvotes: 1