Bunker
Bunker

Reputation: 1051

groupby month both axis

I have data that has a createdat timestamp. And availability data for different dates. I grouped the createdat data using:

monthly_availability = summed_availability.groupby(pd.TimeGrouper(freq='M')).sum()


       2015-08-18  2015-08-19  2015-09-09  2015-09-10  2015-09-11  \
createdat                                                                
2015-08-31         1.0         1.0         4.0         6.0         7.0   
2015-09-30         NaN         NaN         2.0         2.0         2.0   
2015-10-31         NaN         NaN         NaN         NaN         NaN   
2015-11-30         NaN         NaN         NaN         NaN         NaN   
2015-12-31         NaN         NaN         NaN         NaN         NaN   

Now I would like to do the same for the rows

When I try to group axis 1 using the following code:

monthly_availability.groupby(pd.TimeGrouper(freq='M', axis=1), axis=1)

I get the following error:

TypeError Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index' 

Dateframe.info() gives

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 19 entries, 2015-08-31 to 2017-02-28
Freq: M
Columns: 767 entries, 2015-08-18 00:00:00 to 2017-09-12 00:00:00
dtypes: float64(767)
memory usage: 114.0 KB

All the column labels are of a date type.

Upvotes: 0

Views: 123

Answers (1)

jezrael
jezrael

Reputation: 862601

It seems you need convert columns names to datetime first:

monthly_availability.columns = pd.to_datetime(monthly_availability.columns)
df = monthly_availability.groupby(pd.TimeGrouper(freq='M'), axis=1).sum()
print (df)
createdat   2015-08-31  2015-09-30  2015-10-31  2015-11-30  2015-12-31
createdat                                                             
2015-08-31         1.0         1.0         4.0         6.0         7.0
2015-09-30         NaN         NaN         2.0         2.0         2.0
2015-10-31         NaN         NaN         NaN         NaN         NaN
2015-11-30         NaN         NaN         NaN         NaN         NaN
2015-12-31         NaN         NaN         NaN         NaN         NaN

Also is possible use Grouper:

df = monthly_availability.groupby(pd.Grouper(freq='M'), axis=1).sum()
print (df)
createdat   2015-08-31  2015-09-30  2015-10-31  2015-11-30  2015-12-31
createdat                                                             
2015-08-31         1.0         1.0         4.0         6.0         7.0
2015-09-30         NaN         NaN         2.0         2.0         2.0
2015-10-31         NaN         NaN         NaN         NaN         NaN
2015-11-30         NaN         NaN         NaN         NaN         NaN
2015-12-31         NaN         NaN         NaN         NaN         NaN

Upvotes: 1

Related Questions