Reputation: 4442
I have dataframe
ID 2016-01 2016-02 ... 2017-01 2017-02 ... 2017-10 2017-11 2017-12
111 12 34 0 12 3 0 0
222 0 32 5 5 0 0 0
I need to count every 12 columns and get
ID 2016 2017
111 46 15
222 32 10
I try to use
(df.groupby((np.arange(len(df.columns)) // 31) + 1, axis=1).sum().add_prefix('s'))
But it returns to all columns But when I try to use
df.groupby['ID']((np.arange(len(df.columns)) // 31) + 1, axis=1).sum().add_prefix('s'))
It returns
TypeError: 'method' object is not subscriptable
How can I fix that?
Upvotes: 3
Views: 5645
Reputation: 21
The above code has a slight syntax error and throws the following error:
ValueError: No axis named 1 for object type
Basically, the groupby condition needs to be wrapped by []
. So I'm rewriting the code correctly for convenience:
new_df = df.groupby([[i//n for i in range(0,m)]], axis = 1).sum()
where n
is the number of columns you want to group together and m
is the total number of columns being grouped. You have to rename the columns after that.
Upvotes: 2
Reputation: 11
If you don't mind losing the labels, you can try this:
new_df = df.groupby([i//n for i in range(0,m)], axis = 1).sum()
where n is the number of columns you want to group together and m is the total number of columns being grouped. You have to rename the columns after that.
Upvotes: 1
Reputation: 862591
First set_index
of all columns without dates:
df = df.set_index('ID')
1. groupby
by split
ed columns and selected first:
df = df.groupby(df.columns.str.split('-').str[0], axis=1).sum()
2. lambda
function for split:
df = df.groupby(lambda x: x.split('-')[0], axis=1).sum()
3. converted columns to datetimes and groupby
years:
df.columns = pd.to_datetime(df.columns)
df = df.groupby(df.columns.year, axis=1).sum()
4. resample
by years
:
df.columns = pd.to_datetime(df.columns)
df = df.resample('A', axis=1).sum()
df.columns = df.columns.year
print (df)
2016 2017
ID
111 46 15
222 32 10
Upvotes: 6