Reputation: 143
I'm looking for a better code to transform my DataFrame
.
My DataFrame
looks like this:
Period LASTDATE PRICE VAT SUM CLIENT
0 2018Q1 31/3/2018 1 2 3 NAME
1 2018Q2 30/6/2018 2 2 4 NAME
2 2018Q3 30/9/2018 3 3 6 NAME
3 2018Q4 31/12/2018 4 4 8 NAME
I want to achieve this:
2018Q1 2018Q2 2018Q3 2018Q4 LASTDATE SUM
NAME 3 4 6 8 31/12/2018 21
So far I split the Period
column from the DataFrame
and make it a header. I believe there's a better way to code this. I searched the documentation of pandas and I believe, set_index()
might help me. Any ideas?
Upvotes: 2
Views: 501
Reputation: 2581
Unfortunately, jezrael's answer won't work if you have more than 1 client and it doesn't give you the client name as the index (as in your example output).
Try the following:
df1 = df.pivot(index='CLIENT', columns='Period', values='SUM')
df_agg = df.groupby('CLIENT').agg({'LASTDATE': 'last', 'SUM': 'sum'})
df_fin = pd.concat([df1, df_agg], axis=1)
In the first line, you reshape the data frame, but lose the LASTDATE value, which you bring back in with the second line (together with the total sum for each client) and then combine the data frames in the last line.
For an expanded imaginary data frame:
Period LASTDATE PRICE VAT SUM CLIENT
0 2018Q1 31/3/2018 1 2 3 NAME1
1 2018Q2 30/6/2018 2 2 4 NAME1
2 2018Q3 30/9/2018 3 3 6 NAME1
3 2018Q4 31/12/2018 4 4 8 NAME1
4 2018Q1 31/3/2018 2 2 1 NAME2
5 2018Q2 30/6/2018 4 2 2 NAME2
6 2018Q3 30/9/2018 6 3 3 NAME2
7 2018Q4 31/12/2018 8 4 4 NAME2
you should get the following:
2018Q1 2018Q2 2018Q3 2018Q4 LASTDATE SUM
CLIENT
NAME1 3 4 6 8 31/12/2018 21
NAME2 1 2 3 4 31/12/2018 10
Tested on pandas 0.24.1.
Upvotes: 4
Reputation: 862701
Create Series
by DataFrame.set_index
for index by Period
, convert to one column DataFrame
by Series.to_frame
, transpose by DataFrame.T
and last add new columns by DataFrame.assign
:
df1 = (df.set_index('Period')['SUM']
.to_frame()
.T
.rename_axis(None, axis=1)
.assign(LASTDATE = df['LASTDATE'].iat[-1], SUM = df['SUM'].sum()))
print (df1)
2018Q1 2018Q2 2018Q3 2018Q4 LASTDATE SUM
SUM 3 4 6 8 31/12/2018 21
Upvotes: 4