Magotte
Magotte

Reputation: 143

How to set the values of a column as columns in dataframe?

I'm looking for a better code to transform my DataFrame. My DataFrame looks like this:

    Period  LASTDATE    PRICE   VAT SUM CLIENT
0   2018Q1  31/3/2018   1       2   3   NAME
1   2018Q2  30/6/2018   2       2   4   NAME
2   2018Q3  30/9/2018   3       3   6   NAME
3   2018Q4  31/12/2018  4       4   8   NAME

I want to achieve this:

    2018Q1  2018Q2  2018Q3  2018Q4  LASTDATE    SUM
NAME    3     4       6        8    31/12/2018  21

So far I split the Period column from the DataFrame and make it a header. I believe there's a better way to code this. I searched the documentation of pandas and I believe, set_index() might help me. Any ideas?

Upvotes: 2

Views: 501

Answers (2)

Arienrhod
Arienrhod

Reputation: 2581

Unfortunately, jezrael's answer won't work if you have more than 1 client and it doesn't give you the client name as the index (as in your example output).
Try the following:

df1 = df.pivot(index='CLIENT', columns='Period', values='SUM')
df_agg = df.groupby('CLIENT').agg({'LASTDATE': 'last', 'SUM': 'sum'})
df_fin = pd.concat([df1, df_agg], axis=1)

In the first line, you reshape the data frame, but lose the LASTDATE value, which you bring back in with the second line (together with the total sum for each client) and then combine the data frames in the last line.
For an expanded imaginary data frame:

   Period    LASTDATE  PRICE  VAT  SUM CLIENT
0  2018Q1   31/3/2018      1    2    3  NAME1
1  2018Q2   30/6/2018      2    2    4  NAME1
2  2018Q3   30/9/2018      3    3    6  NAME1
3  2018Q4  31/12/2018      4    4    8  NAME1
4  2018Q1   31/3/2018      2    2    1  NAME2
5  2018Q2   30/6/2018      4    2    2  NAME2
6  2018Q3   30/9/2018      6    3    3  NAME2
7  2018Q4  31/12/2018      8    4    4  NAME2

you should get the following:

        2018Q1  2018Q2  2018Q3  2018Q4    LASTDATE  SUM
CLIENT                                                 
NAME1        3       4       6       8  31/12/2018   21
NAME2        1       2       3       4  31/12/2018   10

Tested on pandas 0.24.1.

Upvotes: 4

jezrael
jezrael

Reputation: 862701

Create Series by DataFrame.set_index for index by Period, convert to one column DataFrame by Series.to_frame, transpose by DataFrame.T and last add new columns by DataFrame.assign:

df1 = (df.set_index('Period')['SUM']
         .to_frame()
         .T
         .rename_axis(None, axis=1)
         .assign(LASTDATE = df['LASTDATE'].iat[-1], SUM = df['SUM'].sum()))
print (df1)
     2018Q1  2018Q2  2018Q3  2018Q4    LASTDATE  SUM
SUM       3       4       6       8  31/12/2018   21

Upvotes: 4

Related Questions