Aman J
Aman J

Reputation: 1855

How to custom pivot a table in pandas

Code to get the dataframe df

import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO('qi_variable,qi_variable_type,value\nDEBIT_STUCCO_FT303A_KG_MIN,Time,2021-04-01 00:00:10\nDEBIT_STUCCO_FT303A_KG_MIN,Time,2021-04-01 00:00:30\nDEBIT_STUCCO_FT303A_KG_MIN,ValueY,"338,25"\nDEBIT_STUCCO_FT303A_KG_MIN,ValueY,"337,799987792969"\nDEBIT_EAU_MOUSSE_KG_MIN,Time,2021-04-01 00:00:10\nDEBIT_EAU_MOUSSE_KG_MIN,Time,2021-04-01 00:00:30\nDEBIT_EAU_MOUSSE_KG_MIN,ValueY,"55,1691627502441"\nDEBIT_EAU_MOUSSE_KG_MIN,ValueY,"55,3335952758789"\nCORRECTION_MOUSSE,Time,2021-04-01 00:04:12\nCORRECTION_MOUSSE,Time,2021-04-01 00:04:35\nCORRECTION_MOUSSE,ValueY,"1,04863631725311"\nCORRECTION_MOUSSE,ValueY,"1,04946064949036"\n'))

Current dataframe df:

This is my data

Expected Result: I am trying to pivot wide my table to a result as shown below. I tried pd.pivot pd.pivot_table but failed to get it working. Ignore the blank rows, I left those blank to improve the readability.

enter image description here

Upvotes: 2

Views: 353

Answers (2)

Mateusz Dorobek
Mateusz Dorobek

Reputation: 761

df_1 = df[df.qi_variable_type=='Time'].rename(columns={'value': 'Time'})[['qi_variable', 'Time']].reset_index()
df_2 = df[df.qi_variable_type=='ValueY'].rename(columns={'value': 'ValueY'})[['qi_variable', 'ValueY']].reset_index()
df_1.join(df_2, lsuffix='_2')[['qi_variable', 'Time','ValueY']]
    qi_variable                 Time                  ValueY
0   DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:10   338,25
1   DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:30   337,799987792969
2   DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:10   55,1691627502441
3   DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:30   55,3335952758789
4   CORRECTION_MOUSSE           2021-04-01 00:04:12   1,04863631725311
5   CORRECTION_MOUSSE           2021-04-01 00:04:35   1,04946064949036

Upvotes: 0

Nk03
Nk03

Reputation: 14949

One way:

df = df.pivot_table(index = 'qi_variable', columns = 'qi_variable_type', values = 'value', aggfunc= list).apply(pd.Series.explode)

Alternative:

df = df.pivot_table(index = ['qi_variable', df.groupby(['qi_variable','qi_variable_type']).cumcount()], columns = 'qi_variable_type', values = 'value', aggfunc= ''.join).reset_index(-1, drop=True)

One more alternative:

df = df.set_index(['qi_variable',df.groupby(['qi_variable', 'qi_variable_type']).cumcount(), 'qi_variable_type']).unstack(-1).reset_index(-1, drop=True)

OUTPUT:

qi_variable_type                           Time            ValueY
qi_variable                                                      
CORRECTION_MOUSSE           2021-04-01 00:04:12  1,04863631725311
CORRECTION_MOUSSE           2021-04-01 00:04:35  1,04946064949036
DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:10  55,1691627502441
DEBIT_EAU_MOUSSE_KG_MIN     2021-04-01 00:00:30  55,3335952758789
DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:10            338,25
DEBIT_STUCCO_FT303A_KG_MIN  2021-04-01 00:00:30  337,799987792969

Upvotes: 5

Related Questions