Reputation: 1855
Code to get the dataframe df
import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO('qi_variable,qi_variable_type,value\nDEBIT_STUCCO_FT303A_KG_MIN,Time,2021-04-01 00:00:10\nDEBIT_STUCCO_FT303A_KG_MIN,Time,2021-04-01 00:00:30\nDEBIT_STUCCO_FT303A_KG_MIN,ValueY,"338,25"\nDEBIT_STUCCO_FT303A_KG_MIN,ValueY,"337,799987792969"\nDEBIT_EAU_MOUSSE_KG_MIN,Time,2021-04-01 00:00:10\nDEBIT_EAU_MOUSSE_KG_MIN,Time,2021-04-01 00:00:30\nDEBIT_EAU_MOUSSE_KG_MIN,ValueY,"55,1691627502441"\nDEBIT_EAU_MOUSSE_KG_MIN,ValueY,"55,3335952758789"\nCORRECTION_MOUSSE,Time,2021-04-01 00:04:12\nCORRECTION_MOUSSE,Time,2021-04-01 00:04:35\nCORRECTION_MOUSSE,ValueY,"1,04863631725311"\nCORRECTION_MOUSSE,ValueY,"1,04946064949036"\n'))
Current dataframe df
:
Expected Result: I am trying to pivot wide my table to a result as shown below. I tried pd.pivot
pd.pivot_table
but failed to get it working. Ignore the blank rows, I left those blank to improve the readability.
Upvotes: 2
Views: 353
Reputation: 761
df_1 = df[df.qi_variable_type=='Time'].rename(columns={'value': 'Time'})[['qi_variable', 'Time']].reset_index()
df_2 = df[df.qi_variable_type=='ValueY'].rename(columns={'value': 'ValueY'})[['qi_variable', 'ValueY']].reset_index()
df_1.join(df_2, lsuffix='_2')[['qi_variable', 'Time','ValueY']]
qi_variable Time ValueY
0 DEBIT_STUCCO_FT303A_KG_MIN 2021-04-01 00:00:10 338,25
1 DEBIT_STUCCO_FT303A_KG_MIN 2021-04-01 00:00:30 337,799987792969
2 DEBIT_EAU_MOUSSE_KG_MIN 2021-04-01 00:00:10 55,1691627502441
3 DEBIT_EAU_MOUSSE_KG_MIN 2021-04-01 00:00:30 55,3335952758789
4 CORRECTION_MOUSSE 2021-04-01 00:04:12 1,04863631725311
5 CORRECTION_MOUSSE 2021-04-01 00:04:35 1,04946064949036
Upvotes: 0
Reputation: 14949
One way:
df = df.pivot_table(index = 'qi_variable', columns = 'qi_variable_type', values = 'value', aggfunc= list).apply(pd.Series.explode)
Alternative:
df = df.pivot_table(index = ['qi_variable', df.groupby(['qi_variable','qi_variable_type']).cumcount()], columns = 'qi_variable_type', values = 'value', aggfunc= ''.join).reset_index(-1, drop=True)
One more alternative:
df = df.set_index(['qi_variable',df.groupby(['qi_variable', 'qi_variable_type']).cumcount(), 'qi_variable_type']).unstack(-1).reset_index(-1, drop=True)
OUTPUT:
qi_variable_type Time ValueY
qi_variable
CORRECTION_MOUSSE 2021-04-01 00:04:12 1,04863631725311
CORRECTION_MOUSSE 2021-04-01 00:04:35 1,04946064949036
DEBIT_EAU_MOUSSE_KG_MIN 2021-04-01 00:00:10 55,1691627502441
DEBIT_EAU_MOUSSE_KG_MIN 2021-04-01 00:00:30 55,3335952758789
DEBIT_STUCCO_FT303A_KG_MIN 2021-04-01 00:00:10 338,25
DEBIT_STUCCO_FT303A_KG_MIN 2021-04-01 00:00:30 337,799987792969
Upvotes: 5