Reputation: 7591
I'm trying to unpivot two columns inside a pandas dataframe. The transformation I seek would be the inverse of this question.
We start with a dataset that looks like this:
import pandas as pd
import numpy as np
df_orig = pd.DataFrame(data=np.random.randint(255, size=(4,5)),
columns=['accuracy','time_a','time_b','memory_a', 'memory_b'])
df_orig
accuracy time_a time_b memory_a memory_b
0 6 118 170 102 239
1 241 9 166 159 162
2 164 70 76 228 121
3 228 121 135 128 92
I wish to unpivot both themwmory
and time
columns, obtaining this dataset in result:
df
accuracy memory category time
0 6 102 a 118
1 241 159 a 9
2 164 228 a 70
3 228 128 a 121
12 6 239 b 170
13 241 162 b 166
14 164 121 b 76
15 228 92 b 135
So far I have managed to get my desired output using df.melt()
twice plus some extra commands:
df = df_orig.copy()
# Unpivot memory columns
df = df.melt(id_vars=['accuracy','time_a', 'time_b'],
value_vars=['memory_a', 'memory_b'],
value_name='memory',
var_name='mem_cat')
# Unpivot time columns
df = df.melt(id_vars=['accuracy','memory', 'mem_cat'],
value_vars=['time_a', 'time_b'],
value_name='time',
var_name='time_cat')
# Keep only the 'a'/'b' as categories
df.mem_cat = df.mem_cat.str[-1]
df.time_cat = df.time_cat.str[-1]
# Keeping only the colums whose categories match (DIRTY!)
df = df[df.mem_cat==df.time_cat]
# Removing the duplicated category column.
df = df.drop(columns='time_cat').rename(columns={"mem_cat":'category'})
Given how easy it was to solve the inverse question, I believe my code is way too complex. Can anyone do it better?
Upvotes: 2
Views: 493
Reputation: 862591
Use wide_to_long
:
np.random.seed(123)
df_orig = pd.DataFrame(data=np.random.randint(255, size=(4,5)),
columns=['accuracy','time_a','time_b','memory_a', 'memory_b'])
df = (pd.wide_to_long(df_orig.reset_index(),
stubnames=['time','memory'],
i='index',
j='category',
sep='_',
suffix='\w+')
.reset_index(level=1)
.reset_index(drop=True)
.rename_axis(None))
print (df)
category accuracy time memory
0 a 254 109 66
1 a 98 230 83
2 a 123 57 225
3 a 113 126 73
4 b 254 126 220
5 b 98 17 106
6 b 123 214 96
7 b 113 47 32
Upvotes: 3