Reputation: 831
Suppose the below simplified dataframe. (The actual df is much, much bigger.) How does one assign values to a new column f
such that f
is a function of another column (e.,g. e
)? I'm pretty sure one needs to use apply
or map
but never done this with a dataframe that has multiindex columns?
df = pd.DataFrame([[1,2,3,4], [5,6,7,8], [9,10,11,12], [13,14,15,16]])
df.columns = pd.MultiIndex.from_tuples((("a", "d"), ("a", "e"), ("b", "d"), ("b","e")))
df
a b
d e d e
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12
3 13 14 15 16
Desired output:
a b
d e f d e f
0 1 2 1 3 4 1
1 5 6 1 7 8 -1
2 9 10 -1 11 12 -1
3 13 14 -1 15 16 -1
Would like to be able to apply the following lines and assign them to a new column f
. Two problems: First, the last line that contains the apply
doesn't work but hopefully my intent is clear. Second, I'm unsure how to assign values to a new column of a dataframe with a multi index column structure. Would like to be able use functional programming methods.
lt = df.loc(axis=1)[:,'e'] < 8
gt = df.loc(axis=1)[:,'e'] >= 8
conditions = [lt, gt]
choices = [1, -1]
df.loc(axis=1)[:,'f'] = df.loc(axis=1)[:,'e'].apply(np.select(conditions, choices))
Upvotes: 2
Views: 140
Reputation: 79318
nms = [(i, 'f')for i, j in df.columns if j == 'e']
df[nms] = (df.iloc[:, [j == 'e' for i, j in df.columns]] < 8) * 2 - 1
df = df.sort_index(axis=1)
df
a b
d e f d e f
0 1 2 1 3 4 1
1 5 6 1 7 8 -1
2 9 10 -1 11 12 -1
3 13 14 -1 15 16 -1
EDIT:
for a custom ordering:
d = {i:j for j, i in enumerate(df.columns.levels[0])}
df1 = df.loc[:, sorted(df.columns, key = lambda x: d[x[0]])]
IF the whole data is in a way symmetric, you could do:
df.stack(0).assign(f = lambda x: 2*(x.e < 8) - 1).stack().unstack([1,2])
Out[]:
a b
d e f d e f
0 1 2 1 3 4 1
1 5 6 1 7 8 -1
2 9 10 -1 11 12 -1
3 13 14 -1 15 16 -1
Upvotes: 4