Kev
Kev

Reputation: 25

Shuffling/permuting columns within subset of column values pandas

I have the following DataFrame with columns A,B,C,D,E:

A    B    C    D    E
a_0  b_0  c_0  1    2
a_0  b_1  c_1  3    4
a_0  b_2  c_2  5    6
a_1  b_1  c_2  7    8
a_1  b_3  c_0  9    10
a_1  b_0  c_3  11   12    

How can I go about permuting only the columns D, E within each group of values in column A? Ex: I am looking for a permutation like the following:

A    B    C    D    E
a_0  b_0  c_0  3    4
a_0  b_1  c_1  5    6
a_0  b_2  c_2  1    2
a_1  b_1  c_2  11   12
a_1  b_3  c_0  9    10
a_1  b_0  c_3  7    8

where the columns A,B,C remain as is but the values of columns D,E are shuffled, but within the rows corresponding to the value in column A.

Upvotes: 1

Views: 39

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35646

Try set_index + groupby sample:

import pandas as pd

df = pd.DataFrame({
    'A': ['a_0', 'a_0', 'a_0', 'a_1', 'a_1', 'a_1'],
    'B': ['b_0', 'b_1', 'b_2', 'b_1', 'b_3', 'b_0'],
    'C': ['c_0', 'c_1', 'c_2', 'c_2', 'c_0', 'c_3'],
    'D': [1, 3, 5, 7, 9, 11],
    'E': [2, 4, 6, 8, 10, 12]
})

df[['D', 'E']] = df.set_index('A')[['D', 'E']] \
    .groupby(level=0) \
    .sample(frac=1).values

print(df)

Possible df:

     A    B    C   D   E
0  a_0  b_0  c_0   3   4
1  a_0  b_1  c_1   1   2
2  a_0  b_2  c_2   5   6
3  a_1  b_1  c_2   7   8
4  a_1  b_3  c_0  11  12
5  a_1  b_0  c_3   9  10

Upvotes: 1

Related Questions