Jay Bhie Santos
Jay Bhie Santos

Reputation: 31

How to make a loop of random column combinations without repeating the combination in pandas dataframe?

I have a pandas dataframe that has 4 columns (A,B,D,E,F,G). I want to randomize each combination into 4 combinations (e.g. ABDE, ADEF, AEFG). And then add the combined columns into my existing dataframe which contains column 'C' (the output will be like this for example: CABDE). But I want to make all the combinations and add it to the other dataframe which contains column 'C', and save each of it as a dataframe. This is my dfC: (the dataframe with column C in it)

             C
0     0.439024
1     0.429268
2     0.429268
3     0.434146
4     0.439024
...
2203  0.346341
2204  0.341463
2205  0.331707
2206  0.312195
2207  0.390244

This is my df6 (the dataframe with column (A,B,D,E,F,G)

             A         B         D         E         F         G
0     0.043902  0.014634  0.356098  0.253659  0.117073  0.112195
1     0.043902  0.058537  0.375610  0.229268  0.141463  0.082927
2     0.058537  0.087805  0.400000  0.234146  0.141463  0.053659
3     0.068293  0.102439  0.429268  0.239024  0.146341  0.034146
4     0.082927  0.102439  0.468293  0.248780  0.151220  0.029268
...
2203  0.063415  0.068293  0.204878  0.312195  0.019512  0.053659
2204  0.053659  0.073171  0.195122  0.307317  0.019512  0.053659
2205  0.063415  0.073171  0.180488  0.302439  0.024390  0.043902
2206  0.073171  0.073171  0.160976  0.302439  0.034146  0.043902
2207  0.092683  0.087805  0.097561  0.287805  0.043902  0.053659

This is my code to get a randomized combination of columns from df6:

df4.sample(n=4,axis='columns')

This is how I add the dataframe with C column and the df4:

dfC.join(dfR)

This is the sample output:

             C         D         A         F         B
0     0.439024  0.356098  0.043902  0.117073  0.014634
1     0.429268  0.375610  0.043902  0.141463  0.058537
2     0.429268  0.400000  0.058537  0.141463  0.087805
3     0.434146  0.429268  0.068293  0.146341  0.102439
4     0.439024  0.468293  0.082927  0.151220  0.102439
...
2203  0.346341  0.204878  0.063415  0.019512  0.068293
2204  0.341463  0.195122  0.053659  0.019512  0.073171
2205  0.331707  0.180488  0.063415  0.024390  0.073171
2206  0.312195  0.160976  0.073171  0.034146  0.073171
2207  0.390244  0.097561  0.092683  0.043902  0.087805

But I want to get all of the combinations and save it as a dataframe. I will get 15 combinations which means 15 new dataframes.

Upvotes: 2

Views: 95

Answers (1)

jezrael
jezrael

Reputation: 863291

You can create dictionary of DataFrames for all combinations of columns names:

from  itertools import combinations

cols= ['A','B','D','E','F','G']
#or get columns to variable 
cols = df4.columns

d = {"".join(tup): dfC.join(df4[tup]) for tup in combinations(cols, 4)}

print (d['CABDE'])

Upvotes: 1

Related Questions