Reputation: 31
I have a pandas dataframe that has 4 columns (A,B,D,E,F,G). I want to randomize each combination into 4 combinations (e.g. ABDE, ADEF, AEFG). And then add the combined columns into my existing dataframe which contains column 'C' (the output will be like this for example: CABDE). But I want to make all the combinations and add it to the other dataframe which contains column 'C', and save each of it as a dataframe. This is my dfC: (the dataframe with column C in it)
C
0 0.439024
1 0.429268
2 0.429268
3 0.434146
4 0.439024
...
2203 0.346341
2204 0.341463
2205 0.331707
2206 0.312195
2207 0.390244
This is my df6 (the dataframe with column (A,B,D,E,F,G)
A B D E F G
0 0.043902 0.014634 0.356098 0.253659 0.117073 0.112195
1 0.043902 0.058537 0.375610 0.229268 0.141463 0.082927
2 0.058537 0.087805 0.400000 0.234146 0.141463 0.053659
3 0.068293 0.102439 0.429268 0.239024 0.146341 0.034146
4 0.082927 0.102439 0.468293 0.248780 0.151220 0.029268
...
2203 0.063415 0.068293 0.204878 0.312195 0.019512 0.053659
2204 0.053659 0.073171 0.195122 0.307317 0.019512 0.053659
2205 0.063415 0.073171 0.180488 0.302439 0.024390 0.043902
2206 0.073171 0.073171 0.160976 0.302439 0.034146 0.043902
2207 0.092683 0.087805 0.097561 0.287805 0.043902 0.053659
This is my code to get a randomized combination of columns from df6:
df4.sample(n=4,axis='columns')
This is how I add the dataframe with C column and the df4:
dfC.join(dfR)
This is the sample output:
C D A F B
0 0.439024 0.356098 0.043902 0.117073 0.014634
1 0.429268 0.375610 0.043902 0.141463 0.058537
2 0.429268 0.400000 0.058537 0.141463 0.087805
3 0.434146 0.429268 0.068293 0.146341 0.102439
4 0.439024 0.468293 0.082927 0.151220 0.102439
...
2203 0.346341 0.204878 0.063415 0.019512 0.068293
2204 0.341463 0.195122 0.053659 0.019512 0.073171
2205 0.331707 0.180488 0.063415 0.024390 0.073171
2206 0.312195 0.160976 0.073171 0.034146 0.073171
2207 0.390244 0.097561 0.092683 0.043902 0.087805
But I want to get all of the combinations and save it as a dataframe. I will get 15 combinations which means 15 new dataframes.
Upvotes: 2
Views: 95
Reputation: 863291
You can create dictionary of DataFrames for all combinations of columns names:
from itertools import combinations
cols= ['A','B','D','E','F','G']
#or get columns to variable
cols = df4.columns
d = {"".join(tup): dfC.join(df4[tup]) for tup in combinations(cols, 4)}
print (d['CABDE'])
Upvotes: 1