Reputation: 47
Have a good day. I have tried to find a real solution for the sum of all possible combinations of an array but have not found any so far using python.
Commonly people ask about the following combinatorics:
Given: a, b, c, d and e as records:
a+b
a+c
a+d
a+e
b+c
b+d
b+e
c+d
c+e
d+e
The real combination would be:
a+b
a+c
a+d
a+e
a+b+c
a+b+d
a+b+e
a+c+d
a+c+e
a+d+e
a+b+c+d
a+b+c+e
a+c+d+e
a+b+c+d+e
b+c
b+d
b+e
b+c+d
b+c+e
b+d+e
c+d
c+e
c+d+e
d+e
So, the result I need to obtain is the following, but so far I have not found how to iterate it to cover any size of the array:
| COMBINATION | SUM |
|-------------|--------------|
|a+b | XXX|
|a+c | XXX|
|a+d | XXX|
|a+e | XXX|
|a+b+c | XXX|
|a+b+d | XXX|
|a+b+e | XXX|
|a+c+d | XXX|
|a+c+e | XXX|
|a+d+e | XXX|
|a+b+c+d | XXX|
|a+b+c+e | XXX|
|a+c+d+e | XXX|
|a+b+c+d+e | XXX|
|b+c | XXX|
|b+d | XXX|
|b+e | XXX|
|b+c+d | XXX|
|b+c+e | XXX|
|b+d+e | XXX|
|c+d | XXX|
|c+e | XXX|
|c+d+e | XXX|
|d+e | XXX|
Note that the combination size is variable and must cover the total number of records in the array. Anyone have any ideas, possibly with the mathematical algorithm that is actually being applied here?
Upvotes: 2
Views: 1458
Reputation: 7863
You can try the following:
import pandas as pd
from itertools import product
a = [2, 6, 7]
df = pd.DataFrame(product([0, 1], repeat=len(a)), columns=a)
df['sum'] = df[a] @ a
print(df)
This gives:
2 6 7 sum
0 0 0 0 0
1 0 0 1 7
2 0 1 0 6
3 0 1 1 13
4 1 0 0 2
5 1 0 1 9
6 1 1 0 8
7 1 1 1 15
The first three columns of the dataframe are labeled with elements of the list a
. The value of 0 in a column indicates that the corresponding element of a
is used in a sum, and 0
that it is not used. The last column gives sums of the elements of a
selected in this way.
If you are interested only in sums that involve at least two elements of a
, then you can select the relevant rows of the dataframe using df[df[a].sum(axis=1) >= 2]
.
Edit. The code above produces a dataframe consisting of 2**len(a)
rows. For larger values of len(a)
this will use a lot of memory. In such cases, it may be better to produce sums one by one in a loop instead of computing a dataframe with all of them. This can be done, for example, as follows:
from itertools import product
import numpy as np
a = [2, 6, 7]
arr = np.array(a)
for p in product([0, 1], repeat=len(a)):
print(p, p @ arr)
Upvotes: 1
Reputation: 11240
# Look at all subsets from size 2 up to the whole thing
for size in range(2, len(records) + 1):
# Iterate through all subsets of size "size"
for subset in itertools.combinations(records, size):
# subset will be a subset of the records. Feel free
# to grab their sum and print them out however you want
Edited: Accidentally mistyped combinations
as combination
. Sorry.
Upvotes: 3