wrangler
wrangler

Reputation: 3576

Sum of values at common indexes in all subsets?

In a recent problem where i have to sum all values at common indexes in all possible subsets of size k in array of size n.

For eg: If

array ={1,2,3}

Its subsets (k=2) will be (x [i] , x [j]) where i < j

1 2
1 3
2 3
Sum:4,8

Firstly I have used recursion (same that of generating all subsets)

int sum_index[k]={0};
void sub_set(int array[],int n,int k,int temp[],int q=0,int r=0)
{
    if(q==k)
    {
       for(int i=0;i<k;i++)
          sum_index[i]+=temp[i];
    }
    else
    {
        for(int i=r;i<n;i++)
        {
            temp[q]=array[i];
            sub_set(value,n,k,temp,q+1,i+1);
        }
    }
}

Problem is its taking too much time then expected .

Then i modified it to...

void sub_set(int array[],int n,int k,int temp[],int q=0,int r=0)
{
    if(q==k)
    {
       return;
    }
    else
    {
        for(int i=r;i<n;i++)
        {
            temp[q]=array[i];
            sum_index[q]+=temp[q];  //or sum_index[q]+=s[i];
            sub_set(value,n,k,temp,q+1,i+1);
        }
    }
}

Still taking too much time!!

Is there any other approach to this problem?? Or any other modification i needed that i am unaware of??

Upvotes: 1

Views: 159

Answers (3)

deviantfan
deviantfan

Reputation: 11414

A solution in O(n^2) instead of O(n!):
First a tiny (:)) bit of explanation, then some code:

I´m going to assume here that your array is sorted (if not, use std::sort first). Additionally, I´m going to work with the array values 1,2,3,4... here, if you array consists arbitrary values (like 2 8 17), you´ll have to think of it as the indices (ie. 1=>2, 2=>8 etc.)

Definition: (x choose y) means the binomial coefficient, how it is calculated is in the link too. If you have an array size a and some k for the subset size, (a choose k) is the number of permutations, eg. 3 for your example: (1,2), (1,3) and (2,3).

You want the sum for each column if you write the permutations under each other, this would be easy if you knew for each column how many times each array element occurs, ie. how many 1´s, 2´s and 3´s for the first, and how many for second column (with k=2).

Here a bigger example to explain: (1,2,3,4,5) and all possible k´s (each in one block):

1
2
3
4
5

12
13
14
15
23
24
25
34
35
45

123
124
125
134
135
145
234
235
245
345

... (didn´t write k=4)

12345

Let´s introduce column indices, 0<=c<k, ie. c=0 means the first column, c=1 the second and so on; and the array size s=5.

So, looking eg. at the k=3-block, you´ll notice that the lines beginning with 1 (column c=0) have all permutations of the values (2,3,4,5) for k=2, more generally a value x in column c has all permutations for values x+1 to s after it. The values from from x+1 to s are s-x different values, and after column c there are k-c-1 more columns. So, for a value x, you can calculate ((s-x) choose (k-c-1)).
Additionally, the first column has only the values 1,2,3, the last two numbers are not here because after this column there are two more columns.

If you do this for the first column, it works well. Eg. with value 1 in the first column of k=3 above:
count(x) = ((s-x) choose (k-c-1)) = (4 choose 2) = 6
and indeed there are six 1 there. Calculate this count for every array value, multiply x*count(x), and sum it up for every x, that´s the result for the first column.

The other columns are a tiny bit harder, because there can be multiple "permutation blocks" of the same number. To start with, the step above needs a small adjustment: You need a muliplier array somewhere, one multiplier for each array value, and in the beginning each multiplier is 1. In the calculation x*count(x) above, take x*count(x)*muliplier(x) instead.

In the k=3-example, 1 in the first column can be followed by 2,3,4, 2 can be followed by 3,4, and 3 by 4. So the 3-based permutations of the second column need to be counted twice, and the 4-based even three times; more generally so many times like there are smaller values in the previos colums. Multiply that to the current multiplier.

... Some code:

#include<iostream>
#include<vector>
#include<algorithm>

using namespace std;

// factorial (x!)
unsigned long long fact(unsigned char x)
{
    unsigned long long res = 1;
    while(x)
    {
        res *= x;
        x--;
    }
    return res;
}

//binomial coefficient (n choose k)
unsigned long long binom(unsigned char n, unsigned char k)
{
    if(!n || !k) return 1;
    return (fact(n) / fact(k)) / fact(n-k);
}

//just for convenience
template<class T> void printvector(std::vector<T> data)
{
    for(auto l : data) cout << l << " ";
    cout << endl;
}

std::vector<unsigned long long> calculate(std::vector<int> data, int k)
{
    std::vector<unsigned long long> res(k, 0); //result data
    std::vector<unsigned long long> multiplier(data.size(), 1);
    if(k < 1 || k > 255 || data.size() < 1) return res; //invalid stuff
    std::sort(data.begin(), data.end()); //as described

    for(int column = 0; column < k; column++) //each column separately
    {
        //count what to multiply to the multiplier array later
        std::vector<unsigned long long> newmultiplier(data.size(), 0);

        //for each array element in this column
        for(int x = column; x <= (data.size() + column - k); x++)
        {
            //core calculation
            res[column] += data[x] * multiplier[x] * binom(data.size() - x - 1, k - column - 1);
            //counting the new multiplier factor
            for(int helper = x + 1; helper < data.size(); helper++)
                newmultiplier[helper]++;
        }
        //calculating new multiplier
        for(int x = 0; x < data.size(); x++)
        {
            if(newmultiplier[x])
                multiplier[x] *= newmultiplier[x];
        }
    }

    return res;
}

int main() {
    printvector(calculate({1,2,3}, 2)); //output 4 8
    return 0;
}

Upvotes: 1

Jarod42
Jarod42

Reputation: 217145

std::next_permutation may help:

std::vector<int> sub_set(const std::vector<int>& a, int k)
{
    std::vector<int> res(k, 0);
    std::vector<bool> p(a.size() - k, false);
    p.resize(a.size(), true);

    do
    {
        int index = 0;
        for (std::size_t i = 0; i != p.size(); ++i) {
            if (p[i]) {
                res[index++] += a[i];
            }
        }
    } while (std::next_permutation(p.begin(), p.end()));
    return res;
}

Live Demo

Upvotes: 0

IronMensan
IronMensan

Reputation: 6831

Instead of iterating through the possible sub-sets, think of it a combinatorics problem.

To use your example of k=2 and {1,2,3}, let's just look at the first value of the result. It has two 1's and one 2. The two 1's correspond to the number one element sets that can be made from {2, 3} and the one 2 corresponds to the number of one element sets that can be made from {3}. A similar arrangement exists for the one 2 and two 3's in the second element of the result and looking at the subsets of the elements that appear before the element being considered.

Things get a bit more complicated when k>2 because then you will have to look for the number of combinations of elements before and after the element being considered, but the basic premise still works. Multiply the number of possible subsets before times the number of subsets afterwards and that will tell you how many times each element contributes to the result.

Upvotes: 1

Related Questions