Eradash
Eradash

Reputation: 422

Getting all combinations of K and less elements in List of N elements with big K

I want to have all combination of elements in a list for a result like this:

List: {1,2,3}

1
2
3
1,2
1,3
2,3

My problem is that I have 180 elements, and I want to have all combinations up to 5 elements. With my tests with 4 elements, it took a long time (2 minutes) but all went well. But with 5 elements, I get a run out of memory exception.

My code presently is this:

public IEnumerable<IEnumerable<Rondin>> getPossibilites(List<Rondin> rondins)
{
    var combin5 = rondins.Combinations(5);
    var combin4 = rondins.Combinations(4);
    var combin3 = rondins.Combinations(3);
    var combin2 = rondins.Combinations(2);
    var combin1 = rondins.Combinations(1);

    return combin5.Concat(combin4).Concat(combin3).Concat(combin2).Concat(combin1).ToList();
}

With the fonction: (taken from this question: Algorithm to return all combinations of k elements from n)

public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int k)
{
    return k == 0 ? new[] { new T[0] } :
    elements.SelectMany((e, i) =>
    elements.Skip(i + 1).Combinations(k - 1).Select(c => (new[] { e }).Concat(c)));
}

I need to search in the list for a combination where each element added up is near (with a certain precision) to a value, this for each element in an other list. There is all my code for this part:

var possibilites = getPossibilites(opt.rondins);

possibilites = possibilites.Where(p => p.Sum(r => r.longueur + traitScie) < 144);

foreach(BilleOptimisee b in opt.billesOptimisees)
{
    var proches = possibilites.Where(p => p.Sum(r => (r.longueur + traitScie)) < b.chute && Math.Abs(b.chute - p.Sum(r => r.longueur)) - (p.Count() * 0.22) < 0.01).OrderByDescending(p => p.Sum(r => r.longueur)).ElementAt(0);

    if(proches != null)
    {
        foreach (Rondin r in proches)
        {
            opt.rondins.Remove(r);
            b.rondins.Add(r);
            possibilites = possibilites.Where(p => !p.Contains(r));
        }
    }
}

With the code I have, how can I limit the memory taken by my list ? Or is there a better solution to search in a very big set of combinations ?

Please, if my question is not good, tell me why and I will do my best to learn and ask better questions next time ;)

Upvotes: 0

Views: 265

Answers (3)

Eradash
Eradash

Reputation: 422

I found my solution, I'm writing it here so that other people that has a similar problem than me can have something to work with...

I made a recursive fonction that check for a fixed amount of possibilities that fit the conditions. When the amount of possibilities is found, I return the list of possibilities, do some calculations with the results, and I can restart the process. I added a timer to stop the research when it takes too long. Since my condition is based on the sum of the elements, I do every possibilities with distinct values, and search for a small amount of possibilities each time (like 1).

So the fonction return a possibility with a very high precision, I do what I need to do with this possibility, I remove the elements of the original list, and recall the fontion with the same precision, until there is nothing returned, so I can continue with an other precision. When many precisions are done, there is only about 30 elements in my list, so I can call for all the possibilities (that still fits the maximum sum), and this part is much easier than the beginning.

There is my code:

public List<IEnumerable<Rondin>> getPossibilites(IEnumerable<Rondin> rondins, int nbElements, double minimum, double maximum, int instance = 0, double longueur = 0)
{
    if(instance == 0)
        timer = DateTime.Now;

    List<IEnumerable<Rondin>> liste = new List<IEnumerable<Rondin>>();

    //Get all distinct rondins that can fit into the maximal length
    foreach (Rondin r in rondins.Where(r => r.longueur < (maximum - longueur)).DistinctBy(r => r.longueur).OrderBy(r => r.longueur))
    {
        //Check the current length
        double longueur2 = longueur + r.longueur + traitScie;

        //If the current length is under the maximal length
        if (longueur2 < maximum)
        {
            //Get all the possibilities with all rondins except the current one, and add them to the list
            foreach (IEnumerable<Rondin> poss in getPossibilites(rondins.Where(rondin => rondin.id != r.id), nbElements - liste.Count, minimum, maximum, instance + 1, longueur2).Select(possibilite => possibilite.Concat(new Rondin[] { r })))
            {
                liste.Add(poss);
                if (liste.Count >= nbElements && nbElements > 0)
                    break;
            }

            //If this the current length in higher than the minimum, add it to the list
            if (longueur2 >= minimum)
                liste.Add(new Rondin[] { r });
        }

        //If we have enough possibilities, we stop the research
        if (liste.Count >= nbElements && nbElements > 0)
            break;

        //If the research is taking too long, stop the research and return the list;
        if (DateTime.Now.Subtract(timer).TotalSeconds > 30)
            break;
    }
    return liste;
}

Upvotes: 0

Jay
Jay

Reputation: 3355

Let your precision be defined in the imaginary spectrum.

Use a real index to access the leaf and then traverse the leaf with the required precision.

See PrecisLise @ http://net7mma.codeplex.com/SourceControl/latest#Common/Collections/Generic/PrecicseList.cs

While the implementation is not 100% complete as linked you can find where I used a similar concept here:

http://net7mma.codeplex.com/SourceControl/latest#RtspServer/MediaTypes/RFC6184Media.cs

Using this concept I was able to re-order h.264 Access Units and their underlying Network Access Layer Components in what I consider a very interesting way... outside of interesting it also has the potential to be more efficient using close the same amount of memory.

et al, e.g, 0 can be proceeded by 0.1 or 0.01 or 0.001, depending on the type of the key in the list (double, float, Vector, inter alia) you may have the added benefit of using the FPU or even possibly Intrinsics if supported by your processor, thus making sorting and indexing much faster than would be possible on normal sets regardless of the underlying storage mechanism.

Using this concept allows for very interesting ordering... especially if you provide a mechanism to filter the precision.

I was also able to find several bugs in the bit-stream parser of quite a few well known media libraries using this methodology...

Upvotes: 0

amit
amit

Reputation: 178441

Your output list for combinations of 5 elements will have ~1.5*10^9 (that's billion with b) sublists of size 5. If you use 32bit integers, even neglecting lists overhead and assuming you have a perfect list with 0b overhead - that will be ~200GB!

You should reconsider if you actually need to generate the list like you do, some alternative might be: streaming the list of elements - i.e. generating them on the fly.

That can be done by creating a function, which gets the last combination as an argument - and outputs the next. (to think how it is done, think about increasing by one a number. you go from last to first, remembering a "carry over" until you are done)

A streaming example for choosing 2 out of 4:

start: {4,3}
curr = start {4, 3}
curr = next(curr) {4, 2} // reduce last by one 
curr = next(curr) {4, 1} // reduce last by one
curr = next(curr) {3, 2} // cannot reduce more, reduce the first by one, and set the follower to maximal possible value
curr = next(curr) {3, 1} // reduce last by one
curr = next(curr) {2, 1} // similar to {3,2}
done.

Now, you need to figure how to do it for lists of size 2, then generalize it for arbitrary size - and program your streaming combination generator.

Good Luck!

Upvotes: 2

Related Questions