Storing list of lists efficiently (sieving)

Question

Question

I'm looking for a data structure for a project in C to store a list of lists. I need to be able to access the n-th list given just n (the terms will be accessed out of order). The individual lists will contain between 1 and M integers (say M = 25 for concreteness); the outer list contains N of these. The individual lists are closer to 1 than M on average: in my example, only 20% have between 5 and 25 elements.

The obvious implementation is an array of length N*M. But this is space-inefficient: for performance reasons, it's important that the structure not take up too much memory. What is a good way to do this?

Context

I'm writing a factorization sieve. The outer array represents numbers from Sb + 1 to S(b+1), and each of the arrays store the prime factors of one number in that range. The smaller the structure gets the larger S can be chosen, reducing the number of (expensive) divisions.

This also gives another avenue for optimization: store only primes greater than or equal to L. The benefit is that instead of needing floor(log_2(x = largest number in range)) elements in each list, you need only floor(log_L(x)). (The example above corresponds to x = 10^12, L = 3.) The downside is that to reconstruct the factorization one needs to do trial division for primes below L.

In my application each factorization is reconstructed once, so increasing L to the next prime costs (somewhat more than) 10^12 additional divisions in my example; as an order of magnitude, this is 24-87 ops each or 2-8 hours in total on a 3 GHz K10. The more efficient the memory structure, the fewer 2 to 8 hour chinks I'll need to spend. (On the flip side, memory structures that take too much CPU work aren't worth it unless they provide a better tradeoff.)

Storing list of lists efficiently (sieving)

Question

Context

Answers (1)

Related Questions