Asthor
Asthor

Reputation: 979

Efficient way of altering data in an array with threads

I've been trying to figure out the most efficient way where many threads are altering a very big byte array on bit level. For ease of explaining I'll base the question around a multithreaded Sieve of Eratosthenes to ease explaining the question. The code though should not be expected to fully completed as I'll omit certain parts that aren't directly related. The sieve also wont be fully optimised as thats not the direct question. The sieve will work in such a way that it saves which values are primes in a byte array, where each byte contains 7 numbers (we can't alter the first bit due to all things being signed).

Lets say our goal is to find all the primes below 1 000 000 000 (1 billion). As a result we would need an byte array of length 1 000 000 000 / 7 +1 or 142 857 143 (About 143 million).

class Prime {
    int max = 1000000000;    
    byte[] b = new byte[(max/7)+1];

    Prime() {
        for(int i = 0; i < b.length; i++) {
            b[i] = (byte)127;  //Setting all values to 1 at start
        }
        findPrimes();
    }

    /*
     * Calling remove will set the bit value associated with the number
     * to 0 signaling that isn't an prime
     */
    void remove(int i) {
        int j = i/7; //gets which array index to access
        b[j] = (byte) (b[j] & ~(1 << (i%7)));
    }

    void findPrimes() {
        remove(1); //1 is not a prime and we wanna remove it from the start
        int prime = 2;
        while (prime*prime < max) {
            for(int i = prime*2; i < max; i = prime + i) {
                remove(i);
            }
            prime = nextPrime(prime); //This returns the next prime from the list
        }
    }

... //Omitting code, not relevant to question
}

Now we got a basic outline where something runs through all numbers for a certain mulitplication table and calls remove to remove numbers set bits that fits the number to 9 if we found out they aren't primes.

Now to up the ante we create threads that do the checking for us. We split the work so that each takes a part of the removing from the table. So for example if we got 4 threads and we are running through the multiplication table for the prime 2, we would assign thread 1 all in the 8 times tables with an starting offset of 2, that is 4, 10, 18, ...., the second thread gets an offset of 4, so it goes through 6, 14, 22... and so on. They then call remove on the ones they want.

Now to the real question. As most can see that while the prime is less than 7 we will have multiple threads accessing the same array index. While running through 2 for example we will have thread 1, thread 2 and thread 3 will all try to access b[0] to alter the byte which causes an race condition which we don't want.

The question therefore is, whats the best way of optimising access to the byte array.

So far the thoughts I've had for it are:

  1. Putting synchronized on the remove method. This obviously would be very easy to implement but an horrible ideas as it would remove any type of gain from having threads.
  2. Create an mutex array equal in size to the byte array. To enter an index one would need the mutex on the same index. This Would be fairly fast but would require another very big array in memory which might not be the best way to do it
  3. Limit the numbers stored in the byte to prime number we start running on. So if we start on 2 we would have numbers per array. This would however increase our array length to 500 000 000 (500 million).

Are there other ways of doing this in a fast and optimal way without overusing the memory?

(This is my first question here so I tried to be as detailed and thorough as possible but I would accept any comments on how I can improve the question - to much detail, needs more detail etc.)

Upvotes: 2

Views: 214

Answers (2)

rdllopes
rdllopes

Reputation: 5120

I think you could avoid synchronizing threads...

For example, this task:

for(int i = prime*2; i < max; i = prime + i) {
            remove(i);
}

it could be partitioned in small tasks.

for (int i =0; i < thread_poll; i++){
    int totalPos =  max/8; // dividing virtual array in bytes
    int partitionSize  = totalPos /thread_poll; // dividing bytes by thread poll
    removeAll(prime, partitionSize*i*8,  (i + 1)* partitionSize*8);
}
....

// no colisions!!!

void removeAll(int prime, int initial; int max){
    k = initial / prime;
    if (k < 2) k = 2;
    for(int i = k * prime; i < max; i = i + prime) {
        remove(i);
    }
}

Upvotes: 1

Sneftel
Sneftel

Reputation: 41474

You can use an array of atomic integers for this. Unfortunately there isn't a getAndAND, which would be ideal for your remove() function, but you can CAS in a loop:

java.util.concurrent.atomic.AtomicIntegerArray aia;

....

void remove(int i) {
    int j = i/32; //gets which array index to access
    do {
        int oldVal = aia.get(j);
        int newVal = oldVal & ~(1 << (i%32));
        boolean updated = aia.weakCompareAndSet(j, oldVal, newVal);
    } while(!updated);
}

Basically you keep trying to adjust the slot to remove that bit, but you only succeed if nobody else modifies it out from under you. Safe, and likely to be very efficient. weakCompareAndSet is basically an abstracted Load-link/Store conditional instruction.

BTW, there's no reason not to use the sign bit.

Upvotes: 5

Related Questions