tnorgd
tnorgd

Reputation: 1602

How to count identical bytes

I have two long arrays of bytes and I need to compute how many bytes on corresponding positions are identical. My solution (in JAVA) is as follows:

    int sum = 0;
    for(int i = 0;i < t.length;i++)
      if (t[i] == spb[i])
        sum++;

Since this part of my program takes substantial time, I wonder if I can make this any faster

Obviously the length of the two arrays is identical

Upvotes: 1

Views: 243

Answers (4)

James Waldby - jwpat7
James Waldby - jwpat7

Reputation: 8711

Besides using previously-suggested "concurrent threads to compute partial sums for sections of the array" method (which I will comment on in a later paragraph), you can use two simple techniques to speed up the loop: (1) ?: tertiary operator instead of if test, and (2) loop unrolling.

On my old 2GHz system, using gcj java compiler, each of those techniques gives a few percent of speedup. The improvement you get on your machine (if any) may be compiler or jvm dependent.

Code examples:

    if (t[i] == spb[i]) sum++;

goes to

    sum += t[i] == spb[i] ? 1 : 0;

and

    public static int counts (byte[] A, byte[] B) {
       int i, count, L=A.length;
       for (count=i=0; i<L; i++)
          count += A[i]==B[i] ? 1 : 0;
       return count;
    }

goes to (for example)

   public static int counts (byte[] A, byte[] B) {
        final int S=8;
        int i, count, L=A.length-S;
        for(count=i=0; i<L; i+=S) {
            count += A[i+0]==B[i+0] ? 1 : 0;
            count += A[i+1]==B[i+1] ? 1 : 0;
            count += A[i+2]==B[i+2] ? 1 : 0;
            count += A[i+3]==B[i+3] ? 1 : 0;
            count += A[i+4]==B[i+4] ? 1 : 0;
            count += A[i+5]==B[i+5] ? 1 : 0;
            count += A[i+6]==B[i+6] ? 1 : 0;
            count += A[i+7]==B[i+7] ? 1 : 0;
        }
        for (; i<L+S; ++i)
            count += A[i]==B[i] ? 1 : 0;
        return count;
    }

Loop unrolling could be done with S having a value larger or smaller than the S=8 shown above. However, in the tests I ran, S=16 showed little improvement over S=8. Some sample timings, with 202MB arrays:

A. 51094384 matches in 3.421857144 sec. (original loop)
E. 51094384 matches in 3.212364808 sec. (use ?: value)
F. 51094384 matches in 2.953596272 sec. (?: + S=8 unroll)
G. 51094384 matches in 2.949984214 sec. (?: + S=16 unroll)

In this run, E. timing is 6% less than A., while F. and G. are 8% less than E., and 14% less than A. (Other timings, not shown, verified that a previous answer's claim that "requesting the field of an object (t.length) takes more time than a local variable", is irrelevant.)

Regarding use of concurrent threads: Suppose you use three threads. Among other methods, you could let thread i treat contiguous bytes in the i'th third of each array, or you could let thread i handle every third byte, that is, bytes with index mod 3 = i. It would be worthwhile to benchmark the difference in execution time. I would expect it to be different, on different machines, depending on cache sizes and regimes.

Upvotes: 1

MoJ
MoJ

Reputation: 1

int sum = 0;
for(int i = t.length - 1;i >= 0 ;i--)
if (t[i] == spb[i])
    sum++;

In principle, requesting the field of an object (t.length) takes more time than a local variable (i). If you iterate from last to first, the most expensive instruction is executed only once.

Upvotes: 0

Jon Skeet
Jon Skeet

Reputation: 1502386

Nope, you're basically doing the right thing (at least for a single thread - Simon's idea of using multiple threads is a good one). How much time is this taking, and how long are the arrays? It ought to be pretty quick.

You might be able to speed it up by creating a ByteBuffer around the byte array, then using asLongBuffer to create a LongBuffer wrapping it again. You could then check 8 bytes at a time (as longs), only checking a single byte at a time when the long comparison returns false. This would be significantly more complicated code though - and I wouldn't be at all surprised to find that it's actually a lot slower.

Upvotes: 5

Simon Nickerson
Simon Nickerson

Reputation: 43157

If the arrays are very long, you could use multiple concurrent threads to compute partial sums for sections of the array, and then sum up the partial sums.

Upvotes: 6

Related Questions