Reputation: 1602
I have two long arrays of bytes and I need to compute how many bytes on corresponding positions are identical. My solution (in JAVA) is as follows:
int sum = 0;
for(int i = 0;i < t.length;i++)
if (t[i] == spb[i])
sum++;
Since this part of my program takes substantial time, I wonder if I can make this any faster
Obviously the length of the two arrays is identical
Upvotes: 1
Views: 243
Reputation: 8711
Besides using previously-suggested "concurrent threads to compute partial sums for sections of the array" method (which I will comment on in a later paragraph), you can use two simple techniques to speed up the loop: (1) ?:
tertiary operator instead of if
test, and (2) loop unrolling.
On my old 2GHz system, using gcj java compiler, each of those techniques gives a few percent of speedup. The improvement you get on your machine (if any) may be compiler or jvm dependent.
Code examples:
if (t[i] == spb[i]) sum++;
goes to
sum += t[i] == spb[i] ? 1 : 0;
and
public static int counts (byte[] A, byte[] B) {
int i, count, L=A.length;
for (count=i=0; i<L; i++)
count += A[i]==B[i] ? 1 : 0;
return count;
}
goes to (for example)
public static int counts (byte[] A, byte[] B) {
final int S=8;
int i, count, L=A.length-S;
for(count=i=0; i<L; i+=S) {
count += A[i+0]==B[i+0] ? 1 : 0;
count += A[i+1]==B[i+1] ? 1 : 0;
count += A[i+2]==B[i+2] ? 1 : 0;
count += A[i+3]==B[i+3] ? 1 : 0;
count += A[i+4]==B[i+4] ? 1 : 0;
count += A[i+5]==B[i+5] ? 1 : 0;
count += A[i+6]==B[i+6] ? 1 : 0;
count += A[i+7]==B[i+7] ? 1 : 0;
}
for (; i<L+S; ++i)
count += A[i]==B[i] ? 1 : 0;
return count;
}
Loop unrolling could be done with S
having a value larger or smaller than the S=8
shown above. However, in the tests I ran, S=16
showed little improvement over S=8
. Some sample timings, with 202MB arrays:
A. 51094384 matches in 3.421857144 sec. (original loop)
E. 51094384 matches in 3.212364808 sec. (use ?: value)
F. 51094384 matches in 2.953596272 sec. (?: + S=8 unroll)
G. 51094384 matches in 2.949984214 sec. (?: + S=16 unroll)
In this run, E. timing is 6% less than A., while F. and G. are 8% less than E., and 14% less than A. (Other timings, not shown, verified that a previous answer's claim that "requesting the field of an object (t.length) takes more time than a local variable", is irrelevant.)
Regarding use of concurrent threads: Suppose you use three threads. Among other methods, you could let thread i
treat contiguous bytes in the i
'th third of each array, or you could let thread i
handle every third byte, that is, bytes with index mod 3 = i
. It would be worthwhile to benchmark the difference in execution time. I would expect it to be different, on different machines, depending on cache sizes and regimes.
Upvotes: 1
Reputation: 1
int sum = 0;
for(int i = t.length - 1;i >= 0 ;i--)
if (t[i] == spb[i])
sum++;
In principle, requesting the field of an object (t.length) takes more time than a local variable (i). If you iterate from last to first, the most expensive instruction is executed only once.
Upvotes: 0
Reputation: 1502386
Nope, you're basically doing the right thing (at least for a single thread - Simon's idea of using multiple threads is a good one). How much time is this taking, and how long are the arrays? It ought to be pretty quick.
You might be able to speed it up by creating a ByteBuffer
around the byte array, then using asLongBuffer
to create a LongBuffer
wrapping it again. You could then check 8 bytes at a time (as longs), only checking a single byte at a time when the long
comparison returns false. This would be significantly more complicated code though - and I wouldn't be at all surprised to find that it's actually a lot slower.
Upvotes: 5
Reputation: 43157
If the arrays are very long, you could use multiple concurrent threads to compute partial sums for sections of the array, and then sum up the partial sums.
Upvotes: 6