woodlumhoodlum
woodlumhoodlum

Reputation: 462

Java Optimizing arithmetic and Assignment Operators for large input

I have a piece of code that must run extremely fast in terms of clock speed. The algorithm is already in O(N). It takes 2seconds, it needs to take 1s. For most A.length inputs ~ 100,000 it takes .3s unless a particular line of code is invoked an extreme number of times. (For an esoteric programming challenge)

It uses a calculation of the arithmetic series that 1,2,..N -> 1,3,4,10,15.. that can be represented by n*(n+1)/2 I loop through this equation hundreds of thousands of times. I do not have access to the input, nor can I display it. The only information I am able to get returned is the time it took to run. particularly the equation is:

s+=(n+c)-((n*(n+1))/2);

s and c can have values range from 0 to 1Billion

n can range 0 to 100,000

What is the most efficient way to write this statement in terms of clock speed? I have heard division takes more time then multiplication, but beyond that I could not determine whether writing this in one line or multiple assignment lines was more efficient. Dividing and multiplying versus multiplying and then dividing? Also would creating custom integers types significantly help?

Edit as per request, full code with small input case (sorry if it's ugly, I've just kept stripping it down):

public static void main(String[] args) {

        int A[]={3,4,8,5,1,4,6,8,7,2,2,4};//output 44
        int K=6;
        //long start = System.currentTimeMillis();;
        //for(int i=0;i<100000;i++){
            System.out.println(mezmeriz4r(A,K));
        //}
        //long end = System.currentTimeMillis();;

//      System.out.println((end - start) + " ms");

    }
    public static int mezmeriz4r(int[]A,int K){
        int s=0;
        int ml=s;
        int mxl=s;
        int sz=1;
        int t=s;
        int c=sz;
        int lol=50000;
        int end=A.length;
        for(int i=sz;i<end;i++){
            if(A[i]>A[mxl]){
                mxl=i;
            }else if(A[i]<A[ml]){
                ml=i;
            }
            if(Math.abs(A[ml]-A[mxl])<=K){
                sz++;
                if(sz>=lol)return 1000000000;
                if(sz>1){
                    c+=sz;
                }
            }else{
                if(A[ml]!=A[i]){
                    t=i-ml;
                    s+=(t+c)-((t*(t+1))/(short)2);
                    i=ml;
                    ml++;
                    mxl=ml;
                }else{
                    t=i-mxl;
                    s+=(t+c)-((t*(t+1))/(short)2);
                    i=mxl;
                    mxl++;
                    ml=mxl;
                }
                c=1;
                sz=0;
            }
        }
        if(s>1000000000)return 1000000000;
        return s+c;
    }

Returned from Challenge:

Detected time complexity:

O(N)

test time result

example example test 0.290 s. OK

single single element 0.290 s. OK

double two elements 0.290 s. OK

small_functional small functional tests 0.280 s. OK

small_random small random sequences length = ~100 0.300 s. OK

small_random2 small random sequences length = ~100 0.300 s. OK

medium_random chaotic medium sequences length = ~3,000 0.290 s. OK

large_range large range test, length = ~100,000 2.200 s. TIMEOUT ERROR running time: >2.20 sec., time limit: 1.02 sec.

large_random random large sequences length = ~100,000 0.310 s. OK

large_answer test with large answer 0.320 s. OK

large_extreme all maximal value = ~100,000 0.340 s. OK

Upvotes: 2

Views: 1667

Answers (7)

MrSmith42
MrSmith42

Reputation: 10161

I would try the following and profile the code after each change to check if there is any gain in speed.


replace:

if(Math.abs(A[ml]-A[mxl])<=K)

by

int diff = A[ml]-A[mxl];
if(diff<=K && diff>=-K)

replace

/2

by

>>1

replace

ml++;
mxl=ml;

by

mxl=++ml;

Maybe avoid array access of the same element (internal boundary checks of java may take some time)

So staore at least A[i] in a local varianble.

Upvotes: 1

Mani
Mani

Reputation: 3364

I dont have access to validate all inputs. and time range. but this one runs O(N) for sure. and have improved. run and let me know your feedback.i will provide details if necessary

public static int solution(int[]A,int K){
    int minIndex=0;
    int maxIndex=0;
    int end=A.length;
    int slize = end;
    int startIndex = 0;
    int diff = 0;
    int minMaxIndexDiff = 0;
    for(int currIndex=1;currIndex<end;currIndex++){
        if(A[currIndex]>A[maxIndex]){
            maxIndex=currIndex;
        }else if(A[currIndex]<A[minIndex]){
            minIndex=currIndex;
        }
        if( (A[maxIndex]-A[minIndex]) >K){
            minMaxIndexDiff= currIndex- startIndex;
            if (minMaxIndexDiff > 1){
                slize+= ((minMaxIndexDiff*(minMaxIndexDiff-1)) >> 1);
                if (diff > 0 ) {
                    slize = slize + (diff * minMaxIndexDiff);
                }
            }

            if (minIndex == currIndex){
                diff = currIndex - (maxIndex + 1);
            }else{
                diff = currIndex - (minIndex + 1);
            }
            if (slize > 1000000000) {
                return 1000000000;
            }
            minIndex = currIndex;
            maxIndex = currIndex;
            startIndex = currIndex;
        }
    }
    if ( (startIndex +1) == end){
        return slize;
    }
    if (slize > 1000000000) {
        return 1000000000;
    }
    minMaxIndexDiff= end- startIndex;
    if (minMaxIndexDiff > 1){
        slize+= ((minMaxIndexDiff*(minMaxIndexDiff-1)) >> 1);
        if (diff > 0 ) {
            slize = slize + (diff * minMaxIndexDiff);
        }
    }

    return slize;
}

Upvotes: 1

leventov
leventov

Reputation: 15313

Nested assignments, i. e. instead of

t=i-ml;
s+=(t+c)-((t*(t+1))/(short)2);
i=ml;
ml++;
mxl=ml;

something like

s+=((t=i-ml)+c);
s-=((t*(t+1))/(short)2);
i=ml;
mxl=++ml;

sometimes occurs in OpenJDK sources. It mainly results in replacing *load bytecode instructions with *dups. According to my experiments, it really gives a very little speedup, but it is ultra hadrcore, I don't recommend to write such code manually.

Upvotes: 1

AlexWien
AlexWien

Reputation: 28757

I would try to elimnate this line if(Math.abs(A[ml]-A[mxl])<= by a faster self calculated abs version, which is inlined, not a method call!

The cast to (short) does not help, but try the right shift operator X >>1 instead x / 2

removing the System.out.println() can speed up by factor of 1000. But be carefull otherwise your whole algorithm can be removed by the VM becasue you dont use it. Old code:

for(int i=0;i<100000;i++){
            System.out.println(mezmeriz4r(A,K));
}

New code:

int dummy = 0;
    for(int i=0;i<100000;i++){
          dummy =   mezmeriz4r(A,K);
    }
//Use dummy otherwise optimisation can remove  mezmeriz4r
System.out.print("finished: " + dummy);

Upvotes: 0

Display Name
Display Name

Reputation: 8128

I would create a C version first and see, how fast it can go with "direct access to the metal". Chances are, you are trying to optimize calculation which is already optimized to the limit.

Upvotes: 0

klor
klor

Reputation: 4436

Get rid of the System.out.println() in the for loop :) you will be amazed how much faster your calculation will be

Upvotes: 1

rgettman
rgettman

Reputation: 178303

With a little algebra, you can simply the expression (n+c)-((n*(n+1))/2) to c-((n*(n-1))/2) to remove an addition operation. Then you can replace the division by 2 with a bit-shift to the right by 1, which is faster than division. Try replacing

s+=(n+c)-((n*(n+1))/2);

with

s+=c-((n*(n-1))>>1);

Upvotes: 3

Related Questions