Paul Stearns
Paul Stearns

Reputation: 906

How can I improve Oracle performance when performing a user defined Statistics function?

An associate of mine converted a javascript inverse chi-square routine to Oracle. The good news is that it returns the same results as the javascript routine. The bad news is that where it takes 1.5 seconds to return the result in IE or Chrome, it takes 23 seconds in Oracle. Of that 23 seconds >99% is CPU time.

There are two loops in the routines, an outer loop that executes 36 times for the values we are testing with, and an inner loop that runs 10,753 times for the each iteration of the outer loop. It does the same loop in both JS as it does in Oracle. For each iteration of the inner loop it executes an EXP function and an LN function, both of which are intrinsic in both languages.

I have compiled the Oracle code both Interpreted as well as Native, with little if any change (.045 second difference).

I have three questions;

  1. Why is Oracle so slow/how can I improve it?
  2. Is there an intrinsic inverse chi square function in Oracle.
  3. Does anyone have an inverse chi-square function that does not require the iterative looping (or not as much) as the one I am using?

A bonus question is;

Does anyone have a routine that computes Confidence Intervals in PL/SQL, or a language that could easily be converted to PL/SQL?

As requested, here is the code, which is a bit long (The main routine is CRITCHI for testing P=0.975 and DF=21507.38);

BIGX Number :=20;

FUNCTION POZ(Z IN NUMBER) RETURN NUMBER IS
  Y NUMBER;
  X NUMBER;
  W NUMBER;
  Z_MAX NUMBER;
  XXX NUMBER;
BEGIN  
      Z_MAX:=6.0;


      IF (Z=0) THEN 
          X:= 0.0;
      ELSE 
          Y := 0.5 * ABS(Z);
          IF (Y >= (Z_MAX * 0.5)) THEN 
              X:= 1.0;
          ELSIF (y < 1.0) THEN
              W:= Y * Y;

              X:= ((((((((0.000124818987 * W
                       - 0.001075204047) * W + 0.005198775019) * W
                       - 0.019198292004) * W + 0.059054035642) * W
                       - 0.151968751364) * W + 0.319152932694) * W
                       - 0.531923007300) * W + 0.797884560593) * Y * 2.0;
            ELSE
              Y:= Y-2.0;
              Y:= (((((((((((((-0.000045255659 * Y
                             + 0.000152529290) * Y - 0.000019538132) * Y
                             - 0.000676904986) * Y + 0.001390604284) * Y
                             - 0.000794620820) * Y - 0.002034254874) * Y
                             + 0.006549791214) * Y - 0.010557625006) * Y
                             + 0.011630447319) * Y - 0.009279453341) * Y
                             + 0.005353579108) * Y - 0.002141268741) * Y
                             + 0.000535310849) * Y + 0.999936657524;
          END IF;
      END IF;



      IF (Z>0.0) THEN 
        XXX:=((X + 1.0) * 0.5);
      ELSE
        XXX:= ((1.0 - x) * 0.5);
      END IF;

      RETURN XXX;

END POZ;


FUNCTION EX(X IN NUMBER) RETURN NUMBER IS
BEGIN
  IF (x < -BIGX) THEN
    RETURN 0;
  ELSE
    RETURN EXP(X);
  END IF;
END EX; 


FUNCTION POCHISQ(X IN NUMBER, DF IN NUMBER) RETURN NUMBER IS 
      A NUMBER;
      Y NUMBER;
      S NUMBER;
      E NUMBER;
      C NUMBER;
      Z NUMBER;
      X1 NUMBER;
      EVEN BOOLEAN;                                       /* True if df is an even number */
      LOG_SQRT_PI NUMBER := 0.5723649429247000870717135;  /* log(sqrt(pi)) */
      I_SQRT_PI NUMBER   := 0.5641895835477562869480795;  /* 1 / sqrt(pi) */
  b1 PLS_INTEGER;
  b2 PLS_INTEGER;
  e1 PLS_INTEGER;
  e2 PLS_INTEGER;
BEGIN    
 b1 := DBMS_UTILITY.GET_TIME();
 b2 := DBMS_UTILITY.GET_CPU_TIME();
      X1:=X;
      IF (X1 <= 0.0 OR DF < 1) THEN
          RETURN 1.0;
      END IF;

      A:= 0.5 * X1;
      EVEN:= (MOD(DF,2)=0);

      IF (DF > 1) THEN
          Y := ex(-A);
      END IF;

      IF EVEN THEN
        S:=Y;
      ELSE
        S:=(2.0 * poz(-sqrt(X1)));
      END IF;


      IF (DF > 2) THEN
          X1:= 0.5*(DF-1.0);
          IF EVEN THEN
            Z:=1.0;
          ELSE
            Z:=0.5;
          END IF;

          IF (A > BIGX) THEN
              IF EVEN THEN
                E:=0.0;
              ELSE
                E:=LOG_SQRT_PI;
              END IF;
              C:= LN(A);

              /* Timming snippet */
             e1 := DBMS_UTILITY.GET_TIME() - b1;
             e2 := DBMS_UTILITY.GET_CPU_TIME() - b2;
             --DBMS_OUTPUT.PUT_LINE( '0-GET_TIME elapsed = ' || e1 || ' hsecs.' );
             --DBMS_OUTPUT.PUT_LINE( '0-GET_CPU_TIME elapsed = ' || e2 || ' hsecs.' );
              /* End of Timming snippet */

              WHILE (Z <= X1) 
                LOOP
                  E:= LN(Z) + E;
                  S:=S+EX(C * Z - A - E);
                  Z:=Z+1.0;
              END LOOP;

             e1 := DBMS_UTILITY.GET_TIME() - b1;
             e2 := DBMS_UTILITY.GET_CPU_TIME() - b2;
             --DBMS_OUTPUT.PUT_LINE( '1-GET_TIME elapsed = ' || e1 || ' hsecs. Z= ' || Z );
             --DBMS_OUTPUT.PUT_LINE( '1-GET_CPU_TIME elapsed = ' || e2 || ' hsecs.' );

              RETURN S;
          ELSE
              IF EVEN THEN
                E:=1.0;
              ELSE
                E:=(I_SQRT_PI / sqrt(A));
              END IF;
              C:= 0.0;
              WHILE (Z <= X1) 
                LOOP
                  E:= E * (A / Z);
                  C:= C + E;
                  Z:=Z+ 1.0;
              END LOOP;

             e1 := DBMS_UTILITY.GET_TIME() - b1;
             e2 := DBMS_UTILITY.GET_CPU_TIME() - b2;
             --DBMS_OUTPUT.PUT_LINE( '2-GET_TIME elapsed = ' || e1 || ' hsecs.' );
             --DBMS_OUTPUT.PUT_LINE( '2-GET_CPU_TIME elapsed = ' || e2 || ' hsecs.' );


              RETURN C * Y + S;
          END IF;
      ELSE 

       e1 := DBMS_UTILITY.GET_TIME() - b1;
       e2 := DBMS_UTILITY.GET_CPU_TIME() - b2;
       --DBMS_OUTPUT.PUT_LINE( '3-GET_TIME elapsed = ' || e1 || ' hsecs.' );
       --DBMS_OUTPUT.PUT_LINE( '3-GET_CPU_TIME elapsed = ' || e2 || ' hsecs.' );

          RETURN S;
      END IF;


END POCHISQ;


  /*  CRITCHI  --  Compute critical chi-square value to
                   produce given p.  We just do a bisection
                   search for a value within CHI_EPSILON,
                   relying on the monotonicity of pochisq().  */

FUNCTION CRITCHI(P IN NUMBER, DF IN NUMBER) RETURN NUMBER IS 
    CHI_EPSILON NUMBER:= 0.000001;   /* Accuracy of critchi approximation */
    CHI_MAX NUMBER:= 99999.0;        /* Maximum chi-square value */
    minchisq NUMBER:= 0.0;
    maxchisq NUMBER:= CHI_MAX;
    chisqval NUMBER;
    dummy_count number := 0;
    BEGIN
    IF (p <= 0.0) THEN
        RETURN maxchisq;
    ELSE
        IF (p >= 1.0) THEN
            RETURN 0.0;
        END IF;
    END IF;

    chisqval:= df / sqrt(p);    /* fair first value */
    WHILE ((maxchisq - minchisq) > CHI_EPSILON) 
      LOOP
        if (pochisq(chisqval, df) < p) THEN
            maxchisq:= chisqval;
        ELSE
            minchisq:= chisqval;
        END IF;
      chisqval:= (maxchisq + minchisq) * 0.5;
      dummy_count := dummy_count + 1;
    END LOOP;
    --DBMS_OUTPUT.PUT_LINE('chisqval = ' || chisqval);

    RETURN chisqval;

END CRITCHI;

Upvotes: 2

Views: 384

Answers (1)

APC
APC

Reputation: 146239

For the benefit of future seekers, who might not have the patience to trawl through all the comments, the following optimizations were applied to the program to make it run fast.

  1. All the numeric variables were declared as BINARY_INTEGER(#). Find out more.
  2. Functions were declared as deterministic.
  3. Functions were compiled into native C.

(#) On more modern versions of the database PLS_INTEGER is preferred (simply because BINARY_INTEGER is old and deprecated - it gets converted to PLS_INTEGER under the covers) .


NB - if the OP or @AlexPoole would care to write a similar response, I'll happily upvote their answer and delete this one.

Upvotes: 2

Related Questions