Krystian Rusin
Krystian Rusin

Reputation: 27

Big-O Notation of function

The following function takes a 1D array A of size n as a parameter and returns a 2D array M of size n x n. M stores average values. These average values are calculated from array A between 2 iterative variables (i an j) if i <= j using the formula M[i][j]=(A[i] +...+A[j])/( j-i+1). For example if i=1, and j=3. Then M[0][3] will store the value calculated from M[1][3] = (A[1] + A[2] + A[3])/3-1+1

 public float[][] averageArray(float[] A){
        int n = A.length;
        float sum;
        float[][] M = new float[n][n];

        for(int j = 0; j<n;j++){
            for(int i = 0; i<n; i++){

                if(i<=j) {
                    sum = 0;
                    for (int k = i; k <= j; k++) {
                        sum += A[k];

                    }
                    M[i][j] = sum / (j - i + 1);
                }else{
                    M[i][j] = 0;
                }

                System.out.println("["+ i +"]" + "["+ j +"]: "+ M[i][j]);
            }

        }
        return M;
    }

I'm confused about the big-o notation of this function. I know that the first two for loops produce a big-O of O(n^2) but because the third loop is conditional, I'm not sure how to put that into big-O notation. From tests I found that the number of times the 3rd loop is executed increases with the j value (e.x if j = 3 then the loop executes 3 times, if j = 5 then the loop will execute 5 times).

Upvotes: 0

Views: 157

Answers (2)

Charlie
Charlie

Reputation: 9108

Update: Based on @loganrussell48's analysis and comment below, the conclusion I came up with of n^2 is wrong.


Wrong answer:

Larger factors like n^2 'eclipse' smaller factors like the one you're searching for (which we know is less than n, which is why the answer will be less than n^3).

It's safe to say your big-O is larger than n^2 but smaller than n^3. If you consult a list of time complexities (like https://en.wikipedia.org/wiki/Time_complexity#Table_of_common_time_complexities) you'll see those two complexities are adjacent.

Big-O is all about simplification. Constant factors drop out. Smaller factors drop out. In this case, the third factor of n (n * n * n) is much smaller than n. The condition is met half the time and executes between i and j times (both less than n, so using estimation half of half of n). This third factor of n is now much smaller than n.

Common complexities smaller than n are insignificant compared to n^2.

In this case, n^2 is the important thing you are trying to convey.

graph of common complexities

Upvotes: 0

loganrussell48
loganrussell48

Reputation: 1864

tl;dr when you're not sure how it factors into the big-o runtime, create a summation for the loop, and then convert that summation into a function of n

Sometimes you'll encounter questions like "how many times did the line inside the innermost loop execute in terms of n. It's not quite the runtime analysis, but here's an analysis nonetheless. It will help you get a better picture of the big-o runtime.

It might be helpful to put a counter inside the inner loop and see how many times it gets executed.

It also might be helpful to draw a grid and color the squares where i <= j and j is the row index (since it's the first loop's variable) and i is the column index). When you do this, you'll see that all the colored squares split the square grid into 2 triangles down the diagonal from top-left to bottom right. Anything that falls directly on the line still counts (because you said <= rather than <). The colored squares will be the bottom/left triangle.

This is just a depiction of where the innermost loop will actually do something.

The outer 2 loops will now iterate over each location in the grid. We'll call this the current location. A line of code in the inner loop will now execute once for each colored square above the current location in that column of the grid (and once for the current location if it's colored) each time a new location is determined by the outer 2 loops.

After visualizing this, you can more easily see how to count the number of times this will execute. The first column of the grid has n colored squares. The first time this column will be counted will be when the top left square is chosen(j=0, i=0). The second time will be when (j=1, i=0). SO, let's fill in a grid where the value at each location is the number of times each cell is counted. It will be like this:

[n,  0 ,  0,  0, ... ]
[n-1, n-1, 0, 0, ... ]
[n-2, n-2, n-2, 0, ...]

You can see the picture now. Adding up everything in this grid will now tell you how many times your inner-most loop executed.

  1. First row has 1 n
  2. Second row has 2 (n-1)'s
  3. Third row has 3 (n-2)'s

You can see the pattern of (n-j) * (j+1) as the total for each row.

Sum over the rows to get the total.

You end up with a sum like so:

for(int i = 0; i < n; i++)
    sum += (n-i)*(i+1);
return sum;

inner-most loop executions

That's just for the number of times that the inner-most loop executed. Now for the times the inner-most loop did not get executed. This part is much easier. It's simply the number of non-colored squares in the grid from earlier.

Because it's an n by n grid, n2/2 would seem like the right answer. BUT the main diagonal squares are all colored. n2/2 already counts half of that line, so we have to take out the other half: n/2.

So, the total number of executions would be the for loop sum above, plus half the square of n (non-colored squares), minus half of n (because you just added half of the diagonal that was already colored in the previous plus term).

This ends up looking like enter image description here

The meaning of the first 2 terms is the number of times that the inner-most for-loop did NOT execute.

When I run this code, the following is my results:

  1. n=10
    • inner-loop executions: 220
    • total executions: 265
    • 220 + 102/2 - 10/2 (220 + 50 - 5 = 265)
  2. n=100
    • inner-loop executions: 171700
    • total executions: 176650
  3. n=1000
    • inner-loop executions: 167167000
    • total executions: 167666500
  4. n=10000
    • inner-loop executions: 166716670000
    • total executions: 166766665000
  5. n=100000
    • inner-loop executions: 166671666700000
    • total executions: 166676666650000
  6. n=100000000 (did it just for the lolz, you can already see the pattern)
    • inner-loop executions: 166666671666666700000000
    • total executions: 166666676666666650000000

AND FOR MY FINAL REVEAL, SINCE YOU'VE READ THIS FAR This is an O(n3) function.

I won't get into it too much, but the summation for the number of times that the inner-most for-loop executes simplifies down to enter image description here

Adding in the 2 terms that count the number of times the inner-most loop did NOT execute, group the like-terms for easier simplification and you'll end up with this:

enter image description here

You can use these last 2 formulas to verify the data that I've shared above. You can also empirically verify by adding counters into your loops and running them to see how many times they execute and compare that with the values given by these formulas(for values as large as those I provided you'll need to use BigIntegers or some other arbitrarily large number format). If you don't trust me, or your computer, you can also try throwing some of this information at an online tool which solves equations, such as wolfram alpha. Lastly, if this is from an exam question, you can take this to your professor and showcase the nature of this problem and the exact number of for-loop executions given n is the length of the array A. If all this is wrong, I've at the least shown that it's true for powers of 10. Hope this helps in some way.

Upvotes: 2

Related Questions