Moody
Moody

Reputation: 833

Determining the Big-O of this algorithm

I'm trying to understand what the Big-O is of the code below.

What the code is supposed to do

Essentially, I am trying to select a subset (max size = selectionSize) of random nodes and any edges that exists between them. The selection of random nodes is done in the while loop. After having done that, I want to select any edges that exist between the selected nodes.

What I think it is & why

I think the running time is O = n^2 where n=selectionSize. The reason is: even though I can increase the size of the elements in nodes (e.g. make it 10000), I don't believe it can affect the algorithm since I am only looping through the maximum of selectionSize. The only reason I am a bit worried that this is wrong is because of the while loop, where I select random elements from the list up until I have enough. While this can take quite long (because it is random), I don't think it affects the overall output in terms of time. Edit: ugh on second thoughts... Nevermind... The nodes size does affect it (since node.getNeighbors()can be at most the size of the nodes).. So I think that if the selectionSize is equal to the size of nodes, the running time is O=n^2 where n=size of nodes.

Any tips/hints would be appreciated.

Code

// nodes and selectionSize are my input:
int[] nodes = {1,2,3...,1000}; // 1000 elements
int selectionSize = 500; // This can be at most the size of the elements (in this case 1000)

run_algo(nodes, selectionSize);

public void run_algo(int[] nodes, int selectionSize) {
    randomNodesList = {};

    while(randomNodesList < selectionSize) {
        randomNode = selectRandomNode(nodes); // Assume O(1)

        if(!randomNodesList.exists(randomNode)) { // Assume O(1)
            randomNodesList.push_back(randomNode); // Assume O(1)
        }
    }

    foreach(node in randomNodesList) {
        foreach(neighbor in node.getNeighbors()) { // Max. nodes size (in this case 1000)
            if (!randomNodesList.exists(neighbor)) { // Assume O(1)
                AddEdge(node, neighbor); // Takes O(1) time
            }
        }
    }
}

Upvotes: 3

Views: 215

Answers (2)

Aman
Aman

Reputation: 8985

if selectRandomNode(nodes); works with replacement (the same node can be picked twice), then the big o is not defined, since you have a probably infinite loop (you can end up picking the same node again and again).

If it works without replacement, then it's O(n^2) (in the worst case, every node may be connected to every other node).


Notes on selecting without replacement:

  • Consider the case when you are given an array of size n, say A and an empty array, B. All the elements in A are unique.

  • The task is to fill B with n elements randomly selected from A. It is desired that there should be at least k unique elements in B.

It can be shown that the probability of having more than k unique items increases with increasing n (I have added the equations after the plot).

Thus, in practice, the probability of the loop finishing in a single pass (i.e. in less than n steps) gets large as the difference in n and k increases. It's very intuitive if you think about it, the math is just cherry on the top.

enter image description here


def k_numerator(n, k):
    res = 0
    sign = 1
    for i in range(k, 0, -1):
        pow_term = (i ** n)
        comb_term = comb(k, i)
        prod = pow_term * comb_term
        prod = prod * sign
        res = res + prod
        sign = sign * -1
    return res

def p_exactly_k(n, k):
    """ 
    Returns the probability of `B` containing exactly `k` unique elements
    (also see notes above)
    """

    return (comb(n, k) * k_numerator(n, k)) / (n ** n)

Upvotes: 1

Andreas
Andreas

Reputation: 309

I am not 100% sure if I understand this right. But let's break it down: The while-Loop runs "selectionSize"-times best case and worst case n (where n is the amount of nodes)

Therefore the size of randomNodeList is in O(n). In a simple graph you can have O(n-1) neighbours. So the whole loop must be in O(n^2) (Because of n *(n-1))

axiom is right. It is in fact not possible to find an upper bound for this algorithm. It's nondeterministic. It depends on your random numbers.

Upvotes: 0

Related Questions