Reputation: 833
I'm trying to understand what the Big-O is of the code below.
What the code is supposed to do
Essentially, I am trying to select a subset (max size = selectionSize) of random nodes and any edges that exists between them. The selection of random nodes is done in the while
loop. After having done that, I want to select any edges that exist between the selected nodes.
What I think it is & why
I think the running time is
Edit: ugh on second thoughts... Nevermind... The O = n^2
where n=selectionSize
. The reason is: even though I can increase the size of the elements in nodes
(e.g. make it 10000), I don't believe it can affect the algorithm since I am only looping through the maximum of selectionSize
. The only reason I am a bit worried that this is wrong is because of the while
loop, where I select random elements from the list up until I have enough. While this can take quite long (because it is random), I don't think it affects the overall output in terms of time.nodes
size does affect it (since node.getNeighbors()
can be at most the size of the nodes
).. So I think that if the selectionSize
is equal to the size of nodes
, the running time is O=n^2
where n=size of nodes
.
Any tips/hints would be appreciated.
Code
// nodes and selectionSize are my input:
int[] nodes = {1,2,3...,1000}; // 1000 elements
int selectionSize = 500; // This can be at most the size of the elements (in this case 1000)
run_algo(nodes, selectionSize);
public void run_algo(int[] nodes, int selectionSize) {
randomNodesList = {};
while(randomNodesList < selectionSize) {
randomNode = selectRandomNode(nodes); // Assume O(1)
if(!randomNodesList.exists(randomNode)) { // Assume O(1)
randomNodesList.push_back(randomNode); // Assume O(1)
}
}
foreach(node in randomNodesList) {
foreach(neighbor in node.getNeighbors()) { // Max. nodes size (in this case 1000)
if (!randomNodesList.exists(neighbor)) { // Assume O(1)
AddEdge(node, neighbor); // Takes O(1) time
}
}
}
}
Upvotes: 3
Views: 215
Reputation: 8985
if selectRandomNode(nodes);
works with replacement (the same node can be picked twice), then the big o is not defined, since you have a probably infinite loop (you can end up picking the same node again and again).
If it works without replacement, then it's O(n^2)
(in the worst case, every node may be connected to every other node).
Consider the case when you are given an array of size n
, say A
and an empty array, B
. All the elements in A are unique.
The task is to fill B
with n
elements randomly selected from A
. It is desired that there should be at least k
unique elements in B
.
It can be shown that the probability of having more than k
unique items increases with increasing n
(I have added the equations after the plot).
Thus, in practice, the probability of the loop finishing in a single pass (i.e. in less than n
steps) gets large as the difference in n
and k
increases.
It's very intuitive if you think about it, the math is just cherry on the top.
def k_numerator(n, k):
res = 0
sign = 1
for i in range(k, 0, -1):
pow_term = (i ** n)
comb_term = comb(k, i)
prod = pow_term * comb_term
prod = prod * sign
res = res + prod
sign = sign * -1
return res
def p_exactly_k(n, k):
"""
Returns the probability of `B` containing exactly `k` unique elements
(also see notes above)
"""
return (comb(n, k) * k_numerator(n, k)) / (n ** n)
Upvotes: 1
Reputation: 309
I am not 100% sure if I understand this right. But let's break it down: The while-Loop runs "selectionSize"-times best case and worst case n (where n is the amount of nodes)
Therefore the size of randomNodeList is in O(n). In a simple graph you can have O(n-1) neighbours. So the whole loop must be in O(n^2) (Because of n *(n-1))
axiom is right. It is in fact not possible to find an upper bound for this algorithm. It's nondeterministic. It depends on your random numbers.
Upvotes: 0