Reputation: 198
I have recently started reading Sicp, and I have worked my way through to Section 1.2.3. I am unable to understand some of the details of Order of Growth. Please bear with me and my overly long question. Please also note that I have never dealt with analyzing algorithms before.
Here are few pieces of the text from Sicp:
Let n be a parameter that measures the size of the problem. In our previous examples we took n to be the number for which a given function is to be computed, but there are other possibilities. For instance, if our goal is to compute an approximation to the square root of a number, we might take n to be the number of digits accuracy required.
Take the following procedure which was given in Section 1.7 for Newton's Method for Square roots:
(define (sqrt x)
(sqrt-iter 1.0 x))
And this is (sqrt-iter)
(define (sqrt-iter guess x)
(if (good-enough? guess x)
guess
(sqrt-iter (improve guess x)
x)))
Where good-enough?
checks if the guess
a good enough approximations
(define (good-enough? guess x)
(< (abs (- (square guess) x)) 0.001))
Now, according to Sicp, 0.001
should be n
, but shouldn't n
be input in (sqrt x)
?
The number of iterations changes based on input x
, that is the number of iterations required
would change based on the size of the number.
Here is my python equivalent proving that:
In [33]: sqrt(2)
1
1.5
1.4166666666666665
achieved 1.4142156862745097 in 3 iterations
In [34]: sqrt(4)
1
2.5
2.05
2.000609756097561
achieved 2.0000000929222947 in 4 iterations
In [35]: sqrt(1024)
1
512.5
257.2490243902439
130.61480157022683
69.22732405448895
42.00958563100827
33.19248741685438
32.02142090500024
achieved 32.0000071648159 in 8 iterations
So shouldn't 0.001
be k1 or k2, as it is a value that is constant and is independent of
n (here the value we input to sqrt
)?
let R(n) be the amount of resources the process requires for a problem of size n. R(n) might measure the number of internal storage registers used, the number of elementary machine operations performed, and so on. In computers that do only a fixed number of operations at a time, the time required will be proportional to the number of elementary machine operations performed.
Here, Sicp says that R(n) might be the number of elementary operations performed, but isn't the number of operations performed be different from OS to OS? As a Linux Machine might do it a certain set of steps, a FreeBSD machine another, and a Windows Machine another? You'll see why I am saying this soon.
We say that R(n) has order of growth Θ(f (n)), written R(n) = Θ(f (n)) (pronounced “theta of f (n)”), if there are positive constants k1 and k2 independent of n such that k1 f (n) ≤ R(n) ≤ k2 f (n) for any sufficiently large value of n. (In other words, for large n, the value R(n) is sandwiched between k1 f (n) and k2 f (n).)
Now, from what I understand we calculate the growth of the resources used by doing some algebra after
receiving the values from functions R(n)
and f(n)
(Is R a function? and f what I am thinking of? I don't know!)
For instance, with the linear recursive process for computing factorial described in Section 1.2.1 the number of steps grows proportionally to the input n. Thus, the steps required for this process grows as Θ(n). We also saw that the space required grows as Θ(n). For the iterative factorial, the number of steps is still Θ(n) but the space is Θ(1)—that is, constant. The tree-recursive Fibonacci computation requires Θ(ϕn) steps and space Θ(n), where ϕ is the golden ratio described in Section 1.2.2.
Now this is the part which makes no sense to me - what do I consider a step? Do I consider the number of substitutions using the substitution model? Or do I consider the number of expressions evaluated? Or do I consider when there is arithmetic evaluated?
I understand that the the space is Θ(n) (because the interpreter needs to remember 1 more thing every recursion) in the recursive procedure and in the iterative Θ(1) (as the interpreter the some value with x and continues.) I don't understand how the Fibonacci computation (Tree Recursion) takes Θ(n).
I also have no clue how the earlier n is related with the number of steps.
So here is the list of all my questions to do with Order of Growth:
Which exact value in an algorithm is n?
What happens in R(n)? (Assuming its a function of course) Does n divide/multiply a value?
What do I consider a step?
How do I compute the Order of Growth of the steps and space.
In short:
What is Order of Growth, and How do I compute it ?
Upvotes: 4
Views: 6188
Reputation: 162
The order of growth of an algorithm is something you must understand if you want to be able to write effective code. There are so many great resources all around the internet on this topic, some are more mathematically accurate than others, but here is my own mathematically inaccurate explanation:
Let's say you have some data, e.g. 100 rows from a database and you want to process it. Now, what could you with the data actually be doing?
Maybe you want to just print the first row. How hard that could be? Well not very hard. You need just one operation to do it print(first_row). If you had 1000 rows, you could still print the first row with just one operation, this situation means that it does not matter how much data you are processing, it will still take you the same (constant) amount of time, O(1).
Then we have another very important order of growth O(log(n)). This is typical for algorithms that are searching in sorted data. If you have 100 rows of sorted names, and you are looking for some name, that would require just a few comparisons. With each comparison, you cut the data into half and will find the name at worst case in 6 operations. If you are looking in 1000 sorted rows, you need roughly 9-10 comparisons. This is very effective in searching in big data and you must not worry about searching in big datasets of sorted data. It will always be very fast.
Now if you would like to for example print all of the data, you would have to call print on each row. For 100 rows you would have to do 100 operations and for 1000 rows 1000 operations. This means that amount of work you need is directly proportional to the amount of data you are processing. This would be O(n).
O(n*n) is when you are for example trying to find duplicate rows in the data. For that, you need to compare each row with every other row. So you would have to do 10 000 comparisons for 100 rows and 1000 000 comparisons for 1000 rows! As you can see in this case, the number of operations needed is starting to grow very fast and is already a growth that will possibly limit the amount of data you can process within a few seconds on a modern computer (what you normally want). That is why when you see two for loops in one another you must start being careful. This is also called polynomial growth.
O(e^n) is exponential growth and this is growing very, very, very ... very fast. Consider a situation where you have 100 rows with random numbers. If you were to find all subsets of those numbers with a total sum of 50, it would take you something like 1267650600228229401496703205376 operations. For 1000 rows, it would then be a number with 300 zeroes. This would for you as a programmer mean that if you tried to process 1000 numbers like that, you would never get the result.
O(n!) is a factorial complexity that grows even faster and you will not process any big set of data with this kind of algorithm. A typical example is solving the traveling salesman problem with a brute-force search. This would mean that if there were 100 rows with 100 cities and you would like to know what is the order in which you should visit them, so that you will take the shortest path you would need to calculate distances for every possible travel combination and it would mean to perform a number of operations with 157 zeroes and for 1000 cities the number would have thousands of zeroes. This is a crazy number you cannot even imagine. Consider that there are approximately 10^78 to 10^82 atoms in the universe! In other words, you can optimize a path of a salesman for just a few cities on your computer.
You should be able to sort the orders of growth based on speed they grow in. It is helpful to draw it into a graph and think about this picture always when you will be writing some algorithm and when you will find yourself putting for loops one into another.
Upvotes: 1
Reputation: 748
What you are asking is a long topic. I will try to answer in short. However, this is a basic computer science concept and at the beginning of any algorithm book, you will find a chapter about Order of Growth (aka Big-O, Big-Omega, and Big-theta notations). I highly recommend this book if you are interested: https://www.amazon.ca/Introduction-Algorithms-Thomas-H-Cormen/dp/0262033844
To answer your question, scientists were not able to compare their codes because atomic operations take different time on different machines. Tons of factors can impact the running time of a code on a machine such as OS load, OS scheduling, implemented atomic operations on CPU, etc. Therefore they came up with a theoretical definition of the order of growth.
One thing is for sure impacting running time and that is the size of the input of the code, usually noted by n
. Since n
can be very large, these notations are also known as asymptotic notations. So, when we talk about the order of growth, we assume n
is arbitrarily large (we don't care about small input size). Simply, the order of growth is the number of atomic steps (aka elementary) that your program executes. What is atomic? any operation that takes 1 or 2 or a constant number of CPU operations (by constant, I mean it does not depend on n
, the input size). Let me give you an example.
This code:
c = a + b
has one atomic step, it has a summation and an assignment. another example:
for i in 1..n:
print(i)
This code has one atomic step print(i)
and repeats it n
times (let n
be the input size). So we say this program has n
atomic operations (i.e. its order of growth is O(n)
).
So, I hope so far you understand what is the atomic operation and what is the order of growth (number of atomic steps). However, usually, it is not easy to calculate the order of growth, and lots of math is involved. for example, this code:
for i in 1..n:
for j in i..n:
print(i+j)
In this code since j
depends on i
we will have n + (n-1) + ... + 1 = n*(n+1)/2
atomic operation. If you calculate that formula it is n^2 + ...
. Since n^2
has the largest exponent in the result, we only care about that term. In a very large input size, that term is dominant and we say its order of growth is O(n^2)
(I know lots of details are missing).
So, when we say program A has the order of growth of O(n)
and program B has the order of growth of O(n^2)
, we are sure that, for a large input size, program B will be much slower than program A (we can finally compare codes without running it on a machine).
So to summarize, the order of growth is the number of atomic operations when the input size is very large and we don't care about small operations, we only care about the biggest chunk of the operations.
My words here might offend both scientists and engineers. To my scientist friends: I was not mathematically accurate when I was explaining the order of growth here (please read the book that I have mentioned if you care about the exact mathematical definition). To my engineer friends: Yes, those small steps that scientists ignore when calculating Big-o
actually matters in practice, but scientists need to simplify to build ground and then discuss details.
Upvotes: 3