Reputation: 235
Consider the following binary search tree, along with the following frequencies of lookups:
13 Key | 13 | 11 | 26 | 1 | 12 | 28
/ \ -----------------------------------------
11 26 Frequency | 26 | 5 | 25 | 1 | 3 | 15
/ \ \
1 12 28
I was given this question:
Calculate the cost of the search tree above for the given search frequencies and show that the search tree isn't optimal for the given search frequencies.
I calculated the cost, but my teacher say I did so incorrectly, but did not explain why.
So what we need to do for calculate cost is check where the first node is. 13 is in level 1
and the frequency of 13 is 26
. So we do 26*1=26
The nodes 11 and 26 are in level 2, the nodes 1, 12 and 28 are in level 3.
In the end we have for cost: 26*1 + 5*2 + 25*2 + 1*3 + 3*3 + 15*3
. My teacher says that this calculation is incorrect, but didn't explain why.
Also, how do you show that a tree isn't optimal? Here is definition I have from our script:
Let
K
be a set of keys andR
a workload. A search treeT
overK
is optimal forR
iffP(T) = min{P(T') | T' is search tree for K}
@templatetypedef Thank you very much for take your time and help!! Your answer is very nice for me I understand many things from it. Here is tree I found it is more optimal than this tree from task:
26
/ \
13 28
/
11
/ \
1 12
Tree above have cost of 143
and this one have 138
. So this one is really more optimal and task is solved :)
Upvotes: 2
Views: 1600
Reputation: 372992
Fundamentally, you're approaching the question of calculating the total lookup time in a BST correctly. You're taking each node in the tree, using the depth to determine the number of comparisons necessary to perform a lookup that ends at that node, multiplying those values by the number of lookups, and summing the results. I didn't meticulously double-check your exact calculations and so it's possible that you missed something, though.
Your second question was about determining whether a binary search tree is optimal for a given set of lookups. You've given the rigorous mathematical definition, but in this case I think it might be a bit easier to explain this at a higher level.
The calculation you did earlier here is a way of starting with a BST and information about what lookups will be performed, then computing a number corresponding to the number of comparisons that will end up being made in the course of performing those lookups. That number essentially tells you how fast those lookups are going to be - higher numbers mean that the lookups take longer, and lower numbers mean that the lookups will take less time.
Now, imagine that you want to make a BST that will take the least total amount of time to perform the lookups in question. In other words, you want the "best" BST for the given set of keys and lookup frequencies. That BST would be one with the lowest total lookup cost, where that cost is calculated using the approach you worked through earlier on. The terminology for a BST with that property - that it has the best lookup speed for those frequencies among all the possible BSTs that you can make - is an optimal BST.
The question here is to show that the tree you have isn't optimal. That means that you need to show that this isn't the best possible tree you can make. One way to do this would be to find an even better tree. So can you find another BST with the same keys where the total lookup time is lower than the one you were given?
Good luck!
Upvotes: 3