Reputation: 3561
I'm required to write a C++ function that, given a range (a,b], returns the number of nodes in an AVL tree that are in that given range, specifically in log(n) time complexity. I can add more fields to the tree's nodes if I need to do so.
I should point out that a,b will not necessarily appear in the tree. For example, if the tree's nodes are: 1,2,5,7,9,10, then running the function using the parameters (3,9] should return 3.
Which algorithm should I use to achieve this?
Upvotes: 0
Views: 6280
Reputation: 2648
First you could implement a split by key operation. That is, given a tree, to perform split(tree, key, ts, tg)
splits the key in two trees; ts
contains the keys less than key
; t2
the greater or equal ones. This operation can be done in O(lg n).
Then, with two splits, the first on a and the second on b you can obtain the desired subset range in O(lg n).
The split could be implemented as follows (pseudo code):
void split(Node * root, const Key & key, Node *& ts, Node *& tg) noexcept
{
if (root == Node::NullPtr)
return;
if (key < KEY(root))
{
Node * r = RLINK(root), * tgaux = Node::NullPtr;
split(LLINK(root), key, ts, tgaux);
insert(tgaux, root); // insert root in tgaux
tg = join_ex(tgaux, r);
}
else
{ // ket greater or equal than key to tg
Node * l = LLINK(root), *tsaux = Node::NullPtr;
split(RLINK(root), key, tsaux, tg));
insert(tsaux, root); // insert root in tsaux
ts = join_ex(l, tsaux);
}
}
The join_ex(t1, t2)
joins two exclusive trees; that is, all the keys of t1 are lesser that any key of tree t2. This join can be implemented in O(lg n) in a similar way to the concatenation described by Knuth in TAOCP V3 6.2.3.
Grosso modo if you want to join l
and r
, then suppose h(l) > h(r). You remove from r
the leftmost node (the minimum). Let j
this join node and r'
the resulting tree (r
- j
). Now you descend by the right side of r
until reaching a node p
such that h(p) - h(r')
equals 0 or 1. At this moment you do
And you treat p
as if this was inserted.
EDIT: I was wrong in interpreting the question. Sorry. I did not see that it was to count not to calculate a set. The following would be my answer. I do not erase what I've written because I think it is useful anyway.
Ami Tavory was right.
If you use extended trees, that is to store the subtree cardinality in each node, then you could easily compute the inorder positios of a key. I usually call to this operation position(key)
. If key
is not in the set then it returns the position that key
had if it was inserted in the tree.
The inorder position of root is the cardinality of left tree.
Now, in order to count the cardinality of [a, b) set you perform position(b) - position(a)
. You could require to do some adjustments if a or b are not present in the tree. But basically is thus.
position(key)
is, I think, "naturally" simple. Supposing that the node cardinality is accessed with COUNT(node)
:
long position(Node * root, const Key & key) noexcept
{
if (r == Node::NullPtr)
return 0;
if (key < KEY(root))
return position(LLINK(r), key, p);
else if (KEY(r) < key)
return position(RLINK(r), key) + COUNT(LLINK(r)) + 1;
else // the root contains key
return COUNT(LLINK(r));
}
Since an avl tree is balanced, position takes O(lg n). So two calls take O(lg n). A non recursive version is simple.
I hope you know to forgive my mistake
Upvotes: 1
Reputation: 76297
This is a famous problem - dynamic order statistcs by tree augmentation.
You basically need to augment your nodes so that when you look at a child pointer, you know how many children are in the child's subtree at time O(1). It's easy to see that this can be done without affecting the complexity.
Once you have that, you can answer any query (between this and that, inclusive/exclusive - all possibilities) by performing two traversals from node to roots. The exact traversals depend on the details (check the functions lower_bound
and upper_bound
in C++ for example).
Upvotes: 2