Reputation:
I have a table will a few hundred services that I am scraping from a third party web service. The services have several million symbols which I'm also getting.
The hierarchy is essentially this;
root
|_service1
| |_service1/sub1*
| |_service1/sub2
| |_service1/sub3
| |_service1/sub4
| | |_service1/sub4/sub1
| | |_**service1/sub4/sub1/leaf1**
| |
| |_service1/sub5
|
|_service2
|_service2/sub1
|_service2/sub2
|_service2/sub3
So what I'm trying to achieve is - given an Id for a leaf, I want to work up the tree till I find the branch were isActive is true then return that value.
In this example, I want to find the active service for the leaf /service1/sub4/sub1/leaf1
which in this case is service1/sub1
Up to now I've been using a Common Table Expression to work from the top down and set an additional ActiveServiceId column on the leaves that are under that branch; with millions of symbols that turns into a very long running process.
I'm only working with a very small subset of the symbols (although I don't know which ones and I need them all) so the known leaf Ids at the time of the query will only be a few hundred.
I've created a SQLFiddle to reflect this hierachy to assist anyone who might be able to help
UPDATE to explain further the current approach. If I decided that service1
is to be the new active service - the leaves under service1 in every sub branch need to bubble up to service1 and I will reset all the child branches so none of them as marked as active
Upvotes: 0
Views: 1656
Reputation: 159
If you have the full path stored as a string for each node you can do the following:
SELECT r.ID
FROM TreeTable r, TreeTable rs
WHERE rs.ID = ?
AND r.Path = Left(rs.path,Len(r.Path))
AND Len(r.Path) = (SELECT Max(Len(t.Path))
FROM TreeTable t, TreeTable s
WHERE s.ID = ?
AND t.Path = Left(s.path,Len(t.Path))
AND t.isActive = true)
Ok it's not pretty and I welcome any suggestions for making it neater - but it should work.
Upvotes: 0
Reputation: 159
You may find it useful to use nested sets to store this data:
http://en.wikipedia.org/wiki/Nested_set_model
I have used this method before to store tree data in a database and found it good for querying data belonging to all nodes under a parent etc.
Using the data stored in nested sets the solution to your problem would be as follows:
SELECT s.ID
FROM TreeTable s
WHERE s.L = (SELECT Max(t.L)
FROM TreeTable t, TreeTable LeafNode
WHERE t.isActive = true
AND t.L < LeafNode.L
AND t.R > LeafNode.R
AND LeafNode.ID = ?)
Upvotes: 0