p.g.
p.g.

Reputation: 325

Query to find the next item in a tree stored as adjacency list in sqlite

Background: sqlite should be used to store information that can be queried using SNMP. SNMP organizes the information in a hierarchical structure of OID and supports 3 types of queries:

  1. value for a single OID
  2. OID and value that is lexicographically next to a given OID.
  3. All OID and values of a subtree

I started with the following table:

PRAGMA foreign_keys = ON;

CREATE TABLE data (
   oid TEXT NOT NULL PRIMARY KEY,
   parent TEXT REFERENCES data(oid) ON DELETE CASCADE CHECK(oid = '1' OR parent IS NOT NULL),
   leaf_id INTEGER NOT NULL,
   value BLOB DEFAULT NULL,
   CHECK(parent IS NULL OR oid = parent || '.' || leaf_id)
);

INSERT INTO data VALUES ('1', NULL, 1, NULL);
INSERT INTO data VALUES ('1.5', '1', 5, NULL);
INSERT INTO data VALUES ('1.5.2', '1.5', 2, 'foo');
INSERT INTO data VALUES ('1.5.11', '1.5', 11, 'foo');
INSERT INTO data VALUES ('1.5.1', '1.5', 1, 'foo');
INSERT INTO data VALUES ('1.3', '1', 3, NULL);
INSERT INTO data VALUES ('1.3.4', '1.3', 4, 'foo');
INSERT INTO data VALUES ('1.3.7', '1.3', 7, 'foo');
INSERT INTO data VALUES ('1.3.5', '1.3', 5, 'foo');
INSERT INTO data VALUES ('1.3.6', '1.3', 6, 'foo');

The idea of storing the last part of the OID as an INT is to be able order all the items with the same parent lexicographically.

The first query type is trivial to write. However - due to my limited experience with SQL - a struggled in writing queries for the second and third case. I think it should be possible with the right WITH RECURSIVE ... SELECT construct. So far I could not find a way to combine the 3 situations (first child, next sibling, next of the parent) and correct sorting did not work either. An additional complexity is that all OIDs with a value of NULL must be ignored.

I would really appreciate if someone could provide the two queries or assist me in writing them.

If the queries would get too complex or impossible to write, another idea would be to add another column 'next' with a 'pointer' to the next item and fill the next values with a trigger.

I don't like to use nested sets however - too complex and slow for inserts/deletes.

Upvotes: 4

Views: 1161

Answers (1)

CL.
CL.

Reputation: 180192

Retrieving a subtree is trivial:

WITH RECURSIVE subtree(oid, value, depth, leaf_id) AS (
    SELECT oid,
           value,
           0 AS depth,
           leaf_id
    FROM data
    WHERE oid = '1'  -- start of subtree
    UNION ALL
    SELECT child.oid,
           child.value,
           parent.depth + 1,
           child.leaf_id
    FROM data AS child
    JOIN subtree AS parent ON child.parent = parent.oid
    ORDER BY depth DESC, leaf_id ASC
)
SELECT oid, value
FROM subtree
WHERE value IS NOT NULL

The depth and leaf_id values are needed only for sorting the results lexicographically.

You should have an index on the parent column.


As for the lexicographical next item, consider first the following CTE, which goes simply up through the tree, but remembers the leaf value of the last level below:

WITH RECURSIVE parents(oid, parent, previous_leaf, step, leaf_id) AS (
    SELECT oid,
           parent,
           -1,
           0 AS step,
           leaf_id
    FROM data
    WHERE oid = '1.3.4'  -- start point
    UNION ALL
    SELECT parent.oid,
           parent.parent,
           child.leaf_id,
           child.step + 1,
           parent.leaf_id
    FROM data AS parent
    JOIN parents AS child ON parent.oid = child.parent
    ORDER BY step
)
SELECT oid, previous_leaf FROM parents

oid         previous_leaf
----------  -------------
1.3.4       -1             (1.)
1.3         4              (2.)
1           3              (3.)

What happens when, for each result row, we search the subtree below the oid value, with the additional restriction that the top-level leaf in that subtree must be larger than the previous_leaf value?

  1. We search for children of 1.3.4. (previous_leaf has no effect.)
  2. We search for larger silbings of 1.3.4, e.g., 1.3.5.
  3. We search for larger silbings of 1.3, e.g., 1.5.

So now we just have to do this subtree search:

WITH RECURSIVE parents(oid, parent, previous_leaf, step, leaf_id) AS (
    ... see above ...
),
subtree(oid, value, depth, leaf_id, previous_leaf, step) AS (
    SELECT oid,
           NULL,            -- interesting items are only *below* top of subtree 
           0 AS depth,
           leaf_id,
           previous_leaf,
           step
    FROM parents
    UNION ALL
    SELECT child.oid,
           child.value,
           parent.depth + 1,
           child.leaf_id,
           -1,              -- previous_leaf mattered only at the top
           parent.step
    FROM data AS child
    JOIN subtree AS parent ON child.parent = parent.oid
    WHERE child.leaf_id > parent.previous_leaf
    ORDER BY step, depth DESC, leaf_id
)
SELECT oid, value
FROM subtree
WHERE value IS NOT NULL
LIMIT 1                     -- only the first item

Upvotes: 2

Related Questions