BMcV
BMcV

Reputation: 613

how to establish a relationship between node and descendants at specific level with SQLAlchemy

I have a table with hierarchical data where each row contains an id, parent_id, level which is the depth, and also a path enumeration hierarchy_hint which is a plain string with JSON-like formatting. For example:

id parent_id level hierarchy_hint
1 null 'level1' '{"level1": "root"}'
2 1 'level2' '{"level1": "root", "level2: "shoes"}'
3 2 'level3' '{"level1": "root", "level2: "shoes", "level3": "athletic"}'

(For the sake of example I used integers, but the ids are UUIDs)

I'd like to use hierarchy_hint to get all descendants of a given Node at a particular level. The model already has a self referential relationship established to get direct descendants (children). The model looks like:

class Node(db.Model):
    id = db.Column(UUID(as_uuid=True), primary_key=True)
    parent_id = db.Column(UUID(as_uuid=True), db.ForeignKey('node.id'))
    children = db.relationship('Node')
    level = db.Column(db.String(length=4000), primary_key=False)
    hierarchy_hint = db.Column(db.String(length=200), primary_key=False)

I've tried adding to the model:

    level_threes = relationship(
        "Node", primaryjoin=and_(
            foreign(hierarchy_hint).like(func.regexp_replace(hierarchy_hint, '}', ',%%')),
            level == 'level3'), viewonly=True)

But I get the error:

sqlalchemy.exc.ArgumentError: Relationship Node.level_threes could not determine any unambiguous local/remote column pairs based on join condition and remote_side arguments.  Consider using the remote() annotation to accurately mark those elements of the join condition that are on the remote side of the relationship.

I suspect that because I'm modifying the hierarchy_hint using regexp_replace I should be using a secondary table somehow but I'm not experienced enough with SQLAlchemy to know what exactly to do.

I'm open also to alternative suggestions for how to do this using recursive queries or otherwise but I would like to be able to access the property in the same way as I access children to avoid refactoring elsewhere.

I'm using Flask, SQLAlchemy, and PostgreSQL.

In plain SQL what I'm trying to do might look like this:

with ancestor as (select hierarchy_hint from node where id='2')
select node.id from node, ancestor
    where node.hierarchy_hint and node.level='level4';

Here we would expect to get all the level4 items within the shoes tree but no level4 items from any other trees nor any level5 etc. items from shoes.

Upvotes: 0

Views: 216

Answers (1)

Ian Wilson
Ian Wilson

Reputation: 9079

I actually think I figured out what you are trying to do. I think you can do something like below. You are probably aware but you might want to double check that your path in hierarchy_hint is deterministic because I'm not sure regular JSON puts the keys in the same order all the time. It seemed to work with dumps when I tested it though. I tried to add some data to make sure it wasn't returning everything in the same level.

Aliased relationship


class Node(Base):
    __tablename__ = 'nodes'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('nodes.id'))
    parent = relationship('Node', backref='children', remote_side='nodes.c.id')
    level = Column(String(length=200), primary_key=False)
    name = Column(String(length=200), primary_key=False)
    hierarchy_hint = Column(String(length=4000), primary_key=False)

# After class is constructed.
aliased_node = aliased(Node)

# Attach the relationship after class is constructed so we can use alias.
Node.level_threes = relationship(
    aliased_node, primaryjoin=and_(
        aliased_node.hierarchy_hint.like(func.regexp_replace(Node.hierarchy_hint, '}', ',%%')),
        aliased_node.level == 'level3'),
    viewonly=True,
    uselist=True,
    remote_side=[aliased_node.level, aliased_node.hierarchy_hint],
    foreign_keys=Node.hierarchy_hint)

Base.metadata.create_all(engine)

node_values = [
    ('level1', 'root', {"level1": "root"}),
    ('level2', 'shoes', {"level1": "root", "level2": "shoes"}),
    ('level3', 'athletic shoes', {"level1": "root", "level2": "shoes", "level3": "athletic"}),
]
node_values2 = [
    ('level2', 'shirts', {"level1": "root", "level2": "shirts"}),
    ('level3', 'athletic shirts', {"level1": "root", "level2": "shirts", "level3": "athletic"}),
]

def build_branch(session, parent, node_values):
    nodes = []
    for level, name, value in node_values:
        parent = Node(level=level, name=name, parent=parent, hierarchy_hint=dumps(value))
        session.add(parent)
        nodes.append(parent)
    return nodes

with Session(engine) as session:
    nodes = []
    # Build the original branch.
    nodes.extend(build_branch(session, None, node_values))
    # Now attach another branch below root.
    nodes.extend(build_branch(session, nodes[0], node_values2))
    print(len(nodes))
    session.commit()

    # Print out to check structure
    nodes = session.query(Node).order_by(Node.id).all()
    for node in nodes:
        print(node.id, node.name, node.parent_id, node.level, node.hierarchy_hint)

    shoes = session.query(Node).filter(Node.name == 'shoes').first()
    athletic_shoes = session.query(Node).filter(Node.name == 'athletic shoes').first()
    athletic_shirts = session.query(Node).filter(Node.name == 'athletic shirts').first()
    # Check relationship...
    assert athletic_shoes in shoes.level_threes, "Athletic shoes should be below shoes"
    assert athletic_shirts not in shoes.level_threes, "Athletic shirts should not be below shoes"

Output


1 None level1 {"level1": "root"}
2 1 level2 {"level1": "root", "level2": "shoes"}
3 2 level3 {"level1": "root", "level2": "shoes", "level3": "athletic"}
4 3 level2 {"level1": "root", "level2": "shirts"}
5 4 level3 {"level1": "root", "level2": "shirts", "level3": "athletic"}

Upvotes: 1

Related Questions