getglad
getglad

Reputation: 2563

Using Postgres @> Operator and json_build_object in SQLAlchemy

I have a query using several Postgres-specific operations:

SELECT 
    a.row_id, 
    a.name
FROM
    a 
JOIN
    b
ON
    b.json_record @> json_build_object('path', json_build_object('to', a.name))::jsonb

My understanding is that the @> operator acts as a comparison, but the comparison methods for JSONB in the SQLAlchemy docs reference only keys, not values.

https://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#sqlalchemy.dialects.postgresql.JSONB.Comparator

I'm not clear on how I could design this query through SQLAlchemy short of using a raw query.

Edit 1

Based on this answer, I gave the below a try.

session \
   .query(A_Table) \
   .join(
      B_Table.json_record.contains({
         'path': {
            'to': A_Table.name
         }
      })
   )

It, however, resulted in an error from the line 'to': A_Table.name:

AttributeError: Neither 'BinaryExpression' object nor 'Comparator' object has an attribute 'selectable'
sqlalchemy/orm/query.py", line 2206, in join
from_joinpoint=from_joinpoint,
File "<string>", line 2, in _join

So I instead attempted

session \
   .query(A_Table) \
   .filter(
      B_Table.json_record.contains({
         'path': {
            'to': A_Table.name
         }
      })
   )

Which at least resulted in a different error, this one with some SQL generation from SQLAlchemy:

sqlalchemy.exc.StatementError: (builtins.TypeError) 
Object of type 'InstrumentedAttribute' is not JSON serializable 
[SQL: 'SELECT a.row_id AS a_id, a.name AS a_name FROM a, b 
WHERE b.json_record @> %(json_record_1)s'] [parameters: [{}]]

This SQL is close to what I was aiming for, and may be acceptable, but the example provided in the answer presumes I know the value ahead of time, when what I want to do is a comparison against the row value. I typically would do:

.filter([a.name == b.json_record['path']['to'].astext])

But I'm also trying to leverage an optimization from a gin index on this JSONB column, which leaves me needing the @> operator.

Edit 2

Based on the answer from Ilja Everilä, I was able to track down the SQLAlchemy method implemented in the source code, and using the sql-json method was able to get the SQL almost there.

session \
   .query(A_Table) \
   .join(
      B_Table.json_record.contains({
         json_sql({'path': json_sql({
            'to': A_Table.name
         }
      })
   )

Giving me the SQL:

SELECT 
    a.row_id, 
    a.name
FROM
    a 
JOIN
    b
ON
    b.json_record @> json_build_object('path', json_build_object('to', a.name))

The problem with this output is that instead of:

json_build_object(..., json_build_object(...))

Valid Postgres syntax should be:

json_build_object(..., json_build_object(...))::jsonb

Both the answer's and the source code's approach relies on the _FunctionGenerator, which can build the function, but it's not clear how something can get appended to the end of the method during the compile when going that route.

Edit 3

NVM - the answer's author pointed out jsonb_build_object(...) would fit with this model w/o the flag.

Upvotes: 1

Views: 4936

Answers (1)

Ilja Everil&#228;
Ilja Everil&#228;

Reputation: 52949

The linked Q/A handles the case of using literal values, as you've noticed. The solution is to combine using contains() in SQLA and jsonb_build_object() in Postgresql as you were trying before:

session.query(A_Table) \
    .filter(
        B_Table.json_record.contains(
            func.jsonb_build_object( 
                'path',
                func.jsonb_build_object('to', A_Table.name)
            )
        )
    )

My understanding is that the @> operator acts as a comparison, but the comparison methods for JSONB in the SQLAlchemy docs reference only keys, not values.

The SQLAlchemy documentation for JSONB.Comparator.contains() seems to be a bit poorly written. Compare

Boolean expression. Test if keys (or array) are a superset of/contained the keys of the argument jsonb expression.

to Postgresql documentation for @>:

Does the left JSON value contain the right JSON path/value entries at the top level?


You could hide the details of building jsonb in a helper function:

def build_jsonb(obj):
    if isinstance(obj, dict):
        pairs = [(k, build_jsonb(v)) for k, v in obj.items()]
        return func.jsonb_build_object(*[arg for p in pairs for arg in p])

    elif isinstance(obj, list):
        return func.jsonb_build_array(*[build_jsonb(v) for v in obj])

    else:
        return obj

and then use it in the original query:

session.query(A_Table) \
    .filter(
        B_Table.json_record.contains(
            build_jsonb({'path': {'to': A_Table.name}})))

If you wish to use the explicit JOIN syntax:

session.query(A_Table).\
    join(B_Table, B_Table.json_record.contains(
        build_jsonb({'path': {'to': A_Table.name}})))

Upvotes: 4

Related Questions