edbras
edbras

Reputation: 4404

Query with JSON array in WHERE in Postgresql

I have a user model with a payment_info JSONB column that contains the following json example:

{
    "customer_id": "cst_K5gCsCkKAU",
    "subscriptions": [
        {
            "status": "active",
            "external_id": "sub_3Q9Q4bP2zW"
        }
    ]
}

I am a newbie with JSON queries, but created the following against the Postgres (PG) db that seem to work, that is: I search for all users that has a certain external_id value:

SELECT payment_info->'subscriptions' as Subscriptions
    FROM public."user"
    , jsonb_array_elements(payment_info->'subscriptions') as subs
    where (subs->>'external_id')::text = 'sub_3Q9Q4bP2zW'

How do I do the same in SQLAlchemy? I tried several thinks that I found on the web (SO) but it does't work. I tried:

  1. JSONB Comparator

    query = misc.setup_query(db_session, User).filter(
        User.payment_info.comparator.contains(
            ('subscriptions', 'external_id') == payment_subscription_id))
    

    That results in following error:

    sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) operator does not exist: jsonb @> boolean
    LINE 3: WHERE "user".payment_info @> false
                                  ^
    HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
    
  2. json_contains function:

    from sqlalchemy import func
    query = misc.setup_query(db_session, User).filter(
        func.json_contains(User.payment_info,
                           payment_subscription_id,
                           ['subscriptions', 'external_id']))
    

    That results in:

    LINE 3: WHERE json_contains("user".payment_info, 'sub_QxyMEmU', ARRA...
               ^
    HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
    
  3. Path to the key:

    query = misc.setup_query(db_session, User).filter(
        User.payment_info['subscriptions', 'external_id'].astext == payment_subscription_id)
    

    That results in empty result, with the following query:

    SELECT *
    FROM "user" 
    WHERE ("user".payment_info #>> %(payment_info_1)s) = %(param_1)s
    

What am I doing wrong, and how can I make it work? BTW: Do I need to put an index on the external_id? (not present yet)

Upvotes: 2

Views: 2739

Answers (1)

Ilja Everilä
Ilja Everilä

Reputation: 52929

You could pretty much implement your original approach using an aliased function expression:

misc.setup_query(db_session, User).\
    select_from(
        User,
        func.jsonb_array_elements(User.payment_info['subscriptions']).
            alias('subs')).\
    filter(column('subs', type_=JSONB)['external_id'].astext == 'sub_3Q9Q4bP2zW')

which compiles to

SELECT "user".id AS user_id, "user".payment_info AS user_payment_info 
FROM "user", jsonb_array_elements("user".payment_info -> %(payment_info_1)s) AS subs 
WHERE (subs ->> %(subs_1)s) = %(param_1)s

On the other hand you could use the containment operator:

misc.setup_query(db_session, User).\
    filter(User.payment_info['subscriptions'].contains(
        [{'external_id': 'sub_3Q9Q4bP2zW'}]))

Note that the outermost list is required, as it is part of the "path" to check. With that same logic you could omit extracting the array:

misc.setup_query(db_session, User).\
    filter(User.payment_info.contains(
        {'subscriptions': [{'external_id': 'sub_3Q9Q4bP2zW'}]}))

The above @> using approaches are indexable using a GIN index. The 1st requires a functional index, because it extracts the array first:

CREATE INDEX user_payment_info_subscriptions_idx ON "user"
USING GIN ((payment_info -> 'subscriptions'));

The 2nd would require indexing the entire payment_info jsonb column. Creating GIN indexes can be done in SQLAlchemy model definitions with Postgresql-specific index options:

class User(Base):
    ...

Index('user_payment_info_subscriptions_idx',
      User.payment_info['subscriptions'],
      postgresql_using='gin')

As to why the various attempts proved unsuccessful:

  1. You're not supposed to access the comparator directly. It provides the operators for the type. In addition you're passing contains() the result of the expression

     ('subscriptions', 'external_id') == payment_subscription_id
    

    which is False, most likely (depends on what payment_subscription_id is). That is to say it is evaluated in Python.

  2. There is no json_contains() function in Postgresql (unlike MySQL). Use the @> operator, or SQL/JSON path functions, such as jsonb_path_exists().

  3. You have the wrong path. User.payment_info['subscriptions', 'external_id'].astext would match something like {"subscriptions": {"external_id": "foo"}}, but in your data subscriptions references an array.

Upvotes: 1

Related Questions