Mateusz Kubica
Mateusz Kubica

Reputation: 11

How to modify SQLAlchemy SQL statement, right before execution?

Abstract:
For purpose of semester project on my university I need to implement entity-level access module for SQLAlchemy. ORM queries should return only those objects that user has access to. For example:

# python-pseudocode
# SomeModel contains multiple rows in database, but some_user only have acces to some_obj1, some_obj2

session.query(SomeModel).all()
[]  # returns empty list because user is not set

ACL.set_user(some_user)
session.query(SomeModel).all()
['some_obj1', 'some_obj2']  # returns object that user have acces to

Problem:
I have already implemented that, by extending BaseQuery of SQLAlchemy and overriding iterator, pretty much something like explained here, but my lecturer pointed out that is not right approach due to its poor-efficiency. He submited that I am filtering objects after retrieving them from database, so if there are million of rows in DB, and user only have acces to several of them I am retrieving those million objects for nothing. He suggested that I should intecept SQL statement riqht before its execution.

I did some research and I found SQLAlchemy events, before_cursor_execute seems to be good place to intercept statement. I had idea that I could parse statement and inject WHERE <tablename>.id IN <ids_that_user_have_access_to>. Unfortunately, I faced three problems:

  1. I need to be sure that I inject WHERE clause in right place in statement. I have this intuition that it's easy when query is simple, but with more complex queries it might be tricky. Is there way to insert WHERE in place that will always work fine?
  2. When it comes to parsing, statement intercepted in before_cursor_execute event, has this weird formatting with question marks, something like: SELECT sm.id AS sm_id, sm.some_field AS sm_some_field FROM sm WHERE sm.id = ? , parameters associated with statement are in parameters tuple (for exemplary statement it would be (1, )) which is passed to function along with statement. I tested parsing statement with moz_sql_parser but of course, question marks are not valid in SQL statement, so it cannot be parsed. How to parse such statement?
  3. Even when I'll be able to parse statement and inject WHERE clause, how would I know in which position in newly created parameters tuple should I place appropriate parameter?

Upvotes: 0

Views: 2560

Answers (1)

Mateusz Kubica
Mateusz Kubica

Reputation: 11

I am replying to myself but I found suitable solution.

Turned out before_execute, event is better place to intercept and modify query, because uncompiled SQL construct ClauseElement is available. This makes modyfing query much easier.

@event.listens_for(engine, 'before_execute', retval=True)
def intercept(conn, clauseelement, multiparams, params):
    from sqlalchemy.sql.selectable import Select

    # check if it's select statement
    if isinstance(clauseelement, Select):
        # 'froms' represents list of tables that statement is querying
        table = clauseelement.froms[0]

        # adding filter in clause
        clauseelement = clauseelement.where(table.c.id.in_(list_of_ids))

    return clauseelement, multiparams, params

For more, read about before_execute event and expression API.

Upvotes: 1

Related Questions