Reputation: 159
I'm new to backend development and python. I'm working on creating a SQLAlchemy query from SQL query which has subquery and inner join.
I'm using python > 3.5, "oracle+cx_oracle" driver and SQLAlchemy==1.3.6
The SQL query that I need to convert is:
SELECT DISTINCT d.*, t.max_date, t.USER_ID
FROM DATABASE_ONE d
INNER JOIN (
SELECT USER_ID, MAX(DATE_CREATED) max_date FROM (
SELECT * FROM DATABASE_ONE
WHERE USER_ID IN ('1','2','3')
AND CATEGORY = 'Users'
AND IS_DELETED IS NULL
)
GROUP BY USER_ID
) t on t.USER_ID = d.USER_ID AND t.max_date = d.DATE_CREATED
Below is the code that I've tried to convert the above SQL query to SQLAlchemy query.
# --------------------- first query -------------------
first_query = (
Session.query(DatabaseOne)
.filter(
DatabaseOne.user_id.in_('1', '2', '3'),
DatabaseOne.category == "Obligor Rating Scorecards",
DatabaseOne.is_deleted == None,
)
.subquery("first_query")
)
# -------------------- second query --------------------
second_query = (
Session.query(
DatabaseOne.user_id,
func.max(DatabaseOne.date_created).label("max_date"),
)
.select_from(first_query)
.group_by(DatabaseOne.user_id)
.subquery("second_query")
)
# -------------------- final query --------------------
final_query = (
Session.query(
DatabaseOne, second_query.c.max_date, second_query.c.user_id
)
.join(
second_query,
and_(
second_query.c.user_id == DatabaseOne.user_id,
DatabaseOne.date_created == second_query.c.max_date,
),
)
.distinct(DatabaseOne)
.all()
)
Upvotes: 0
Views: 511
Reputation: 136
Base on your SQL I think you are trying to query the lines which user_id is given and with max date. So first maybe we can update the SQL a little:
SELECT DISTINCT d.*, t.max_date, t.USER_ID
FROM DATABASE_ONE d
INNER JOIN (
SELECT USER_ID, MAX(DATE_CREATED) max_date
FROM DATABASE_ONE
WHERE
USER_ID IN ('1','2','3')
AND CATEGORY = 'Users'
AND IS_DELETED IS NULL
GROUP BY USER_ID
) t on t.USER_ID = d.USER_ID AND t.max_date = d.DATE_CREATED
In these case it's easier to read and convert into SQLAlchemy language.
Then the problem about your second SQLAlchemy query, is that you are trying to select from a subquery but the select is from origin table. If convert the second SQLAlchemy query into sql query, It's will look like:
SELECT DATABASE_ONE.USER_ID, MAX(DATABASE_ONE.DATE_CREATED) max_date FROM (
SELECT * FROM DATABASE_ONE
WHERE USER_ID IN ('1','2','3')
AND CATEGORY = 'Users'
AND IS_DELETED IS NULL
)
GROUP BY DATABASE_ONE.USER_ID
The better way to write the SQLAlchemy query:
subquery = Session.query(
DatabaseOne.user_id,
func.max(DatabaseOne.date_created).label("max_date"),
)
.filter(
DatabaseOne.user_id.in_('1', '2', '3'),
DatabaseOne.category == "Obligor Rating Scorecards",
DatabaseOne.is_deleted == None,
)
.group_by(DatabaseOne.user_id)
.subquery()
Session.query(
DatabaseOne, subquery.c.user_id, subquery.c.max_date # see the .c method to access column in subquery
).select_from(subquery).join(DatabaseOne).filter(DatabaseOne.user_id = subquery.c.user_id,DatabaseOne.date_created = subquery.c.max_date).all()
Upvotes: 1