Reputation: 8536
How can I structure this sqlalchemy query so that it does the right thing?
I've given everything I can think of an alias, but I'm still getting:
ProgrammingError: (psycopg2.ProgrammingError) subquery in FROM must have an alias
LINE 4: FROM (SELECT foo.id AS foo_id, foo.version AS ...
Also, as IMSoP pointed out, it seems to be trying to turn it into a cross join, but I just want it to join a table with a group by subquery on that same table.
Here is the sqlalchemy:
(Note: I've rewritten it to be a standalone file that is as complete as possible and can be run from a python shell)
from sqlalchemy import create_engine, func, select
from sqlalchemy import Column, BigInteger, DateTime, Integer, String, SmallInteger
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine('postgresql://postgres:#######@localhost:5435/foo1234')
session = sessionmaker()
session.configure(bind=engine)
session = session()
Base = declarative_base()
class Foo(Base):
__tablename__ = 'foo'
__table_args__ = {'schema': 'public'}
id = Column('id', BigInteger, primary_key=True)
time = Column('time', DateTime(timezone=True))
version = Column('version', String)
revision = Column('revision', SmallInteger)
foo_max_time_q = select([
func.max(Foo.time).label('foo_max_time'),
Foo.id.label('foo_id')
]).group_by(Foo.id
).alias('foo_max_time_q')
foo_q = select([
Foo.id.label('foo_id'),
Foo.version.label('foo_version'),
Foo.revision.label('foo_revision'),
foo_max_time_q.c.foo_max_time.label('foo_max_time')
]).join(foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id
).alias('foo_q')
thing = session.query(foo_q).all()
print thing
generated sql:
SELECT foo_id AS foo_id,
foo_version AS foo_version,
foo_revision AS foo_revision,
foo_max_time AS foo_max_time,
foo_max_time_q.foo_max_time AS foo_max_time_q_foo_max_time,
foo_max_time_q.foo_id AS foo_max_time_q_foo_id
FROM (SELECT id AS foo_id,
version AS foo_version,
revision AS foo_revision,
foo_max_time_q.foo_max_time AS foo_max_time
FROM (SELECT max(time) AS foo_max_time,
id AS foo_id GROUP BY id
) AS foo_max_time_q)
JOIN (SELECT max(time) AS foo_max_time,
id AS foo_id GROUP BY id
) AS foo_max_time_q
ON foo_max_time_q.foo_id = id
and here is the toy table:
CREATE TABLE foo (
id bigint ,
time timestamp with time zone,
version character varying(32),
revision smallint
);
The SQL was I expecting to get (desired SQL) would be something like this:
SELECT foo.id AS foo_id,
foo.version AS foo_version,
foo.revision AS foo_revision,
foo_max_time_q.foo_max_time AS foo_max_time
FROM foo
JOIN (SELECT max(time) AS foo_max_time,
id AS foo_id GROUP BY id
) AS foo_max_time_q
ON foo_max_time_q.foo_id = foo.id
Final note: I'm hoping to get an answer using select() instead of session.query() if possible. Thank you
Upvotes: 23
Views: 31468
Reputation: 7068
subquery in FROM must have an alias
This error means the subquery (on which we're trying to perform a join
) has no alias.
Even if we .alias('t')
it just to satisfy this requirement, we will then get the next error:
missing FROM-clause entry for table "foo"
That's because the join on
clause (... == Foo.id
) is not familiar with Foo
.
It only knows the "left" and "right" tables: t
(the subquery) and foo_max_time_q
.
Instead, select_from
a join of Foo
and foo_max_time_q
.
Replace .join(B, on_clause)
with .select_from(B.join(A, on_clause)
:
]).join(foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id
]).select_from(foo_max_time_q.join(Foo, foo_max_time_q.c.foo_id == Foo.id)
This works here because A INNER JOIN B
is equivalent to B INNER JOIN A
.
To preserve the order of joined tables:
from sqlalchemy import join
and replace .join(B, on_clause)
with .select_from(join(A, B, on_clause))
:
]).join(foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id
]).select_from(join(Foo, foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id)
Alternatives to session.query() can be found here.
Upvotes: 1
Reputation: 473873
You are almost there. Make a "selectable" subquery and join it with the main query via join()
:
foo_max_time_q = select([func.max(Foo.time).label('foo_max_time'),
Foo.id.label('foo_id')
]).group_by(Foo.id
).alias("foo_max_time_q")
foo_q = session.query(
Foo.id.label('foo_id'),
Foo.version.label('foo_version'),
Foo.revision.label('foo_revision'),
foo_max_time_q.c.foo_max_time.label('foo_max_time')
).join(foo_max_time_q,
foo_max_time_q.c.foo_id == Foo.id)
print(foo_q.__str__())
Prints (prettified manually):
SELECT
foo.id AS foo_id,
foo.version AS foo_version,
foo.revision AS foo_revision,
foo_max_time_q.foo_max_time AS foo_max_time
FROM
foo
JOIN
(SELECT
max(foo.time) AS foo_max_time,
foo.id AS foo_id
FROM
foo
GROUP BY foo.id) AS foo_max_time_q
ON
foo_max_time_q.foo_id = foo.id
The complete working code is available in this gist.
Upvotes: 33