SQLAlchemy - subquery in a WHERE clause

Chad Birch picture Chad Birch · Jun 1, 2011 · Viewed 64.9k times · Source

I've just recently started using SQLAlchemy and am still having trouble wrapping my head around some of the concepts.

Boiled down to the essential elements, I have two tables like this (this is through Flask-SQLAlchemy):

class User(db.Model):
    __tablename__ = 'users'
    user_id = db.Column(db.Integer, primary_key=True)

class Posts(db.Model):
    __tablename__ = 'posts'
    post_id = db.Column(db.Integer, primary_key=True)
    user_id = db.Column(db.Integer, db.ForeignKey('users.user_id'))
    post_time = db.Column(db.DateTime)

    user = db.relationship('User', backref='posts')

How would I go about querying for a list of users and their newest post (excluding users with no posts). If I was using SQL, I would do:

SELECT [whatever]
FROM posts AS p
    LEFT JOIN users AS u ON u.user_id = p.user_id
WHERE p.post_time = (SELECT MAX(post_time) FROM posts WHERE user_id = u.user_id)

So I know exactly the "desired" SQL to get the effect I want, but no idea how to express it "properly" in SQLAlchemy.

Edit: in case it's important, I'm on SQLAlchemy 0.6.6.

Answer

zzzeek picture zzzeek · Jun 13, 2011

the previous answer works, but also the exact sql you asked for is written much as the actual statement:

print s.query(User, Posts).\
    outerjoin(Posts.user).\
    filter(Posts.post_time==\
        s.query(
            func.max(Posts.post_time)
        ).
        filter(Posts.user_id==User.user_id).
        correlate(User).
        as_scalar()
    )

I guess the "concept" that isn't necessarily apparent is that as_scalar() is currently needed to establish a subquery as a "scalar" (it should probably assume that from the context against ==).

Edit: Confirmed, that's buggy behavior, completed ticket #2190. In the current tip or release 0.7.2, the as_scalar() is called automatically and the above query can be:

print s.query(User, Posts).\
    outerjoin(Posts.user).\
    filter(Posts.post_time==\
        s.query(
            func.max(Posts.post_time)
        ).
        filter(Posts.user_id==User.user_id).
        correlate(User)
    )