Kazuren
Kazuren

Reputation: 55

Filter for each item in list to check if it is in a list of ID's

I want to get only results where for every game's categories that category is in a list of id's, doing the following does not work. Game.categories is a hybrid_property that gives me the game's categories.

This is the query I am doing expecting to get a list of all the games that have at least one category in the list of ID's I specify.

games = Game.query \
  .filter([category.id for category in Game.categories].in_([1, 2, 3])) \
  .paginate(page, current_app.config['POSTS_PER_PAGE'], False)

My models.py is this. Note that each game's categories depends on the count that category has been defined by a user.

class CategoryUpvote(db.Model):
  id = db.Column(db.Integer, primary_key=True)
  user_id = db.Column(db.Integer, db.ForeignKey('user.id'))
  category_id = db.Column(db.Integer, db.ForeignKey('category.id'))
  game_id = db.Column(db.Integer, db.ForeignKey('game.id'))

class Category(db.Model):
  id = db.Column(db.Integer, primary_key=True)
  name = db.Column(db.String(32))

class Game(db.Model):
  id = db.Column(db.Integer, primary_key=True)
  title = db.Column(db.String(255), index=True, unique=True)

  category_upvotes = db.relationship('CategoryUpvote', backref='category', lazy='dynamic', cascade="save-update, merge, delete")

  def add_upvote(self, category, user):
    if not self.is_upvoted(category, user):
      upvote = CategoryUpvote(category_id=category.id, game_id=self.id, user_id=user.id)
      db.session.add(upvote)
    else:
      upvote = self.category_upvotes.filter(CategoryUpvote.category_id == category.id, CategoryUpvote.user_id == user.id, CategoryUpvote.game_id == self.id).first()
      db.session.delete(upvote)

  def is_upvoted(self, category, user):
    return self.category_upvotes.filter(CategoryUpvote.category_id == category.id, CategoryUpvote.user_id == user.id, CategoryUpvote.game_id == self.id).count() > 0

  @hybrid_property
  def categories(self):
    return Category.query \
      .filter(CategoryUpvote.game_id == self.id, CategoryUpvote.category_id == Category.id) \
      .group_by(Category.id) \
      .order_by(db.func.count(CategoryUpvote.game_id).desc()) \
      .limit(5) \
      .all()

I get this error

'list' object has no attribute 'in_'

How can I somehow run filter for each category a game has? for example

.filter(Game.categories[0].in_([1, 2, 3]), Game.categories[1].in_([1, 2, 3]), Game.categories[2].in_([1, 2, 3])) \

Also if Game had each Category as a foreign key I could do something like this

Game.query.join(Category).filter(Category.id.in_([1, 2, 3]))

How to do something similar with a hybrid_property?

Upvotes: 1

Views: 776

Answers (2)

Ilja Everilä
Ilja Everilä

Reputation: 52949

The attribute Game.categories, though meant as a hybrid_property, does not work in SQL context. When used to create SQL expression objects at class level it still performs an actual query against the database, returning a list of Category objects instead. Most of your errors result from this fact.

In addition it essentially performs a join (using the pre SQL-92 style) between Game and CategoryUpvote, returning 5 top voted categories from all games. It would make the property a lot more usable in general, if it were to return the Query without materializing to a list:

class Game(db.Model):

  @hybrid_property
  def categories(self):
    return Category.query \
      .join(CategoryUpvote) \
      .filter(CategoryUpvote.game_id == self.id) \
      .group_by(Category.id) \
      .order_by(db.func.count(CategoryUpvote.game_id).desc()) \
      .limit(5)

You would then access the list of categories of a Game instance using

game.categories.all()

or simply iterate over it directly. With the hybrid property returning a query you can now use it to generate correlated subqueries, for example. Then you could turn your attempt to filter categories based on a list of ids in to the predicate "the difference between given ids and category ids of a game identified by id is empty", or in other words all given ids are contained in the set of top-5 category ids of a game:

# A portable way to build a (derived) table of ids. Use VALUES
# instead, if supported.
ids = union(*[select([literal(i).label('id')]) for i in [1, 2, 3]])

Game.query.filter(~exists(
    ids.select()
        .except_(Game.categories
            .correlate(Game)
            .with_entities(Category.id)
            .subquery()
            .select())))

Please note that

I want to get only results where for every game's categories that category is in a list of id's

and

This is the query I am doing expecting to get a list of all the games that have at least one category in the list of ID's I specify

are not the same thing, and the example above is written with the former in mind, though the strict interpretation (every category in list) would require using symmetric difference.

In order to get the latter you would query using the predicate "the intersection between given ids and category ids of a game identified by id is not empty", so you would swap the except_() to intersect() and remove the NOT operator ~ in front of exists() – because if something is not empty, a row or rows exists.

Upvotes: 3

Arghya Saha
Arghya Saha

Reputation: 5723

The problem facing is because you are trying to run a query using a generator

category.id.in_([1,2,3]) for category in Game.categories

To give you a small example. When you run the following code you get the desired output of 0 to 5 in a list.

>>> print([a for a in range(5)])
[0, 1, 2, 3, 4]

Now if you do remove the square bracket then

>>> print(a for a in range(5))
<generator object <genexpr> at 0x102829f10>

You can see that the output is a generator expression. Same is the case with your code, since you are not evaluation python is sending it as a generator object

I don't have experience with sqlalchemy, But I feel what you are trying to achieve you can do something like this:

for category in Game.categories
    result.append(category.id.in_([1, 2, 3]))

games = Game.query \
  .filter(Game.id.in_(result)) \
  .paginate(page, current_app.config['POSTS_PER_PAGE'], False)

Upvotes: 1

Related Questions