Reputation: 7391
I have a model A associated to model B via INNER JOIN:
class A
has_many :bees, as: :bable
scope :bees, -> () {
joins("INNER JOIN bees AS b ON id = b.bable_id .......")
}
end
class B
table_name = "bees"
belongs_to :bable, polymorphic: true
end
I need to filter using B's datetime field (created_at), so I declared a new attribute thus:
has bees.created_at, as: :b_created_at
The sphinx query statement generated now includes:
GROUP_CONCAT(DISTINCT UNIX_TIMESTAMP(bees.`created_at`) SEPARATOR ',') AS `b_created_at`
After indexing, my sphinx index file size exploded.
Thanks
Upvotes: 0
Views: 231
Reputation: 678
It appears that the indexer is creating, within the index file, a comma separated list of all created timestamps of all bees
- as created timestamps are generally unique (!), this indexing is going to create one item for every bee
. If you have a lot of bees
then this is going to be big.
I would be looking at some way to bypass Sphinx for this part of the query if that is possible and get it to add a direct SQL BETWEEN LowDateTs AND HighDateTs
against the built in created_at
instead. I hope this is possible - it will definitely be better than using a text index to find it.
Hope this is of some help.
Edit:
Speed reading Sphinx' docs:
[...] WHERE clause. This clause will map both to fulltext query and filters. Comparison operators (=, !=, <, >, <=, >=), IN, AND, NOT, and BETWEEN are all supported and map directly to filters [...]
So the key is to stop it treating the timestamp as a text search and use a BETWEEN
, which will be vastly more efficient and hopefully stop it trying to use text indexing on this field.
Upvotes: 1