Alex Kahn
Alex Kahn

Reputation: 537

Why is my index missing documents with Thinking Sphinx?

I have a simple Thinking Sphinx index defined on my Account model:

define_index do
  indexes display_name
  indexes email_addresses.email_address

  has created_at
  set_property :delta => :datetime, :threshold => 2.minutes
end

(Ignore the delta for now; I'm generating the full index and searching account_core.)

But I'm getting some unexpected results:

>> Account.count
# => 885138

>> Account.search.total_entries
# => 260795

>> Account.search("[email protected]")
# => []

However, on the command line, using the search utility, I'm able to find Lenny:

$ search -c /etc/sphinx/water.sphinx.conf -i account_core [email protected]

index 'account_core': query '[email protected] ': returned 2 matches of 2 total in 0.759 sec

displaying matches:
1. document=3543432, weight=4, sphinx_internal_id=442101, sphinx_deleted=0, class_crc=0, created_at=Mon Apr 11 12:18:08 2011
2. document=5752816, weight=2, sphinx_internal_id=719552, sphinx_deleted=0, class_crc=0, created_at=Tue Dec 27 12:01:12 2011

Indeed those are Drew's account IDs.

Why am I not able to find Lenny when searching using Thinking Sphinx? Why is the total_entries number so much smaller than the total rows in the accounts table?

Upvotes: 4

Views: 548

Answers (1)

Alex Kahn
Alex Kahn

Reputation: 537

It turns out the issue had to do with how Thinking Sphinx handles Single Table Inheritance. TS only returns records that have a type that corresponds to one of the parent class's subclasses. If type is NULL, the document isn't included in the search results. We had a lot of records in the accounts table with type=NULL. After fixing the data, searching now works as expected.

Thanks to roman3x in #sphinxsearch for pointing me to this.

Upvotes: 1

Related Questions