user2648990
user2648990

Reputation: 103

Simple Sphinx & mySQL Query

Forgive me for asking what should be a simple question but I am totally new to Sphinx.

I am using Sphinx with a mySQL datastore. The table looks like below with the Title and Content fields indexed by Sphinx.

CREATE TABLE `documents` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `group_id` int(11) NOT NULL,
  `group_id2` int(11) NOT NULL,
  `date_added` datetime NOT NULL,
  `title` varchar(255) NOT NULL,
  `content` text NOT NULL,
  `url` varchar(255) NOT NULL,
  `links` int(11) NOT NULL,
  `hosts` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `url` (`url`)
) ENGINE=InnoDB AUTO_INCREMENT=439043 DEFAULT CHARSET=latin1

Now, if I connect through Sphinx with

mysql -h0 -P9306

I can run a simple query like...

SELECT * FROM test1 WHERE MATCH('test document');

And I will get back a result set like...

+--------+----------+------------+
| id     | group_id | date_added |
+--------+----------+------------+
| 360625 |        1 | 1499727792 |
| 362257 |        1 | 1499727807 |
| 362777 |        1 | 1499727811 |
| 159717 |        1 | 1499717614 |
| 160557 |        1 | 1499717621 |
----------------------------------

When what I actually want is it to return a result set with column values from the documents table (like the URL, Title, Links, Hosts, etc. columns) and, if all possible, sort these by the relevancy of the Sphinx match.

Can that be accomplished in a single query? What might it look like?

Thanks in advance!

Upvotes: 1

Views: 497

Answers (1)

barryhunter
barryhunter

Reputation: 21091

Two (main) options

  1. Take the ids from the SphinxQL result, and run a MySQL Query to get the full details, see http://sphinxsearch.com/info/faq/#row-storage eg SELECT * FROM documents WHERE id IN (3,5,7) ORDER BY FIELD(id,3,5,7) This MySQL query, should be VERY quick, because its a PK lookup, and only retrieving a few rows (ie one page of results) - the heavy lifting of searching the whole table has already been done in first Sphinx Query.

  2. Duplicate all the columns you want to retrieve in the resultset as Attributes. You've already made group_id and date_added as attributes, would need to make more attributes.

    sql_field_string is a very convenient shortcut to make BOTH a Field and an String Attribute from one column. Not available for other column types, but less useful, as numeric columns, are not typically needed as fields anyway.

option 1 is good in it avoids duplicating the data, and saves memory (Sphinx wants to typically hold attributes in memory) - and may be most practical on big datasets.

whereas option 2 is good in that it avoids a second query for each result. But because have a copy of data, it may mean additional complication syncing.

  • Doesn't look relevant in your case, but if say had a 'clicks' column, which you want in increment often (when users click!), and need it in resultset but you don't really need it in sphinx for query purposes, the first option, would allow you only have to increment it in database, and the mysql query would always get the live value. But the second option means having to keep sphinx index in 'sync' at all times)

Upvotes: 2

Related Questions