r2b2
r2b2

Reputation: 1275

Need advice on Sphinx query and index configuration

below are tables I need to index in Sphinx alt text

News has many projects linked to it via the news_projects table...

My question is it advisable to have all the project names in one text field so that it can be searched on?

This is my query :

//query to get the project names and make all project names appear in one field separated by comma.
SELECT
  news.id
  news.headline,
  news.body,
GROUP_CONCAT(DISTINCT projects.project_name) as related_projects
FROM  news 
LEFT JOIN news_projects on news.id = news_projects.news_id
LEFT JOIN projects on news_projects.project_id = projects.id
GROUP BY news.id

This will then output rows with project_name column formatted as 'name,name2,name3'

Im a bit undecided weather to use this or just make the project_id an attribute.. Having the project_name as text is definitely of big help because it can be searched on...

Please.. I need you opinion on this.. thanks a lot!

Upvotes: 0

Views: 516

Answers (2)

Johnny Woo
Johnny Woo

Reputation: 187

You can also specify related_projects field with sql_joined_field. It may be faster because you can remove grouping from the main query.

sql_query = SELECT id, headline, body FROM news

sql_joined_field = related_projects from query; \
    SELECT news_projects.news_id, projects.project_name \
    FROM news_projects \
    JOIN projects ON projects.id = news_projects.project_id \
    ORDER BY news_projects.news_id ASC

If you make delta indexes, though, you'll have to copy delta conditions to all joined fields (same with ranged queries), which can be somewhat of a pain.

Upvotes: 1

pat
pat

Reputation: 16226

If the project name is useful for searching on, then definitely keep it. I would recommend concatenating with a space instead of a comma, though (although it may not matter under the hood for Sphinx when you're using the default settings).

It's probably also useful to have the project ids as a Multi-Value Attribute (MVA) - which means you can limit search results to news items within a specific project. No reason to not have both, really.

Upvotes: 1

Related Questions