Reputation: 3
I have some pretty large tables (~30M rows) that have the following similar structure:
- item_id - uint
- item_text - varchar(255)
- user_id - uint
There are no keys, and you can find records with same item_id and item_text but different user_id:
item_id, item_text, user_id
3, text1, 5
3, text1, 7
4, text2, 5
I'm trying to move from MySQL to Sphinx for doing text searches, so how can I translate something like:
SELECT * FROM table WHERE user_id=123 AND item_text LIKE '%search_string%'. ?!
With the current sphinx config setup the indexer is running out of disk space:
source items
{
...
sql_query_pre = SELECT @id := 0
sql_query = SELECT @id := @id + 1, item_id, item_text, user_id FROM items
sql_attr_uint = user_id
sql_attr_uint = item_id
}
index items_index
{
source = items
path = ...
enable_star = 1
min_prefix_len = 0
min_infix_len = 3
min_word_len = 3
}
Is there a way to use sql_attr_multi to store the user_id values found for each item_id ?
Thanks !
Upvotes: 0
Views: 3188
Reputation: 21091
You could do,
source items {
...
sql_query = SELECT item_id, item_text, GROUP_CONCAT(user_id) AS user_id \
FROM items GROUP BY item_id ORDER BY NULL
sql_attr_multi = uint user_id from field;
}
Also asking yourself, if you really min_infix_len? That vastly inflates the disk size (far more than duplicate rows will). Or maybe try dict=keywords
Upvotes: 3