Dao Kieu Vi
Dao Kieu Vi

Reputation: 143

Indexing multivalue with Solr, choose type=text or string

I've read an example spring-solr and I have a question as following:

In a Product entity, there are many categories and I want to index and query them

In schema I have configurations:

<field name="categories" type="text_ws" multivalue="true"> (text_ws with a simple tokenizer: WhitespaceTokenizerFactory) (1)
or
<field name="categories" type="string" multivalue="true"> (2)

When that with the way (1) I will push to Solr with a full string which contains the categories separated by space letter to make indexing as well as for querying and with the way (2) that is a list of strings with each string is one category.

I would like to know the way is better than for indexing and for querying. In entity representations then I prefer (2) more than (1) that is a list of strings

Upvotes: 1

Views: 362

Answers (2)

MatsLindh
MatsLindh

Reputation: 52832

A StrField will only match if you have an exact match, or a wildcard query that matches the field. There is no tokenization or filtering performed on the values, and Foo is different to foo. This is fine if all you want is to filter based on a field value or use the field for faceting.

If you want to search in the field, matching just "beyond" to "Bed, Bath & Beyond", you'll want a TextField with different analyzers and tokenizers applied (depending on how and what you want to match).

Which means: it depends on your use case and what you want to achieve. There is also nothing wrong with having the same value in two different fields, where you perform different tokenization or analysis for each field. Use copyField to get the same value into both fields, then query each field depending on how you want to search or what operation you want to perform (i.e. use one for faceting and one for search).

Upvotes: 1

Ahmet Arslan
Ahmet Arslan

Reputation: 71

It depends on how you want to consume/use categories field.

By the way it is multiValued not multivalue.

Upvotes: 0

Related Questions