Alaa
Alaa

Reputation: 4611

Solr: how to search for substring in facets

I have country field in my solr DB to represent the countries related to this item. the list of countries are PIPE separated
ex:
<arr name="country"> <str>France | United Kingdom | Norway | UAE </str> </arr> and another item like
<arr name="country"> <str>Australia | Belgium | Argentina </str> </arr>
now i need to search for all items related to United Kingdom OR Belgium i tried this

http://127.0.0.1:8888/solr/MyDb/select/?q=*:*&version=2.2&start=0&rows=10&indent=on&facet=true&fq=country:United+Kingdom+OR+Belgium

but this didn't work! could you please guide me to how to do this search?
Thanks for your help

Upvotes: 0

Views: 1322

Answers (2)

Michael Dillon
Michael Dillon

Reputation: 32392

For this particular data, multivalued fields are the right answer, but I wanted to say a bit about pipe separated fields. I have used these quite a bit but always to flatten an object hierarchy, for instance to represent a currency amount as GBP|75000 or a dimension as ft|14.

In one case I used it to represent a section of an XML document that had various combinations of 7 different tags, so I used a single field with a pipe separated list of 7 items. For example:

Promotion|||December Days||773635554238
|quarterpage|||||883736656534

The one thing about all of these examples is that the position within the item list is fixed, i.e. currency code is always first or the Marketing ID is always last. That means that you can reliably search for things like GBP|* to find all documents with pound sterling currency or *|quarterpage|* to find all documents with quarter page ads.

One weakness of this last example is that you have to be careful that you are using globally unique terms in all the different uses of the 7 items in the Marketing field which leads to longer terms and therefore higher RAM usage. It would not work if sometimes cat means category and sometimes cat means catalog.

Upvotes: 1

Jayendra
Jayendra

Reputation: 52779

Whats the analyses performed on the country field @ index and query time ?

Would suggest -

Index the countries and multivalued instead of separated by |. Use fieldType with minimal analysis or field type string for filtering

<field name="country" type="string" indexed="true" stored="true" multiValued="true"/>

Filter queries should work with

fq=country:Norway
fq=country:("United Kingdom" OR Belgium)

Upvotes: 1

Related Questions