user2075215
user2075215

Reputation: 379

solr filter query on document value

I'm looking for a solution where my very long query strings are returning a 414 http response. Some queries can reach up to 10,000 chars, I could look at changing how many chars apache/jetty allows, but I'd rather not allow my webserver to have anyone post 10,000 chars.

Is there a way in solr where I can save a large query string in a document and use it in a filtered query?

select?q=*:*&fq=id:123 - this would return a whole document, but is there a way to return the value of a field in document 123 in the query

The field queryValue in document with the id of 123 would be Intersects((LONGSTRING))

So is there a way to do something like select?q=*:*&fq=foo:{id:123.queryValue}

this would be the same as select?q=*:*&fq=foo:Intersects((LONGSTRING))?

Upvotes: 0

Views: 724

Answers (2)

Alexandre Rafalovitch
Alexandre Rafalovitch

Reputation: 9789

There might be alternative methods to what you are looking for before doing literal solution you seek:

  1. You can POST query to Solr instead of using GET. There is no URL limit on that
  2. If you are sending a long list of ids and using OR construct, there are alternative query parsers to make it more efficient (e.g. TermsQueryParser)
  3. If you have constant (or semi-constant) query parameters, you could factor them out into defaults on request handlers (in solrconfig.xml). You can create as many request handlers as you want and defaults can be overriden, so this effectively allows you to pre-define classes/types of queries.

Upvotes: 1

MatsLindh
MatsLindh

Reputation: 52822

Two possibilities:

Joining

You can use the Join query parser to fetch the result from one collection / core and use that to filter results in a different core, but there are several limitations that will be relevant when you're talking larger installations and data sizes. You'll have to experiment to see if this works for your use case.

The Join Query Parser

Hashing

As long as you're only doing exact matches, hash the string on the client side when indexing and when querying. Exactly how you do this will depend on your language of choice. For python you'd get the hash of the long string using hashlib, and by using sha256, you'll get a resulting string that you can use for indexing and querying that's 64 bytes if you're using the hex form, 44 if you're using base64.

Example:

>>> import hashlib
>>> hashlib.sha256(b"long_query_string_here").hexdigest()
'19c9288c069c47667e2b33767c3973aefde5a2b52d477e183bb54b9330253f1e'

You would then store then 19c92... value in Solr, and do the same transformation when you have value you're querying after.

fq=hashed_id:19c9288c069c47667e2b33767c3973aefde5a2b52d477e183bb54b9330253f1e

Upvotes: 1

Related Questions