Reputation: 115
References:
id scheme
Format: id:<namespace>:<document-type>:<key/value-pairs>:<user-specified>
http://docs.vespa.ai/documentation/content/buckets.html
http://docs.vespa.ai/documentation/content/idealstate.html
its possible to structure data in user defined bucketing logic by using 32 LSB in document-id format (n / g selections).
however, the query logic isn't very clear on how to route queries to a specific bucket range based on a decision taken in advance.
e.g., it is possible to split data into a time range (start-time/end-time) if i can define n (a number) compressing the range. all documents tagged such will end up in same bucket (that will follow its course of split on number of documents / size as configured).
however, how do i write a search query on data indexed in such manner? is it possible to indicate the processor to choose a specific bucket, or range of buckets (in case distribution algorithm might have moved buckets)?
Upvotes: 3
Views: 218
Reputation: 2339
You can choose one bucket in a query by specifying the streaming.groupname
query property.
Either in the http request by adding
&streaming.groupname=[group]
or in a Searcher by
query.properties().set("streaming.groupname","[group]").
If you want multiple buckets, use the parameter streaming.selection
instead, which accepts any document selection expression: http://docs.vespa.ai/documentation/reference/document-select-language.html
To specify e.g two buckets, use set streaming.selection
(in the HTTP request or a Searcher) to
id.group=="[group1]" and id.group=="[group2]"
See http://docs.vespa.ai/documentation/streaming-search.html
Note that streaming search should only be used when each query only need to search one or a few buckets. It avoids building reverse indexes, which is cheaper in that special case (only).
Upvotes: 4
Reputation: 3184
The &streaming.* parameters is described here http://docs.vespa.ai/documentation/reference/search-api-reference.html#streaming.groupname
This only applies to document types which are configured with mode=streaming, for default mode which is index you cannot control the query routing http://docs.vespa.ai/documentation/reference/services-content.html#document
Upvotes: 0