Raghu Venmarathoor
Raghu Venmarathoor

Reputation: 868

View all the documents loaded into vespa

Is there any way to fetch all the documents loaded into vespa?

I tried querying with regular expressions, but it didn't work as expected.

select * from entity where ID matches "[.]+";

ID is not an attribute, but I tried with an attribute field, both didn't respond with any values.

Upvotes: 5

Views: 929

Answers (2)

Jon
Jon

Reputation: 2339

Using visiting instead of search, either with the vespa-visit tool or using visiting in the document/v1 REST API is usually preferable for dumping documents.

If you want to use search, use this query to match all documents of a type:

select * from yourdocumenttype where sddocname contains 'yourdocumenttype';

To iterate over all documents with this, it will be more efficient to use a some field in your document to partition the document set into smaller chunks and query for one chunk at a time (e.g if you have a timestamp field, add a range condition to the query to retrieve documents for a slice of time in each query).

(Regular expressions are only supported in streaming mode.)

Upvotes: 5

Kristian Aune
Kristian Aune

Reputation: 996

To dump all documents from Vespa, use vespa-visit:

"visit" is a different interface than the search interface - it is built for large data transfers with high throughput, but not necessarily low latency

Teams use visit to extract a full dump or a subset, using a selection expression

Upvotes: 3

Related Questions