suyash308
suyash308

Reputation: 347

How to use ML models in Vespa.ai?

We are trying to use the ML model in Vespa, we have textual data stored in Vespa, can somebody help us with the below question-

  1. One example of onnx model trained using scikit-learn used in Vespa.
  2. Where to add preprocessing steps before model training and prediction using onnx model in Vespa with example.

Upvotes: 3

Views: 556

Answers (1)

Lester Solbakken
Lester Solbakken

Reputation: 266

This is a very broad question and the answer very much depends on what your goals are. In general, the documentation for using an ONNX model in Vespa can be found here:

https://docs.vespa.ai/documentation/onnx.html

An example that uses an ONNX BERT model for ranking can be found in the Transformers sample application:

https://github.com/vespa-engine/sample-apps/tree/master/transformers

Note that both these links assume that you have an existing model. In general, Vespa is a serving platform and not usually used in the model training process. As such Vespa doesn't really care where your model comes from, be that scikit-learn, pytorch or any other system. ONNX is a general format for ML model exchange between various systems.

However, there are some foundational ideas that I think I should get across that maybe can clarify a bit. Vespa currently considers all ML models to have numeric (in the form of tensors) inputs and outputs. This means you can't directly put text to your model and have text come out on the other side. Most textual data these days are encoded to some form of numeric representation such as embedding vectors, or, as the BERT example above shows, text is tokenized such that each token gets its own vector representation. After model computation, embedding vectors or token-set representations can be decoded back to text.

Vespa currently handles the computational part, the (pre-)processing of encoding/decoding text to embeddings or other representations are currently up to the user. Vespa does offer a rich set of features to help out in this regard in the form of document and query processors. So you can create a document processor that encodes the text of each incoming document to some representation before storing it. Likewise, a searcher (query processor) can be created that encodes incoming textual queries to a compatible representation before documents are scored against it.

So, in general, you would train your models outside of Vespa using whatever embedding or tokenization strategies are necessary for your model. When deploying the Vespa application you add the models with any required custom processing code, which is used when feeding or querying Vespa.

If you have a more concrete example of what you are trying to achieve I could be more specific.

Upvotes: 5

Related Questions