sai_varshittha
sai_varshittha

Reputation: 323

How to do batch inferenece for Hugging face models?

I want to do batch inference on MarianMT model. Here's the code:

from transformers import MarianTokenizer
tokenizer = MarianTokenizer.from_pretrained('Helsinki-NLP/opus-mt-en-de')
src_texts = [ "I am a small frog.", "Tom asked his teacher for advice."]
tgt_texts = ["Ich bin ein kleiner Frosch.", "Tom bat seinen Lehrer um Rat."]  # optional
inputs = tokenizer(src_texts, return_tensors="pt", padding=True)
with tokenizer.as_target_tokenizer():
    labels = tokenizer(tgt_texts, return_tensors="pt", padding=True)
inputs["labels"] = labels["input_ids"]
outputs = model(**inputs) 

How do I do batch inference?

Upvotes: 2

Views: 722

Answers (1)

rish.uk
rish.uk

Reputation: 93

  1. Create a dataset and preferably a Dataloader
  2. Specify your batch size and create your batches with wither Dataloader or DataCollator from Huggingface
  3. Run your tokenizer on each batch
  4. Generate output for each batch
  5. Use batch_decode to get final output.

Upvotes: 0

Related Questions