DSPy: How to get the number of tokens available for the input fields?

Question

This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule.

When running a DSPy module with a given signature, I am interested in getting the token count of the "prompt template" that it currently passes to the language model (LM), by which I meant the number of input tokens passed to the LM minus the token counts of the input fields. This would thus count the length of the signature description, field descriptions, and the few-shot examples. Then, by subtracting the context window of the LM with the token count of the prompt template, I would get the maximum number of tokens that I can squeeze into the input fields.

I am interested in this as I am currently building a RAG pipeline that retrieves texts from a database to synthesize the final response. However, the total length of the texts retrieved from the database might exceed the context window size of the LM I am using. Thus, a iterative or recursive summarization process is need to compress the prompt before synthesizing the final response. While I acknowledge that you can simply summarize each chunk of text one-by-one to be extra cautious to not exceed the context window, I think this might not be the most effective way to do this.

I originally built an RAG pipeline entirely using LlamaIndex where the response would be generated by response synthesizers. Note that the compact mode of response synthesizers would try to pack as many tokens from the retrieved contexts into a single LM call as possible to reduce the number of calls. This is achived via PromptHelper that squeezes as many tokens into the fields of the prompt template as possible so that the length of the fields altogether does not excess context_window - prompt_template_length.

Now, as I am switching all the prompting to DSPy for more flexibility, I wonder what would be the best way for me to implement something alike PromptHelper? I also checked how the LlamaIndex integration for DSPy does this: https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dspy/predict/llamaindex.py#L22-L36

It appears that it converts the signature to a legacy format first? Therefore, would this be a good approach to this problem or are there better alternatives?

DSPy: How to get the number of tokens available for the input fields?

Answers (1)

Related Questions