Tom Lin
Tom Lin

Reputation: 110

DSPy: How to get the number of tokens available for the input fields?

This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule.

When running a DSPy module with a given signature, I am interested in getting the token count of the "prompt template" that it currently passes to the language model (LM), by which I meant the number of input tokens passed to the LM minus the token counts of the input fields. This would thus count the length of the signature description, field descriptions, and the few-shot examples. Then, by subtracting the context window of the LM with the token count of the prompt template, I would get the maximum number of tokens that I can squeeze into the input fields.

I am interested in this as I am currently building a RAG pipeline that retrieves texts from a database to synthesize the final response. However, the total length of the texts retrieved from the database might exceed the context window size of the LM I am using. Thus, a iterative or recursive summarization process is need to compress the prompt before synthesizing the final response. While I acknowledge that you can simply summarize each chunk of text one-by-one to be extra cautious to not exceed the context window, I think this might not be the most effective way to do this.

I originally built an RAG pipeline entirely using LlamaIndex where the response would be generated by response synthesizers. Note that the compact mode of response synthesizers would try to pack as many tokens from the retrieved contexts into a single LM call as possible to reduce the number of calls. This is achived via PromptHelper that squeezes as many tokens into the fields of the prompt template as possible so that the length of the fields altogether does not excess context_window - prompt_template_length.

Now, as I am switching all the prompting to DSPy for more flexibility, I wonder what would be the best way for me to implement something alike PromptHelper? I also checked how the LlamaIndex integration for DSPy does this: https://github.com/stanfordnlp/dspy/blob/55510eec1b83fa77f368e191a363c150df8c5b02/dspy/predict/llamaindex.py#L22-L36

It appears that it converts the signature to a legacy format first? Therefore, would this be a good approach to this problem or are there better alternatives?

Upvotes: 0

Views: 475

Answers (1)

Papa T
Papa T

Reputation: 1

If you need really good compression , I’ve used on ChatGPT “SentenceSqueezer”, it usually compresses in neighborhood 60-85% of instruction sets and it’s so simple, guaranteed not to lose context because it makes an acronym out of each sentence or line , tell it to leave periods in there place or whatever you desire, gives pre and post count with a mapping table, and it may do meta compression using one symbol. The other one is “Tokenizer GPT Instruction Compressor” also in the same platform but uses single placeholders for words that repeat more than 3 times I. BElieve, outputs same stats on compression mapping table etc, may be worth a try, I’m new so I may be way off if this will work for you, hopefully so. But if they both have worked for me in not having to ever worry about how technical I need my instructions to be

Upvotes: 0

Related Questions