Reputation: 93
I'm using NestJS to write a forwarding service for the openai chat completion API. I want to do a transformation for the original stream and then forward stream to the client side.
The code is like below, and it's inside a nestJS controller
const completion = await openai.createChatCompletion(
{
model: 'gpt-3.5-turbo',
messages: messages,
n: 1,
stream: true,
max_tokens: 4000,
},
{ responseType: 'stream' },
);
class TransformerStream extends Transform {
_transform(chunk, encoding, callback) {
// If I directly forward the chunk like this, the client can receive chunk by chunk
this.push(chunk)
// However, if I use string, the client can't receive chunk by chunk.
// My original code is to transform the chunk to string and do some transformation, to simplify the question, just use 'data: ping\n' here
this.push('data: ping\n', 'utf8')
callback()
}
}
const transformer = new TransformerStream()
completion.data.pipe(transformer).pipe(res)
And I'm using axios to request the API from the client side, and I'm trying to receive it chunk by chunk using onDownloadProgress
axios.post('/api/chat', body, {
responseType: 'stream',
onDownloadProgress: progress => {
console.log(progress)
}
} )
In summary, when I directly send the buffer chunk from the openAI api, the progress can be logged several times. But when I send the string, it can only be logged once.
Upvotes: 0
Views: 2135
Reputation: 3923
It might be due to the difference between the length of the original chunk
and the length of the string you are trying to write to the stream.
You can consider setting the following headers in your NestJS controller:
Transfer-Encoding
: chunked
X-Content-Type-Options
: nosniff
Sample code:
res.setHeader('Transfer-Encoding', 'chunked');
res.setHeader('X-Content-Type-Options', 'nosniff');
Transfer-Encoding
tells the browser to start processing the data instead of waiting for all the content to be loaded first
X-Content-Type-Options
tells the browser to respect the Content-Type
specified by your header instead of trying to guess based on the head of the content returned. Based on my test with latest Chrome browser, it seems like the initial 1024 bytes are "blocked" before browser correctly identified the Content-Type
.
You can read more about the behaviour here: What is "X-Content-Type-Options=nosniff"?
Reference:
Upvotes: 2