Getting 400 bad request error when trying to use gemini-1.5-pro-preview-0409 api with video in firebase functions

I'm trying to use gemini-1.5-pro-preview-0409 with video using the Vertex AI API.

I'm using a nodejs function implemented in Firebase Cloud Functions and I'm calling this function via my webapp and trying to pass along the video's GCP storage uri. I got the code sample from Vertex AI studio.

When I use the video and my prompt in the AI Studio, it seems to work correctly. When I call the function from my webapp, I can see in the cloud function logs that I'm getting a 400 bad request error.

Can you please tell me what's going on?

Code is below:

import {VertexAI} from '@google-cloud/vertexai';
const vertex_ai = new VertexAI({project: projectId, location: 'us-central1'});
const model = 'gemini-1.5-pro-preview-0409';

// Instantiate the models
const generativeModel = vertex_ai.preview.getGenerativeModel({
    model: model,
    generationConfig: {
        'maxOutputTokens': 8192,
        'temperature': 1,
        'topP': 0.95,
    },
    safetySettings: [
        {
            'category': 'HARM_CATEGORY_HATE_SPEECH',
            'threshold': 'BLOCK_MEDIUM_AND_ABOVE'
        },
        {
            'category': 'HARM_CATEGORY_DANGEROUS_CONTENT',
            'threshold': 'BLOCK_MEDIUM_AND_ABOVE'
        },
        {
            'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
            'threshold': 'BLOCK_MEDIUM_AND_ABOVE'
        },
        {
            'category': 'HARM_CATEGORY_HARASSMENT',
            'threshold': 'BLOCK_MEDIUM_AND_ABOVE'
        }
    ],
});

const text1 = {text: 'The video provided is a procedure with steps. Summarize the procedure.'};

export const outputSimFromVertexAI = onCall({timeoutSeconds: 900, memory: "1GiB"}, async (request) => {
    console.log('data is ')
    console.log(request.data)

    const videoPath = request.data.videoPath;

    console.log('video path is ', videoPath);
    const video1 = {
        fileData: {
            mimeType: 'video/mp4',
            fileUri: videoPath
        }
    };

    const req = {
        contents: [
            {role: 'user', parts: [video1, text1]}
        ]
    }

    console.log('video is ', video1)
    console.log('text1 is ', text1)

    const result = await generativeModel.generateContent(req);
    const aiResponse = JSON.stringify(await result.response);

    console.log('ai response is')
    console.log(aiResponse)

    return 'response generated successfully';
})

The full error I'm getting is:

Unhandled error ClientError: [VertexAI.ClientError]: got status: 400 Bad Request. {"error":{"code":400,"message":"Request contains an invalid argument.","status":"INVALID_ARGUMENT"}}
    at throwErrorIfNotOK (/workspace/node_modules/@google-cloud/vertexai/build/src/functions/post_fetch_processing.js:32:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async generateContent (/workspace/node_modules/@google-cloud/vertexai/build/src/functions/generate_content.js:51:5)
    at async file:///workspace/index.js:2142:20
    at async /workspace/node_modules/firebase-functions/lib/common/providers/https.js:467:26 {
  stackTrace: undefined
}

Upvotes: 2

Answers (2)

user2646187

Reputation: 91

This was a mistake on my end. When retrieving the filepath using metadata for a file in Firebase storage, it retrieves the path partially. For example folder1/folder2/file. To get the full path, you then need to prepend the string 'gs://your-project-id.appspot.com' so that the full path is 'gs://your-project-id.appspot.com/folder1/folder2/file'. While I was getting my project id correctly from variables, my mistake was to not have the '.appspot.com' string prepended in the full path. Once this was fixed, I was able to retrieve results from the AI correctly.

Upvotes: 1

Nullsrc

Reputation: 43

(I can't post as comment, but feel free to move this to a comment if it belongs better in comments instead.)

EDIT: Found the GCP documentation for multimodal prompts:

Gemini 1.5 Pro (Preview): Maximum video length when it includes audio is approximately 50 minutes. The maximum video length for video without audio is 1 hour. Maximum videos per prompt is 10. The model is able to use both video and audio data to answer the prompt. For example, summarizes the video using both the visual content and speech in the video.

It's worth verifying that your video meets these requirements.

I'm seeing the same response when linking to PDF files stored in cloud storage, but only sometimes. The error occurs when linking to a large file in my dataset, but not when linking to some smaller files.

I can recreate the 400 error in the Vertex AI Studio with gemini-1.5-pro-preview-0409 if I use the file selection option to include the file from a google cloud storage bucket. Surprisingly, if I instead upload a copy of the file into the prompt the error does not occur and I can generate a response successfully.

I think there are some unstated limitations about the files included from google cloud storage. It may be worth attempting the same task with a smaller video file to see if there's a size limitation.

Upvotes: 1

Getting 400 bad request error when trying to use gemini-1.5-pro-preview-0409 api with video in firebase functions

Answers (2)

Related Questions