Vian Esterhuizen
Vian Esterhuizen

Reputation: 3904

Problem uploading files with Gemini API (using PHP)

I'm struggling with Gemini API, specifically uploading files. I have the chat functionality working exactly as expected, including chat history, with the below code commented out that adds the URLs to the payload. If I uncomment that code, the chat breaks.

// Add user message to history
$history[] = ["role" => "user", "parts" => [["text" => $userMessage]]];

// Add file URIs to the history as separate parts
// if (!empty($fileUris)) {
//     $history[] = [
//         "role" => "user",
//         "parts" => array_merge(
//             array_map(fn($fileUri) => ["fileData" => $fileUri], $fileUris)
//         )
//     ];
// }

$payload = [
    "contents" => $history,
    "systemInstruction" => [
        "role" => "user",
        "parts" => [
            ["text" => $instructions]
        ]
    ],
    "generationConfig" => [
        "temperature" => 1,
        "topK" => 40,
        "topP" => 0.95,
        "maxOutputTokens" => 8192,
        "responseMimeType" => "text/plain"
    ]
];

$requestBody = json_encode($payload, JSON_UNESCAPED_SLASHES | JSON_PRETTY_PRINT);
file_put_contents('debug_payload.json', $requestBody);

// gemini-2.0-flash-exp
// gemini-exp-1206
// gemini-1.5-pro
// gemini-1.5-flash-8b
// 
$api_version = 'gemini-1.5-flash-8b';

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://generativelanguage.googleapis.com/v1beta/models/{$api_version}:streamGenerateContent?alt=sse&key=" . API_KEY);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, ["Content-Type: application/json"]);
curl_setopt($ch, CURLOPT_POSTFIELDS, $requestBody);
curl_setopt($ch, CURLOPT_WRITEFUNCTION, function ($ch, $chunk) use ($historyFile, &$history, $textResponseLog) {
    $lines = explode("data: ", $chunk);
    foreach ($lines as $line) {
        $line = trim($line);
        if ($line) {
            $json = json_decode($line, true);
            if (isset($json['candidates'][0]['content']['parts'])) {
                foreach ($json['candidates'][0]['content']['parts'] as $part) {
                    if (isset($part['text'])) {
                        $history[] = ["role" => "model", "parts" => [["text" => $part['text']]]];
                        file_put_contents($historyFile, json_encode($history, JSON_PRETTY_PRINT));
                        echo "data: " . json_encode(["response" => $part['text']]) . "\n\n";
                        ob_flush();
                        flush();
                    }
                }
            }
        }
    }
    file_put_contents($textResponseLog, $chunk, FILE_APPEND);
    return strlen($chunk);
});

curl_exec($ch);
curl_close($ch);

This is the object I send:

{
    "contents": [
        {
            "role": "user",
            "parts": [
                {
                    "text": "Hey"
                }
            ]
        },
        {
            "role": "user",
            "parts": [
                {
                    "fileData": {
                        "fileUri": "https://generativelanguage.googleapis.com/v1beta/files/6sr9bb3f4w61",
                        "mimeType": "application/json"
                    }
                },
                {
                    "fileData": {
                        "fileUri": "https://generativelanguage.googleapis.com/v1beta/files/snrd7cowjb45",
                        "mimeType": "application/json"
                    }
                },
                {
                    "fileData": {
                        "fileUri": "https://generativelanguage.googleapis.com/v1beta/files/g38bsh0fw458",
                        "mimeType": "application/json"
                    }
                }
            ]
        }
    ],
    "systemInstruction": {
        "role": "user",
        "parts": [
            {
                "text": "Instructions ..."
            }
        ]
    },
    "generationConfig": {
        "temperature": 1,
        "topK": 40,
        "topP": 0.95,
        "maxOutputTokens": 8192,
        "responseMimeType": "text/plain"
    }
}

And the error I get back is:

Using gemini-1.5-flash-8b or gemini-2.0-flash-exp (none of them work those I just confirmed before posting)

{
  "error": {
    "code": 400,
    "message": "Unable to submit request because it has a mimeType parameter with value application/json, which is not supported. Update the mimeType and try again. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini",
    "status": "INVALID_ARGUMENT"
  }
}

But this is the response I'm getting after upload ... it's application/json and forcing it to pdf gives me a different error.

{
  "file": {
    "name": "files/zwfo9w8j0v9k",
    "mimeType": "application/json",
    "sizeBytes": "74",
    "createTime": "2025-01-29T20:29:30.655895Z",
    "updateTime": "2025-01-29T20:29:30.655895Z",
    "expirationTime": "2025-01-31T20:29:30.632984185Z",
    "sha256Hash": "MDI0MjRmOGQ5ZDk5YmM2MTMxMDVjMjg5ZTg2NWJjYmUxMTdhN2VjZjc2NzJlZTQ5MjU2MDg5ZGNlMWFmM2I1MQ==",
    "uri": "https://generativelanguage.googleapis.com/v1beta/files/zwfo9w8j0v9k",
    "state": "ACTIVE",
    "source": "UPLOADED"
  }
}

Example code supplied by Google AI studio

# TODO: Make the following files available on the local file system.
FILES=("file1.pdf" "file2.pdf" "file3.pdf")
MIME_TYPES=("application/pdf" "application/pdf" "application/pdf")
for i in "${!FILES[@]}"; do
  NUM_BYTES=$(wc -c < "${FILES[$i]}")
  curl "https://generativelanguage.googleapis.com/upload/v1beta/files?key=${API_KEY}" \
    -H "X-Goog-Upload-Command: start, upload, finalize" \
    -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
    -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPES[$i]}" \
    -H "Content-Type: application/json" \
    -d "{'file': {'display_name': '${FILES[$i]}'}}" \
    --data-binary "@${FILES[$i]}"
  # TODO: Read the file.uri from the response, store it as FILE_URI_${i}
done

...

{
      "role": "user",
      "parts": [
        {
          "fileData": {
            "fileUri": "${FILE_URI_0}",
            "mimeType": "application/pdf"
          }
        },
        {
          "fileData": {
            "fileUri": "${FILE_URI_1}",
            "mimeType": "application/pdf"
          }
        },
        {
          "fileData": {
            "fileUri": "${FILE_URI_2}",
            "mimeType": "application/pdf"
          }
        }
      ]
    }

Upvotes: 0

Views: 200

Answers (1)

McMaco
McMaco

Reputation: 178

Based on documentation for configuration settings used when generating the prompt. Under parameters of responseMimeType we have gemini-1.5-pro and gemini-1.5-flash only for the available models. Try to change your api_version variable either of the two.

Specify the appropriate response type to avoid unintended behaviors. For example, if you require a JSON-formatted response, specify application/json and not text/plain.

Upvotes: 0

Related Questions