Timeout error when uploading large file in FastAPI

I have a FastAPI function that receives a file, saves it locally, untars it and pushes the untarred image into an artifactory location. For some reason, larger files, or sometimes smaller files, get timed out, while uploading with a 200 response. I followed all the suggestions, like nginx ones, without any luck.

@router.post("/upload_container")
async def post_container(file: UploadFile = File(), new_container_image: str = Form(), new_tag: str = Form()):
    if file.content_type != "application/x-tar":
        raise HTTPException(status_code=400, detail="Can only accept tarred version of Docker file")
    file_path = str(str(uuid.uuid4()).split("-")[1]) + ".tar"
    async with aiofiles.open(file_path, "wb") as out_file:
        while content := await file.read(1024):  # async read chunk
            await out_file.write(content)  # async write chunk
    file_path = str(os.path.abspath(os.path.join(os.getcwd(), file_path)))
    command = "podman load --input {}".format(file_path)
    result = subprocess.run(split(command), stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    if result.returncode != 0:
        return {"filename": file.filename, "details": result.stderr, "Error": {" failed to unwrap"}}
    tagged_image = str(str(result.stdout).split("\n")[0]).split("Loaded image(s):")[1]
    if os.path.exists(file_path):
        os.remove(file_path)
    if tagandpull(tagged_image, new_container_image, new_tag) != True:
        return {"message": "Unable to push the image", "image_name": f"{file.filename}"}
    return {
        "filename": file.filename,
        "Details": {
            "new_url": "artifactorylocation.cloud.com/"
            + config_data.output["repos"]["local"]["deploy_repo"]
            + "/"
            + new_container_image,
            "new_tag": new_tag,
        },
    }

Here is my nginx annotation:

annotations: {
    'kubernetes.io/ingress.class': 'nginx',
    'nginx.ingress.kubernetes.io/proxy-body-size': '2400m',
    'nginx.ingress.kubernetes.io/proxy-connect-timeout': '600',
    'nginx.ingress.kubernetes.io/proxy-read-timeout': '600',
    'nginx.ingress.kubernetes.io/proxy-send-timeout': '600'
},

Has anyone faced this issue before?

Upvotes: 2

Views: 471

Answers (2)

Chris
Chris

Reputation: 34551

First, I would suggest having a look at this answer and this answer on how to upload, as well as store chunk by chunk, large files to a FastAPI backend in a dramatically faster way than using UploadFile.

Second, as for processing the file (as mentioned in your question), it could take place in a BackgroundTask (see this answer) or a ProcessPool behind the scenes (see this answer), thus separating the upload from processing logic. A typical setup would be to have the client post data to /task/create, for instance, where you might or might not save the file to disk (depending on your project's requirements, as well as the size of the file and the server's machine specifications). That API call would start the processing of the file data (whatever that might be, depending on your case) in the background, and immediately return a response to the client with a unque task_ID of the newly created task. That task_ID could then be used by the client to query the status of their request/task (i.e., whether the file is still pending, being processed or has finished processing) at /task/status/{task_ID}, as well as get any relevant results back when the processing is done, by calling /task/results/{task_ID}—you might want to have a look at how to share variables/objects between HTTP requests as well.

Please note that passing sensitive information to the query string comes with its risks, as mentioned in this answer. If that's not that important to your case—as well as you provide access to your API over HTTPS, and an authentication system is in place to protect your API, thus ensuring that only those who are supposed to download the results can do so—you could then have the endpoints defined as explained ealier, i.e., passing the task_ID to the query string; otherwise, please have a look at this answer on how to pass the task_ID to the request body instead (the linked answer also demonstrates how to automate the process of checking the task's status and redirecting the user to another page for downloading the results, when the task is done processing).

Using the approaches described above could significantly minimize the time required to call the /upload_container endpoint you defined and get a response by the server; hence, avoiding any timeouts.

It should also be noted that one could always increase the timeout value, which (for uvicorn) defaults to 5 seconds, as shown in the documentation:

--timeout-keep-alive <int>: Close Keep-Alive connections, if no new data is received within this timeout. Default: 5.

Running uvicorn from the command line, for instance:

> uvicorn main:app --timeout-keep-alive 15

Running uvicorn programmatically, for example:

uvicorn.run(app, timeout_keep_alive=15)

Upvotes: 0

Amias
Amias

Reputation: 343

You are doing a lot of things while still within the request handler , these things are simply taking longer than a reasonable HTTP timeout.

Maybe you could decouple the incoming request from the action , so the incoming request sets a flag which is polled for by a service which does the work outside the request.

Upvotes: 0

Related Questions