mad_scientist
mad_scientist

Reputation: 117

How to write a FAST API function taking .csv file and making some preprocessing in pandas

I am trying to create an API function, that takes in .csv file (uploaded) and opens it as pandas DataFrame. Like that:

from fastapi import FastAPI
from fastapi import UploadFile, Query, Form
import pandas as pd

app = FastAPI()

@app.post("/check")
def foo(file: UploadFile):
    df = pd.read_csv(file.file)
    return len(df)

Then, I am invoking my API:

import requests

url = 'http://127.0.0.1:8000/check'
file = {'file': open('data/ny_pollution_events.csv', 'rb')}

resp = requests.post(url=url, files=file)
print(resp.json())

But I got such error: FileNotFoundError: [Errno 2] No such file or directory: 'ny_pollution_events.csv'

As far as I understand from doc pandas is able to read .csv file from file-like object, which file.file is supposed to be. But it seems, that here in read_csv() method pandas obtains name (not a file object itself) and tries to find it locally.

Am I doing something wrong? Can I somehow implement this logic?

Upvotes: 1

Views: 1882

Answers (1)

shashank
shashank

Reputation: 26

To read the file in pandas, the file must be stored on your PC. Don't forget to import shutil. if you don't need the file to be stored on your PC, delete it using os.remove(filepath).

        if not file.filename.lower().endswith(('.csv',".xlsx",".xls")):
            return 404,"Please upload xlsx,csv or xls file."

        if file.filename.lower().endswith(".csv"):
            extension = ".csv"
        elif file.filename.lower().endswith(".xlsx"):
            extension = ".xlsx"
        elif file.filename.lower().endswith(".xls"):
            extension = ".xls"

        # eventid = datetime.datetime.now().strftime('%Y%m-%d%H-%M%S-') + str(uuid4())
        filepath = "location where you want to store file"+ extension

        with open(filepath, "wb") as buffer:
            shutil.copyfileobj(file.file, buffer)
        
        try:
            if filepath.endswith(".csv"):
                df = pd.read_csv(filepath)
            else:
                df = pd.read_excel(filepath)
        except:
            return 401, "File is not proper"

Upvotes: 1

Related Questions