danpelota
danpelota

Reputation: 2275

Storing pandas DataFrames in SQLAlchemy models

I'm building a flask application that allows users to upload CSV files (with varying columns), preview uploaded files, generate summary statistics, perform complex transformations/aggregations (sometimes via Celery jobs), and then export the modified data. The uploaded file is being read into a pandas DataFrame, which allows me to elegantly handle most of the complicated data work.

I'd like these DataFrames along with associated metadata (time uploaded, ID of user uploading the file, etc.) to persist and be available for multiple users to pass around to various views. However, I'm not sure how best to incorporate the data into my SQLAlchemy models (I'm using PostgreSQL on the backend).

Three approaches I've considered:

Given the advantages and drawbacks of each (including those I'm unaware of), is there a preferred way to incorporate pandas DataFrames into SQLAlchemy models?

Upvotes: 13

Views: 2203

Answers (1)

zerocog
zerocog

Reputation: 1763

Go towards the JSON and PostgreSQL solution. I am on a Pandas project that started with the Pickle on file system, and loaded the data into to an class object for the data processing with pandas. However, as the data became large, we played with SQLAlchemy / SQLite3. Now, we are finding that working with SQLAlchemy / PostgreSQL is even better. I think our next step will be JSON. Have fun! Pandas rocks!

Upvotes: 6

Related Questions