Reputation: 1333
I have a pandas data frame which I want to insert it into my Postgres database in my Django project.
The data frame has 5 columns and the Database table has 6 columns and moreover, the data frame columns and DB columns order are not the same.
So, before merging both, do I have to make sure that the order of the columns is the same in both the data frame and DB table? and how pls suggest how do I handle the missing column
Upvotes: 3
Views: 11620
Reputation: 4404
If dataframe has column names same as column names in database, you can insert df directly into database table using dataframe.to_sql()
method with the help of sqlalchemy
for connection:
from myapp.models import Bob
from sqlalchemy import create_engine
from django.conf import settings
db_connection_url = "postgresql://{}:{}@{}:{}/{}".format(
settings.DATABASES['default']['USER'],
settings.DATABASES['default']['PASSWORD'],
settings.DATABASES['default']['HOST'],
settings.DATABASES['default']['PORT'],
settings.DATABASES['default']['NAME'],
)
engine = create_engine(db_connection_url)
df.to_sql(Bob._meta.db_table, engine, if_exists='append', index=False, chunksize=10000)
Missing column will be empty (or database will set default value if it defined at database level, not django level), or you can add missing column to dataframe with required value.
Upvotes: 9
Reputation: 5279
Just do an Explicit Insert ...
If your table has columns in the order of A,B,C,D,E
But your Pandas has them in the order of D,C,B,A (Note no Column E)
Just generate an SQL Insert like (Note I have no Column E)
insert into <TABLE> (D,C,B,A) values (row_iterator.D,row_iterator.C,...)
For the Column E - the best and simplest solution is have a default value in the Db Definition ....
i.e.
CREATE TABLE Bob (
A int NOT NULL,
B int NOT NULL,
C int NOT NULL,
D int NOT NULL,
E int DEFAULT 42
);
Hope that helps
Upvotes: 3