Bharat Guda
Bharat Guda

Reputation: 117

Loading table to AWS RDS Postgres using Python takes forever

I am reading data from url and loading to AWS RDS Postgres. I am using free tier. Data has about 1.5 million records, when I am loading it to local postgres it just takes about <10 mins but when loading into AWS Postgres, it takes forever(more than 15 hrs) for just one query. how can I improve the performance or speed up the code, below is what I am using, please suggest me some good ways:

import pandas as pd
from sqlalchemy import create_engine
import zipfile
import os
from urllib.request import urlopen 
import urllib.request
import io
from io import BytesIO, StringIO



 pg_engine=create_engine('postgresql://user:[email protected]:5432/database')

 zf1 = zipfile.ZipFile(BytesIO(urllib.request.urlopen('http://wireless.fcc.gov/uls/data/complete/l_market.zip').read()))


df6_mk = pd.read_csv(zf1.open('MC.dat'),header=None,delimiter='|', index_col=0, names=['record_type', 'unique_system_identifier', 'uls_file_number','ebf_number','call_sign',
                        'undefined_partitioned_area_id','partition_sequence_number','partition_lat_degrees','partition_lat_minutes',
                        'partition_lat_seconds','partition_lat_direction','partition_long_degrees','partition_long_minutes','partition_long_seconds',
                        'partition_long_direction','undefined_partitioned_area'])

df6_mk.to_sql('mc_mk',pg_engine,if_exists='replace')

Upvotes: 1

Views: 105

Answers (1)

Anonymous Juan
Anonymous Juan

Reputation: 446

I believe the free tier RDS option is limited to a R/W capacity of 5/second, which will throttle you.

Upvotes: 1

Related Questions