duckman
duckman

Reputation: 747

best way to work with large dataset in python

I am working with a large financial dataset (15gb for now but will be 200 gb later). what will be the best way to work with it. In particular, I want to do some statistical tests and produce some graphs using millisecond data. So far I have used sqlite3 for the shake of easiness but it seems not able to handle the size of the file. I am using PyCharm (not sure if it helps)

Upvotes: 2

Views: 1000

Answers (1)

Juanín
Juanín

Reputation: 851

sqlite is not a good alternative if you want to manage large ammounts of data (actually I wouldn't use sqlite for something other than prototyping or running tests).

You can try using amazon RDS to store the database http://aws.amazon.com/es/rds/ and choose between one of the database engines that amazon offers.

As for using Python, I think you should let the DB engine to handle the requests and just use python to produce the graphs.

Upvotes: 1

Related Questions