Reputation: 11281
I want to take advantage of the speed benefits of holding an SQLite database (via SQLAlchemy) in memory while I go through a one-time process of inserting content, and then dump it to file, stored to be used later.
Consider a bog-standard database created in the usual way:
# in-memory database
e = create_engine('sqlite://')
Is there a quicker way of moving its contents to disc, other than just creating a brand new database and inserting each entry manually?
EDIT:
There is some doubt as to whether or not I'd even see any benefits to using an in-memory database. Unfortunately I already see a huge time difference of about 120x.
This confusion is probably due to me missing out some important detail in the question. Also probably due to a lack of understanding on my part re: caches / page sizes / etc. Allow me to elaborate:
I am running simulations of a system I have set up, with each simulation going through the following stages:
insert
new entries into the database based on the most recent simulation.commit()
.While I only ever make a dozen or so insertions on each simulation run, I do however run millions of simulations, and the results of each simulation need to be available for future simulations to take place. As I say, this read and write process takes considerably longer when running a file-backed database; it's the difference between 6 hours and a month.
Hopefully this clarifies things. I can cobble together a simple python script to outline my process further a little further if necessary.
Upvotes: 7
Views: 5266
Reputation: 1121524
SQLAlchemy and SQLite know how to cache and do batch-inserts just fine.
There is no benefit in using an in-memory SQLite database here, because that database uses pages just like the on-disk version would, and the only difference is that eventually those pages get written to disk for disk-based database. The difference in performance is only 1.5 times, see SQLite Performance Benchmark -- why is :memory: so slow...only 1.5X as fast as disk?
There is also no way to move the in-memory database to a disk-based database at a later time, short of running queries on the in-memory database and executing batch inserts into the disk-based database on two separate connections.
Upvotes: 1