Reputation: 136151
I have an existing SQLite 3 database file, on which I need to make some extensive calculations. Doing the calculations from the file is painfully slow, and as the file is not large (approximately 10 MB), so there shouldn't be any problem to load it into memory.
Is there a Pythonic way to load the existing file into memory in order to speed up the calculations?
Upvotes: 80
Views: 59649
Reputation: 151
Nearly 14 years later and things a better than back in 2010. In python 3.11 you can do the following:
import sqlite3
with open(sqlitefile_path, 'rb') as f:
conn = sqlite3.conect(":memory:")
conn.deserialize(f.read())
In 2018, the sqlite3_{de,}serialize() functions where added to sqlite3 and released in version 3.23. They could be accessed by python's ctypes, though that's not super pythonic. The apsw
module added support for serialize
in 3.36 back in 2021. And support now exists in the builtin sqlite3 library in python 3.11, released back in late 2022.
Upvotes: 2
Reputation: 2888
Here is the snippet that I wrote for my Flask application:
import sqlite3
from io import StringIO
def init_sqlite_db(app):
# Read database to tempfile
con = sqlite3.connect(app.config['SQLITE_DATABASE'])
tempfile = StringIO()
for line in con.iterdump():
tempfile.write('%s\n' % line)
con.close()
tempfile.seek(0)
# Create a database in memory and import from tempfile
app.sqlite = sqlite3.connect(":memory:")
app.sqlite.cursor().executescript(tempfile.read())
app.sqlite.commit()
app.sqlite.row_factory = sqlite3.Row
Upvotes: 130
Reputation: 3348
This has already been answered before, including code examples at In Python, how can I load a SQLite database completely to memory before connecting to it?.
You do not mention operating system, but one gotcha of Windows XP is that it defaults to a 10 MB file cache, no matter how much memory you have. (This made sense in the days when systems came with 64 MB, etc.). This message has several links:
Re: [sqlite] ORDER BY is more than 10 times slower with 3.3.8 compared
Upvotes: 4
Reputation: 1398
SQLite supports in-memory databases.
In Python, you would use a :memory: database name for that.
Perhaps you could open two databases (one from the file, an empty one in-memory), migrate everything from the file database into memory, then use the in-memory database further to do calculations.
Upvotes: -1
Reputation: 3385
The answers of @thinwybk and Crooner are both excellent.
When you have multiple connections to :memory:
SQLite databases, for instance when using SQLAlchemy together with the source.backup(dest)
function, then you may end up in not placing the backup into the "right" memory database.
This can be fixed using a proper connection string: Python - How to connect SQLAlchemy to existing database in memory. And it does not involve any hack nor use of undocumented features.
Upvotes: 0
Reputation: 28
With the solution of Cenk Alti, I always had a MemoryError with Python 3.7, when the process reached 500 MB. Only with the use of the backup functionality of SQLite 3 (mentioned by thinwybk), I was able to to load and save bigger SQLite databases. Also you can do the same with just three lines of code, both ways.
Upvotes: 0
Reputation: 3027
If we must use a Python wrapper, then there are no better solution than the two pass, read and write solution. But beginning with version 3.7.17, SQLite has the option of accessing disk content directly using memory-mapped I/O. SQLite mmap.
If you want to use mmap, you have to use the C interface since no wrapper provide it.
And there is another hardware solution, the memory disk. Then you have the convenient file I/O and the speed of memory.
Upvotes: 7
Reputation: 4753
What about sqlite3.Connection.backup(...)
? "This method makes a backup of a SQLite database even while it’s being accessed by other clients, or concurrently by the same connection." Availability: SQLite 3.6.11 or higher. New in version 3.7.
import sqlite3
source = sqlite3.connect('existing_db.db')
dest = sqlite3.connect(':memory:')
source.backup(dest)
Upvotes: 68
Reputation: 61
Here is a relatively simple way to read a SQLite db into memory. Depending upon your preferences with regard to manipulating data, you either use Pandas dataframe or write your table to a in-memory sqlite3 database. Similarly after manipulating your data you use the same df.to_sqlite approach to store your results back into a db table.
import sqlite3 as lite
from pandas.io.sql import read_sql
from sqlalchemy import create_engine
engine = create_engine('sqlite://')
c = engine.connect()
conmem = c.connection
con = lite.connect('ait.sqlite', isolation_level=None) #Here is the connection to <ait.sqlite> residing on disk
cur = con.cursor()
sqlx = 'SELECT * FROM Table'
df = read_sql(sqlx, con, coerce_float=True, params=None)
#Read SQLite table into a panda dataframe
df.to_sql(con=conmem, name='Table', if_exists='replace', flavor='sqlite')
Upvotes: 4
Reputation: 363537
sqlite3.Connection.iterdump
"[r]eturns an iterator to dump the database in an SQL text format. Useful when saving an in-memory database for later restoration. This function provides the same capabilities as the .dump
command in the sqlite3 shell."
Get such an iterator and dump the disk-based database into a memory-based one, and you're ready to compute. When the computation is done, just dump the other way around back to disk.
Upvotes: 16
Reputation: 44118
First you should try and find out what is causing the slowness you are observing. Are you writing to tables? Are your writes within large enough transactions so that you don't save needless temporary results to disk? Can you change writes to go to temporary tables (with pragma temp_store=memory
)? Can you live with pragma synchronous=off
?
I don't think this functionality is exposed in the Python module, but sqlite has a backup API that sounds like exactly what you are asking for: a way to copy from one database to another (either one of which may be an in-memory database) that works pretty much automatically without any user-visible enumeration of tables. (Maybe APSW exposes this?)
Another option is to create a ram disk (if you have sufficient control of the environment) and copy the file there.
Upvotes: 9