Reputation: 11418
I am having a ~90 MB SQLite database on a SSD consisting mostly on message attachments including a BLOB column content, that stores the binary attachment data.
Now I found that the following query
SELECT message_id FROM attachments WHERE length(content) IS NULL;
is 500x faster (0.5ms vs. 250ms) than the original
SELECT message_id FROM attachments WHERE content IS NULL;
Is it true, that both queries are equivalent?
Additional info
Upvotes: 7
Views: 1625
Reputation: 26599
I made a script to benchmark both functions. length(x) IS NULL
is faster unless you have mostly NULL
values.
IS NULL
: 11.343180236999842length(x) IS NULL
: 7.824154090999855IS NULL
: 15.019244787999924length(x) IS NULL
: 7.527420233999919IS NULL
: 6.184766045999822length(x) IS NULL
: 6.448342310000044import sqlite3
import timeit
conn = sqlite3.connect("test.db")
c = conn.cursor()
c.execute("DROP TABLE IF EXISTS test")
c.execute("CREATE TABLE test (data BLOB)")
for i in range(10000):
# Modify this to change data
if i % 2 == 0:
c.execute("INSERT INTO test(data) VALUES (randomblob(1024))")
else:
c.execute("INSERT INTO test(data) VALUES (NULL)")
def timeit_isnull():
c.execute("SELECT data IS NULL AS dataisnull FROM test")
c.fetchall()
def timeit_lenisnull():
c.execute("SELECT length(data) IS NULL AS dataisnull FROM test")
c.fetchall()
print(timeit.timeit(timeit_isnull, number=1000))
print(timeit.timeit(timeit_lenisnull, number=1000))
Upvotes: 3
Reputation: 180172
In SQLite, the length and type of each column value are stored at the beginning of the row.
This allows the length()
and typeof()
functions to be optimized to avoid loading the actual value.
The IS NULL operator has no such optimization (although it would be possible to implement it).
Upvotes: 4
Reputation: 31889
Actually using LENGTH(some_blob_content)
instructs MySQL server 5.5 and above to bypass row-scanning, this is the reason for improved performance of your query, because the data is read directly from meta-data table.
EDIT:
In SQLite during part INSERT
and SELECT
processing, the complete content of each row in the database is encoded as a single BLOB. See here for details.
Also for a string value X, the
length(X)
function returns the number of characters (not bytes) in X prior to the firstNULL
character. Since SQLite strings do not normally containNULL
characters, thelength(X)
function will usually return the total number of characters in the string X. For ablob
value X,length(X)
returns the number of bytes in the blob. If X isNULL
thenlength(X)
is NULL. If X is numeric thenlength(X)
returns the length of a string representation of X.
The above quote from the documentation suggests that, for blob values, the procedure of reading their length is the same as in MySQL, i.e. from meta-data table.
Upvotes: 1