Reputation: 93082
I am implementing a FUSE file system that uses a sqlite3 database for its backend. I do not plan to ever change the database backend as my file system uses sqlite3 as a file format. One of the functions a file system must implement is readdir
function. This function allows the process to iteratively read the contents of a directory by repeatedly calling it and getting the next few directory entries (as many as the buffer can hold). The directory entries are returned may be returned in any order. I want to implement this operation with the following query:
SELECT fileno, name FROM dirents WHERE dirno = ? LIMIT -1 OFFSET ?;
where dirno
is the directory I'm reading from and OFFSET ?
is the number of entries I've already returned. I want to read as many rows as I can fit into the buffer (I cannot predict the count as these are variable-length records depending on the length of the file name) and then reset the query.
Due to the stateless nature of FUSE, keeping open a query and returning the next few rows until the directory has ended is not an option as I cannot reliably detect if the process closes the directory prematurely.
The dirents
table has the following schema:
CREATE TABLE dirents (
dirno INTEGER NOT NULL REFERENCES inodes(inum),
fileno INTEGER NOT NULL REFERENCES inodes(inum),
name TEXT NOT NULL,
PRIMARY KEY (dirno, name)
) WITHOUT ROWID;
In theory, a SELECT
statement yields rows in no defined order. In practice, can I assume that when I execute the same prepared SELECT
statement multiple times with successively larger OFFSET
values, I get the same results as if I read all the data in a single query, i.e. the row order is the same unspecified order each time? An assumption that currently holds is that the database is not modified between queries.
Can I assume that the row order stays reasonably similar when a different query modifies the dirents
table inbetween? Some glitches (e.g. directory entries appearing twice) will of course be observable by the program, but for usability (the main user of readdir
is the ls
command) it is highly useful if the directory listing is usually mostly correct.
If I cannot make these assumptions, what is a better way to reach the desired result?
I know that I could throw in an ORDER BY
clause to make the row order well-defined, but I fear that this might have a strong impact on performance, especially when reading small chunks of a large directory—the directory has to be sorted every time a chunk is read.
Upvotes: 1
Views: 154
Reputation: 93082
If I add an ORDER BY name
clause to the SELECT
query, sqlite3 generates almost identical (except for a stray Noop
) bytecode for the query but a row order is guaranteed:
sqlite> EXPLAIN SELECT fileno, name FROM dirents WHERE dirno = ? LIMIT -1 OFFSET ?;
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 21 0 00
1 Integer -1 1 0 00
2 Variable 2 2 0 00
3 MustBeInt 2 0 0 00
4 SetIfNotPos 2 2 0 00
5 Add 1 2 3 00
6 SetIfNotPos 1 3 -1 00
7 OpenRead 1 3 0 k(2,nil,nil) 02
8 Variable 1 4 0 00
9 IsNull 4 19 0 00
10 Affinity 4 1 0 D 00
11 SeekGE 1 19 4 1 00
12 IdxGT 1 19 4 1 00
13 IfPos 2 18 1 00
14 Column 1 2 5 00
15 Column 1 1 6 00
16 ResultRow 5 2 0 00
17 DecrJumpZero 1 19 0 00
18 Next 1 12 0 00
19 Close 1 0 0 00
20 Halt 0 0 0 00
21 Transaction 0 0 3 0 01
22 TableLock 0 3 0 dirents 00
23 Goto 0 1 0 00
sqlite> EXPLAIN SELECT fileno, name FROM dirents WHERE dirno = ? ORDER BY name LIMIT -1 OFFSET ?;
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 22 0 00
1 Noop 0 0 0 00
2 Integer -1 1 0 00
3 Variable 2 2 0 00
4 MustBeInt 2 0 0 00
5 SetIfNotPos 2 2 0 00
6 Add 1 2 3 00
7 SetIfNotPos 1 3 -1 00
8 OpenRead 2 3 0 k(2,nil,nil) 02
9 Variable 1 4 0 00
10 IsNull 4 20 0 00
11 Affinity 4 1 0 D 00
12 SeekGE 2 20 4 1 00
13 IdxGT 2 20 4 1 00
14 IfPos 2 19 1 00
15 Column 2 2 5 00
16 Column 2 1 6 00
17 ResultRow 5 2 0 00
18 DecrJumpZero 1 20 0 00
19 Next 2 13 0 00
20 Close 2 0 0 00
21 Halt 0 0 0 00
22 Transaction 0 0 3 0 01
23 TableLock 0 3 0 dirents 00
24 Goto 0 1 0 00
So I guess I'm going for that ORDER BY name
clause.
Upvotes: 0
Reputation: 1270503
The right solution to this problem is to use order by
. If you are concerned about the performance of the order by, then use an index on the column used for the order by
.
The simplest approach, in my opinion, would be to remove the without rowid
option on the table creation. Then you can just access the table as:
SELECT fileno, name
FROM dirents
WHERE dirno = ?
ORDER BY rowid
LIMIT -1 OFFSET ?;
I realize this adds additional bytes to each row, but it is for a good purpose -- making sure that your queries are correct.
Actually, the best index for this table would be on dirno, rowid, fileno, name
. Given the where
clause, you are doing a full table scan anyway, unless you have an index.
Upvotes: 2