jamzsabb
jamzsabb

Reputation: 1154

Is high read but absolutely no writing on SQLite query normal?

I've been running a query to add a column to a table by matching ID from another table. Both have about 600 million rows so its reasonable that this would take a while, but what is worrying me is that there are high read speeds (~500MB/s) on the disk but sqlite is writing 0B/s according to iotop. The filesize on my .db file has not changed in hours, but adding a column to a 600 million row table should change AT LEAST one byte, no?

Is this normal behavior from SQLite? The machine is pretty powerful, Ubuntu 16 on quad core i7 with 64GB RAM and NVMe SSD. The queries and table schema are below.

ALTER TABLE tableA ADD address TEXT;
UPDATE tableA SET address = (SELECT address FROM tableB WHERE tableA.ID = tx_out.ID);

Table schema:
CREATE TABLE tableA (
    ID TEXT,
    column1 INT,
    column2 TEXT,
);

CREATE TABLE tx_out (
    ID TEXT,
    sequence INT,
    address TEXT
);

Upvotes: 0

Views: 96

Answers (2)

CL.
CL.

Reputation: 180060

Adding a column changes almost nothing on disk; a row with fewer values than the table has columns is assumed to have NULLs in the missing columns.

The UPDATE is extremely slow because the subquery has to scan through the entire tx_out table for every row in tableA. You could speed it up greatly with an index on the tx_out.ID column.

When the database has to rewrite all rows anyway, and you have the disk space, it might be a better idea to create a new table:

INSERT INTO NewTable(ID, col1, col2, address)
SELECT ID, col1, col2, address
FROM tableA
JOIN tableB USING (ID);     -- also needs an index to be fast

Upvotes: 2

Twelfth
Twelfth

Reputation: 7180

Too big for a comment

I've had this run for days with no change...I think it is prone to locking itself in one manner or another, I killed it after it's 3rd day of what appeared to be no changes. I've encountered very similar issues when attempting to add a new index as well, but that one successfully completed in 2 days before I hit my 3 day kill threshhold ;) Possible 3 days just wasn't enough?

My preference now is to create a second table that has the new column, load the table with the data from the old plus the new column, rename the old table to X_oldtablename, rename the new table to the table name. Run tests and drop x_oldtablename after you are confident the new table is working

Upvotes: 1

Related Questions