jony89
jony89

Reputation: 5367

optimizing mysql table with 1.5m records where most are soft deleted

I've got a MySQL table that has around 1.5 million records and the table size is 1.3GB

I am using a soft delete mechanism in that table, which means I've a column deleted_at which indicates whether the row has been deleted and when. if the record is not deleted then deleted_at value is NULL

From these 1.5 million records, only 30K are not soft deleted. that means they are accessed frequently, while the other records are barely accessed, but they are, in some cases.

So this table is heavily used and queried for the none deleted records, and sometimes for the soft deleted records.

I have a BTREE index type for the deleted_at record (with cardinality of 35K). The table becomes heavier with time and obviously it is not a scalable solution.

The table engine is MyISAM. most of the other tables are InnoDB but this table is queried heavily with STORED PROCEDURE, and when I changed to InnoDB the queries were way slower.

I am looking for a solution that will not involve hardware changes. the current hardware is sufficient for that table to have good performance, but this will not be the case once this table will grow more.

Things I thought of :

What other options do I have? can I give priority to some rows in a table with MySQL? memory wise.

I've got 10.3.20-MariaDB and 32GB of RAM

Upvotes: 2

Views: 914

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562731

MyISAM does not cache rows, it only caches indexes. It relies on the filesystem cache for buffering rows.

So you could try to ensure at least the index is fully loaded into the cache:

  1. Increase key_buffer_size so it's at least as large as your MyISAM indexes for this table. Use SHOW TABLE STATUS to find out the index size.
  2. If you have multiple MyISAM tables, you might need to dedicate a key cache specifically for this table. See CACHE INDEX.
  3. Pre-load the index into the key cache at startup. See LOAD INDEX INTO CACHE.

You may also want to consider multi-column indexes tailored to your queries. Like if you have a query WHERE user_id = 1234 AND deleted_at IS NULL, you should create an index on (user_id, deleted_at).

Which indexes you need depend on the queries you want to optimize.

Frankly, I would split the table so deleted rows are in a second table. That would reduce your table size by 98%, and that might make queries run quick enough that you don't need to use MyISAM anymore.

Upvotes: 4

Related Questions