user3010273
user3010273

Reputation: 900

Optimizing mysql database - List of tasks eats up my server

I run a service that takes user request and adds it to a table named 'queue' that has 3 fields: userid, queuenumber, and processed.

I store there tasks that need to be processed ( I can process less tasks than I get requests )

Unfortunately, I don't really know much about mysql optimization and my processing script ( that I run 50 versions in parallel ) uses queries like those:

SELECT * FROM `queue` WHERE `processed` = 0 ORDER BY `queue`.`queuenumber`

When any instance of the code takes a row to process, it marks it processed = 1 so it won't be taken up by the other instances.

So when I run 50 instances of this code, it becomes heavy on the MySQL database ( and I have only 4GB of ram )

Because of that, I get stats like those:

Traffic:

Received 15.8 MiB
Sent    42.9 GiB    

ø per hour

Received: 3 MiB
Sent: 8.1 GiB

Some stats from phpmyadmin:

enter image description here

How do I optimize this? To be able to run this code in parallel but not list the whole tables?

// EDIT:

The results of create table are:

CREATE TABLE queue ( userid int(11) NOT NULL, queuenumber int(11) NOT NULL AUTO_INCREMENT, processed tinyint(1) NOT NULL, PRIMARY KEY (queuenumber), UNIQUE KEY queuenumber (queuenumber), UNIQUE KEY userid (userid) ) ENGINE=InnoDB AUTO_INCREMENT=121617 DEFAULT CHARSET=latin1

Upvotes: 1

Views: 185

Answers (1)

abasterfield
abasterfield

Reputation: 2282

Implementing queues with an RDBMS is an anti-pattern: http://blog.engineyard.com/2011/5-subtle-ways-youre-using-mysql-as-a-queue-and-why-itll-bite-you

However if you insist there are ways to do it consistently without race conditions.

First implement a notification (outside of MySQL) when a new row is enqueued so you don't have to poll for new data in a fast loop - maybe this is why your traffic is so high. I have seen UDP multicast used for this, or UNIX signals.

Now check table is using InnoDB

SHOW CREATE TABLE `queue`;

If you need to change it to InnoDB

ALTER TABLE `queue` ENGINE=InnoDB;

If you have multiple consumers processing jobs ensure your locking is race free; every time a consumer encounter a new job the first thing it should try to do is to update it with it's unique process-id. each consumer should perform an update on the table to assign the row/job to a consumer by ID

You will need to have a process column on your table

ALTER TABLE `queue` ADD `process` int unsigned default NULL;

Add an index to help you select the rows efficiently

ALTER TABLE `queue` ADD KEY (`processed`,`process`);

Now your consumers can run the following in their code

UPDATE `queue` SET process = ? WHERE process IS NULL AND processed = 0 LIMIT 1;

You can set ? to be the pid or thread-id of the consumer. You can tell if the update found a row by either checking the affected rows in the response of the query (fastest) or try to get the row details anyway - maybe you will return 0 rows which means the update did nothing - a different consumer got the job.

SELECT * FROM `queue` WHERE process = ? AND processed = 0;

When your consumer is finished with the row/job it can just set processed to 0 on that row (maybe you should consider an auto-inc primary key on the table you you can accurately address a single row)

ALTER TABLE `queue` ADD `id` bigint unsigned NOT NULL auto_increment FIRST, ADD primary key (`id`);

If your consumers crash for whatever reason you can tell what was going on by looking at the row - if processed = 0 but process has been set then something crashed while processing that row.

Upvotes: 2

Related Questions