Reputation: 7654
Inside an web application ( Pyramid ) I create certain objects on POST
which need some work done on them ( mainly fetching something from the web ). These objects are persisted to a PostgreSQL database with the help of SQLAlchemy. Since these tasks can take a while it is not done inside the request handler but rather offloaded to a daemon process on a different host. When the object is created I take it's ID
( which is a client side generated UUID
) and send it via ZeroMQ to the daemon process. The daemon receives the ID
, and fetches the object from the database, does it's work and writes the result to the database.
Problem:
The daemon can receive the ID
before it's creating transaction is committed. Since we are using pyramid_tm
, all database transactions are committed when the request handler returns without an error and I would rather like to leave it this way. On my dev system everything runs on the same box, so ZeroMQ is lightning fast. On the production system this is most likely not an issue since web application and daemon run on different hosts but I don't want to count on this.
This problem only recently manifested itself since we previously used MongoDB with a write_convern
of 2. Having only two database servers the write
on the entity always blocked the web-request until the entity was persisted ( which is obviously is not the greatest idea ).
I see multiple possible solutions, but most of them don't satisfy me:
after_created
event to trigger it and this is really nice since it decouples this process completely and thus eliminating the risk of "forgetting" to tell the daemon to work. Also think that I still would need a READ UNCOMMITTED
isolation level on the daemon side, is this correct?Upvotes: 1
Views: 2426
Reputation: 8783
I would just use PostgreSQL's LISTEN
and NOTIFY
functionality. The worker can connect to the SQL server (which it already has to do), and issue the appropriate LISTEN
. PostgreSQL would then let it know when relevant transactions finished. You trigger for generating the notifications in the SQL server could probably even send the entire row in the payload, so the worker doesn't even have to request anything:
CREATE OR REPLACE FUNCTION magic_notifier() RETURNS trigger AS $$
BEGIN
PERFORM pg_notify('stuffdone', row_to_json(new)::text);
RETURN new;
END;
$$ LANGUAGE plpgsql;
With that, right as soon as it knows there is work to do, it has the necessary information, so it can begin work without another round-trip.
Upvotes: 1
Reputation: 7033
This comes close to your second solution:
Create a buffer, drop the ids from your zeromq messages in there and let you worker poll regularly this id-pool. If it fails retrieving an object for the id from the database, let the id sit in the pool until the next poll, else remove the id from the pool.
You have to deal somehow with the asynchronous behaviour of your system. When the ids arrive constantly before persisting the object in the database, it doesnt matter whether pooling the ids (and re-polling the the same id) reduces throughput, because the bottleneck is earlier.
An upside is, you could run multiple frontends in front of this.
Upvotes: 0