Metalstorm
Metalstorm

Reputation: 3242

Etags used in RESTful APIs are still susceptible to race conditions

Maybe I'm overlooking something simple and obvious here, but here goes:

So one of the features of the Etag header in a HTTP request/response it to enforce concurrency, namely so that multiple clients cannot override each other's edits of a resource (normally when doing a PUT request). I think that part is fairly well known.

The bit I'm not so sure about is how the backend/API implementation can actually implement this without having a race condition; for example:

Setup:

The problem:

The only fool-proof solution I can think of is to also make the database perform the check, in the update query for example. Am I missing something?

P.S Tagged as Python due to the frameworks used but this should be a language/framework agnostic problem.

Upvotes: 6

Views: 2078

Answers (3)

Mark
Mark

Reputation: 19977

You are right that you can still get race conditions if the 'check last etag' and 'make the change' aren't in one atomic operation.

In essence, if your server itself has a race condition, sending etags to the client won't help with that.

You already mentioned a good way to achieve this atomicity:

The only fool-proof solution I can think of is to also make the database perform the check, in the update query for example.

You could do something else, like using a mutex lock. Or using an architecture where two threads cannot deal with the same data.

But the database check seems good to me. What you describe about ORM checks might be an addition for better error messages, but is not by itself sufficient as you found.

Upvotes: 1

Matt Timmermans
Matt Timmermans

Reputation: 59263

This is really a question about how to use ORMs to do updates, not about ETags.

Imagine 2 processes transferring money into a bank account at the same time -- they both read the old balance, add some, then write the new balance. One of the transfers is lost.

When you're writing with a relational DB, the solution to these problems is to put the read + write in the same transaction, and then use SELECT FOR UPDATE to read the data and/or ensure you have an appropriate isolation level set.

The various ORM implementations all support transactions, so getting the read, check and write into the same transaction will be easy. If you set the SERIALIZABLE isolation level, then that will be enough to fix race conditions, but you may have to deal with deadlocks.

ORMs also generally support SELECT FOR UPDATE in some way. This will let you write safe code with the default READ COMMITTED isolation level. If you google SELECT FOR UPDATE and your ORM, it will probably tell you how to do it.

In both cases (serializable isolation level or select for update), the database will fix the problem by getting a lock on the row for the entity when you read it. If another request comes in and tries to read the entity before your transaction commits, it will be forced to wait.

Upvotes: 2

Evan
Evan

Reputation: 457

Etag can be implemented in many ways, not just last updated time. If you choose to implement the Etag purely based on last updated time, then why not just use the Last-Modified header?

If you were to encode more information into the Etag about the underlying resource, you wouldn't be susceptible to the race condition that you've outlined above.

The only fool proof solution I can think of is to also make the database perform the check, in the update query for example. Am I missing something?

That's your answer.


Another option would be to add a version to each of your resources which is incremented on each successful update. When updating a resource, specify both the ID and the version in the WHERE. Additionally, set version = version + 1. If the resource had been updated since the last request then the update would fail as no record would be found. This eliminates the need for locking.

Upvotes: 1

Related Questions