Reputation: 2187
Having distributed serverless application, based on AWS Aurora Serverless MySQL 5.6 and multiple Lambda functions. Some of Lambdas represent writing threads, another are reading treads. For denoting most important details, lets suppose that there is only one table with following structure:
id: bigint primary key autoincrement
key1: varchar(700)
key2: bigint
content: blob
unique(key1, key2)
Writing threads perform INSERTs in following manner: every writing thread generates one entry with key1+key2+content
, where key1+key2
pair is unique, and id
is generating automatically by autoincrement. Some writing threads can fail by DUPLICATE KEY ERROR
, if key1+key2
will have repeating value, but that does not matter and okay.
There also some reading threads, which are polling table and tries to process new inserted entries. Goal of reading thread is retrieve all new entries and process them some way. Amount of reading threads is uncontrolled and they does not communicate with each other and does not write anything in table above, but can write some state in custom table.
Firstly it's seems that polling is very simple - it's enough to reading process to store last id
that has been processed, and continue polling from it, e.g. SELECT * FROM table WHERE id > ${lastId}
. Approach above works well on small load, but does not work with high load by obvious reason: there are some amount of inserting entries, which have not yet appeared in the database, because cluster had not been synchronized at this point.
Let's see what happens in cluster point of view, event if it consists of only two servers A and B.
1) Server A accepts write transaction with entry insertion and acquired autoincrement number 100500
2) Server B accepts write transaction with entry insertion and acquired autoincrement number 100501
3) Server B commits write transaction
4) Server B accepts read transaction, and returns entries with id > 100499
, which is only 100501
entry.
5) Server A commits write transaction.
6) Reading thread receives only 100501
entry and moves lastId
cursor to 100501
. Entry 100500
is lost for current reading thread forever.
QUESTION: Is there way to solve problem above WITHOUT hard-lock tables on all cluster, in some lock-less aware way or something similar?
Upvotes: 1
Views: 705
Reputation: 7620
The issue here is that the local state in each lambda (thread) does not reflect the global state of said table.
As a first call I would try to always consult the table what is the latest ID before reading the entry with that ID.
Have a look at built in function LAST_INSERT_ID()
in MySQL.
[...] the most recently generated ID is maintained in the server on a per-connection basis
Your lambda could be creating connections prior to handler function / method which would make them longer living (it's a known trick, but it's not bomb proof here), but I think new simultaneously executing lambda function would be given a new connection, in which case the above solution would fall apart.
Luckily what you have to do then is to wrap all WRITES and all READS in transactions so that additional coordination will take place when reading and writing simultaneously to the same table.
In your quest you might come across transaction isolation levels and SEERIALIZEABLE would be safest and least perfomant, but apparently AWS Aurora does not support it (I had not verified that statement).
HTH
Upvotes: 2