Reputation: 2491
I am reading the concepts of elasticsearch-7.4 and I got to know about two fields.
_seq_no
and _version
.
As per the documentation:
Version
Returns a version for each search hit.
Sequence Numbers and Primary Term
Returns the sequence number and primary term of the last modification to each search hit.
But it is not clearing anything related to when they both will be different or same for a document.
I created an index test
PUT /test/_doc/_mapping
{
"properties": {
"total_price" : {
"type": "integer"
},
"final_price": {
"type": "integer"
},
"base_price": {
"enabled": false
}
}
}
I am updating the full document using PUT API
.
PUT /test/_doc/2
{
"total_price": 10,
"final_price": 10,
"base_price": 10
}
Both _seq_no and _version are increasing in this case.
On doing partial updates using UPDATE API
,
POST /test/_doc/2/_update
{
"doc" : {
"base_price" : 10000
}
}
Both _seq_no and _version are increasing in this case too.
So, I am unable to find the case when only one field is changing but the other is not.
When will both the fields be different?
Upvotes: 10
Views: 11819
Reputation: 1029
Elasticsearch
documents are immutable, this means that whenever you update a document, a new version of that document will be created, regardless of whether you are using PUT
(updating the entire document) or POST
(updating some parts of the document).
Each newly created document will be given a new incremented version, which is identified by the _version
field:
{
"_index": "movies",
"_type": "_doc",
"_id": "109487",
"_version": 14,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 17,
"_primary_term": 7
}
Imagine that you have a blog website, and there are 2 users were hitting the same blog post of id 1 at the same time: GET https://myblog.com/posts/1
Back to Elasticsearch
, the post
document has a field named view_count
, this field stores the total number of views (how many times the post was viewed).
To increment the view_count
you have to send a GET
request reading the current value:
GET /posts/_doc/1
{
"_index": "movies",
"_type": "_doc",
"_id": "109487",
"_version": 12,
"_seq_no": 15,
"_primary_term": 7,
"found": true,
"_source": {
"post": "Lorem ipsum ...",
"title": "My title",
"published_at": "2020-01-01",
"view_count": 10
}
}
Then you update the view_count
of post id 1 by incrementing the returned value (from GET
) by 1:
PUT /posts/_doc/1/_update
{
"doc": {
"view_count": 11
}
}
There is a problem here.
Since both users were hitting the same post page at the same time, they’ll be getting the value of 10
.
As you see here, the value 11 was stored, but that is incorrect, since we updated the document twice (remember 2 users hit the post id at the same time), hence the value should be 12.
But why? That is because both users have gotten the value 10
when they read the view_count
.
So, how do we solve this issue?
Fortunately, Elasticsearch uses something called Optimistic concurrency control (OCC) (Optimistic concurrency control - Wikipedia).
To ensure that the recent document needs to be updated we send the if_primary_term
alongside with the if_seq_no
values (which are fetched from the GET
request):
POST /posts/_update/1?if_primary_term=1&if_seq_no=10
That’s it.
Upvotes: 4
Reputation: 217434
Sequence numbers have been introduced in ES 6.0.0. Just before that release came out, they were very well explained in this blog article.
But in summary,
version
is a sequential number that counts the number of time a document was updated_seq_no
is a sequential number that counts the number of operations that happened on the indexSo if you create a second document, you'll see that version
and _seq_no
will be different.
Let's create three documents:
POST test/_doc/_bulk
{"index": {}}
{"test": 1}
{"index": {}}
{"test": 2}
{"index": {}}
{"test": 3}
In the response, you'll get the payload below.
{
"took" : 166,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "test",
"_type" : "_doc",
"_id" : "d2zbSW4BJvP7VWZfYMwQ",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "test",
"_type" : "_doc",
"_id" : "eGzbSW4BJvP7VWZfYMwQ",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "test",
"_type" : "_doc",
"_id" : "eWzbSW4BJvP7VWZfYMwQ",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 201
}
}
]
}
As you can see:
_seq_no
is 0 (first index operation)_seq_no
is 1 (second index operation)_seq_no
is 2 (third index operation)Upvotes: 20