Reputation: 101
I'm trying to get statistics/counts on indices in my elasticsearch cluster (1.2.1). I was using the Indices Stats API (_stats endpoint) to get the total number of primary documents and their size on disk. However, I started experimenting with the Count API (_count endpoint) and noticed that the values do not align.
What is the difference between these values? It's not entirely clear from the documentation though a clue in the documentation indicates that the value returned from Indicies Stats can change when refreshing the index. This makes me wonder if this is a lower-level value from the Lucene layer.
Indices Stats API
localhost:9200/my_index/_stats
...snip...
"_all" : {
"primaries" : {
"docs" : {
"count" : 8284,
"deleted" : 87
},
}
}
...snip...
Count API
localhost:9200/my_index/_count
{
"count" : 6854,
"_shards" : {
"total" : 40,
"successful" : 40,
"failed" : 0
}
}
Upvotes: 10
Views: 4716
Reputation: 217454
Actually, the docs.count
you get back from the Indices stats API also includes the count of nested documents present in the index so it will always be greater or equals than the count you get back from the Count API, which only returns the count of top-level documents, i.e. documents that would be returned from a search query.
So, judging by the numbers you posted, it looks like your index contains documents with fields whose type is nested
in the mapping. Sounds correct?
Upvotes: 28