Puneet Pant
Puneet Pant

Reputation: 1048

How to get total number of documents in Marklogic database?

I have around 20 lacs documents in Marklogic Database. I want the total number of documents in my search application for pagination. For getting the totals I am using

xdmp:estimate(cts:search(doc(), $query))

where $query is the combination of various queries combined in cts:and-query. But I am not getting the correct total. When $query is blank it shows much higher count than the total number of documents in the database. When I use

xdmp:estimate(doc())

it shows me the correct total but it would be static total which would not change according to the query. I want the total according to the results returned in response to a particular query that's why I passed $query as an argument but it is not showing correct total. fn:count() shows correct total but when the number of documents are around 20 lacs then fn:count() does not work because fn:count() is slower than xdmp:estimate().

Please help me to get the correct total number of documents returned in response to the search term entered by the user.

Upvotes: 2

Views: 2647

Answers (2)

Clark Richey
Clark Richey

Reputation: 392

I don't understand your question. Do you want the total number of documents in the database OR the total number of documents matching your search?

xdmp:estimate is the right way to go but it is only an ESTIMATE. IF the query used in the estimate can be completely resolved from indexes then the estimate will be 100% correct. However, if the query can't be resolved completely from indexes (requires filtering) then estimate will be off by some amount. This is because xdmp:estimate only uses the indexes to give you a count. Compare fn:count(cts:search(doc(), $query)) to xdmp:estimate(cts:search(doc(), $query)). If the results are significantly different for a given query then you either need to turn on additional indexing to support that query or live with the difference.

Upvotes: 1

mblakele
mblakele

Reputation: 7840

To understand what is going on here, start by reading the architecture whitepaper from http://resources.marklogic.com/library/media/inside-marklogic

Now try this test case:

xdmp:estimate(doc()),
xdmp:estimate(cts:search(doc(), ()))

The first expression will count the number of documents in the database. The second expression will count the number of document fragments in the database. So if the results are different, you probably have fragment roots or fragment parents configured. Some special documents also create extra fragments: I think spelling dictionaries and thesaurus documents do this.

If you want to restrict the estimate to XML document roots, specify the document root QName(s) in the searchable expression, or use /* if you do not care about the root element name.

xdmp:estimate(cts:search(/*, ()))

You could also use the cts:query argument to specify a QName that only appears in the documents you want to count.

Upvotes: 10

Related Questions