bosari
bosari

Reputation: 2000

3 queries sum not equal to exact of total doc count in a database?

I have 3 scenarios-

  1. Get total docs count of all docs with element name xyz and abc. In this case I also require element name's values. Like, for example, doc count of docs where element name xyz value is lala and element name abc value is lili and so on for all possible values/ combination of xyz and abc. Both elements should exist in a doc.
  2. Get total docs count of all docs with element name xyz and element abc does not exist in that doc. In this case I need doc count of docs for all possible values of xyz value wise.
  3. Get total docs count for docs that do not contain element xyz. Together these 3 on addition should be equal to total doc count of database.
    Note: The database size is huge. The query has to be fast. I can slightly compromise on precision. I need to avoid wildcard search. Help.

    xdmp:estimate(cts:search(fn:doc(), cts:and-query(( cts:element-query(xs:QName("meta:xyz"), cts:true-query()),cts:element-query(xs:QName("meta:abc"), cts:true-query()) )) ) ).

This is returning different results then when I sum up all possible values obtained from cts:value-tuples and pass 1 by 1 to -

let $x := local:get-doc-count-for-localname-source(cts:value-tuples((
          cts:element-reference(xs:QName("meta:xyz")),
          cts:element-reference(xs:QName("meta:abc"))
          )) ) 
let $y := fn:sum(($x))
return xdmp:estimate(cts:search(fn:doc(), cts:and-query((                
                cts:element-query(xs:QName("meta:xyz"), cts:true-query()),
                cts:element-query(xs:QName("meta:abc"), cts:true-query())
                      )) )
             )

Upvotes: 1

Views: 94

Answers (1)

ehennum
ehennum

Reputation: 7335

In general, the universal index can yield fast estimates by passing a query to cts.estimate() in Server-Side JavaScript or to xdmp.estimate() in XQuery.

An estimate of documents containing both FIRST_ELEMENT and SECOND_ELEMENT:

cts.estimate(
  cts.andQuery([
    cts.elementQuery('FIRST_ELEMENT', cts.trueQuery()),
    cts.elementQuery('SECOND_ELEMENT', cts.trueQuery())
  ]))

An estimate of documents containing FIRST_ELEMENT but not SECOND_ELEMENT:

cts.estimate(
  cts.andQuery([
    cts.elementQuery('FIRST_ELEMENT', cts.trueQuery()),
    cts.notQuery(
      cts.elementQuery('SECOND_ELEMENT', cts.trueQuery())
      )
  ]))

An estimate of documents not containing FIRST_ELEMENT:

cts.estimate(
    cts.notQuery(
      cts.elementQuery('FIRST_ELEMENT', cts.trueQuery())
      )
  )

An estimate of all documents in the database:

cts.estimate(
  cts.trueQuery()
  )

Hoping that helps,

Upvotes: 1

Related Questions