Lax
Lax

Reputation: 1

MarkLogic Find all doc uris which contains null node for given xpath

XQuery (MarkLogic) I am having trouble to get all doc uris for given xpath which contains null node, let me know if anybody can provide some idea on how to do it.

<Person id="1">
  <Details>
    <Contact>
      <Name>Bob</Name>
      <City>Oakland</City>
    </Contact>
    <OtherInfo>
      <Cars>
        <Car>
          <Brand>Honda<Brand>
          <Model>Accord</Model>
          <Brand/>
        </Car>
      </Cars>
    </OtherInfo>
  </Details>
</Person>


<Person id="2">
  <Details>
    <Contact>
      <Name>Chris</Name>
      <City>Buffalo</City>
    </Contact>
    <OtherInfo>
      <Cars>
        <Car/>
      </Cars>
    </OtherInfo>
  </Details>
</Person>

I am looking to find all docs which do not have any element value for car; those where car is an empty node.

XPath = /Person/Details/OtherInfo/Cars/Car which will return doc corresponding to id =2 only

Upvotes: 0

Views: 1416

Answers (2)

mholstege
mholstege

Reputation: 4912

If a car that is present always has child elements, something like this:

/Person/Details/OtherInfo/Cars/Car[empty(*)]

Upvotes: 3

DALDEI
DALDEI

Reputation: 3732

In XML (and ML) there is no such thing as 'null' or 'null node' This can be pedantic - or it can be problematic depending on what you assume 'null' or 'null node' actually means.

A few possibilities
Car element does not exist Car element exists but has no text content nodes Car element exists and has only whitespace content Car element exists and has a schema defining it to be Simple Content and has only ignorable whitespace content Car element is schema-defined to not allow child nodes Car element is explicitly annotated with an xsi:nil attribute and is schema validated Car element exists but has element content (or other markup like PI) Car element exists but has only attribute content Car element exists but has no node content whatsoever (totally empty).

Your XML sample is mal-formed ( ) but I presume you mean which would imply the last meaning as likely. (exists but is empty)

Checking for the non-existence of something is not easy to do efficiently, its not explicitly indexed but rather implicitly indexed by there being no matches, that can be inefficient to search for if your database has many documents and using pure XPath.

A pure XPath for the expression might be doc()[ /Person/Details/OtherInfo/Cars/Car[ empty(node()) ] ]/fn:document-uri(.)

I suggest using a cts:query instead - more likely to be optimized, for example, assuming Car can only occur as a child of Cars

  cts:element-value-query(
    xs:QName("Car"),
    "") )/fn:document-uri(.)

This queries for all Car elements with the text value of "", which is the text value of a simple element with no child nodes.

Depending on if you have schema or not and your index and DB settings you might be able to run an unfiltered query which is faster

  cts:element-value-query(
    xs:QName("Car"),
    ""),"unfiltered" )/fn:document-uri(.)

But you need to validate if your configuration and data result in accurate unfiltered queries. You check that on a sample set of data using fn:count() and xdmp:estimate() to see if they match, but thats not a guarantee that new data added will be accurate. To be sure you need to study the docs on "filtered vs unfiltered searches" or stick to filtered (default) searches

Upvotes: 1

Related Questions