Ronald
Ronald

Reputation: 207

Query in Marklogic DB to filter documents for the last 24 hours

Is there a way to filter out documents in marklogic based on a date field?

Something like:

select * from data-hub-STAGING where load_date > '2022-01-12'

Upvotes: 1

Views: 220

Answers (1)

Mads Hansen
Mads Hansen

Reputation: 66781

It is possible to issue such a query, as long as there is a date or dateTime being tracked and indexed.

For any MarkLogic database, there is a database option to enable "maintain last modified", which will set and maintain a document property prop:last-modified that can be indexed and range-queries applied.

Your example table was "data-hub-STAGING". If you are using a MarkLogic Data Hub, the datahubCreatedOn field is available with a dateTime range index.

https://docs.marklogic.com/datahub/5.6/flows/about-flows.html#about-flows__pre_ay4_frh_ypb

For every content object outputted by a Data Hub step, regardless of the step type, Data Hub will add the following document metadata keys and values to the document wrapped by the content object:

  • datahubCreatedOn = the date and time at which the document is written

Of course, any other date/dateTime element, JSON field, or document property can also be indexed and then used to filter docs as well.

Then the appropriate range query can be applied to a search in JavaScript, XQuery, Optic, SQL, etc.

For example, in JavaScript searching by field datahubCreatedOn with cts.fieldRangeQuery():

cts.search(cts.fieldRangeQuery("datahubCreatedOn", ">", 
  new Date(Date.now() - 86400 * 1000).toISOString()))

and in XQuery searching by prop:last-modified with cts:element-range-query():

cts:search(doc(), cts:element-range-query(xs:QName("prop:last-modified"), ">",
 current-dateTime() + xs:dayTimeDuration("-PT24H")))

You should even be able to execute the SQL in your question, if you were to create a SQL Template View.

Upvotes: 1

Related Questions