Dharmendra Kumar Singh
Dharmendra Kumar Singh

Reputation: 195

CORB is giving error for PRE-BATCH-MODULE

I am running the CORB but I am getting the URI error. Below is the code and CORB properties

THREAD-COUNT=4
URIS-MODULE=get-uri.xqy
PROCESS-MODULE=report.xqy
PROCESS-TASK=com.marklogic.developer.corb.ExportBatchToFileTask
EXPORT-FILE-NAME=report.xml
PRE-BATCH-MODULE=preProces.xqy
PRE-BATCH-TASK=com.marklogic.developer.corb.PreBatchUpdateFileTask

get-uri.xqy code:

let $uris := cts:uris((), (), cts:collection-query("InvoiceHistory"))
return (count($uris), $uris)

preProces.xqy code:

declare variable $URI as xs:string external;

(: Retrieve all relevant records from the current document :)
let $records := fn:doc($URI)/records

(: Group records by creation_date and calculate the sum of doc_count for each group :)
let $grouped-records :=
                  for $date in distinct-values($records//document/@creation_date)
                  let $total := sum($records/document[@creation_date = $date]/@doc_count/xs:integer(.))
                  return <group date="{$date}" total-docs="{$total}"/>

  (: Serialize the grouped records as XML and store in a temporary collection :)
 let $temp-doc :=
               <results>{ $grouped-records }</results>

 return xdmp:document-insert("/temp/preprocessed.xml", $temp-doc)

report.xqy code:

declare namespace fn = "http://www.w3.org/2005/xpath-functions";
declare variable $URI as xs:string external;

(: Retrieve the preprocessed data :)
let $preprocessed := doc("/temp/preprocessed.xml")/results

(: Generate the final report :)
let $final-report :=
                <results>{
                           for $group in $preprocessed/group
                           order by $group/@date
                           return <result date="{$group/@date}" total-docs="{$group/@total-docs}"/>
                  }</results>

return $final-report

CORB Error:

com.marklogic.developer.corb.CorbException: Undefined external variable at URI:
    at com.marklogic.developer.corb.AbstractTask.wrapProcessException(AbstractTask.java:426)
    at com.marklogic.developer.corb.AbstractTask.handleRequestException(AbstractTask.java:373)
    at com.marklogic.developer.corb.AbstractTask.invokeModule(AbstractTask.java:202)
    at com.marklogic.developer.corb.PreBatchUpdateFileTask.call(PreBatchUpdateFileTask.java:63)
    at com.marklogic.developer.corb.PreBatchUpdateFileTask.call(PreBatchUpdateFileTask.java:30)
    at com.marklogic.developer.corb.Manager.runPreBatchTask(Manager.java:790)
    at com.marklogic.developer.corb.Manager.populateQueue(Manager.java:857)
    at com.marklogic.developer.corb.Manager.run(Manager.java:603)
    at com.marklogic.developer.corb.Manager.main(Manager.java:140) Caused by: com.marklogic.xcc.exceptions.XQueryException: XDMP-EXTVAR:

(err:XPDY0002) declare variable $URI as xs:string external; -- Undefined external variable fn:QName("","URI")

Where it is going wrong, can anyone please suggest?

Upvotes: 1

Views: 109

Answers (1)

Mads Hansen
Mads Hansen

Reputation: 66714

The PRE-BATCH module runs before it starts processing each of the URIs that were selected. It is not fed a $URI the same way that the PROCESS-MODULE is.

So, the preProces.xqy is not going to have a value set for the $URI external variable and is throwing an error when it is executed.

It looks like the preProcess.xqy should actually be the PROCESS-MODULE, which would be invoked for each of the URIs selected. However, you are inserting results into a static URI and it would overwrite the content each time it was run (and would be super slow since locks on that URI would make things single threaded).

If you are trying to generate a consolidated XML report, you might consider returning the XML fragment i.e. return <results>{ $grouped-records }</results> instead of inserting into the database, using the PRE-BATCH-MODULE=INLINE-XQUERY|"<results>" and POST-BATCH-MODULE=INLINE-XQUERY|"</results>" and then all of the content will be written to the report.xml output file by the ExportBatchToFileTask.

THREAD-COUNT=4
URIS-MODULE=get-uri.xqy
PROCESS-MODULE=preProces.xqy
PROCESS-TASK=com.marklogic.developer.corb.ExportBatchToFileTask
EXPORT-FILE-NAME=report.xml
PRE-BATCH-MODULE=INLINE-XQUERY|'<results>'
PRE-BATCH-TASK=com.marklogic.developer.corb.PreBatchUpdateFileTask
POST-BATCH-MODULE=INLINE-XQUERY|'</results>'
POST-BATCH-TASK=com.marklogic.developer.corb.PostBatchUpdateFileTask

CoRB will execute the process module and return results for each $URI that is selected by the URIS-MODULE. If you are looking to generate an aggregate result across all of the documents, then you would need to process the result XML file that is generated from this CoRB job.

However, if you created range-indexes on those two attributes then you could easily generate the aggregate report in a single query with something like this:

let $counts-by-date := 
  cts:value-co-occurrences(
    cts:element-attribute-reference(xs:QName("document"), xs:QName("creation_date")), 
    cts:element-attribute-reference(xs:QName("document"), xs:QName("doc_count")), 
    "map")
return
  <results>{
    for $date in map:keys($counts-by-date)
    return <result date="{$date}" ingested-docs="{sum(map:get($counts-by-date, $date))}"/>  
  }</results>

Upvotes: 0

Related Questions