Sravan
Sravan

Reputation: 21

ERROR mapreduce.MarkLogicInputFormat: com.marklogic.xcc.exceptions.XQueryException: XDMP-DOCROOTTEXT - MLCP Error Using Query Filter

I am getting below error while trying to export a collection with query filter option from ML DB using MLCP.

Added the error message as well and the config which I was as part of the export job using mlcp..............................................................................................................................................

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/opt/MarkLogic/mlcp-10.0.8.2/lib/hadoop-auth-2.7.2.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
24/10/14 04:26:31 INFO contentpump.ContentPump: Job name: local_928393857_1
24/10/14 04:26:31 ERROR mapreduce.MarkLogicInputFormat: com.marklogic.xcc.exceptions.XQueryException: XDMP-DOCROOTTEXT: xdmp:unquote("collection_name_SerializedQuery.txt") -- Invalid root text "collection_name_SerializedQuery.txt" at  line 1
 [Session: user=username, cb={default} [ContentSource: user=username, cb={none} [provider: SSLconn address=hostname/x.x.x.x:****, pool=1/64]]]
 [Client: XCC/10.0-8, Server: XDBC/10.0-11]
on line 1
expr: xdmp:unquote("collection_name_SerializedQuery.txt"),
in xdmp:eval("for $f in xdmp:forest-open-replica(xdmp:database-forests(xdmp:da...")
in /MarkLogic/hadoop.xqy, on line 32
expr: xdmp:unquote("collection_name_SerializedQuery.txt"),
in hadoop:get-splits("", "fn:collection("gps-temporal")", "cts:query(xdmp:unquote('collection_name_SerializedQuery.txt')/*)")
on line 5
expr: xdmp:unquote("collection_name_SerializedQuery.txt")
24/10/14 04:26:31 ERROR mapreduce.MarkLogicInputFormat: Query: xquery version "1.0-ml";
fn:exists(xdmp:get-request-header('x-forwarded-for'));
import module namespace hadoop = "http://marklogic.com/xdmp/hadoop" at "/MarkLogic/hadoop.xqy";
xdmp:host-name(xdmp:host()),
hadoop:get-splits('', 'fn:collection("gps-temporal")',"cts:query(xdmp:unquote('collection_name_SerializedQuery.txt')/*)"),
"REDACT",0,let $repf := fn:function-lookup(xs:QName('hadoop:get-splits-with-replica'),0)
return if (exists($repf)) then $repf() else ()
,0,"AUDIT",
let $f :=
    fn:function-lookup(xs:QName('xdmp:group-get-audit-event-type-enabled'), 2)
return
    if (not(exists($f)))
    then ()
    else
        let $group-id := xdmp:group()
        let $enabled-event := $f($group-id,("mlcp-copy-export-start", "mlcp-copy-export-finish"))
        let $mlcp-start-enabled :=
                if ($enabled-event[1]) then "mlcp-copy-export-start" else ()
        let $mlcp-finish-enabled :=
                if ($enabled-event[2]) then "mlcp-copy-export-finish" else ()
        return ($mlcp-start-enabled, $mlcp-finish-enabled)
24/10/14 04:26:31 ERROR contentpump.LocalJobRunner: Error getting input splits:
24/10/14 04:26:31 ERROR contentpump.LocalJobRunner: com.marklogic.xcc.exceptions.XQueryException: XDMP-DOCROOTTEXT: xdmp:unquote("collection_name_SerializedQuery.txt") -- Invalid root text "collection_name_SerializedQuery.txt" at  line 1
 [Session: user=sc799-sa, cb={default} [ContentSource: user=sc799-sa, cb={none} [provider: SSLconn address=mlg-gpi-uat.ntrs.com/10.33.132.50:8062, pool=1/64]]]
 [Client: XCC/10.0-8, Server: XDBC/10.0-11]
on line 1
expr: xdmp:unquote("collection_name_SerializedQuery.txt"),
in xdmp:eval("for $f in xdmp:forest-open-replica(xdmp:database-forests(xdmp:da...")
in /MarkLogic/hadoop.xqy, on line 32
expr: xdmp:unquote("collection_name_SerializedQuery.txt"),
in hadoop:get-splits("", "fn:collection("gps-temporal")", "cts:query(xdmp:unquote('collection_name_SerializedQuery.txt')/*)")
on line 5
expr: xdmp:unquote("collection_name_SerializedQuery.txt")

Used below Options and Query Filter query

options_file:

-host
<hostname>
-port
****
-ssl
true
-username
username
-password
******
-collection_filter
collection_name
-query_filter
collection_SerializedQuery.txt
-output_file_path
<path>
-output_type
archive
-thread_count
8

query_filter file:

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:cts="http://marklogic.com/cts"><cts:uri>/collection_name</cts:uri>
  <and-query>
    <collection-query>
      <collection>collection_name</collection>
    </collection-query>
    <json-property-scope-query>
      <property-name>metadata</property-name>
      <query>
        <json-property-value-query>
          <property-name>archivalDate</property-name>
          <value>2023-11-30</value>
          <value>2023-11-29</value>
        </json-property-value-query>
      </query>
    </json-property-scope-query>
  </and-query>
</query>

executed below command

sh mlcp.sh export -options_file optionsFile.txt

Upvotes: 1

Views: 48

Answers (1)

Mads Hansen
Mads Hansen

Reputation: 66781

The error that was reported:

24/10/14 04:26:31 ERROR mapreduce.MarkLogicInputFormat: com.marklogic.xcc.exceptions.XQueryException: XDMP-DOCROOTTEXT: xdmp:unquote("collection_name_SerializedQuery.txt") -- Invalid root text "collection_name_SerializedQuery.txt" at line 1

Is telling you that when it tried to execute xdmp:unquote("collection_name_SerializedQuery.txt") on the string value of a filename did not produce a valid document. You can reproduce this by executing that in Query Console.

Your options specified the name of a file:

-query_filter
collection_SerializedQuery.txt

However, the query_filter is expecting a serialized query to be specified (the XML as a value, not the filename of a file that contains that information).

It should instead be:

-query_filter
<query xmlns="http://marklogic.com/cts"><uri>/collection_name</uri><and-query><collection-query><collection>collection_name</collection></collection-query><json-property-scope-query><property-name>metadata</property-name><query><json-property-value-query><property-name>archivalDate</property-name><value>2023-11-30</value><value>2023-11-29</value></json-property-value-query></query></json-property-scope-query></and-query></query>

Refer to the documentation for examples of how this is done: https://docs.marklogic.com/11.0/guide/mlcp-guide/en/exporting-content-from-marklogic-server/controlling-what-is-exported,-copied,-or-extracted/example--exporting-documents-matching-a-query.html

Upvotes: 0

Related Questions