Fetching all the document URI's in MarkLogic Using Java Client API

Question

i am trying to fetch all the documents from a database without knowing the exact url's . I got one query

DocumentPage documents =docMgr.read();
while (documents.hasNext()) {
    DocumentRecord document = documents.next();
    System.out.println(document.getUri());
}

But i do not have specific urls , i want all the documents

Sam Mefford · Accepted Answer

The first step is to enable your uris lexicon on the database.

You could eval some XQuery and run cts:uris() (or server-side JS and run cts.uris()):

    ServerEvaluationCall call = client.newServerEval()
        .xquery("cts:uris()");
    for ( EvalResult result : call.eval() ) {
        String uri = result.getString();
        System.out.println(uri);
    }

Two drawbacks are: (1) you'd need a user with privileges and (2) there is no pagination.

If you have a small number of documents, you don't need pagination. But for a large number of documents pagination is recommended. Here's some code using the search API and pagination:

    // do the next eight lines just once
    String options =
        "" +
        "  " +
        "    " +
        "  " +
        "";
    QueryOptionsManager optionsMgr = client.newServerConfigManager().newQueryOptionsManager();
    optionsMgr.writeOptions("uriOptions", new StringHandle(options));

    // run the following each time you need to list all uris
    QueryManager queryMgr = client.newQueryManager();
    long pageLength = 10000;
    queryMgr.setPageLength(pageLength);
    ValuesDefinition query = queryMgr.newValuesDefinition("uris", "uriOptions");
    // the following "and" query just matches all documents
    query.setQueryDefinition(new StructuredQueryBuilder().and());
    int start = 1;
    boolean hasMore = true;
    Transaction transaction = client.openTransaction();
    try {
        while ( hasMore ) {
            CountedDistinctValue[] uriValues =
                queryMgr.values(query, new ValuesHandle(), start, transaction).getValues();
            for (CountedDistinctValue uriValue : uriValues) {
                String uri = uriValue.get("string", String.class);
                //System.out.println(uri);
            }
            start += uriValues.length;
            // this is the last page if uriValues is smaller than pageLength
            hasMore = uriValues.length == pageLength;
        }
    } finally {
        transaction.commit();
    }

The transaction is only necessary if you need a guaranteed "snapshot" list isolated from adds/deletes happening concurrently with this process. Since it adds some overhead, feel free to remove it if you don't need such exactness.

Fetching all the document URI's in MarkLogic Using Java Client API

Answers (2)

Related Questions

Fetching all the document URI&#39;s in MarkLogic Using Java Client API

Answers (2)

Related Questions

Fetching all the document URI's in MarkLogic Using Java Client API