Praneet Sinha
Praneet Sinha

Reputation: 31

Ingest Data in batches to RDBMS from Marklogic with DMSDK

I have a requirement to insert a large amount of data from MarkLogic to RDBMS using DMSDK

Below is my code sample

ArrayList<ArrayList<String>> batch = new ArrayList<ArrayList<String>>();
DatabaseClient client = DatabaseClientFactory.newClient(config.getmlHost(), config.getmlPort(), new DatabaseClientFactory.BasicAuthContext(dbConfig.getuser(), dbConfig.getpassword()));
QueryManager queryMgr = client.newQueryManager();
StructuredQueryBuilder sb = queryMgr.newStructuredQueryBuilder();
StructuredQueryDefinition criteria = sb.and(sb.collection("collection1"),sb.collection("collection2"))
DataMovementManager dmm = client.newDataMovementManager();
QueryBatcher batcher = dmm.newQueryBatcher(criteria)
        .withBatchSize(10)
        .withThreadCount(12)
        .onUrisReady(
                        new ExportListener()
                        .onDocumentReady(doc -> {
                    logger.info("URI received : " + doc.getUri());
                    try {
                        //Getting data From xml and adding it into a arraylist for batch creation
                        ArrayList<String> getDataXml = new GetDataXml().GetDatafromXml(doc.getContent(new DOMHandle()),
                                dbuilder, xPath, ColumnNames);
                        batch.add(getDataXml);

                    } catch (Exception e) {
                        logger.error("Error in the Code", e);
                    }
                })).onQueryFailure(exception -> {
                    logger.error(exception);
                });
        dmm.startJob(batcher);
        batcher.awaitCompletion();
        dmm.stopJob(batcher);
        Class.forName("Driver Name");

        //connecting to RDBMS
        Connection conn = DriverManager.getConnection(DB_URL, USER, PASS)
        PreparedStatement pstmt = conn.prepareStatement("INSERT INTO DBNAME VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?)");

        //Creating Batches PreparedStatement.addBatch()
        for(ArrayList<String> eachObject : batch) {             
            createPreparedStatement(pstmt, eachObject).addBatch();
        }

        //
        int[] result = pstmt.executeBatch();
        logger.info("Total Records Inserted " + result.length);
        oracle.closeConnect(oracleConn);

public PreparedStatement createPreparedStatement(PreparedStatement pstmt, ArrayList<String> eachObject)
            throws SQLException {
        for (int i = 0; i < eachObject.size(); i++) {
            pstmt.setString(i + 1, eachObject.get(i));
        }
        return pstmt;
    }

This code only gets data from MarkLogic and it is not inserted into the RDBMS database after 1 batch is completed,can any one point my mistak in the code. Thanks in Advance.

Upvotes: 0

Views: 100

Answers (1)

ehennum
ehennum

Reputation: 7335

Consider created a prepared statement before starting the job and, within the onDocumentReady() listener:

  1. extracting one or more values from the document,
  2. setting the placeholders on the prepared statement to the values, and
  3. executing the prepared statement.

The downside of the strategy of accumulating all documents in an array is that the array could use up all available memory and that throughput should be better if database operations are interleaved.

Hoping that helps,

Upvotes: 1

Related Questions