Reputation: 29
I am trying to create a data pipeline from "SQL SERVER (from GCP VM)" To "BigQuery" using CLOUD DATA FUSION; I have done all the below setup configurations,
And I try run the pipeline and it end up with few errors; I have tried few google search but I didn't get the answer.
I was able to create a data fusion pipeline between "GCS To BigQuery" and it was working fine. but this "SQL server to big query" pipeline showing some Error.
Could anyone please help me on this?
Here is the error details,
2020-01-10 13:00:47,528 - WARN [Thread-95:o.a.h.m.LocalJobRunner@589] - job_local976595976_0001 java.lang.Exception: java.lang.NullPointerException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:491) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:551) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] java.lang.NullPointerException: null at org.apache.hadoop.mapreduce.lib.db.DataDrivenDBInputFormat.createDBRecordReader(DataDrivenDBInputFormat.java:281) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at io.cdap.plugin.db.batch.source.DataDrivenETLDBInputFormat.createDBRecordReader(DataDrivenETLDBInputFormat.java:124) ~[1578661227434-0/:na] at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.createRecordReader(DBInputFormat.java:245) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at io.cdap.cdap.etl.batch.preview.LimitingInputFormat.createRecordReader(LimitingInputFormat.java:51) ~[cdap-etl-core-6.1.0.jar:na] at io.cdap.cdap.internal.app.runtime.batch.dataset.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:92) ~[na:na] at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:521) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) ~[hadoop-mapreduce-client-core-2.9.2.jar:na] at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270) ~[hadoop-mapreduce-client-common-2.9.2.jar:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_232] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_232] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_232] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_232] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_232] 2020-01-10 13:00:50,841 - ERROR [MapReduceRunner-phase-1:i.c.c.i.a.r.ProgramControllerServiceAdapter@97] - MapReduce Program 'phase-1' failed. java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 failed at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_232] 2020-01-10 13:00:50,842 - ERROR [MapReduceRunner-phase-1:i.c.c.i.a.r.ProgramControllerServiceAdapter@98] - MapReduce program 'phase-1' failed with error: MapReduce JobId job_local976595976_0001 failed. Please check the system logs for more details. java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 failed at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_232] 2020-01-10 13:00:50,916 - ERROR [WorkflowDriver:i.c.c.d.SmartWorkflow@552] - Pipeline '0f084034-33a9-11ea-95f6-8e2648ebe039' failed. 2020-01-10 13:00:51,225 - ERROR [WorkflowDriver:i.c.c.i.a.r.w.WorkflowProgramController@89] - Workflow service 'workflow.default.0f084034-33a9-11ea-95f6-8e2648ebe039.DataPipelineWorkflow.20288f05-33a9-11ea-a505-8e2648ebe039' failed. java.lang.IllegalStateException: MapReduce JobId job_local976595976_0001 failed at com.google.common.base.Preconditions.checkState(Preconditions.java:176) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.run(MapReduceRuntimeService.java:416) ~[na:na] at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na] at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) ~[na:na] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_232]
Upvotes: 2
Views: 1306
Reputation: 1
UPDATE: https://issues.cask.co/browse/CDAP-16453 It's a known issue, fixed in 6.1.2
"Same error on MySQL 5.x Strange enough, if you deploy the pipeline and run it it works... I'm thinking about decoupling pipelines to have small sql-to-storage and the big pipeline in the outgoing flow"
regards Virgilio
Upvotes: 0
Reputation: 5243
As per issue records reported, you have persisted with java.lang.nullpointerexception error, that might reflect the usage of a null when the object required within an application run path.
Assuming the fact that you've successfully configured JDBC driver, I would recommend to check the source Database Properties across your pipeline in order to determine the undefined field, supposedly can be Import Query property field, that is used to import data from specified table by supplying SELECT
query with appropriate $CONDITIONS
if the number of splits to generate is more than 1:
SELECT * FROM <table> WHERE $CONDITIONS
Upvotes: 0