Reputation: 726
I am trying to use Sqoop2 to copy data from an Oracle 11g2 server to HDFS.
The link to Oracle seems to work, as it will complain if I use invalid credentials. The definition is as follows:
link with id 14 and name OLink (Enabled: true, Created by xxx at 2/9/16 2:48 PM, Updated by xxx at 2/11/16 10:08 AM)
Using Connector generic-jdbc-connector with id 4
Link configuration
JDBC Driver Class: oracle.jdbc.driver.OracleDriver
JDBC Connection String: jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=5999)))(CONNECT_DATA=(SERVER=DEDICATED)(SID=abc)))
Username: xxx
Password:
JDBC Connection Properties:
(The weird port number is here because I need to use a reverse tunneling to access the database for now. It will be fixed soon)
The job definition is the following
Job with id 2 and name Test OLink (Enabled: true, Created by xxx at 2/9/16 2:56 PM, Updated by xxx at 2/11/16 10:58 AM)
Using link id 14 and Connector id 4
From database configuration
Schema name: xxx
Table name: t_name
Table SQL statement:
Table column names:
Partition column name: CL_ID
Null value allowed for the partition column: false
Boundary query:
Throttling resources
Extractors: 3
Loaders: 3
ToJob configuration
Override null value:
Null value:
Output format: TEXT_FILE
Compression format: DEFAULT
Custom compression format:
Output directory: /tmp
When I start the job (with verbose mode set to true), it lists all the column name and types of the remote table (meaning the connectivity to Oracle is OK), but the job fails, for example
2016-02-11 10:44:42 UTC: BOOTING - Progress is not available
2016-02-11 10:44:59 UTC: RUNNING - 0.00 %
2016-02-11 10:45:09 UTC: RUNNING - 0.00 %
2016-02-11 10:45:19 UTC: RUNNING - 0.00 %
2016-02-11 10:45:29 UTC: FAILED
Exception: Job Failed with status:3
Stack trace: Task failed task_1450719316904_0239_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
The logs show the following:
2016-02-11 10:44:59,651 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1450719316904_0239: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:56320, vCores:37> knownNMs=5
2016-02-11 10:45:04,775 INFO [Socket Reader #1 for port 54706] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1450719316904_0239 (auth:SIMPLE)
2016-02-11 10:45:04,803 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1450719316904_0239_m_000002 asked for a task
2016-02-11 10:45:04,803 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1450719316904_0239_m_000002 given task: attempt_1450719316904_0239_m_000000_0
2016-02-11 10:45:06,494 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1450719316904_0239_m_000000_0 is : 0.0
2016-02-11 10:45:06,503 FATAL [IPC Server handler 6 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1450719316904_0239_m_000000_0 - exited : org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs during extractor run
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:99)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.sqoop.common.SqoopException: GENERIC_JDBC_CONNECTOR_0001:Unable to get a connection
at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.<init>(GenericJdbcExecutor.java:59)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:50)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:38)
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:95)
... 7 more
Caused by: java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:489)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:553)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:254)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:528)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.<init>(GenericJdbcExecutor.java:51)
... 10 more
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:439)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:454)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:693)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:251)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1140)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:340)
... 17 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:149)
at oracle.net.nt.ConnOption.connect(ConnOption.java:133)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:405)
... 22 more
The software versions are
Any clue on how to troubleshoot this ?
Upvotes: 2
Views: 829
Reputation: 4466
When the Sqoop job starts up, it will connect from the machine you are running the Sqoop command on to the Oracle machine to query the tables and build the Sqoop job.
When the map-reduce phase starts running, each datanode in the cluster that is running a map-reduce task will need to connect to the database. From those errors, it looks like your datanodes cannot connect to Oracle, but the machine you are initiating the job from can.
Can you confirm connectivity from all the datanodes to Oracle?
Upvotes: 1