Tharunkumar Reddy
Tharunkumar Reddy

Reputation: 2813

Sqoop Socket timeout detected: Read timed out while reading data from Mainframe and insert into Hive

Please find my below problem and help me to resolve it.

I have a sqoop statement which brings data from Mainframe to Hadoop(Hive) using attunity jdbc driver.

    sqoop import \
    --connect "jdbc:attconnect://X.X.1.1:8883;DefTdpName=PROD_E;OneTdpMode=1"\ 
    --driver "com.attunity.jdbc.NvDriver" \
   --query Select * FROM how_record WHERE \$CONDITIONS" \
    --fields-terminated-by "\t" \
    --hive-import \
    --hive-overwrite \
    --delete-target-dir  \
    --null-string '\\N' \
    --null-non-string '\\N' \
    --hive-delims-replacement '' \
    --target-dir "path/how_record" \
    --fetch-size 50000 \
    --hive-table "how_record" -m 1

When i am running above sqoop statement for first time i am keep getting socket time out error.

ERROR manager.SqlManager: Error executing statement:
java.sql.SQLException: JBC0088E: JBC0002E: Socket timeout detected: Read timed out

java.sql.SQLException: JBC0088E: JBC0002E: Socket timeout detected: Read timed out

            at com.attunity.jdbc.NvSQLWarning.createSQLException(Unknown Source)

            at com.attunity.comm.XmsgComm.xMsgThrowTimeoutEx(Unknown Source)

            at com.attunity.comm.XmsgComm.treatInterruptedIOException(Unknown Source)

            at com.attunity.comm.XmsgComm.xMsgPerformCall(Unknown Source)

            at com.attunity.comm.Daemon.GetServer(Unknown Source)

            at com.attunity.navapi.SysTdp.getRemoteSysTdp(Unknown Source)

            at com.attunity.jdbc.NvConnectionBase.open(Unknown Source)

            at com.attunity.jdbc.NvConnectPropertiesBase.openConnection(Unknown Source)

            at com.attunity.jdbc.NvDriverCoreBase.connect(Unknown Source)

            at com.attunity.jdbc.NvDriverCoreBase.connect(Unknown Source)

            at java.sql.DriverManager.getConnection(DriverManager.java:664)

            at java.sql.DriverManager.getConnection(DriverManager.java:270)

But when i ran above one for second time(Immediately), The error is automatically resolved and sqoop is completing successfully.

I am facing this issue only when i ran sqoop for first time.

I went through few blogs and error may be related to connection pool in the JDBC.

My background is mostly related DATA, Don't have much experience at JAVA side.

Can some one please help me to find out the actual issue and resolution.

Thanks in advance!!

Upvotes: 3

Views: 1075

Answers (2)

yammanuruarun
yammanuruarun

Reputation: 403

Refer 24.1.Sqoop Supported Databases

Refer Attunity Driver Configuration

Try specifying the appropriate driver using --driver option in sqoop import command.

Download nvjdbc2.jar and place it in an appropriate location as specified in sqoop documentation as per your cluster.

Class name to be used : com.attunity.jdbc.NvDriver

While JDBC is a compatibility layer that allows a program to access many different databases through a common API, slight differences in the SQL language spoken by each database may mean that Sqoop can’t use every database out of the box, or that some databases may be used in an inefficient manner.

Even if Sqoop supports a database internally, you may still need to install the database vendor’s JDBC driver in your $SQOOP_HOME/lib path on your client. Sqoop can load classes from any jars in $SQOOP_HOME/lib on the client and will use them as part of any MapReduce jobs it runs; unlike older versions, you no longer need to install JDBC jars in the Hadoop library path on your servers.

Upvotes: 2

Prerak Jain
Prerak Jain

Reputation: 29

Have you looked at the load on your database server? I am guessing that is where the bottleneck lies. This configuration is useful only of you can scale your database server, or have multiple servers, each with a different set of tables.

Upvotes: 0

Related Questions