Sqoop: error when migrating table in pgadmin4 database to HDFS

Question

I'm a newbie to Sqoop and HDFS and I'm trying to migrate a table in pgadmin4 to HDFS by using Sqoop. I spent a day to fix errors but finally I encountered this error that I cant find a solution anywhere:

Sqoop Command:

sqoop import --connect 'jdbc:postgresql://192.168.1.166:5432/SIC2024_BuiNamQuan ssl=disable&sslfactory=org.postgresql.ssl.NonValidatingFactory' 
--username 'postgres' -P 'root' 
--table 'table_data' 
--target-dir '/user/sic2024_BuiNamQuan/hive/warehouse/Capstonesic2024_BuiNamQuan ' 
-m 1;

Logs:

2024-08-04 23:53:08,371 INFO manager.SqlManager: Using default fetchSize of 1000
2024-08-04 23:53:08,371 INFO tool.CodeGenTool: Beginning code generation
2024-08-04 23:53:08,919 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "table_data" AS t LIMIT 1
2024-08-04 23:53:08,993 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
Note: /tmp/sqoop-h-user/compile/b2036ee14b5cd61907f970a49bad7e35/table_data.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2024-08-04 23:53:10,832 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-h-user/compile/b2036ee14b5cd61907f970a49bad7e35/table_data.jar
2024-08-04 23:53:10,840 WARN manager.PostgresqlManager: It looks like you are importing from postgresql.
2024-08-04 23:53:10,842 WARN manager.PostgresqlManager: This transfer can be faster! Use the --direct
2024-08-04 23:53:10,842 WARN manager.PostgresqlManager: option to exercise a postgresql-specific fast path.
2024-08-04 23:53:10,856 INFO mapreduce.ImportJobBase: Beginning import of table_data
2024-08-04 23:53:10,859 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2024-08-04 23:53:11,034 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
2024-08-04 23:53:11,636 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2024-08-04 23:53:11,737 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
2024-08-04 23:53:11,977 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2024-08-04 23:53:11,977 INFO impl.MetricsSystemImpl: JobTracker metrics system started
2024-08-04 23:53:12,396 INFO db.DBInputFormat: Using read commited transaction isolation
2024-08-04 23:53:12,474 INFO mapreduce.JobSubmitter: number of splits:1
2024-08-04 23:53:12,803 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1055552489_0001
2024-08-04 23:53:12,828 INFO mapreduce.JobSubmitter: Executing with tokens: []
2024-08-04 23:53:13,229 INFO mapred.LocalDistributedCacheManager: Creating symlink: /tmp/hadoop-h-user/mapred/local/job_local1055552489_0001_ffa4f31b-dd61-497c-8274-20f0fd911760/libjars <- /home/h-user/libjars/*
2024-08-04 23:53:13,266 WARN fs.FileUtil: Command 'ln -s /tmp/hadoop-h-user/mapred/local/job_local1055552489_0001_ffa4f31b-dd61-497c-8274-20f0fd911760/libjars /home/h-user/libjars/*' failed 1 with: ln: failed to create symbolic link '/home/h-user/libjars/*': No such file or directory

2024-08-04 23:53:13,266 WARN mapred.LocalDistributedCacheManager: Failed to create symlink: /tmp/hadoop-h-user/mapred/local/job_local1055552489_0001_ffa4f31b-dd61-497c-8274-20f0fd911760/libjars <- /home/h-user/libjars/*
2024-08-04 23:53:13,267 INFO mapred.LocalDistributedCacheManager: Localized file:/tmp/hadoop/mapred/staging/h-user1055552489/.staging/job_local1055552489_0001/libjars as file:/tmp/hadoop-h-user/mapred/local/job_local1055552489_0001_ffa4f31b-dd61-497c-8274-20f0fd911760/libjars
2024-08-04 23:53:13,375 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
2024-08-04 23:53:13,375 INFO mapreduce.Job: Running job: job_local1055552489_0001
2024-08-04 23:53:13,380 INFO mapred.LocalJobRunner: OutputCommitter set in config null
2024-08-04 23:53:13,413 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
2024-08-04 23:53:13,413 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2024-08-04 23:53:13,414 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2024-08-04 23:53:13,630 INFO mapred.LocalJobRunner: Waiting for map tasks
2024-08-04 23:53:13,631 INFO mapred.LocalJobRunner: Starting task: attempt_local1055552489_0001_m_000000_0
2024-08-04 23:53:13,666 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
2024-08-04 23:53:13,668 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2024-08-04 23:53:13,742 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
2024-08-04 23:53:13,810 INFO db.DBInputFormat: Using read commited transaction isolation
2024-08-04 23:53:13,813 INFO mapred.MapTask: Processing split: 1=1 AND 1=1
2024-08-04 23:53:13,823 INFO mapred.LocalJobRunner: map task executor complete.
2024-08-04 23:53:13,838 WARN mapred.LocalJobRunner: job_local1055552489_0001
java.lang.Exception: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class table_data not found
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class table_data not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2638)
    at org.apache.sqoop.mapreduce.db.DBConfiguration.getInputClass(DBConfiguration.java:403)
    at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.createDBRecordReader(DataDrivenDBInputFormat.java:270)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.createRecordReader(DBInputFormat.java:266)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:527)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.ClassNotFoundException: Class table_data not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2542)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2636)
    ... 12 more
2024-08-04 23:53:14,379 INFO mapreduce.Job: Job job_local1055552489_0001 running in uber mode : false
2024-08-04 23:53:14,381 INFO mapreduce.Job:  map 0% reduce 0%
2024-08-04 23:53:14,382 INFO mapreduce.Job: Job job_local1055552489_0001 failed with state FAILED due to: NA
2024-08-04 23:53:14,399 INFO mapreduce.Job: Counters: 0
2024-08-04 23:53:14,401 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2024-08-04 23:53:14,403 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 2.7383 seconds (0 bytes/sec)
2024-08-04 23:53:14,403 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2024-08-04 23:53:14,403 INFO mapreduce.ImportJobBase: Retrieved 0 records.
2024-08-04 23:53:14,403 ERROR tool.ImportTool: Import failed: Import job failed!

I tried everything I can find on the internet but nothing went right the error continuing and I have no idea what to do

Sqoop: error when migrating table in pgadmin4 database to HDFS

Answers (0)

Related Questions