Reputation: 719
This documentation https://hudi.apache.org/docs/syncing_metastore is not really straightforward.
I've spent a lot of time trying to make this tool working. Whether I run it from CLI (run-sync-tools.sh) or from Intellij (Running HiveSyncTool directly) - I always receive ClassNotFoundException for different classes..
First exception is ClassNotFoundException: org.slf4j.LoggerFactory.. Ok I added dependency explicitly. But in continues..
In Intellij it's happening because almost all dependencies are with provided
scope. I had to change to compile..
After resolving those exceptions I receive:
java.lang.NoSuchMethodError: 'org.apache.parquet.schema.LogicalTypeAnnotation org.apache.parquet.schema.Type.getLogicalTypeAnnotation()'
This looks like parquet and avro libraries incompatibility. Tried different versions but without success.
The main question here - is there any easy way to run this tool? I don't believe it should be required ato add missing dependencies/changing Maven scope.. This is really weird.
Thanks in advance
Upvotes: 1
Views: 416
Reputation: 1653
See my answer at https://github.com/apache/incubator-xtable/discussions/457#discussioncomment-9659748.
The gist.... download the following from mavenrepository.com
org.apache.hudi:hudi-hive-sync-bundle:0.14.1,com.amazonaws:aws-java-sdk-s3:1.11.271,org.apache.hadoop:hadoop-client:2.10.2,org.apache.hadoop:hadoop-aws:2.10.2
and you'll need a Hive 2.3.10 and Hadoop 2.10.2 installation.
Upvotes: 1