vish
vish

Reputation: 67

How to migrate data from local on-premises HDFS to Azure storage

I want to move the data from my local on-premises HDFS server to my Azure HDinsight cluster.

I tried distcp command but it does not understand the data lake storage path.

Upvotes: 0

Views: 3506

Answers (1)

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12788

Steps for Connecting on-premise Hadoop to ADLS:

Step1: Create the Azure Data Lake Store account.

Step2: Create the identity to access Azure Data Lake Store.

Step3: Modify the core-site.xml in your on-premise Hadoop cluster.

Step4: Test connectivity to Azure Data Lake Store from on-premise Hadoop.

Step5: Use DistCp to transfer the data from on-premise Hadoop to Azure Data Lake Store.

Syntax: hadoop distcp <HDFS_Path> <ADLS_PATH>

Example: hadoop distcp README.txt adl://mydatalakename.azuredatakestore.net/

For more details, refer "Connecting On-premise Hadoop to Azure Data Lake Store" and Migrate on-premise Apache Hadoop cluster to Azure HDInsight - data migration best practices.

Hope this helps.

Upvotes: 1

Related Questions