user8587005
user8587005

Reputation: 65

when i do distcp whether mapper will run in Source or destination

I am running a Distcp in hadoop to load the data from dev cluster to production cluster .. my question is from where the resources will take.. is it from source or destination?

Upvotes: 1

Views: 503

Answers (2)

roh
roh

Reputation: 1053

where ever you initiate the job/run the distCp command it will use the resources in that environment.

Side note: You can initiate the job in source or destination as long as you give the right source and destination.

Upvotes: 1

Deepan Ram
Deepan Ram

Reputation: 850

Distcp spins off MapReduce jobs on the cluster it is running on/from. You can use the Yarn UI on that cluster to monitor the job progress and utilization.

Lets assume if you are copying from a Prod cluster to a Dev cluster, and are worried about resources utilization , then you can actually run the Distcp job on the Dev cluster and have it "pull" the data from Prod cluster.

Upvotes: 1

Related Questions