Load and process data in parallel inside Hadoop

Question

i am using hadoop to process bigdata, i first load data to hdfs and then execute jobs, but it is sequential. Is it possible to do it in parallel. For example, running 3 jobs and 2 process of load data from others jobs at same time on my cluster.

Cheers

RojoSam · Accepted Answer

If your cluster has enough resources to run the jobs in parallel, then yes. But be sure that the work of each job, doesn't interfere with the others. Like load the data at the same time that another job in execution should be using it, that won't work as you expected.

If there is not enough resources, then hadoop will enqueue the jobs until the resources are available, depending on the Scheduler configured.

Load and process data in parallel inside Hadoop

Answers (2)

Related Questions