Is the mapred process in Hadoop multi-threaded?

Question

I've configured our hadoop cluster with mapred_map_tasks_max to 6 and as expected, I see 6 mapred processes running when kicking of PIG jobs.

I am however a bit surprised to see the CPU usage on some of these individual processes to exceed 100% sometimes reaching 1000%+. Does mapreduce default to multiple threads? Could this be something with Pig itself?

All I could find online was some information about a setting (mapred.map.runner.class), but this doesn't appear to be set to MultiThreaded in anyway.

Thanks.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2630 mapred 20 0 53.4g 2.8g 12m S 218.1 4.5 1:17.32 java
2553 mapred 20 0 53.4g 2.8g 12m S 110.7 4.5 1:25.07 java
2636 mapred 20 0 53.4g 2.8g 12m S 110.4 4.5 1:11.58 java
2437 mapred 20 0 53.5g 5.6g 12m S 108.1 8.8 3:46.52 java
2353 mapred 20 0 53.5g 5.2g 12m S 101.1 8.3 3:35.27 java
2239 mapred 20 0 53.5g 5.8g 12m S 82.6 9.3 3:54.47 java

Is the mapred process in Hadoop multi-threaded?

Answers (1)

Related Questions