Reputation: 681
I'm looking into tuning MapR Hadoop via Ansible templates.
It is easy enough to tune something to the number of CPU threads found on a system. For example, to set reduce tasks to 1/4 of threads:
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>{{ (ansible_processor_vcpus / 4)|int }}</value>
One resource suggests that the number of map / reduce tasks should be scaled to the number of disks on the system. I don't see any comparable variable for that.
There is an array of ansible_devices
with sda, sdb, &c. Perhaps I can count that? Perhaps apply a filter so I'm only counting disks which are available to Hadoop?
Upvotes: 0
Views: 518