Reputation: 121
I was interested in modifying the way the input data splits of jobs were allocated to particular nodes.
I went through JobInprogress code of hadoop but couldn't get to know how actual allocation happens.
How are the input splits of a job distributed across nodes of cluster ?
Which Hadoop files do i need to go thru to understand allocation ?
Upvotes: 2
Views: 178
Reputation: 33495
Each input format like MultiFileInputFormat implements the InputFormat#getSplits() method where the InputSplits are calculated.
Upvotes: 1