Reputation: 368
In a Hadoop Reducer, I would like to create and emit new keys under specific conditions, and I'd like to ensure that these keys are unique.
The pseudo-code for what I want goes like:
@Override
protected void reduce(WritableComparable key, Iterable<Writable> values, Context context)
throws IOException, InterruptedException {
// do stuff:
// ...
// write original key:
context.write(key, data);
// write extra key:
if (someConditionIsMet) {
WritableComparable extraKey = createNewKey()
context.write(extraKey, moreData);
}
}
So I now have two questions:
The extra key has to be unique across all reducers - both for application reasons and because I think it would otherwise violate the contract of the reduce stage. What is a good way to generate a key that is unique across reducers (and possibly across jobs?)
Maybe get reducer/job IDs and incorporate that into key generation?
Upvotes: 2
Views: 1011
Reputation: 30089
Context.getTaskAttemptID()
method and then pull out the reducer ID number with TaskAttemptID.getTaskID().getId()
Upvotes: 2