Reputation: 23918
In hadoop streaming, is there a way to get the ID of a node handling a given task?
By way of analogy, this snippet gives the name of the input file for the task:
#!/usr/bin/env python
import os
map_input_file = str(os.environ["map_input_file"])
I'm looking for something like os.environ["map_node_id"]. Any unique handle to the node would work...
Upvotes: 0
Views: 447
Reputation: 10650
You can get the datanode's hostname simply by using the socket module in your mapper/reducer:
import socket
...
node = socket.gethostname()
Upvotes: 1