Abe
Abe

Reputation: 23918

hadoop streaming get node id

In hadoop streaming, is there a way to get the ID of a node handling a given task?

By way of analogy, this snippet gives the name of the input file for the task:

#!/usr/bin/env python
import os
map_input_file = str(os.environ["map_input_file"])

I'm looking for something like os.environ["map_node_id"]. Any unique handle to the node would work...

Upvotes: 0

Views: 447

Answers (1)

Lorand Bendig
Lorand Bendig

Reputation: 10650

You can get the datanode's hostname simply by using the socket module in your mapper/reducer:

import socket
...
node = socket.gethostname()

Upvotes: 1

Related Questions