user2401464
user2401464

Reputation: 537

Hadoop mapreduce python command line arguments

In my python mapper code, I need to access the 'path' given in -input 'path'. How is it possible to access this in python code?

Upvotes: 0

Views: 599

Answers (1)

zsxwing
zsxwing

Reputation: 20836

You can read the input file from os.environ. For example,

import os
input_file = os.environ['map_input_file']

Actually, you can also read other JobConf from os.environ. Note: During the execution of a streaming job, the names of the "mapred" parameters are transformed. The dots ( . ) become underscores ( _ ). For example, mapred.job.id becomes mapred_job_id and mapred.jar becomes mapred_jar. To get the values in a streaming job's mapper/reducer use the parameter names with the underscores. See Configured Parameters.

I also find a very useful post for you: A Guide to Python Frameworks for Hadoop.

Upvotes: 1

Related Questions