code_ada
code_ada

Reputation: 884

boto does not like EMR BootstrapAction paramater

I'm trying to launch AWS EMR cluster using boto library, everything works well.

Because of that I need to install required python libraries, tried to add bootstrap action step using boto.emr.bootstrap_action

But It gives error below;

Traceback (most recent call last):
File "run_on_emr_cluster.py", line 46, in <module>
steps=[step])
File "/usr/local/lib/python2.7/dist-packages/boto/emr/connection.py", line 552, in run_jobflow
bootstrap_action_args = [self._build_bootstrap_action_args(bootstrap_action) for bootstrap_action in bootstrap_actions]
File "/usr/local/lib/python2.7/dist-packages/boto/emr/connection.py", line 623, in _build_bootstrap_action_args
bootstrap_action_params['ScriptBootstrapAction.Path'] = bootstrap_action.path AttributeError: 'str' object has no attribute 'path'

Code below;

from boto.emr.connection import EmrConnection
conn = EmrConnection('...', '...')

from boto.emr.step import StreamingStep
step = StreamingStep(name='mapper1',
    mapper='s3://xxx/mapper1.py',
    reducer='s3://xxx/reducer1.py',
    input='s3://xxx/input/',
    output='s3://xxx/output/')


from boto.emr.bootstrap_action import BootstrapAction
bootstrap_action = BootstrapAction(name='install related packages',path="s3://xxx/bootstrap.sh", bootstrap_action_args=None)

job = conn.run_jobflow(name='emr_test',
    log_uri='s3://xxx/logs',
    master_instance_type='m1.small',
    slave_instance_type='m1.small',
    num_instances=1,
    action_on_failure='TERMINATE_JOB_FLOW',
    keep_alive=False,
    bootstrap_actions='[bootstrap_action]',
    steps=[step])

What's the proper way of passing bootstrap arguments?

Upvotes: 0

Views: 657

Answers (1)

garnaat
garnaat

Reputation: 45846

You are passing the bootstrap_actions argument as a literal string rather than as a list containing the BootstrapAction object you just created. Try this:

job = conn.run_jobflow(name='emr_test',
    log_uri='s3://xxx/logs',
    master_instance_type='m1.small',
    slave_instance_type='m1.small',
    num_instances=1,
    action_on_failure='TERMINATE_JOB_FLOW',
    keep_alive=False,
    bootstrap_actions=[bootstrap_action],
    steps=[step])

Notice that the ``bootstrap_action` argument is different here.

Upvotes: 2

Related Questions