letsc
letsc

Reputation: 2567

How do I list all running EMR clusters using Boto?

How do I list all my running clusters in my aws account using boto? Using the the command line I can get them using :

aws emr list-clusters --profile my-profile --region us-west-2 --active

However I wanna do the same using boto3. However the following code does not return any clusters:

import boto3

session = boto3.Session(profile_name='my-profile')

client = session.client('emr', region_name= 'us-west-2')

response = client.list_clusters(
    ClusterStates=['RUNNING']
)

print response

Result:

{u'Clusters': [], 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': '577f3961-bdc80772f266', 'HTTPHeaders': {'x-amzn-requestid': '577f3961-34e5-11e7-a12a-bdc80772f266', 'date': 'Tue, 09 May 2017 18:28:47 GMT', 'content-length': '15', 'content-type': 'application/x-amz-json-1.1'}}}

Upvotes: 6

Views: 11002

Answers (5)

Lamanus
Lamanus

Reputation: 13541

Here is the paginator solution.

import boto3

boto3 = boto3.session.Session(region_name='ap-northeast-2')
emr = boto3.client('emr')

page_iterator = emr.get_paginator('list_clusters').paginate(
    ClusterStates=['RUNNING','WAITING']
)

for page in page_iterator:
    for item in page['Clusters']:
        print(item['Id'])

The result is

j-21*****
j-3S*****

Upvotes: 11

rushabh25
rushabh25

Reputation: 41

It will only return 50 records, if you have more than that then you would need to use the Marker to track the paging of cluster list across multiple listClusters call and then you can filter on name like 'something'

Upvotes: 1

Alkesh Mahajan
Alkesh Mahajan

Reputation: 479

Try This one

import boto3
client = boto3.client("emr")
running_clust = client.list_clusters(ClusterStates=['WAITING'])

print(running_clust)

use 'WAITING', 'RUNNING','STARTING' etc

  • List item

Upvotes: 0

Puneetha B M
Puneetha B M

Reputation: 41

The cluster is initially in Waiting state, when there are jobs running against the cluster, it changes to Running state. In your case, it will only return a Id if there is atleast 1 job running in the cluster.

Change it to below:

ClusterStates=['WAITING', 'RUNNING'] 

Upvotes: 4

Adam Owczarczyk
Adam Owczarczyk

Reputation: 2862

From the docs:

Provides the status of all clusters visible to this AWS account.

Means you probably don't have access to list those clusters using the session credentials you provided. Try to use the credentials that aws cli is using and see if it works.

Upvotes: 0

Related Questions