Bhavuk Chawla
Bhavuk Chawla

Reputation: 222

How to list RDDs defined in Spark shell?

In both "spark-shell" or "pyspark" shells, I created many RDDs but I could not find any way through which I can list all the available RDDs in my current session of Spark Shell?

Upvotes: 7

Views: 2492

Answers (1)

zero323
zero323

Reputation: 330413

In Python you can simply try to filter globals by type:

def list_rdds():
    from pyspark import RDD
    return [k for (k, v) in globals().items() if isinstance(v, RDD)]

list_rdds()
# []

rdd = sc.parallelize([])
list_rdds()
# ['rdd']

In Scala REPL you should be able to use $intp.definedTerms / $intp.typeOfTerm in a similar way.

Upvotes: 8

Related Questions