Reputation: 503
I'm using Spark 1.3.1.
I am trying to view the values of a Spark dataframe column in Python. With a Spark dataframe, I can do df.collect()
to view the contents of the dataframe, but there is no such method for a Spark dataframe column as best as I can see.
For example, the dataframe df
contains a column named 'zip_code'
. So I can do df['zip_code']
and it turns a pyspark.sql.dataframe.Column
type, but I can't find a way to view the values in df['zip_code']
.
Upvotes: 50
Views: 146161
Reputation: 432
You can simply write:
df.select('your column's name').show()
In your case here, it will be:
df.select('zip_code').show()
Upvotes: 10
Reputation: 22711
To view the complete content:
df.select("raw").take(1).foreach(println)
(show
will show you an overview).
Upvotes: 5
Reputation: 330453
You can access underlying RDD
and map over it
df.rdd.map(lambda r: r.zip_code).collect()
You can also use select
if you don't mind results wrapped using Row
objects:
df.select('zip_code').collect()
Finally, if you simply want to inspect content then show
method should be enough:
df.select('zip_code').show()
Upvotes: 60