Reputation: 831
I am running Spark 1.5.1. On startup I have HiveContext
available as sqlContext
but set
sqlContext2 = SQLContext(sc)
I create a pipelined RDD by parsing a list of strings to JSON
data = points.map(lambda line: json.loads(line))
I then try to convert this into a dataframe using
DF = sqlContext2.createDataFrame(data).collect()
This runs perfectly, but then when i run type(DF)
it says that it is a list.
How is this possible? How is a list coming out of a createDataFrame()
Upvotes: 1
Views: 1255
Reputation: 40380
That's because when you apply collect()
on a DataFrame, it return a list that contains all of the elements (Rows) in this DataFrame.
if you want just a DatFrame, df = sqlContext.createDataFrame(data)
is enough.
There is no need for sqlContext2
here.
Upvotes: 1