Reputation: 305
The following code worked for me before, but not anymore. I got the error:
AttributeError: 'DataFrame' object has no attribute 'toDF'
if __name__ == "__main__":
sc = SparkContext(appName="test")
sqlContext = SQLContext(sc)
df = sqlContext.read.format('com.databricks.spark.csv').\
options(header='false',delimiter=',',inferSchema='true').load('test')
### rename columns
df = df.toDF('a','b','c')
...
sc.stop()
Upvotes: 3
Views: 15591
Reputation: 13
if you are working with spark version 1.6 then use this code for conversion of rdd into df
from pyspark.sql import SQLContext, Row
sqlContext = SQLContext(sc)
df = sqlContext.createDataFrame(rdd)
if you want to assign title to rows then use this
df= rdd.map(lambda p: Row(ip=p[0], time=p[1], zone=p[2]))
ip,time,zone are row headers in this example.
Upvotes: 0
Reputation: 305
I figured it out. Looks like it has to do with our spark version. It worked with 1.6
Upvotes: 1