user3610141
user3610141

Reputation: 305

pyspark AttributeError: 'DataFrame' object has no attribute 'toDF'

The following code worked for me before, but not anymore. I got the error:

AttributeError: 'DataFrame' object has no attribute 'toDF'

if __name__ == "__main__":
  sc = SparkContext(appName="test")
  sqlContext = SQLContext(sc)

  df = sqlContext.read.format('com.databricks.spark.csv').\
    options(header='false',delimiter=',',inferSchema='true').load('test')

  ### rename columns
  df = df.toDF('a','b','c')
  ...
  sc.stop()

Upvotes: 3

Views: 15591

Answers (2)

Hamid Ali
Hamid Ali

Reputation: 13

if you are working with spark version 1.6 then use this code for conversion of rdd into df

from pyspark.sql import SQLContext, Row
sqlContext = SQLContext(sc)
df = sqlContext.createDataFrame(rdd)

if you want to assign title to rows then use this

df= rdd.map(lambda p: Row(ip=p[0], time=p[1], zone=p[2]))

ip,time,zone are row headers in this example.

Upvotes: 0

user3610141
user3610141

Reputation: 305

I figured it out. Looks like it has to do with our spark version. It worked with 1.6

Upvotes: 1

Related Questions