user3331966
user3331966

Reputation: 152

Converting string list to Python dataframe - pyspark python sparksql

I have the following Python / Pyspark code:

sql_command = ''' query ''''
df = spark.sql(sql_command)
ls_colnames = df.schema.names
ls_colnames
     ['id', 'level1', 'level2', 'level3', 'specify_facts']

cSchema = StructType([
    StructField("colname", StringType(), False)
  ])
df_colnames = spark.createDataFrame(dataset_array,schema=cSchema)

File "/opt/mapr/spark/spark-2.1.0/python/pyspark/sql/types.py", line 1366, in _verify_type raise TypeError("StructType can not accept object %r in type %s" % (obj, type(obj))) TypeError: StructType can not accept object 'id' in type class 'str'

What can I do to get a spark object of the colnames? `

Upvotes: 1

Views: 3027

Answers (1)

Neeraj Bhadani
Neeraj Bhadani

Reputation: 3110

Not sure if I have understood your question correctly. But if you are tryng to create a dataframe based on the given list, you can use below code for the same.

from pyspark.sql import Row
l =  ['id', 'level1', 'level2', 'level3', 'specify_facts']
rdd1 = sc.parallelize(l)
row_rdd = rdd1.map(lambda x: Row(x))
sqlContext.createDataFrame(row_rdd,['col_name']).show()

Hope it Helps.

Regards,

Neeraj

Upvotes: 3

Related Questions