AJIT SONAWANE
AJIT SONAWANE

Reputation: 61

PySpark How to parse and get field names from Dataframe schema's StructType Object

I have created Dataframe from Hive Table and want to retrieve the field/Column names.

>>>a=df.schema
>>>a
StructType(List(StructField(empid, IntegerType, true), StructField(empname,StringType, true)))

How can I retrieve Field names (empid, empname) from this object.

Upvotes: 4

Views: 18960

Answers (2)

stack0114106
stack0114106

Reputation: 8791

You can also use df.columns to get the column names as a list.

>>> spark.version
u'2.4.0.cloudera2'
>>>
>>> df=spark.sql("select 10 empid, 's' empname from range(1)")

>>> df.schema
StructType(List(StructField(empid,IntegerType,false),StructField(empname,StringType,false)))

>>> df.schema.fieldNames()
['empid', 'empname']

>>> df.columns
['empid', 'empname']
>>>

Upvotes: 1

user10551349
user10551349

Reputation: 161

Use pyspark.sql.types.StructType.fieldnames:

fieldNames()

Returns all field names in a list.

>>> struct = StructType([StructField("f1", StringType(), True)])
>>> struct.fieldNames()
['f1']

Upvotes: 16

Related Questions