SK15
SK15

Reputation: 43

How to extract column name and data types from Glue Dynamic Dataframe?

I am trying to extract column names and data types from Glue dynamic frame and wanted to use them in spark sql. For ex:

 persons = glueContext.create_dynamic_frame.from_catalog(
         database="legislators",
         table_name="customer_table")

persons.printSchema()

The output is

root |-- cust_no: long |-- name: string |-- address: string |-- zip: long

How to extract column names and data types from dynamic frame. I wanted to trim only strings, not longs. I wanted to use the columns in spark sql

spark.sql(""" SELECT cust_no, trim(name),trim(address),zip....""")

Please advise how to achieve this.

Upvotes: 2

Views: 7454

Answers (1)

ruifgmonteiro
ruifgmonteiro

Reputation: 719

You can convert it to a spark dataframe and apply the dftypes method.

persons.toDF().dtypes

Using this method you get a list of tuples containing the column and respective data type inside your dataframe.

[('cust_no', 'long'),('name', 'string'),('address','string'),('zip','long')]

Upvotes: 1

Related Questions