Federico Gentile
Federico Gentile

Reputation: 5940

How to get schema without loading table data in Databricks?

I am working on Databricks and I use spark to laod and publish data to a SQL database. One of the task I need to do is to get the schema of a table of my database and therefore see the datatypes of each column. The only way I am able to do it so far is by loading the whole table and then extracting the schema.

df_tableA = spark.read.format("jdbc") \
        .option("url", datasource_url) \
        .option("dbtable", table_name) \
        .option("user", dbuser) \
        .option("password", dbpassword) \
        .option("driver", driver) \
        .load()

However my goal is to get just the schema without loading the entire table since I want to speed up the process and I do not want to overload the memory.

Would you be able to suggest a smart and elegant way to achieve my goal?

Upvotes: 0

Views: 4056

Answers (1)

pltc
pltc

Reputation: 6082

Normally, load does not load the table into memory. But if you want, you can use a dummy query and pass to dbtable like .option("dbtable", "(select * from table where 1 = 2) t")

Upvotes: 3

Related Questions