uttam
uttam

Reputation: 455

Pyspark dataframe: write jdbc to dynamic creation of table with given schema

Is there a way to dynamically create tables with given schema from pyspark dataframe like we do with pandas dataframe's to_sql method.

similarly can we create a table with a given schema from the pyspark dataframe ?

df.write.jdbc("jdbc:postgresql://localhost:5432/postgres", "sample_data1",mode="overwrite", properties=prop);

In the above code,how can we give the schema to generate the table we wanted ?

Upvotes: 2

Views: 4178

Answers (1)

Som
Som

Reputation: 6338

I think you are searching for these options-

createTableOptions

This is a JDBC writer related option. If specified, this option allows setting of database-specific table and partition options when creating a table (e.g., CREATE TABLE t (name string) ENGINE=InnoDB.). This option applies only to writing.

createTableColumnTypes

The database column data types to use instead of the defaults, when creating the table. Data type information should be specified in the same format as CREATE TABLE columns syntax (e.g: "name CHAR(64), comments VARCHAR(1024)"). The specified types should be valid spark sql data types. This option applies only to writing.

example

# Specifying create table column data types on write
jdbcDF.write \
    .option("createTableColumnTypes", "name CHAR(64), comments VARCHAR(1024)") \
    .jdbc("jdbc:postgresql:dbserver", "schema.tablename",
          properties={"user": "username", "password": "password"})

ref - spark-doc

Upvotes: 3

Related Questions