Reputation: 455
Is there a way to dynamically create tables with given schema from pyspark dataframe like we do with pandas dataframe's to_sql method.
similarly can we create a table with a given schema from the pyspark dataframe ?
df.write.jdbc("jdbc:postgresql://localhost:5432/postgres", "sample_data1",mode="overwrite", properties=prop);
In the above code,how can we give the schema to generate the table we wanted ?
Upvotes: 2
Views: 4178
Reputation: 6338
I think you are searching for these options-
createTableOptions
This is a JDBC writer related option. If specified, this option allows setting of database-specific table and partition options when creating a table (e.g., CREATE TABLE t (name string)
ENGINE=InnoDB.). This option applies only to writing.
createTableColumnTypes
The database column data types to use instead of the defaults, when creating the table. Data type information should be specified in the same format as CREATE TABLE columns syntax (e.g: "name CHAR(64), comments VARCHAR(1024)"
). The specified types should be valid spark sql data types. This option applies only to writing.
# Specifying create table column data types on write
jdbcDF.write \
.option("createTableColumnTypes", "name CHAR(64), comments VARCHAR(1024)") \
.jdbc("jdbc:postgresql:dbserver", "schema.tablename",
properties={"user": "username", "password": "password"})
ref - spark-doc
Upvotes: 3