horatio1701d
horatio1701d

Reputation: 9169

Custom Data Types for DataFrame columns when using Spark JDBC

I know I can use a custom dialect for having a correct mapping between my db and spark but how can I create a custom table schema with specific field data types and lengths when I use spark's jdbc.write options? I would like to have granular control over my table schemas when I load a table from spark.

Upvotes: 4

Views: 5390

Answers (2)

DataBassDrop
DataBassDrop

Reputation: 11

https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html

You can use the createTableColumnTypes option.

Here is the example from the documentation.

Specifying create table column data types on write

jdbcDF.write \
    .option("createTableColumnTypes", "name CHAR(64), comments VARCHAR(1024)") \
    .jdbc("jdbc:postgresql:dbserver", "schema.tablename",
          properties={"user": "username", "password": "password"})

Upvotes: 1

Alper t. Turker
Alper t. Turker

Reputation: 35249

There is a minimal flexibility for writes, implemented by

but if you want

to have granular control over my table schemas when I load a table from spark.

you might have to implement your own JdbcDialect. It is internal developer API and as far as I can tell it is not plugable so you may need customized Spark binaries (it might be possible to registerDialect but I haven't tried this).

Upvotes: 3

Related Questions