Reputation: 81
We have a PostgreSQL table which has UUID as one of the column. How do we send UUID field in Spark dataset(using Java) to PostgreSQL DB. We are not able to find uuid field in org.apache.spark.sql.types.DataTypes.
Please advice.
Upvotes: 8
Views: 6156
Reputation: 5312
As already pointed out, despite these resolved issues (10186, 5753) there is still no supported uuid
Postgres data type as of Spark 2.3.0.
However, there's a workaround by using Spark's SaveMode.Append
and setting the Postgres JDBC property to allow string types to be inferred. In short, it works like:
val props = Map(
JDBCOptions.JDBC_DRIVER_CLASS -> "org.postgresql.Driver",
"url" -> url,
"user" -> user,
"stringtype" -> "unspecified"
)
yourData.write.mode(SaveMode.Append)
.format("jdbc")
.options(props)
.option("dbtable", tableName)
.save()
The table should be created with the uuid column already defined with type uuid
. If you try to have Spark 2.3.0 create this table though, you will again hit a wall:
yourData.write.mode(SaveMode.Overwrite)
.format("jdbc")
.options(props)
.option("dbtable", tableName)
.option("createTableColumnTypes", "some_uuid_column_name uuid")
.save()
Result:
DataType uuid is not supported.(line 1, pos 21)
Upvotes: 6
Reputation:
Yes, you are right, there is no UUID datatype in SparkSQL. Treating them as String should work because the connector will convert the String to UUID.
I haven't tried with PostgreSQL, but when I used Cassandra (and Scala) it worked perfectly.
Upvotes: 2