Venu
Venu

Reputation: 81

Cant find uuid in org.apache.spark.sql.types.DataTypes

We have a PostgreSQL table which has UUID as one of the column. How do we send UUID field in Spark dataset(using Java) to PostgreSQL DB. We are not able to find uuid field in org.apache.spark.sql.types.DataTypes.

Please advice.

Upvotes: 8

Views: 6156

Answers (2)

ecoe
ecoe

Reputation: 5312

As already pointed out, despite these resolved issues (10186, 5753) there is still no supported uuid Postgres data type as of Spark 2.3.0.

However, there's a workaround by using Spark's SaveMode.Append and setting the Postgres JDBC property to allow string types to be inferred. In short, it works like:

    val props = Map(
          JDBCOptions.JDBC_DRIVER_CLASS -> "org.postgresql.Driver",
          "url" -> url,
          "user" -> user,
          "stringtype" -> "unspecified"
        )
          
    yourData.write.mode(SaveMode.Append)
        .format("jdbc")
        .options(props)
        .option("dbtable", tableName)
        .save()

The table should be created with the uuid column already defined with type uuid. If you try to have Spark 2.3.0 create this table though, you will again hit a wall:

    yourData.write.mode(SaveMode.Overwrite)
        .format("jdbc")
        .options(props)
        .option("dbtable", tableName)
        .option("createTableColumnTypes", "some_uuid_column_name uuid")
        .save()

Result:

DataType uuid is not supported.(line 1, pos 21)

Upvotes: 6

user4078581
user4078581

Reputation:

Yes, you are right, there is no UUID datatype in SparkSQL. Treating them as String should work because the connector will convert the String to UUID.

I haven't tried with PostgreSQL, but when I used Cassandra (and Scala) it worked perfectly.

Upvotes: 2

Related Questions