Reputation: 6462
I'm trying to connect spark to a PostgreSQL database using Postgis.
I'm connecting it that way:
val jdbcDF = sc.read.format("jdbc").options(
Map("url" -> "jdbc:postgresql://dbHost:5432/ticketapp?user=user&password=password",
"dbtable" -> "events",
"columnNames" -> "name")
).load()
The problem is that Spark doesn't know the Geometry
type. (I get the following error: SQLException: Unsupported type 1111
)
Is there a way to make it work?
If not, is there a way to still connect to this table without using the problematic field that is of type geometry? Can I exclude this field in order to make it work without having to make a dump of the table without this field?
Upvotes: 2
Views: 1285
Reputation: 1163
Unfortunately, Spark SQL does not know about Geometry type [1, 2]. If you do not need the problematic field, you can exclude it.
With jdbc, you can specify a subquery followed by a temporary table name instead of only a table name [3]. This is useful if you wish to select a subset of your table, or in this case exclude a field. Since there is no way in SQL to select all except some columns in a table [4], you will need to list explicitly the columns you want to keep, like this:
val jdbcDF = sc.read.format("jdbc").options(
Map("url" -> "jdbc:postgresql://dbHost:5432/ticketapp?user=user&password=password",
"dbtable" -> "(select col1,col2 from events) tmp")
).load()
[1] https://issues.apache.org/jira/browse/SPARK-11526
[2] https://github.com/harsha2010/magellan/issues/31
[3] http://metricbrew.com/get-data-from-databases-with-apache-spark-jdbc/
[4] http://www.postgresonline.com/journal/archives/41-How-to-SELECT-ALL-EXCEPT-some-columns-in-a-table.html
Upvotes: 4