Reputation: 291
I am importing data to spark from MYSQL through JDBC and one of the column has time type (SQL type TIME and JDBC type java.sql.Time) with large hour value (Eg: 168:03:01). Spark convert them to timestamp format and causing error while reading three digit hour.How to deal with Time type in Spark
Upvotes: 4
Views: 1337
Reputation: 330303
Probably your best shot at this moment is to cast data before it is actually read by Spark and parse it directly in your application. JDBC data source allows you to pass a valid subquery as a dbtable
option or table
argument. It means you can do for example something similar to this:
sqlContext.read.format("jdbc").options(Map(
"url" -> "xxxx",
"dbtable" -> "(SELECT some_field, CAST(time_field AS TEXT) FROM table) tmp",
))
and use some combination of built-in functions to convert it in Spark to a type that is applicable for your application.
Upvotes: 2