squad21
squad21

Reputation: 73

Connect to local postgresql database using Spark and Scala

I am running scala version 2.12.1. Using IntelliJ, how can I connect to my local postgresql database using spark and run sql commands to manipulate tables? I am having a lot of issues regarding conflicts of versions, so would it also be possible to include the dependencies?

Upvotes: 3

Views: 6821

Answers (1)

oh54
oh54

Reputation: 498

I suggest you use the latest spark, i.e. 2.2.0. For the stuff you want to do you need spark-core, spark-sql and postgresql jdbc driver dependencies.

For spark use these two:

https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11/2.2.0 https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.11/2.2.0

For the postgresql driver this one will likely do fine:

https://mvnrepository.com/artifact/org.postgresql/postgresql/9.4.1212

Spark can connect to relational databases through jdbc, there's a section on this in spark documentation: https://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases

From the same documentation:

// Loading data from a JDBC source
val jdbcDF = spark.read
  .format("jdbc")
  .option("url", "jdbc:postgresql://host/database")
  .option("dbtable", "schema.tablename")
  .option("user", "username")
  .option("password", "password")
  .load()

Obviously you would need to use the url that specifies your database, for postgresql connection url see https://jdbc.postgresql.org/documentation/80/connect.html

Upvotes: 5

Related Questions