Reputation: 2083
I am trying to learn a Scala-Spark JDBC program on IntelliJ IDEA. In order to do that, I have created a Scala SBT Project and the project structure looks like:
Before writing the JDBC connection parameters in the class, I first tried loading a properties file which contain all my connection properties and trying to display if they are loading properly as below:
connection.properties content:
devUserName=username
devPassword=password
gpDriverClass=org.postgresql.Driver
gpDevUrl=jdbc:url
Code:
package com.yearpartition.obj
import java.io.FileInputStream
import java.util.Properties
import org.apache.spark.sql.SparkSession
import org.apache.log4j.{Level, LogManager, Logger}
import org.apache.spark.SparkConf
object PartitionRetrieval {
var conf = new SparkConf().setAppName("Spark-JDBC")
val properties = new Properties()
properties.load(new FileInputStream("connection.properties"))
val connectionUrl = properties.getProperty("gpDevUrl")
val devUserName=properties.getProperty("devUserName")
val devPassword=properties.getProperty("devPassword")
val gpDriverClass=properties.getProperty("gpDriverClass")
println("connectionUrl: " + connectionUrl)
Class.forName(gpDriverClass).newInstance()
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().enableHiveSupport().config(conf).master("local[2]").getOrCreate()
println("connectionUrl: " + connectionUrl)
}
}
Content of build.sbt:
name := "YearPartition"
version := "0.1"
scalaVersion := "2.11.8"
libraryDependencies ++= {
val sparkCoreVer = "2.2.0"
val sparkSqlVer = "2.2.0"
Seq(
"org.apache.spark" %% "spark-core" % sparkCoreVer % "provided" withSources(),
"org.apache.spark" %% "spark-sql" % sparkSqlVer % "provided" withSources(),
"org.json4s" %% "json4s-jackson" % "3.2.11" % "provided",
"org.apache.httpcomponents" % "httpclient" % "4.5.3"
)
}
Since I am not writing or saving data into any file and trying to display the values of properties file, I executed the code using following:
SPARK_MAJOR_VERSION=2 spark-submit --class com.yearpartition.obj.PartitionRetrieval yearpartition_2.11-0.1.jar
But I am getting file not found exception as below:
Caused by: java.io.FileNotFoundException: connection.properties (No such file or directory)
I tried to fix it in vain. Could anyone let me know what is the mistake I am doing here and how can I correct it ?
Upvotes: 0
Views: 727
Reputation: 497
You must write to full path of your connection.properties file (file:///full_path/connection.properties) and in this option when you submit a job in cluster if you want to read file the local disk you must save connection.properties file on the all server in the cluster to same path. But in other option, you can read the files from HDFS. Here is a little example for reading files on HDFS:
@throws[IOException]
def readFileFromHdfs(file: String): org.apache.hadoop.fs.FSDataInputStream = {
val conf = new org.apache.hadoop.conf.Configuration
conf.set("fs.default.name", "HDFS_HOST")
val fileSystem = org.apache.hadoop.fs.FileSystem.get(conf)
val path = new org.apache.hadoop.fs.Path(file)
if (!fileSystem.exists(path)) {
println("File (" + path + ") does not exists.")
null
} else {
val in = fileSystem.open(path)
in
}
}
Upvotes: 1