Reputation: 3199
I am using spark-sql 2.4.1 version, jackson jars & Java 8.
In my spark program/job I am reading few configurations/properties from external "conditions.yml" file which is place in "resource" folder of my Java Project as below
ObjectMapper mapper = new ObjectMapper(new YAMLFactory());
try {
driverConfig = mapper.readValue(
Configuration.class.getClassLoader().getResourceAsStream("conditions.yml"),Configuration.class);
}
If I want to pass "conditions.yml" file from outside while submitting spark-job how to pass this file ? where it should be placed?
In my program I am reading from "resouces" directory i.e. .getResourceAsStream("conditions.yml") ...if i pass this property file from spark-submit ...will the job takes from here from resouces or external path ?
If I want to pass as external file , do I need to change the code above ?
Updated Question:
In my spark driver program I am reading the property file as program arguments Which is being loaded as below
Config props = ConfigFactory.parseFile(new File(args[0]));
While running my spark job in shell script I am giving as below
$SPARK_HOME/bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--name MyDriver \
--jars "/local/jars/*.jar" \
--files hdfs://files/application-cloud-dev.properties,hdfs://files/condition.yml \
--class com.sp.MyDriver \
--executor-cores 3 \
--executor-memory 9g \
--num-executors 5 \
--driver-cores 2 \
--driver-memory 4g \
--driver-java-options -Dconfig.file=./application-cloud-dev.properties \
--conf spark.executor.extraJavaOptions=-Dconfig.file=./application-cloud-dev.properties \
--conf spark.driver.extraClassPath=. \
--driver-class-path . \
ca-datamigration-0.0.1.jar application-cloud-dev.properties condition.yml
Error :
Not loading the properties... what is wrong here ? What is the correct way to pass the Program Args to Spark-Job Java program?
Upvotes: 1
Views: 2320
Reputation: 686
you will have to use --file path to your file in spark-submit command to be able to pass any files. please note this is
syntax for that is
"--file /home/user/config/my-file.yml"
if it is on hdfs then provide the hdfs path
this should copy the file to class path and your code should be able find it from the driver.
the implementation of reading the file should be done with something like this
def readProperties(propertiesPath: String) = {
val url = getClass.getResource("/" + propertiesPath)
assert(url != null, s"Could not create URL to read $propertiesPath properties file")
val source = Source.fromURL(url)
val properties = new Properties
properties.load(source.bufferedReader)
properties
}
hope that is what you are looking for
Upvotes: 2