Reputation: 1065
I'm attempting to set driver log levels when launching jobs in Dataproc (https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs#LoggingConfig). Launching is done via a Java program using the dataproc SDK.
LoggingConfig loggingConfig = new LoggingConfig();
loggingConfig.put("driverLogLevels", Collections.singletonMap("root", "ERROR"));
com.google.api.services.dataproc.model.SparkJob sparkJob = new com.google.api.services.dataproc.model.SparkJob().setMainClass(mainClass).setJarFileUris(jarFileUris).setArgs(args).setProperties(properties).setLoggingConfig(loggingConfig);
Job job = new Job().setPlacement(new JobPlacement().setClusterName(clusterName)).setSparkJob(sparkJob);
// ommitted irrelevant code
Dataproc dp = new Dataproc.Builder(httpTransport, jsonFactory, credential).setApplicationName(jobName).build();
SubmitJobRequest request = new SubmitJobRequest().setJob(job);
return dp.projects().regions().jobs().submit(googleProject, "global", request).execute();
This launches successfully, but does not successfully set log4j configuration:
log4j:ERROR Could not read configuration file from URL [file:/tmp/[guid]/driver_log4j.properties].
java.io.FileNotFoundException: /tmp/[guid]/driver_log4j.properties (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileInputStream.<init>(FileInputStream.java:93)
at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:117)
at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:102)
at org.apache.spark.deploy.yarn.ApplicationMaster$.initializeLogIfNecessary(ApplicationMaster.scala:736)
at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
at org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:736)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:751)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
log4j:ERROR Ignoring configuration file [file:/tmp/[guid]/driver_log4j.properties].
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
where [guid] is a GUID that differs for every job. Logging is by (verbose) default config.
How can I successfully set config? What is the most elegant and robust way on dataproc to adjust log levels for Spark? would be a fallback, but I'd rather use a method that's not liable to change out from under me.
Upvotes: 1
Views: 665
Reputation: 1202
The official way to set log level is the method described in your link. See the dataproc docs.
so I believe that to invoke this from the java SDK within the setArgs(...)
term of your builder. So in your case you would want to add:
args.add("--driver-log-levels");
args.add("root=ERROR");
like so:
args.add("--driver-log-levels");
args.add("root=ERROR");
com.google.api.services.dataproc.model.SparkJob sparkJob = new com.google.api.services.dataproc.model.SparkJob().setMainClass(mainClass).setJarFileUris(jarFileUris).setArgs(args).setProperties(properties).setLoggingConfig(loggingConfig);
Job job = new Job().setPlacement(new JobPlacement().setClusterName(clusterName)).setSparkJob(sparkJob);
// ommitted irrelevant code
Dataproc dp = new Dataproc.Builder(httpTransport, jsonFactory, credential).setApplicationName(jobName).build();
SubmitJobRequest request = new SubmitJobRequest().setJob(job);
return dp.projects().regions().jobs().submit(googleProject, "global", request).execute();
I'm not sure what you mean when you call this a feature that's liable to change out from under you. This should be a stable feature.
Upvotes: 1