Malte
Malte

Reputation: 955

How to load typesafe configFactory from file on hdfs?

I am using typesafe ConfigFactory to load the config into my scala application. I do not want to include the config files into my jar, but load them instead from an external hdfs filesystem. However, I cannot find a simple way to load the config from the fsDataInputStream object I get from hadoop:

//get HDFS file
val hadoopConfig: Configuration = sc.hadoopConfiguration
val fs: FileSystem = org.apache.hadoop.fs.FileSystem.get(hadoopConfig)
val file: FSDataInputStream = fs.open(new Path("hdfs://SOME_URL/application.conf"))
//read config from hdfs
val config: Config = ConfigFactory.load(file.readUTF())

However, this throws an EOFException. Is there an easy way to convert the FSDataInputStream object into the required java.io.File? I found Converting from FSDataInputStream to FileInputStream , but this would be pretty cumbersome for such a simple task.

Upvotes: 4

Views: 5268

Answers (4)

DharmenduR
DharmenduR

Reputation: 11

I could fix the issue with below code. Assume configPath is the path in HDFS location, where you have the .conf file available. for Ex:- hdfs://mount-point/abc/xyz/details.conf

import java.io.File
import com.typesafe.config._
import org.apache.hadoop.fs.{FileSystem, Path}
import java.io.InputStreamReader
val configPath = "hdfs://sparkmainserver:8020/file.conf"
val fs = FileSystem.get(new org.apache.hadoop.conf.Configuration())
val reader = new InputStreamReader(fs.open(new Path(configPath)))
val config: Config = ConfigFactory.parseReader(reader)

Then you can use config.getString("variable_name") to extract and use the variables/parameters. Prior to this you should have ConfigFactory sbt/Maven dependency in your pom file.

Upvotes: 1

Yohan Chung
Yohan Chung

Reputation: 539

You should be able to load .conf file in hdfs using the following code:

ConfigFactory.parseFile(new File("application.conf"));

Please keep in mind that the .conf file should be placed on the same directory as your app file (e.g. jar file in spark).

Upvotes: 1

Quang Gia Le
Quang Gia Le

Reputation: 1

Here is what I did with Spark application:

  /**
    * Load typesafe's configuration from hdfs file location
    * @param sparkContext
    * @param confHdfsFileLocation
    * @return
    */
  def loadHdfsConfig(sparkContext: SparkContext, confHdfsFileLocation: String) : Config = {
    // Array of 1 element (fileName, fileContent)
    val appConf: Array[(String, String)] = sparkContext.wholeTextFiles(confHdfsFileLocation).collect()
    val appConfStringContent = appConf(0)._2
    ConfigFactory.parseString(appConfStringContent)
  }

Now in the code, just use

val config = loadHdfsConfig(sparkContext, confHdfsFileLocation)
config.getString("key-here")

I hope it helps.

Upvotes: 0

Alexey Romanov
Alexey Romanov

Reputation: 170839

Using ConfigFactory.parseReader should work (but I haven't tested it):

val reader = new InputStreamReader(file)
val config = try {
  ConfigFactory.parseReader(reader)
} finally {
  reader.close()
}

Upvotes: 6

Related Questions