Reputation: 125
TL:DR
Is there a way to read a Scala/Java properties file from a Databricks file system?
Or, is there a way to convert a spark data frame Rows into a set of text key/value pairs (that Scala will understand)?
Full Problem:
The properties file is not local, it's on the Databricks cluster. Attempts to read a file from "dbfs:/" or "/dbfs" fail to find the file when using the scala.io.Source
library. My guess is that Scala Source has no ability to recognize the URI for the Databricks file system(?).
I'm able to read the file into a Spark Dataframe however, but attempts to populate a java.utils.Properties
object fail with an error that it doesn't accept the Spark Dataframe "ROW" type. I've tried changing the data frame to an Array and List, but run into the same type mismatch. java.util.List[org.apache.spark.sql.Row]
for example, is what I get when converting the data frame to a list. I'm guessing that means dataFrameObject.collectAsList()
makes a list of spark rows instead of a text list of key/value pairs.
Obviously I'm new to Scala... If there isn't a way to read/load my properties file directly from DBFS, is there a way to convert the spark Row to a key/value pairs - or a byteStream?
Cheers and thanks, Simon
Upvotes: 1
Views: 869
Reputation: 87069
If you're using full version of the Databricks, not community edition, then you should be able to access files on DBFS via /dbfs/_the_rest_of_your_path_without_dbfs:/_...
But if you can't access /dbfs/...
, then you can still load properties as following:
text
format that converts every line in the file into individual row.getString(0)
to fetch first element of the row), and then merging all lines together using the mkString
val path_to_file = "dbfs:/something...."
val df = spark.read.format("text").load(path_to_file)
val allTextg = df.collect().map(_.getString(0)).mkString("\n")
val reader = new java.io.StringReader(allText)
val props = new java.util.Properties()
props.load(reader)
reader.close()
and you can check that properties are loaded with
props.list(System.out)
Upvotes: 1