Reputation: 25
Code:
val xmlDf: DataFrame = spark.read
.format("xml")
.option("nullValue", "")
.xml(df.select("payload").map(x => x.getString(0)))
warning: method xml in class XmlDataFrameReader is deprecated (since 0.13.0): Use XmlReader directly .xml(df.select("payload").map(x => x.getString(0)))
Upvotes: 0
Views: 259
Reputation: 81
Are you trying to read xml to df or read xml from a column to df (nested xml)?
please try either:
spark.read()
.format("xml")
.option("rowTag", "book")
.load("books.xml");
or:
import com.databricks.spark.xml.functions.from_xml
import com.databricks.spark.xml.schema_of_xml
import spark.implicits._
val df = ... /// DataFrame with XML in column 'payload'
val payloadSchema = schema_of_xml(df.select("payload").as[String])
val parsed = df.withColumn("parsed", from_xml($"payload", payloadSchema))
https://github.com/databricks/spark-xml (Compatible with Spark 2.4.x and 3.x, with Scala 2.12.)
Upvotes: 1