Vikram Pawar
Vikram Pawar

Reputation: 25

Migrating Apache Spark xml from 2.11 to 2.12 gives the below warning.How to use The xmlReader directly

Code:

val xmlDf: DataFrame = spark.read
  .format("xml")
  .option("nullValue", "")
  .xml(df.select("payload").map(x => x.getString(0)))

warning: method xml in class XmlDataFrameReader is deprecated (since 0.13.0): Use XmlReader directly .xml(df.select("payload").map(x => x.getString(0)))

Upvotes: 0

Views: 259

Answers (1)

chomar.c
chomar.c

Reputation: 81

Are you trying to read xml to df or read xml from a column to df (nested xml)?

please try either:

spark.read()
  .format("xml")
  .option("rowTag", "book")
  .load("books.xml");

or:

import com.databricks.spark.xml.functions.from_xml
import com.databricks.spark.xml.schema_of_xml
import spark.implicits._
val df = ... /// DataFrame with XML in column 'payload' 
val payloadSchema = schema_of_xml(df.select("payload").as[String])
val parsed = df.withColumn("parsed", from_xml($"payload", payloadSchema))

https://github.com/databricks/spark-xml (Compatible with Spark 2.4.x and 3.x, with Scala 2.12.)

Upvotes: 1

Related Questions