Reputation: 577
I need to know how to how to parse XML file in Spark. I am receiving streaming data from kafka and then need to parse that streamed data.
Here is my Spark code to receive data:
directKafkaStream.foreachRDD(rdd ->{
rdd.foreach(s ->{
System.out.println("&&&&&&&&&&&&&&&&&" +s._2 );
});
And results:
<root>
<student>
<name>john</name>
<marks>90</marks>
</student>
</root>
How to pass these XML elements?
Upvotes: 2
Views: 2080
Reputation: 577
Thanks guys.. Problem Solved. Here is the solution.
String xml = "<name>xyz</name>";
DOMParser parser = new DOMParser();
try {
parser.parse(new InputSource(new java.io.StringReader(xml)));
Document doc = parser.getDocument();
String message = doc.getDocumentElement().getTextContent();
System.out.println(message);
} catch (Exception e) {
// handle SAXException
}
Upvotes: 3
Reputation: 704
As you are processing streaming data, it would be helpful to use databricks's spark-xml lib for xml data processing.
Reference: https://github.com/databricks/spark-xml
Upvotes: 2