Reputation: 6647
I was reading one of the interview question about parsing an XML.
I wrote a very high level and incomplete skeleton of the algorithm , and was looking for some help in writing a simple algo [I assume one exists, because this was asked as an interview question so should be doable in 45 mins I guess].
Here is my attempt:
// Assume well-formedness
public static Node parseXML(String xml)
{
Node root = new XMLParse().new Node();
while(!helper.endOfElements())
{
// Returns name of root element
root.name = helper.getName(xml);
// Returns value of root element
root.name = helper.getValue(xml);
// returns each child in a String and returns all such children as
// a String Array
// Basically splits based on <> </> elements and return that as a String
String[] children = helper.getChildren(xml);
if(children.length!=0)
{
root.childList = new ArrayList<XMLParse.Node>();
for(int i=0; i<children.length;i++)
{
root.childList.add(parseXML(children[i]));
}
}
}
return root;
}
class Node
{
String name;
String value;
List<Node> childList;
public String getName()
{
return name;
}
public String getValue()
{
return value;
}
public List<Node> getChildList()
{
return childList;
}
}
Class helper()
{
// Returns the name of the root of the xml
public static String getName(String XML);
// Returns the value of the root of the xml
public static String getValue(String XML)
// Splits the XML into top level childern of the root of the passed XML
public static String[] getChildren(String XML)
}
I am hoping someone can give me a pseudo-code/code for doing this or may be provide an easy way of implementing the helper function in my algo.
I know there are built in classes to do this like in here , but using them would beat the purpose I guess. Also, many things used in this link are just interfaces so I couldnt find any implementation of say docBuilder.parse (new File("book.xml"))
method.
Upvotes: 2
Views: 11093
Reputation: 1129
About the actual question
This is an example of an interview question that is designed to expose the inexperience of the applicant. An applicant that hasn't been around the industry much (or who just put XML on their resume without ever having implemented a project) might eagerly go off and start churning code.
An applicant that has successfully navigated a few interviews or completed a few projects that use XML will step back and make a few observations
About the solution
In an interview setting I would answer something along the lines of:
I don't think it will be productive to write an actual parser here during the interview. This is because standards compliant parsers are time consuming to write and validate and there are already plenty of implementations for different languages. However, I will describe how to use two popular parsers and how I have used them on past projects. I'll also discuss the pros and cons of each approach.
The first approach that can be used when parsing XML is to treat the entire document as a tree of nodes. This type of parser is called a DOM parser. I used a DOM parser on {insert relevant project experience}. We used a DOM parser because we needed to access different parts of the document at the same time.
The second approach that can be used when parsing XML is to treat the document as a stream of events or nodes. This type of approach is called a SAX parser, I used a SAX parser on {insert relevant project experience}. We used a SAX parser because we couldn't fit the entire document in memory.
{insert discussion on pros and cons}
Further reading
http://www.cs.nmsu.edu/~epontell/courses/XML/material/xmlparsers.html
Upvotes: 21