Reputation: 25
Here's some basic xml doc:
<h1>My Heading</h1>
<p align = "center"> My paragraph
<img src="smiley.gif" alt="Smiley face" height="42" width="42"></img>
<img src="sad.gif" alt="Sad face" height="45" width="45"></img>
<img src="funny.gif" alt="Funny face" height="48" width="48"></img>
</p>
<p>My para</p>
What am i trying to do is find element, all his attributes and save attribute name + attribute value for each element. Here's my code so far:
private Map <String, String> tag = new HashMap <String,String> ();
public Map <String, String> findElement () {
try {
FileReader fRead = new FileReader (sourcePage);
BufferedReader bRead = new BufferedReader (fRead);
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance ();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder ();
Document doc = docBuilder.parse(new FileInputStream (new File (sourcePage)));
XPathFactory xFactory = XPathFactory.newInstance ();
XPath xPath = xFactory.newXPath ();
NodeList nl = (NodeList) xPath.evaluate("//img/@*", doc, XPathConstants.NODESET);
for( int i=0; i<nl.getLength (); i++) {
Attr attr = (Attr) nl.item(i);
String name = attr.getName();
String value = attr.getValue();
tag.put (name,value);
}
bRead.close ();
fRead.close ();
}
catch (Exception e) {
e.printStackTrace();
System.err.println ("An error has occured.");
}
Problem appears when i am looking for img's attributes, because of identical attributes. HashMap is not suitable for this, for its overwriting of values with the same key. Maybe i'm using wrong expression to find all attributes. Is there any other way, how to get attributes names and values of nth img element?
Upvotes: 1
Views: 1479
Reputation: 9103
You can use a map inside the map:
Map<Map<int, String>, String> // int = "some index" 0,1,etc.. & String1(the value of the second Map) =src & String2(the value of the original Map) =smiley.gif
OR
You can inverse it and consider that when using it, like :
Map<String, String> // String1=key=smiley.gif , String2=value=src
Upvotes: 0
Reputation: 38424
First, let's level the field a little. I cleaned up your code a bit to have a compiling starting point. I removed the unnecessary code and fixed the method by my best guess of what it is supposed to do. And I generized it a little to make it accept one tagName
parameter. It's still the same code and does the same mistake, but now it compiles (Java 7 features used for convenience, switch it back to Java 6 if you want). I also split the try-catch
into multiple blocks just for the sake of it:
public Map<String, String> getElementAttributesByTagName(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
NodeList attributeList;
try {
XPath xPath = XPathFactory.newInstance().newXPath();
attributeList = (NodeList)xPath.evaluate("//descendant::" + tagName + "[1]/@*", document, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new RuntimeException(e);
}
Map<String, String> tagInfo = new HashMap<>();
for (int i = 0; i < attributeList.getLength(); i++) {
Attr attribute = (Attr)attributeList.item(i);
tagInfo.put(attribute.getName(), attribute.getValue());
}
return tagInfo;
}
When run against your example code above, it returns:
{height=48, alt=Funny face, width=48, src=funny.gif}
The solution depends on what is your expected output. You either want
<img>
elements (say, the first one)<img>
elements and their attributesFor the first solution, it's enough to change your XPath expression to
//descendant::img[1]/@*
or
//descendant::" + tagName + "[1]/@*
with the tagName
parameter. Beware, that this is not the same as //img[1]/@*
even though it returns the same element in this particular case.
When changed this way, the method returns:
{height=42, alt=Smiley face, width=42, src=smiley.gif}
which are correctly returned attributes of the first <img>
element.
Note that you don't even have to use XPath expression for this kind of work. Here's a non-XPath version:
public Map<String, String> getElementAttributesByTagNameNoXPath(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
Node node = document.getElementsByTagName(tagName).item(0);
NamedNodeMap attributeMap = node.getAttributes();
Map<String, String> tagInfo = new HashMap<>();
for (int i = 0; i < attributeMap.getLength(); i++) {
Node attribute = attributeMap.item(i);
tagInfo.put(attribute.getNodeName(), attribute.getNodeValue());
}
return tagInfo;
}
The second solution needs to change things a bit. We want to return the attributes of all <img>
elements in the document. Multiple elements means we'll use a List
which will hold multiple Map<String, String>
instances, where every Map
represents one <img>
element.
A complete XPath version in case you actually need some complex XPath expression:
public List<Map<String, String>> getElementsAttributesByTagName(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
NodeList nodeList;
try {
XPath xPath = XPathFactory.newInstance().newXPath();
nodeList = (NodeList)xPath.evaluate("//" + tagName, document, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new RuntimeException(e);
}
List<Map<String, String>> tagInfoList = new ArrayList<>();
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
NamedNodeMap attributeMap = node.getAttributes();
Map<String, String> tagInfo = new HashMap<>();
for (int j = 0; j < attributeMap.getLength(); j++) {
Node attribute = attributeMap.item(j);
tagInfo.put(attribute.getNodeName(), attribute.getNodeValue());
}
tagInfoList.add(tagInfo);
}
return tagInfoList;
}
To get rid of the XPath part, you can simply switch it to a one-liner:
NodeList nodeList = document.getElementsByTagName(tagName);
Both these versions, when run against your test case above with an "img"
parameter, return this: (formatted for clarity)
[ {height=42, alt=Smiley face, width=42, src=smiley.gif},
{height=45, alt=Sad face, width=45, src=sad.gif },
{height=48, alt=Funny face, width=48, src=funny.gif } ]
which is a correct list of all the <img>
elements.
Upvotes: 1
Reputation: 5521
try using
Map <String, ArrayList<String>> tag = new HashMap <String, ArrayList<String>> ();
Upvotes: 0