djechlin
djechlin

Reputation: 60778

javax swing html parser not picking up img tags

This HTML:

<td height="79" valign="top" width="70">
            <a href="http://e.livinghuntington.com/HS?a=stuff" target="_blank" title="Follow us on Twitter: http://twitter.com/#!/HuntingtonLive"> link link link <img alt="Follow us on Twitter: http://twitter.com/#!/HuntingtonLive" border="0" height="79" src="http://webe.emv3.com/livinghuntington/images/tt.png" style="display:block;" width="70"/></a>
        </td>
</table>
<table>

and this code:

public void handleStartTag(Tag tag, MutableAttributeSet attr, int pos) {

     System.err.println("tag = " + tag);

Gives this output:

tag = td
tag = a
tag = table

I tried various testing strategies: if I nest a link (which I don't even know if is valid html) it correctly picks up the inner link. If I pull the image out of the link it still doesn't pick up the img. As far as I can tell it never picks up image tags at all. Is there an error in code or a kludge or is this an irreparable problem with the HTML Parser (so I need to chuck it and use a new one)?

Upvotes: 1

Views: 302

Answers (1)

djechlin
djechlin

Reputation: 60778

Issue was img is simple tag so is not picked up under startTag(). handleSimpleTag() is the handler to use.

Upvotes: 2

Related Questions