ian
ian

Reputation: 303

Jsoup: Parsing html out of a piece of javascript

Does anyone of you know how to get html out of a javascript onmouseover event with Jsoup? That might sound vague, so here is is the code:

<table onmouseover="showHoverInfo('', '<a href="somelink"><b>sometext</b>/a><br /> Some other text <br /> <a href="some other link"><b>Some text</b></a>')"

And it goes on. What I would like to know, is: how do I get the html code out of the showHoverInfo() method, using Jsoup?

All help is appreciated.

Upvotes: 0

Views: 557

Answers (1)

acdcjunior
acdcjunior

Reputation: 135762

You can find the onmouseover attribute via .attr() and then process the obtained string (in the example below I use regex) to get the parameter value you want:

import java.util.regex.*;
import org.jsoup.Jsoup;
import org.jsoup.nodes.*;

public class JSoupGetAttributeExample {
    public static void main(String[] args) {
        Document doc = Jsoup.parse("<html><body><div>example</div>" +
        "<table id='myTable' onmouseover=\"showHoverInfo('', '<a href=\\\'somelink\\\'><b>sometext</b>/a><br /> Some other text <br /> <a href=\\\'some other link\\\'><b>Some text</b></a>')\" >" +
        "   <tr>" +
        "       <td>"+
        "       </td>"+
        "   </tr>" +
        "</table>" +
        "</body></html>");
        Element myTable = doc.getElementById("myTable");
        String onmouseover = myTable.attr("onmouseover");
        System.out.println("onmouseover ATTRIBUTE: "+onmouseover);

        /* String processing to get the HTML (second) parameter */
        String secondParameter = null;
        Pattern p = Pattern.compile("showHoverInfo\\('.*', '(.*?)'\\)");
        Matcher m = p.matcher(onmouseover);
        if (m.find()) {
            secondParameter = m.group(1);
        }
        System.out.println("\nHTML PARAMETER: "+secondParameter);
    }
}

Output:

onmouseover ATTRIBUTE: showHoverInfo('', '<a href=\'somelink\'><b>sometext</b>/a><br /> Some other text <br /> <a href=\'some other link\'><b>Some text</b></a>')

HTML PARAMETER: <a href=\'somelink\'><b>sometext</b>/a><br /> Some other text <br /> <a href=\'some other link\'><b>Some text</b></a>

Upvotes: 2

Related Questions