Chris
Chris

Reputation: 392

How to fetch frames with Jsoup?

<html>
    <head></head>
    <frameset cols="180,590,*" border="0">  
        <frame src="test.html" name="main" noresize="" scrolling="no" marginwidth="0" marginheight="0">
        <frame src="http://www.test.com/my.php" name="right" noresize="" scrolling="auto" marginwidth="0" marginheight="0">
            #document    <!-- what is this? -->
                <html>
                    <head>
                        <title>TEST</title>
                    </head>
                    <body></body>
                </html>
        </frame>
    </frameset>
</html>


I'm parsing a webpage. But I have a problem with it.
What is the #documnet?
And how can I parse <html> below #document using Jsoup?

Upvotes: 4

Views: 1485

Answers (1)

Stephan
Stephan

Reputation: 43013

And how can I parse below #document using Jsoup?

You can see #document as a "virtual" element. Jsoup won't see it. It is not present in the actual HTML code neither.

What you want is fetching the frames with Jsoup. See below:

Document doc = ...; // HTML page containing the frameset

Document mainFrameDocument = Jsoup.connect(doc.select("frame[name=main]").absUrl("src")).get();

Document rightFrameDocument = Jsoup.connect(doc.select("frame[name=right]").absUrl("src")).get();

Upvotes: 3

Related Questions