User7071
User7071

Reputation: 77

How would I retrieve certain information on a webpage using their ID value?

In vb.net, I can download a webpage as a string like this:

    Using ee As New System.Net.WebClient()
        Dim reply As String = ee.DownloadString("https://pastebin.com/eHcQRiff")
        MessageBox.Show(reply)
    End Using

Would it be possible to specify an ID tag of an item on the webpage so that the reply will only output the information inside of the code box/id tag?

Example:

The ID tag of RAW Paste Data on https://pastebin.com/eHcQRiff is id="paste_code" which includes the following text:

Test=1
Test=2

Is there anyway to get the WebClient to only output that exact same message using the ID tag (or any other method)?

Upvotes: 1

Views: 39

Answers (1)

Sunil
Sunil

Reputation: 3424

You can use HtmlAgilityPack library

Dim document as HtmlAgilityPack.HtmlDocument = new HtmlAgilityPack.HtmlDocument()
document.Load(@"C:\YourDownloadedHtml.html")

Dim text as string = document.GetElementbyId("paste_code").InnerText

Some more sample code:
(Tested with HtmlAgilityPack 1.6.10.0)

Dim html As string = "<TD width=""""50%""""><DIV align=right>Name :<B> </B></DIV></TD><TD width=""""50%""""><div id='i1'>SomeText</div></TD><TR vAlign=center>"
Dim htmlDoc As HtmlDocument = New HtmlDocument
htmlDoc.LoadHtml(html) 'To load from html string directly
Dim name As String = htmlDoc.DocumentNode.SelectSingleNode("//td/div[@id='i1']").InnerText
Console.WriteLine(name)

Output:
SomeText

Upvotes: 1

Related Questions