Reputation: 21
Using vb.net using regex how would i recover href and the cost?
I have tried various options, and have just learned that regex can be different depending on language, which means i have wasted 2 days trying to figure it out
<div class="single-album" id="m-1_1184">
<span class="album-time link-text">
<a class="album-link tag-b b-ltxt-album b-sec-b b-tab-toy"
href="/cx/1.1184"
title="album | 5 cost">13£50</a>
</span>`enter code here`
<span class="separator">|</span>
</div>
Upvotes: 1
Views: 96
Reputation: 460278
I would really advise against using regex to parse HTML. Instead use HtmlAgilityPack
.
Then it's simple and safe:
Dim html As String = File.ReadAllText("C:\Temp\html.txt") ' i've used this text file for your input
Dim doc = New HtmlAgilityPack.HtmlDocument()
doc.LoadHtml(html)
Dim aHref As HtmlAgilityPack.HtmlNode = doc.DocumentNode.SelectSingleNode("//a[@class='album-link tag-b b-ltxt-album b-sec-b b-tab-toy']")
If aHref IsNot Nothing Then
Dim href As String = aHref.GetAttributeValue("href", "") ' /cx/1.1184
Dim title As String = aHref.GetAttributeValue("title", "")
Dim costs As String = title.Split("|"c).Last().Trim() ' 5 cost
End If
Upvotes: 1