Reputation: 21
Here is a page with a lot of stuff on it but it has 50 blocks of the blocks I have posted below.
HTML Block
<li>
<dl>
<dd>
<a href="/wow/en/item/113987" class="color-q4" data-item="pl=100&cc=5&bl=566">
<span class="icon-frame frame-18 " style='background-image: url("http://media.blizzard.com/wow/icons/18/inv_misc_trinket6oih_lanternb1.jpg");'>
</span>
</a>Obtained <a href="/wow/en/item/113987" class="color-q4" data-item="pl=100&cc=5&bl=566">Battering Talisman</a>.
</dd>
<dt>22 hours ago</dt>
</dl>
</li>
The code I'm using now only searches for this line
Obtained <a href="/wow/en/item/113987" class="color-q4" data-item="pl=100&cc=5&bl=566">Battering Talisman</a>.
How can I get my MatchCollection to return the full HTML block as 1 match?
Dim explorer As New WowExplorer(WowDotNetAPI.Region.EU, Locale.en_GB, "apikey")
Dim Request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://eu.battle.net/wow/en/character/" & Me.Realm & "/" & Me.Name & "/feed")
Dim Response As System.Net.HttpWebResponse = Request.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(Response.GetResponseStream())
Dim Sourecode As String = sr.ReadToEnd
Dim Item_ As New System.Text.RegularExpressions.Regex( _
"Obtained <a href=""/wow/en/item/.*"" class=""color-q4"".*")
Dim matche_name As MatchCollection = Item_.Matches(Sourecode)
For Each Match As Match In matche_name
Dim ItemID As String
Dim ID_Match As String = Match.Value.Split("/").GetValue(4)
ItemID = ID_Match.Split("""").GetValue(0)
Me.Items.Add(explorer.GetItem(ItemID, ItemSource))
Next
Upvotes: 2
Views: 503
Reputation: 626871
Here is a sample code showing how to get those strings using XDocument and Xpath and regex (I added a second <li>
to emulate HTML you might have):
Dim dds As List(Of String), dts As List(Of String)
dds = New List(Of String)
dts = New List(Of String)
Dim str As String = "<li> <dl> <dd> <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&cc=5&bl=566""> <span class=""icon-frame frame-18 "" style='background-image: url(""http://media.blizzard.com/wow/icons/18/inv_misc_trinket6oih_lanternb1.jpg"");'> </span> </a>Obtained <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&cc=5&bl=566"">Battering Talisman</a>.</dd> <dt>22 hours ago</dt> </dl> </li>"
str += "<li> <dl> <dd> <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&cc=5&bl=566""> <span class=""icon-frame frame-18 "" style='background-image: url(""http://media.blizzard.com/wow/icons/18/inv_misc_trinket6oih_lanternb1.jpg"");'> </span> </a>Obtained <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&cc=5&bl=566"">New Talisman</a>.</dd> <dt>10 hours ago</dt> </dl> </li>"
' XPATH WAY
Dim xDoc As XDocument = XDocument.Parse("<?xml version= '1.0'?><root>" + str + "</root>")
dds = xDoc.XPathSelectElements("//dd").Select(Function(m) m.Value).ToList()
dts = xDoc.XPathSelectElements("//dt").Select(Function(m) m.Value).ToList()
' REGEX WAY
dds = New List(Of String)
dts = New List(Of String)
Dim rx As Regex = New Regex("(?s)</a>([^<]*?)<a\s[^>]*?>([^<]*?)</a>([^<\r\n]*)")
Dim matches As IEnumerable(Of Match) = rx.Matches(str).Cast(Of Match)().Select(Function(m) m)
dds = (From match In matches
Select match.Groups(1).Value + match.Groups(2).Value + match.Groups(3).Value).ToList()
Dim rxDt As Regex = New Regex("(?s)<dt>\s*([^<]*?)\s*</dt>")
Dim matches_dts As IEnumerable(Of Match) = rxDt.Matches(str).Cast(Of Match)().Select(Function(m) m)
dts = (From match In matches_dts
Select match.Groups(1).Value).ToList()
Results:
Upvotes: 1