Phantom
Phantom

Reputation: 51

Filter out Html Number Via Regex

Im trying to filter out a specific line from html source with a specific number

To problem is that its between 100 and 25000 inc of +50

How can i make it so that it checks between 100 and 25000 without writing 100 lines of code? txtAmount.text contains the website html source

This is what i got so far:

   Const Amount = "(<td class=""text-right"">100</td>)|(<td class=""text-right"">150</td>)|(<td class=""text-right"">200</td>)|(<td class=""text-right"">etc</td>)"

     Dim qty As New List(Of String)
            qty = txtAmount.Lines.ToList
            For i As Integer = qty.Count - 1 To 0 Step -1
                If Not Regex.IsMatch(qty(i), Amount) Then
                    qty.RemoveAt(i)
                End If
            Next
            txtAmount.Lines = qty.ToArray



HTML:
    <td class="text-right">100</td> <=== I need to get this number

Upvotes: 1

Views: 71

Answers (1)

ElektroStudios
ElektroStudios

Reputation: 20484

You could use \d metacharacter to capture only digts (from a range of 3 digits to 5 digits in this example):

Dim html As String = 
    "<td class=""text-right"">Quantity</td> <td class=""text-right"">100</td>"

Dim rgx As New Regex(".+text-right.+>(?<value>\d{3,5})<.+", RegexOptions.Singleline)

If rgx.IsMatch(html) Then
    Dim value As Integer = CInt(rgx.Match(html).Groups("value").Value)
    Console.WriteLine(value) ' 100 (or whatever other digits exists in the html field.)
End If

Upvotes: 1

Related Questions