Reputation: 978
I am making a small "home" application using VB. As the title says, I want to grab a part of text from a local html file and use it as variable, or put it in a textbox.
I have tried something like this...
Private Sub Open_Button_Click(sender As Object, e As EventArgs) Handles Open_Button.Click
Dim openFileDialog As New OpenFileDialog()
openFileDialog.CheckFileExists = True
openFileDialog.CheckPathExists = True
openFileDialog.FileName = ""
openFileDialog.Filter = "All|*.*"
openFileDialog.Multiselect = False
openFileDialog.Title = "Open"
If openFileDialog.ShowDialog = Windows.Forms.DialogResult.OK Then
Dim fileReader As String = My.Computer.FileSystem.ReadAllText(openFileDialog1.FileName)
TextBox.Text = fileReader
End If
End Sub
The result is to load the whole html code inside this textbox. What should I do so to grab a specific part of html files's code? Let's say I want to grab only the word text from this span...<span id="something">This is a text!!!</a>
Upvotes: 0
Views: 1019
Reputation: 18310
Using an HTML parser is highly recommended due to the HTML language's many nested tags (see this question for example).
However, finding the contents of a single tag using Regex
is possible with no bigger problems if the HTML is formatted correctly.
This would be what you need (the function is case-insensitive):
Public Function FindTextInSpan(ByVal HTML As String, ByVal SpanId As String, ByVal LookFor As String) As String
Dim m As Match = Regex.Match(HTML, "(?<=<span.+id=""" & SpanId & """.*>.*)" & LookFor & "(?=.*<\/span>)", RegexOptions.IgnoreCase)
Return If(m IsNot Nothing, m.Value, "")
End Function
The parameters of the function are:
HTML
: The HTML code as string.
SpanId
: The id of the span (ex. <span id="hello">
- hello is the id)
LookFor
: What text to look for inside the span.
Online test: http://ideone.com/luGw1V
Upvotes: 1
Reputation: 993
I make the following assumptions on this answer.
I'd do something like this:
' get the html document Dim fileReader As String = My.Computer.FileSystem.ReadAllText(openFileDialog1.FileName) ' split the html text based on the span element Dim fileSplit as string() = fileReader.Split(New String () {"<span id=""something"">"}, StringSplitOptions.None) ' get the last part of the text fileReader = fileSplit.last ' we now need to trim everything after the close tag fileSplit = fileReader.Split(New String () {"</span>"}, StringSplitOptions.None) ' get the first part of the text fileReader = fileSplit.first ' the fileReader variable should now contain the contents of the span tag with id "something"
Note: this code is untested and I've typed it on the stack exchange mobile app, so there might be some auto correct typos in it.
You might want to add in some error validation such as making sure that the span element only occurs once, etc.
Upvotes: 1