Guillermo
Guillermo

Reputation:

How to get the content from HTML using VB6

How can i get the content from HTML, removing the elements around it.

I am looking for an example using VB6

Upvotes: 1

Views: 4745

Answers (3)

MarkJ
MarkJ

Reputation: 30398

You can use Internet Explorer as a COM object (without showing it on screen). For example to get a plain-text version of the HTML:

Public Function Html2Text(ByVal Data _
   As String) As String
      Dim obj As Object
      On Error Resume Next
      Set obj = _
         CreateObject("htmlfile")
      obj.Open
      obj.Write Data
      Html2Text = obj.Body.InnerText
End Function

You could also walk the element tree to do something more complicated.

Credit: Karl Peterson in Visual Studio Magazine.

Upvotes: 5

Peter
Peter

Reputation: 3008

The HTML may be mal-formed, making it very difficult to remove the tags with regular expressions. An alternative is to load Internet Explorer as a COM object in VB, and then load the HTML doc in Internet Explorer and use it to walk through the interpreted element tree.

Upvotes: 0

Pooria
Pooria

Reputation: 780

You can use Regular Expression; build your pattern and extract the data that you want from HTML. In this link you might find out how you can use Regular Expression in vb6 http://www.regular-expressions.info/vb.html

Upvotes: 2

Related Questions