Nobodyukno
Nobodyukno

Reputation: 21

Extract portion of HTML from website?

I'm trying to use VBA in Excel, to navigate a site with Internet explorer, to download an Excel file for each day.

After looking through the HTML code of the site, it looks like each day's page has a similar structure, but there's a portion of the website link that seems completely random. But this completely random part stays constant and does not change each time you want to load the page.

The following portion of the HTML code contains the unique string:

<a href="#" onClick="showZoomIn('222698519','b1a9134c02c5db3c79e649b7adf8982d', event);return false;

The part starting with "b1a" is what is used in the website link. Is there any way to extract this part of the page and assign it as a variable that I then can use to build my website link?

Upvotes: 0

Views: 106

Answers (1)

Matteo NNZ
Matteo NNZ

Reputation: 12665

Since you don't show your code, I will talk too in general terms:

1) You get all the elements of type link (<a>) with a Set allLinks = ie.document.getElementsByTagName("a"). It will be a vector of length n containing all the links you scraped from the document.

2) You detect the precise link containing the information you want. Let's imagine it's the 4th one (you can parse the properties to check which one it is, in case it's dynamic):

Set myLink = allLinks(3) '<- 4th : index = 3 (starts from zero)

3) You get your token with a simple split function:

myToken = Split(myLink.onClick, "'")(3)

Of course you can be more synthetic if the position of your link containing the token is always the same, like always the 4th link:

myToken = Split(ie.document.getElementsByTagName("a")(3).onClick,"'")(3)

Upvotes: 1

Related Questions