Reputation: 309
I'm trying to get the scores from the soccer world cups since 1930, in this link are the match ups listed and by clicking in the team a new window pop up with the summary I'm looking for. My problem is that in the HTML source code I don't see a <a>
tag with the href inside of it as usually is, i'm trying to simulate a click event with the following VBA code:
Link: http://www.flashscore.com/soccer/world/world-cup-1990/results/
Vba code:
Sub Test()
Dim URL As String
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim i As Integer
URL = "http://www.flashscore.com/soccer/world/world-cup-1990/results/"
Set IE = New InternetExplorer
With IE
.Navigate URL
.Visible = True
While .Busy Or .ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set HTMLdoc = .Document
End With
Set TDelements = HTMLdoc.getElementsByTagName("td")
For Each TDelement In TDelements
If TDelement.Title = "Click for match detail!" Then
TDelement.Click
End If
Next
IE.Quit
End Sub
Upvotes: 1
Views: 8617
Reputation: 14169
I'd do this one differently, as what you're trying to achieve requires a massive overhead and might take you on a wild goose chase of links, actions, loops, and whatnot. Instead of emulating the actions, it's much better emulating the results of the actions.
Take for example the Germany - Argentina match. When you click on it, it opens this pop-up:
Of particular note here is the URL of the pop-up. Intuitively enough, the URL has some weird combination that might possibly be a unique identifier. The first thing we check is the tag for that row in the table we just clicked.
Check the highlighted part: it resembles the address in our pop-up closely, albeit there are four (4) extra characters at the beginning. Collapsing the row and cross-checking against other rows, we come to the conclusion that the rows use a similar format:
Removing the g_1_ from the id
attribute of the tr
tag gives us the unique identifier for every match. I'll be taking the sanitized id
from the Italy - England match and I will input it into the generic URL. We'll do this in another tab as the pop-up's address bar is locked:
Voila. The page is open in another tab, and from there you can get what you want from the match summary. You can even check the head-to-head by changing #match-summary
to #h2h;overall
.
Because it's safer. Since you're using Excel, you have the sheets at your control. The safest way is to record all these id
attributes from each row into, say, a column. Then loop over that column, concatenating it properly to produce the target URLs.
Then you loop over the URLs and get what you want from the EXACT pages you're looking for. There's no need anymore for the original link/table at this point since almost all the details are in the "summary" pages anyway.
Hope this helps and good luck.
As I feel a bit guilty not posting any code, here's my attempt at producing the id
values for the World Cup section. Note that I took some liberties with the code, so modify it to suit your approach better. I will only be providing the way to get the id
values. All else is up to you.
Sub Test()
Dim URL As String
Dim IE As New InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim dictObj As Object: Set dictObj = CreateObject("Scripting.Dictionary")
Dim tRowID As String
URL = "http://www.flashscore.com/soccer/world/world-cup-1990/results/"
With IE
.Navigate URL
.Visible = True
Do Until .ReadyState = READYSTATE_COMPLETE: DoEvents: Loop
Set HTMLdoc = .Document
End With
With HTMLdoc
'Target the set of tables.
Set tblSet = .getElementById("fs-results")
'Index starts at 0. World Cup Playoffs is (0), World Cup is (1),
'Qualifiers is (2), etc... Targeting tbody right away, as there is
'only one (1) tbody tag per row anyway.
Set mTbl = tblSet.getElementsByTagName("tbody")(1)
Set tRows = mTbl.getElementsByTagName("tr")
With dictObj
'If if value is not yet in dictionary, store it.
For Each tRow In tRows
'Remove the first four (4) characters.
tRowID = Mid(tRow.getAttribute("id"), 5)
If Not .Exists(tRowID) Then
.Add tRowID, Empty
End If
Next tRow
End With
End With
'Print out each of the ID values.
For Each Key In dictObj
Debug.Print Key
Next Key
Set IE = Nothing
End Sub
Which returns the following id
values for the World Cup table (not the Playoffs!):
04LfR6G7
6RNojjg8
nww8zBgL
t0xCZU8R
dKY0Pgu3
jsY4ODf9
K4YcDdti
2qM7lD3j
bXvdGk2k
tdU0F9He
65hR8YfM
lUsW7hAS
jBqaxXO8
YZY3yivF
n1HjSn11
hvInTSoe
rHUdQZPd
xfS51jmd
ncyhHVnq
4b34kglp
O0kz7LJH
lpQE9EuG
UHlQLBvS
fNQAAfQ9
CKpewD92
EmJrU8Wl
KnT4ETW1
nRnnVlHr
b3T12WXk
jwUc3CIq
n9RDMiPL
pAjW714B
GznS8sk5
GWR6BzB3
UXW8NX9F
beX1CGec
Hope this helps you some more. Kindly make sure you read the comments. Good luck!
Upvotes: 3