Reputation: 702
I'm trying to get player names from the scoring table on http://www.pgatour.com/leaderboard.html page, but searching for it with getElementsByTagName using PowerShell returns nothing:
$HTML = Invoke-WebRequest -Uri http://www.pgatour.com/leaderboard.html
$HTML.ParsedHtml.getElementsByTagName("a") | where { $_.className -like '*expansion*' }
Searching for class name in web browser's developer tools using .name.expansion CSS selector returns player names that I need, but as far as I know there is no way to search using CSS selector in PowerShell.
I also tried to use $HTML.AllElements, but with no luck.
Please advise on what is the best way to resolve this task. Thanks!
Upvotes: 1
Views: 2269
Reputation: 67
The problem is that you don't get same page in powershell, as in browser. To check it, try the code:
$HTML = Invoke-WebRequest -Uri "http://www.pgatour.com/leaderboard.html"
$HTML.Content > leaderboard.html
Then open leaderboard.html in browser. As you can see, there is message
It appears your browser may be outdated. For the best website experience, we recommend updating your browser.
And leaderboard is missing. What you can try, is to get content via IE
$ie = New-Object -com InternetExplorer.Application
#$ie.visible=$true
$ie.navigate("http://www.pgatour.com/leaderboard.html")
while($ie.ReadyState -ne 4) {start-sleep -m 100} #waiting for page is ready
start-sleep -s 30 #waiting for leaderboard to load
$ahrefs = $ie.Document.getElementsByTagName("a")
$names = ($ahrefs | where {$_.className -eq "name expansion"})
$names | foreach {write-host $_.textContent}
Note, the solution above is extremely slow
Upvotes: 1