Reputation: 129
Sorry for limited knowledge with powershell. Here I try to read html content from a website, and output as csv file. Right now I can successful download whole html code with my powershell script:
$url = "http://cloudmonitor.ca.com/en/ping.php?vtt=1392966369&varghost=www.yahoo.com&vhost=_&vaction=ping&ping=start";
$Path = "$env:userprofile\Desktop\test.txt"
$ie = New-Object -com InternetExplorer.Application
$ie.visible = $true
$ie.navigate($url)
while($ie.ReadyState -ne 4) { start-sleep -s 10 }
#$ie.Document.Body.InnerText | Out-File -FilePath $Path
$ie.Document.Body | Out-File -FilePath $Path
$ie.Quit()
Get html code, something like this:
........
<tr class="light-grey-bg">
<td class="right-dotted-border">Stockholm, Sweden (sesto01):</td>
<td class="right-dotted-border"><span id="cp20">Okay</span>
</td>
<td class="right-dotted-border"><span id="minrtt20">21.8</span>
</td>
<td class="right-dotted-border"><span id="avgrtt20">21.8</span>
</td>
<td class="right-dotted-border"><span id="maxrtt20">21.9</span>
</td>
<td><span id="ip20">2a00:1288:f00e:1fe::3001</span>
</td>
</tr>
........
But what i really want is get the content and output to csv file like this:
Stockholm Sweden (sesto01),Okay,21.8,21.8,21.9,2a00:1288:f00e:1fe::3001
........
What command can help me achieve this task?
Upvotes: 4
Views: 13898
Reputation: 72630
It was interresting for me too, thanks for the CA site. I wrote this on the corner of my desk, it needs improvments.
Here is a way using Html-Agility-Pack, in the following, I suppose that HtmlAgilityPack.dll is in Html-Agility-Pack directory of the directory script file.
# PingFromTheCloud.ps1
$url = "http://cloudmonitor.ca.com/en/ping.php?vtt=1392966369&varghost=www.silogix.fr&vhost=_&vaction=ping&ping=start";
$Path = "c:\temp\Pingtest.htm"
$ie = New-Object -com InternetExplorer.Application
$ie.visible = $true
$ie.navigate($url)
while($ie.ReadyState -ne 4) { start-sleep -s 10 }
#$ie.Document.Body.InnerText | Out-File -FilePath $Path
$ie.Document.Body | Out-File -FilePath $Path
$ie.Quit()
Add-Type -Path "$(Split-Path -parent $PSCommandPath)\Html-Agility-Pack\HtmlAgilityPack.dll"
$webGraber = New-Object -TypeName HtmlAgilityPack.HtmlWeb
$webDoc = $webGraber.Load("c:\temp\Pingtest.htm")
$Thetable = $webDoc.DocumentNode.ChildNodes.Descendants('table') | where {$_.XPath -eq '/div[3]/div[1]/div[5]/table[1]/table[1]'}
$trDatas = $Thetable.ChildNodes.Elements("tr")
Remove-Item "c:\temp\Pingtest.csv"
foreach ($trData in $trDatas)
{
$tdDatas = $trData.elements("td")
$line = ""
foreach ($tdData in $tdDatas)
{
$line = $line + $tdData.InnerText.Trim() + ','
}
$line.Remove($line.Length -1) | Out-File -FilePath "c:\temp\Pingtest.csv" -Append
}
Upvotes: 1