Reputation: 71
I'm a Asp.Net C# developer and I want to download a CSV file from a PHP Site. Apologize if this has already been covered, but in this case, the Link is showing javascript form submit.
Right now we login on the login page and then enter a date on the search page. The results page shows the results in Html and also has a "download" link which when clicked will server a CSV file.
We would like to "pull" the CSV file periodically via some Screen Scraping / download of the file.
The "download" link shows the following javascript to submit a form back to the server to download the CSV file:
javascript:document.aForm.action='download.php'; setTarget();document.aForm.submit();
I want to setup a C# console application to run periodically to: Login, and download the CSV file.
Can I use the webclient downloaddata method to access this CSV file ? A code sample please.
Can I do some programmatic scraping to login and fill in some search criteria and get to the results page to download the csv ?
What is recommended to this ? html agility pack ? any links or code snippets is greatly appreciated.
Thanks for any assistance.
Upvotes: 1
Views: 1849
Reputation: 3911
Yes you can use WebClient
class and for downloading a file you can use WebClient.DownloadFile
method
OR
You can you Curl
libcurl .net
for reference you can follow this post
To login to a specific Page you can try this
WebClient client = new WebClient();
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705;)");
byte[] bret = client.UploadData("http://www.website.com/post.php", "POST",
System.Text.Encoding.ASCII.GetBytes("field1=value1&field2=value2") );
string sret = System.Text.Encoding.ASCII.GetString(bret);
Once you have got success response, you can parse that response string to get that hyperlink, and then use Webclient.DownloadFile
to download file
try
{
client.DownloadFile("http://www.xyz.com/download.php","abc.csv");
Console.WriteLine("File Saved.");
}
catch (WebException we)
{
Console.WriteLine(we.Message + "\n" + we.Status.ToString());
}
catch (NotSupportedException ne)
{
Console.WriteLine(ne.Message);
}
And i guess in your case that hyperlink javascript
is doing a post to download.php
, so instead of using webclient.downloadFile
you can simple do a post manually using webclient
. above i have shown you how to post.
Upvotes: 1