araf
araf

Reputation: 169

How could I download a file like .doc or .pdf from internet to my harddrive using c#

How could I download a file like .doc , .pdf from internet to my hard drive using c#

Upvotes: 2

Views: 581

Answers (5)

BrokenGlass
BrokenGlass

Reputation: 160862

use the WebClient class:

using(WebClient wc = new WebClient())
wc.DownloadFile("http://a.com/foo.pdf", @"D:\foo.pdf");

Edit based on comments:

Based on your comments I think what you are trying to do is download i.e. PDF files that are linked to from an HTML page. In that case you can

  1. Download the page (with WebClient, see above)

  2. Use the HtmlAgilityPack to find all the links within the page that point to pdf files

  3. Download the pdf files

i am developing a crawler were if i specify a keyword for eg:SHA algorithm and i select the option .pdf or .doc from the crawler it should download the file with selected format in to a targeted folder ..

Based on your clarification this is a solution using google to get the results of the search:

DownloadSearchHits("SHA", "pdf");

...

public static void DownloadSearchHits(string searchTerm, string fileType)
{
    using (WebClient wc = new WebClient())
    {
        string html = wc.DownloadString(string.Format("http://www.google.com/search?q={0}+filetype%3A{1}", searchTerm, fileType));
        HtmlDocument doc = new HtmlDocument();
        doc.LoadHtml(html);
        var pdfLinks = doc.DocumentNode
                            .SelectNodes("//a")
                            .Where(link => link.Attributes["href"] != null 
                                   && link.Attributes["href"].Value.EndsWith(".pdf"))
                            .Select(link => link.Attributes["href"].Value)
                            .ToList();

        int index = 0;
        foreach (string pdfUrl in pdfLinks)
        {
            wc.DownloadFile(pdfUrl, 
                            string.Format(@"C:\download\{0}.{1}", 
                                            index++, 
                                            fileType));
        }
    }
}

In general though you should ask a question related to a particular problem you have with a given implementation that you already have - based on your question you are very far off being able to implement a standalone crawler.

Upvotes: 3

NerdFury
NerdFury

Reputation: 19214

Using WebClient.DownloadFile

http://msdn.microsoft.com/en-us/library/system.net.webclient.downloadfile.aspx

    using (var client = new WebClient())
    {
        var data = client.DownloadFile(url, filename);
    }

Upvotes: 0

Lou Franco
Lou Franco

Reputation: 89162

Use WebClient.DownloadFile() from System.Net

Upvotes: 1

Henk Holterman
Henk Holterman

Reputation: 273209

using (var client = new System.Net.WebClient())
{
    client.DownloadFile( "url", "localFilename");
}

Upvotes: 5

Scott Chamberlain
Scott Chamberlain

Reputation: 127543

The most simple way is use WebClient.DownloadFile

Upvotes: 3

Related Questions