Arti
Arti

Reputation: 3071

Use wget command to download multiple files at different locations

I want to download multiple files say www.google.com, yahoo.com and gmail.com at 3 different locations using wget. How should i go about it? Please help me out..

I am doing all this through c#:

        ProcessStartInfo startInfo = new ProcessStartInfo("CMD.exe");
        Process p = new Process();
        startInfo.RedirectStandardInput = true;
        startInfo.UseShellExecute = false;
        startInfo.RedirectStandardOutput = true;
        startInfo.RedirectStandardError = true;
        p = Process.Start(startInfo);

        p.StandardInput.WriteLine(@"wget --output-document=C:\1.xml xyz.com/a.xml");
        p.StandardInput.WriteLine(@"wget --output-document=C:\2.xml xyz.com/b.xml");
        p.StandardInput.WriteLine(@"wget --output-document=C:\3.xml xyz.com/c.xml");

        p.StandardInput.WriteLine(@"EXIT");
        string output = p.StandardOutput.ReadToEnd();
        string error = p.StandardError.ReadToEnd();
        p.WaitForExit();
        p.Close();

This is not working. would like to know if there r any othe ways of downloading multiple files using wget..

Upvotes: 0

Views: 4546

Answers (2)

paxdiablo
paxdiablo

Reputation: 882446

If you're just talking about retrieving each file from a different location, but still doing it sequentially, you just change the URI in the wget command to point to a different location.

If you want concurrent downloads rather than sequential, you would have to start three separate processes and have them download one file each. These ptocesses could run side by side but I'd probably only consider this for large files (of which an XML file is probably not).

If you're having troubles getting the commands to run at all, the first thing I would do is ditch cmd.exe and its standard input. There's no reason why you can't have a process run wget directly. Or, if you really only want to start the one process, you could output them to a temporary file and use a single process cmd /c tempfile.cmd to run it.


However, there may be a totally different problem you're having unrelated to what you've shown, because that exact code with three echo statements in place of your wget ones runs fine, generating the correct output, at least in Visual C# Express 2010.

And, in fact, once I got my GnuWin32 wget on to the path, the following worked as well, getting real documents off the net and placing them in my top-level directory:

using System;
using System.Diagnostics;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            ProcessStartInfo startInfo = new ProcessStartInfo("cmd.exe");
            Process p = new Process();
            startInfo.RedirectStandardInput = true;
            startInfo.UseShellExecute = false;
            startInfo.RedirectStandardOutput = true;
            startInfo.RedirectStandardError = true;
            p = Process.Start(startInfo);

            p.StandardInput.WriteLine(
                @"wget --output-document=c:\q1.txt http://www.ibm.com");
            p.StandardInput.WriteLine(
                @"wget --output-document=c:\q2.txt http://www.microsoft.com");
            p.StandardInput.WriteLine(
                @"wget --output-document=c:\q3.txt http://www.borland.com");

            p.StandardInput.WriteLine(@"exit");

            string output = p.StandardOutput.ReadToEnd();
            string error = p.StandardError.ReadToEnd();
            p.WaitForExit();
            p.Close();
        }
    }
}

Here's the proof, the single window partway through the Microsoft download:

enter image description here

So, bottom line, what you have shown us is not inherently unworkable as evidenced by the image above. My only suggestion is to start looking around at other things such as the version of wget you're using, GnuWin32 or CygWin.


Now, things get interesting with larger files, as you've stated in one of your comments. If I change all three URIs to http://download.microsoft.com/download/5/F/C/5FC4F80C-242D-423B-9A11-9510A013152D/Dolphins.themepack, a file of 12,889,103 bytes, the code above hangs at about 18% of the first download (around the 2.3M mark).

However, if I change the commands so that they have >nul: 2>nul: on the end, the download goes through without issue, so I suspect it's most likely an issue with the way wget writes its output (without newlines). It also works fully if you don't use redirection on the output and error streams, which strengthens that assertion.

Upvotes: 2

Anickyan
Anickyan

Reputation: 414

Well, first of all, you're on Windows. wget is part of the GNU Operating System. Unless you've installed a "clone" of wget for Windows, this is impossible. You are probably better off downloading the pages yourself, with something like the HTTPClient class.

But if you have a form of wget installed, what is not working? And how do you want it to work? Your question is not very detailed, you just ask how to go about it, and provide a seemingly fine solution.

Upvotes: 0

Related Questions