FrinkTheBrave
FrinkTheBrave

Reputation: 3958

How to download a whole folder of files/subfolders from the web in PowerShell

I can download a single file from the web using:

$wc = New-Object System.Net.WebClient
$wc.DownloadFile("http://blah/root/somefile.ext", "C:\Downloads\www\blah\root\somefile.ext")

But how do I download all the files, including subfolders? Something like the following would be nice...

$wc.DownloadFile("http://blah/root/", "C:\Downloads\www\blah\root\")

The root folder itself appears as a directory listing in IE, you know, like:

[To Parent Directory]
                01 July 2012    09:00       1234 somefile.ext
                01 July 2012    09:01       1234 someotherfile.ext

As a bonus, how would I just downloading the files in the root folder, ignoring subfolders?

Upvotes: 10

Views: 56208

Answers (2)

LuciusAgarthy
LuciusAgarthy

Reputation: 97

This is addition to @FrinkTheBrave answer how I run his awesome script:

  • save the script to file for example "DLfilesFromSite.ps1"

  • run PowerShell as administrator

  • cd to folder with script:

    cd c:\scripts

  • import script:

    Import-Module .\DLfilesFromSite.ps1

  • initialize webclient:

    $webClient = New-Object System.Net.WebClient

  • set encoding for files with diacritics:

    $webClient.Encoding = [System.Text.Encoding]::UTF8

  • call function with parameters:

    Copy-Folder "https://www.example.cz/source/folder/" "C:\destination\folder" $True

I learned a lot about powershell scripting and passing arguments in this article.

Upvotes: 3

FrinkTheBrave
FrinkTheBrave

Reputation: 3958

Here's what I came up with based on Andy's suggestion (with plenty of help from Google, of course):

####################################################################################################
# This function copies a folder (and optionally, its subfolders)
#
# When copying subfolders it calls itself recursively
#
# Requires WebClient object $webClient defined, e.g. $webClient = New-Object System.Net.WebClient
#
# Parameters:
#   $source      - The url of folder to copy, with trailing /, e.g. http://website/folder/structure/
#   $destination - The folder to copy $source to, with trailing \ e.g. D:\CopyOfStructure\
#   $recursive   - True if subfolders of $source are also to be copied or False to ignore subfolders
#   Return       - None
####################################################################################################
Function Copy-Folder([string]$source, [string]$destination, [bool]$recursive) {
    if (!$(Test-Path($destination))) {
        New-Item $destination -type directory -Force
    }

    # Get the file list from the web page
    $webString = $webClient.DownloadString($source)
    $lines = [Regex]::Split($webString, "<br>")
    # Parse each line, looking for files and folders
    foreach ($line in $lines) {
        if ($line.ToUpper().Contains("HREF")) {
            # File or Folder
            if (!$line.ToUpper().Contains("[TO PARENT DIRECTORY]")) {
                # Not Parent Folder entry
                $items =[Regex]::Split($line, """")
                $items = [Regex]::Split($items[2], "(>|<)")
                $item = $items[2]
                if ($line.ToLower().Contains("&lt;dir&gt")) {
                    # Folder
                    if ($recursive) {
                        # Subfolder copy required
                        Copy-Folder "$source$item/" "$destination$item/" $recursive
                    } else {
                        # Subfolder copy not required
                    }
                } else {
                    # File
                    $webClient.DownloadFile("$source$item", "$destination$item")
                }
            }
        }
    }
}

No guarantees of course, but it worked for the site I was interested in

Upvotes: 8

Related Questions