PerryW
PerryW

Reputation: 1436

Copy subset of files keeping folder structure using Powershell

Looking for some Powershell help with a copying challenge.

I need to copy all MS Office files from a fairly large NAS (over 4 million of them and a little over 5tb) to another drive, retaining the existing folder structure where a file is copied.

I have a text file of all the common Office file types (about 40 of them) - extns.txt

At this stage, being a good StackExchanger, I'd post the script I've got so far, but I've spent best part of a day on this and, not only is what I've got embarrassingly awful, I suspect that even the basic algorithm is wrong.

I started to gci the entire tree on the old NAS, once for each file type Then I thought it would be better to traverse once and compare every file to the list of valid types. Then I got into a complete mess about rebuilding the folder structure. I started by splitting on '\' and iterating through the path then wasted an hour of searching because I thought I remembered reading about a simple way to duplicate a path if it doesn't exist.

Another alternative is that I dump out a 4 million line text file of all the files (with full path) I want to copy (this is easy as I imported the entire structure into SQL Server to analyse what was there) and use that as a list of sources

I'm not expecting a 'please write the codez for me' answer but some pointers/thoughts on the best way to approach this would be appreciated.

Upvotes: 1

Views: 212

Answers (1)

Karthick Ganesan
Karthick Ganesan

Reputation: 385

I'm not sure if this is the best approach, but the below script is a passable solution to the least.

$sourceRootPath = "D:\Source"
$DestFolderPath = "E:\Dest"

$extensions = Get-Content "D:\extns.txt"

# Prefix "*." to items in $extensions if it doesn't already have it
$extensions = $extensions -replace "^\*.|^","*."

$copiedItemsList = New-Object System.Collections.ArrayList

foreach ( $ext in $extensions ) {
    $copiedItems = Copy-Item -Path $sourceRootPath -Filter $ext -Destination $DestFolderPath -Container -Recurse -PassThru
    $copiedItems | % { $copiedItemsList.Add($_) | Out-Null }
}

$copiedItemsList = $copiedItemsList | select -Unique

# Remove Empty 'Deletable' folders that get created while maintaining the folder structure with Copy-Item cmdlet's Container switch
While ( $DeletableFolders = $copiedItemsList | ? { ((Test-Path $_) -and $_.PSIsContainer -eq $true -and ((gci $_ | select -first 1).Count -eq 0)) } ) {
    $DeletableFolders | Remove-Item -Confirm:$false
}

The Copy-Item's -Container switch is going to preserve the folder structure for us. However, we may encounter empty folders with this approach.

So, I'm using an arraylist named $copiedItemsList to add the copied objects into, which I will later use to determine empty 'Deletable' folders which are then removed at the end of the script.

Hope this helps!

Upvotes: 1

Related Questions