m226thirteen
m226thirteen

Reputation: 13

Powershell produces incomplete results. Removing displayed commas from join cmdlet

I’m very new to scripting and I am hoping someone can lend some advice.

I took it upon myself to create a PowerShell script that will be useful for project management. The script analyzes a file path to determine if it’s too long and examines whether the file name is less than 50 characters. Also, it makes sure there are no illegal characters amongst other things. All results are initially printed to a text file.

In the PowerShell script, I originally had steps 1, 2 and 3-9 produce 3 separate documents. But ultimately, I decided to have all the steps write to a single text file. In the last step, the text file is then imported in order to create a CSV. The file paths are then sorted alphabetically and it merges any file paths that returned 2 different results. Ultimately, it is this CSV that users will reference in order to manually correct any errors before their media is archived.

Here are the issues I am encountering.

  1. From steps 1-9, the text file that is produced is incomplete. When I check the text file, it stops in the middle of writing a path. I also noticed that the CSV created in step 10 also has incomplete results.
    enter image description here

  2. In step 10, I used the join cmdlet in order to consolidate the results in the CSV. So, if a file name has two different results, I will like to merge the results into 1 line. I used a comma as the delimiter but when I do this, it actually displays multiple commas in the CSV. Is there a way to prevent the commas from being displayed?
    enter image description here

param(
 # $pathToScan Enter the path to scan for file length
 [Parameter(Mandatory=$True,Position=0)]
 [ValidateNotNull()]
 [string]$pathToScan,
 #Character Limit to be set
 [Parameter(Position=1)] 
 #[ValidateNotNull()] 
 [int]$charLimit = 250
)


# File outputs
#$outputEmptyPath = "C:\temp\EmptyPaths_$(Get-Date -Format yyyyMMdd).csv"
#$outputCharPath = "C:\temp\SpecialChars_$(Get-Date -Format yyyyMMdd).csv"
$outputFilePath = "C:\temp\PathLengths_$(Get-Date -Format yyyyMMdd).txt"

# Use this switch to display to screen.  Can ignore this for now.
$writeToConsoleAsWell = $false   # Writing to the console will be much slower.

# Open a new file stream (nice and fast) and write all the paths and their lengths to it.
$outputFileDirectory = Split-Path $outputFilePath -Parent
if (!(Test-Path $outputFileDirectory)) { New-Item $outputFileDirectory -ItemType Directory }
$streamPath = New-Object System.IO.StreamWriter($outputFilePath, $false)
#$streamChar = New-Object System.IO.StreamWriter($outputFilePath, $false)
#$streamEmpty = New-Object System.IO.StreamWriter($outputFilePath, $false)

# STEP 1 - Check for empty paths.
((Get-ChildItem -Path $pathToScan -Recurse | Where-Object {$_.PSIsContainer -eq $True}) | Where-Object {$_.GetFiles().Count -eq 0 -and $_.GetDirectories().Count -eq 0}) | ForEach-Object {
      $emptyPaths = $_.FullName

      if ($emptyPaths) {
          $streamPath.WriteLine("$emptyPaths , empty folder")
        }
}

# STEP 2 - Show for long paths. (default=250 characters)
Get-ChildItem -Path $pathToScan -Recurse | Select-Object -Property BaseName, FullName, @{Name="FullNameLength";Expression={($_.FullName.Length)}} | Sort-Object -Property FullNameLength -Descending | ForEach-Object {
    $fileName = $_.BaseName
    $filePath = $_.FullName
    $length = $_.FullNameLength

    if ($length -gt $charLimit) {
        $string = "$length : $filePath"
    
        # Write to the Console.
        if ($writeToConsoleAsWell) { Write-Host $string }

        #Write to the file.
        $streamPath.WriteLine("$filepath ,, file path too long")
    }

    #STEP 3 - Check for special characters. Allowed characters are Alphanumerics, single space, dashes, underscores, periods
    if ($filename -match '[^a-zA-Z0-9 -_.]') {
        $streamPath.WriteLine("$filepath ,,, has special characters")
    }

    #STEP 4 - Check for double spaces, dashes, periods and underscores
    if ($filepath -match '\s{2,}|\-{2,}|\.{2,}|_{2,}') {
        $streamPath.WriteLine("$filepath ,,,, has double spaces/dashes/periods/underscores")
    }
    
    #STEP 5 - check for more than 50 characters
    if ($filename -match '[^\\]{51,}\.[a-zA-Z0-9]{2,7}$') {
        $streamPath.WriteLine("$filepath ,,,,, exceeds 50 characters")
   }

    #STEP 6 - check for empty space at end of file or folder name
    if ($filename -match '(\s+$)|(\s+\\)|(\s+\.)') {
        $streamPath.WriteLine("$filepath ,,,,,, name has space at end")
   }
   
    #STEP 7 - check for zip and other archived files
    if ($filename -match '(?i)\.zip$|(?i)\.tar.gz$|(?i)\.gz$|(?i)__MACOSX$') {
        $streamPath.WriteLine("$filepath ,,,,,,, unzip files before archiving")
   }
      
    #step 8 - check for cache and render files
    if ($filename -match '(?i)\.cfa$|(?i)\.pek$|(?i)\.xmp$') {
        $streamPath.WriteLine("$filepath ,,,,,,,, delete cache and render files")
   }

    #step 9 - check for Windows hidden files
    if ($filename -match '(?i)\._|thumbs.db') {
        $streamPath.WriteLine("$filepath ,,,,,,,,, delete hidden files")
   }
} 
    Start-Sleep -Seconds 30

#step 10 - Merge and sort results
Import-Csv -Path "C:\temp\PathLengths_$(Get-Date -Format yyyyMMdd).txt" -Header 'Path', 'Empty Folder', 'Long File Path', 'Special Characters', 'Double Spaces, Dashes, Periods and Underscores', 'Exceeds 50 Characters', 'Name Has Space at End', 'Zip File', 'Cache and Render Files', 'Windows Hidden Files' | sort 'Path' | Group-Object 'Path' | ForEach-Object {
    [PsCustomObject]@{
        'Path' = $_.Name
        'Empty Folder' = $_.Group.'Empty Folder' -join ',' 
        'Long File Path' = $_.Group.'XMP File' -join ',,'
        'Special Characters' = $_.Group.'Special Characters' -join ',,,'
        'Double Spaces, Dashes, Periods and Underscores' = $_.Group.'Double Spaces, Dashes, Periods and Underscores' -join ',,,,'
        'Exceeds 50 Characters' = $_.Group.'Exceeds 50 Characters' -join ',,,,,'
        'Name Has Space at End' = $_.Group.'Name Has Space at End' -join ',,,,,,'
        'Zip File' = $_.Group.'Zip File' -join ',,,,,,,'
        'Cache and Render Files' = $_.Group.'Cache and Render Files' -join ',,,,,,,,'
        'Windows Hidden Files' = $_.Group.'Windows Hidden Files' -join ',,,,,,,,,'        
    }
} | Export-Csv "C:\temp\PathLengths_$(Get-Date -Format yyyyMMdd)-SORTED_FINAL.csv" -Delimiter ',' -NoTypeInformation

Upvotes: 0

Views: 400

Answers (1)

Theo
Theo

Reputation: 61148

You shouldn't create a temporary csv file like that and afterwards convert it to CSV, where you can do it straight away using PSCustomObject and Export-Csv.
The way you create a csv-like temp file makes it very easy to mess up the number of commas which leads to misalignment of the fields.

Also, I would suggest not to use commas in the CSV headers (and perhaps shorten them, but that is up to you).

Try:

param(
 # $pathToScan Enter the path to scan for file length
 [Parameter(Mandatory = $true, Position = 0)]
 [ValidateScript({ Test-Path -Path $_ -PathType Container })]
 [string]$pathToScan,

 #Character Limit to be set
 [int]$charLimit = 250
)

# File output
$outputFile = "C:\temp\PathLengths_$(Get-Date -Format yyyyMMdd).csv"

# collect custom objects
$result = Get-ChildItem -Path $pathToScan -Recurse -Force | ForEach-Object {
    # create the output object.
    $obj = [PsCustomObject]@{
        'Path'                                     = $_.FullName
        'ObjectType'                               = if ($_.PSIsContainer) {'Folder'} else {'File'}
        'Empty Folder'                             = $null
        'Long File Path'                           = $null
        'Special Characters'                       = $null
        'Double Spaces_Dashes_Periods_Underscores' = $null
        'Exceeds 50 Characters'                    = $null
        'Name Has Space at End'                    = $null
        'Zip File'                                 = $null
        'Cache and Render Files'                   = $null
        'Windows Hidden Files'                     = $null
    }
    # STEP 1 - Check for empty paths.
    if ($_.PSIsContainer -and $_.GetFileSystemInfos().Count -eq 0) { $obj.'Empty Folder' = 'empty folder' }

    # STEP 2 - Show for long paths. (default=250 characters)
    if ($_.FullName.Length -gt $charLimit) { $obj.'Long File Path' = 'path too long' }

    #STEP 3 - Check for special characters. Allowed characters are Alphanumerics, single space, dashes, underscores, periods
    if ($_.BaseName -match '[^-a-z0-9 _.]') { $obj.'Special Characters' = 'has special characters' }

    #STEP 4 - Check for double spaces, dashes, periods and underscores
    if ($_.BaseName -match '[-\s._]{2,}') { $obj.'Double Spaces_Dashes_Periods_Underscores' = 'has double spaces/dashes/periods/underscores' }

    #STEP 5 - check for more than 50 characters
    # This is a weird check.. Why not simply if ($_.Name.Length -gt 50) ???
    if ($_.Name -match '[^\\]{51,}\.[a-z0-9]{2,7}$') { $obj.'Exceeds 50 Characters' = 'exceeds 50 characters' }

    #STEP 6 - check for empty space at end of file or folder name
    if ($_.Name -match '\s$') { $obj.'Name Has Space at End' = 'name ends in whitespace' }
    
    # these are for files only:
    if (!$_.PSIsContainer) {
        #STEP 7 - check for zip and other archived files
        if ($_.Name -match '\.zip$|\.tar|\.gz$|__MACOSX$') { $obj.'Zip File' = 'unzip files before archiving' }
  
        #STEP 8 - check for cache and render files
        if ('.cfa', '.pek', '.xmp' -contains $_.Extension) { $obj.'Cache and Render Files' = 'delete cache and render files' }

        #STEP 9 - check for Windows hidden files
        if ($_.Attributes -band [System.IO.FileAttributes]::Hidden) { $obj.'Windows Hidden Files' = 'delete hidden files' }
    }

    # output the object, only if there is some value of interest
    # (the first two properties 'Path' and 'ObjectType' are general info, so we disregard those here)
    if (($obj.PsObject.Properties | Select-Object -Skip 2).Value -join '' -ne '') {
        $obj
    }
}

if ($result) { $result | Export-Csv -Path $outputFile -NoTypeInformation }

As you can see I have changed some of the tests:

  • removed (?i) on regex -match, because that is case-insensitive by default
  • moved the - to the front in regexes like [^-a-z0-9 _.], because otherwise it will be interpreted as regex range instead of the minus character itself
  • changed the test for hidden files

Upvotes: 1

Related Questions