user1801422
user1801422

Reputation: 23

Parse filenames, create folder structure

I have a script that needs to be changed due to miscommunication: We have workstations on the prod floor that create files with the following structure - 04_R_____"109402"0076_9999992_35_401_"01_20121107"_134029_0667.I00.asd (the quotes part of the file name is the part that has to be parsed.

I have already created an array with the first part of the file names and the powershell program is able to parse that data; however on the second part of the filename there has to be a folder structure created by part number, test bench number (01, 02, 03, etc) and then by date. If the folder does not exist create folder only if there is a match.

My current script filtering by the prefix (which is wrong) and creates all folders daily (not by a match). I would like to use a substring to exclude so many characters to catch 01, 02, 03 etc. Would it be possible not to recreate the wheel and use my current code with a few changes? All of my test code is included and any help would be greatly apprecidated or modifications!

Code:

$source ="\\127.0.0.1\baunhof\*"
$archive = "\\127.0.0.1\error\\"
#$past=(Get-date).AddDays(-2)

$destination ="\\127.0.0.1\TestFolder1\\"
$destination1="\\127.0.0.1\TestFolder2\\"
$destination2="\\127.0.0.1\TestFolder3\\"
$destination3="\\127.0.0.1\TestFolder4\\"
#array for all destinations
$destination_array=@("$destination", "$destination1", "$destination2", "$destination3")

#creates folder yyyy/mm/dd
#$today = (Get-date -format yyyy/MM/dd)
#new-item -type directory ($today)
$DTS = ( get-date ).ToString('yyyy/MM/dd')

#array for file prefix
$File_Array_8HP70=@("*108701*")
$File_Array_8HP70X=@("*108702*")
$File_Array_9HP48=@("*109401*", "*1094080*", "*1094090*")
$File_Array_9HP48X=@("*109402*", "*1094091*", "*1094082*", "*1094092*")

#test bench number array filter
$test_bench_01=@("*_01_*")
$test_bench_02=@("*_02_*")
$test_bench_03=@("*_03_*")
$test_bench_04=@("*_04_*")

#Error log function: will write to application on server
function Write-EventLog {
  param([string]$msg = "Default Message", [string]$type="Information")
  $log = New-Object System.Diagnostics.EventLog
  $log.set_log("Application")
  $log.set_source("PSscript")
  $log.WriteEntry($msg,$type)
}

Write-Eventlog "Acoustic file parse program has started"

# if statement checks if $destination_array[0] is false then new item
$destination_array[0] = "\\127.0.0.1\TestFolder1\today\" 
If (!(Test-Path -path $destination_array[0])) {
  new-item -type directory "\\127.0.0.1\TestFolder1\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder1\P01\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder1\P02\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder1\P03\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder1\P04\$DTS"
}

$destination_array[1] = "\\127.0.0.1\TestFolder2\today\"
If (!(Test-Path -path $destination_array[1])) {
  new-item -type directory "\\127.0.0.1\TestFolder2\$DTS\"
  new-item -type directory "\\127.0.0.1\TestFolder2\P01\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder2\P02\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder2\P03\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder2\P04\$DTS"
}

$destination_array[2] = "\\127.0.0.1\TestFolder3\today\"
If (!(Test-Path -path $destination_array[2])) {
  new-item -type directory "\\127.0.0.1\TestFolder3\$DTS\"
  new-item -type directory "\\127.0.0.1\TestFolder3\P01\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder3\P02\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder3\P03\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder3\P04\$DTS"
}

$destination_array[3] = "\\127.0.0.1\TestFolder4\today\"
If (!(Test-Path -path $destination_array[3])) {
  new-item -type directory "\\127.0.0.1\TestFolder4\$DTS\"
  new-item -type directory "\\127.0.0.1\TestFolder4\P01\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder4\P02\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder4\P03\$DTS"
  new-item -type directory "\\127.0.0.1\TestFolder4\P04\$DTS"
}

$destination="\\127.0.0.1\TestFolder1\$DTS"
$destination1="\\127.0.0.1\TestFolder2\$DTS"
$destination2="\\127.0.0.1\TestFolder3\$DTS"
$destination3="\\127.0.0.1\TestFolder4\$DTS"
$destination_array=@  ("$destination", "$destination1", "$destination2", "$destination3")

# filter works below - need to use array

#$files = get-childitem $source -filter "108701*" -recurse
#foreach ($file in $files)
#{move-item $file.fullname $destination_array[0] -force}

$File_Array_8HP70_start = $File_Array_8HP70 | % {$_+"*"} 
$files = get-childitem $source -include $File_Array_8HP70_start -recurse
foreach ($file in $files) {
  move-item $file.fullname $destination_array[0] -force
}
#filter test bench
$files01 = gci $destination_array[0] -filter "01_*" -recurse
$files02 = gci $destination_array[0] -filter "02_*" -recurse
$files03 = gci $destination_array[0] -filter "03_*" -recurse          
$files04 = gci $destination_array[0] -filter "04_*" -recurse

$destination_array[0]="\\127.0.0.1\TestFolder1\P01\$DTS"
foreach ($file in $files01) {
  move-item $file.fullname $destination_array[0] -force
}
$destination_array[0]="\\127.0.0.1\TestFolder1\P02\$DTS"
foreach ($file in $files02) {
  move-item $file.fullname $destination_array[0] -force
}
$destination_array[0]="\\127.0.0.1\TestFolder1\P03\$DTS"
foreach ($file in $files03) {
  move-item $file.fullname $destination_array[0] -force
}
$destination_array[0]="\\127.0.0.1\TestFolder1\P04\$DTS"
foreach ($file in $files04) {
  move-item $file.fullname $destination_array[0] -force
}

$File_Array_8HP70X_start = $File_Array_8HP70X | % {$_+"*"}
$files = get-childitem $source -include $File_Array_8HP70X_start -recurse
foreach ($file in $files) {
  move-item $file.fullname $destination_array[1] -force
}
#$files02 = gci $destination_array[1] -filter "02_*" -recurse
$files01 = gci $destination_array[1] -filter "01_*" -recurse
$files02 = gci $destination_array[1] -filter "02_*" -recurse
$files03 = gci $destination_array[1] -filter "03_*" -recurse          
$files04 = gci $destination_array[1] -filter "04_*" -recurse

$destination_array[1]="\\127.0.0.1\TestFolder2\P01\$DTS"
foreach ($file in $files01) {
  move-item $file.fullname $destination_array[1] -force
}
$destination_array[1]="\\127.0.0.1\TestFolder2\P02\$DTS"
foreach ($file in $files02) {
  move-item $file.fullname $destination_array[1] -force
}
$destination_array[1]="\\127.0.0.1\TestFolder2\P03\$DTS"
foreach ($file in $files03) {
  move-item $file.fullname $destination_array[1] -force
}
$destination_array[1]="\\127.0.0.1\TestFolder2\P04\$DTS"
foreach ($file in $files04) {
  move-item $file.fullname $destination_array[1] -force
}

$File_Array_9HP48_start = $File_Array_9HP48 | % {$_+"*"}
$files = get-childitem $source -include $File_Array_9HP48_start -recurse
foreach ($file in $files) {
  move-item $file.fullname $destination_array[2] -force
}
#$files03 = gci $destination_array[2] -filter "03_*" -recurse
$files01 = gci $destination_array[2] -filter "01_*" -recurse
$files02 = gci $destination_array[2] -filter "02_*" -recurse
$files03 = gci $destination_array[2] -filter "03_*" -recurse
$files04 = gci $destination_array[2] -filter "04_*" -recurse

$destination_array[2]="\\127.0.0.1\TestFolder3\P01\$DTS"
foreach ($file in $files01) {
  move-item $file.fullname $destination_array[2] -force
}
$destination_array[2]="\\127.0.0.1\TestFolder3\P02\$DTS"
foreach ($file in $files02) {
  move-item $file.fullname $destination_array[2] -force
}
$destination_array[2]="\\127.0.0.1\TestFolder3\P03\$DTS"
foreach ($file in $files03) {
  move-item $file.fullname $destination_array[2] -force
}
$destination_array[2]="\\127.0.0.1\TestFolder3\P04\$DTS"
foreach ($file in $files04) {
  move-item $file.fullname $destination_array[2] -force
}

$File_Array_9HP48X_start = $File_Array_9HP48X | % {$_+"*"}
$files = get-childitem $source -include $File_Array_9HP48X_start -recurse
foreach ($file in $files) {
  move-item $file.fullname $destination_array[3] -force
}
#$files04 = gci $destination_array[3] -filter "04_*" -recurse
$files01 = gci $destination_array[3] -filter "01_*" -recurse
$files02 = gci $destination_array[3] -filter "02_*" -recurse
$files03 = gci $destination_array[3] -filter "03_*" -recurse
$files04 = gci $destination_array[3] -filter "04_*" -recurse

$destination_array[3]="\\127.0.0.1\TestFolder4\P01\$DTS"
foreach ($file in $files01) {
  move-item $file.fullname $destination_array[3] -force
}
$destination_array[3]="\\127.0.0.1\TestFolder4\P02\$DTS"
foreach ($file in $files02) {
  move-item $file.fullname $destination_array[3] -force
}
$destination_array[3]="\\127.0.0.1\TestFolder4\P03\$DTS"
foreach ($file in $files03) {
  move-item $file.fullname $destination_array[3] -force
}
$destination_array[3]="\\127.0.0.1\TestFolder4\P04\$DTS"
foreach ($file in $files04) {
  move-item $file.fullname $destination_array[3] -force
}
#move files to c:\Error if older than 2 days
$file_2 = gci $source -recurse|where {$_.LastWriteTime -lt (get-date).AddDays(-2)}
foreach ($file in $file_2) {
  move-item $file.fullname $archive -force
}

Write-Eventlog "Acoustic file parse program has completed"

Upvotes: 1

Views: 2651

Answers (1)

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200523

You're trying to do everything by hand. Don't.

Let PowerShell do the work for you:

$DTS = (Get-Date).FormatDate('yyyy/MM/dd')

$parts_lists = @(
  @("108701"),
  @("108702"),
  @("109401", "1094080", "1094090"),
  @("109402", "1094091", "1094082", "1094092")
)

$destination_dirs = @(
  "\\127.0.0.1\TestFolder1",
  "\\127.0.0.1\TestFolder2",
  "\\127.0.0.1\TestFolder3",
  "\\127.0.0.1\TestFolder4"
)

# The following regular expression defines 2 sub-matches for parts list
# and test bench.
$re = "^\d{2}_[A-Z]___(\d{6})\d{4}_\d{7}_\d{2}_\d{3}_(\d{2})_\d{8}_\d{6}_\d{4}\.[A-Z]\d{2}\.asd$"

Get-ChildItem $source -Recurse | ? { $_.Name -match $re } | % {
  # process only files that match the given regular expression

  # iterate over all 4 parts lists
  for ($i = 0; $i -le 3; $i++) {
    if ( $parts_lists[$i] -contains $matches[1] ) {
      # if the first sub-match (the parts list number) is found in the current
      # parts list, construct a destination path from the corresponding base
      # directory, the test bench number and the date.
      $dest = Join-Path $destination_dirs[$i] -ChildPath "P$($matches[2])\$DTS"

      # Create the destination if it doesn't exist. Creating it here ensures
      # that a destination folder is only created when there's actually a
      # file going into it.
      if ( -not (Test-Path -LiteralPath $dest) ) {
        New-Item -Type Directory $dest
      }

      # Move the file ...
      Move-Item $_.FullName $dest -Force
      # ... end exit from the for-loop (no need to check other parts lists
      # once we found a match).
      break
    }
  }
}

I'm not sure if I fully understood your code, so my sample code may need some tuning, but it should give you the general idea.

One thing you need to be aware of is that the date format yyyy/MM/dd will give you a date string using the regional date separator, i.e. on systems with a US locale it will produce a date string 2013/02/15 whereas on systems with a German locale the date string would be 2013.02.15. If you want the date parts separated by forward slashes (which PowerShell will interpret as path separators when you use that date in a path), you need to escape the forward slashes in the format string: yyyy\/MM\/dd.

Edit: The regular expression has 2 purposes:

  • to restrict processing to only those files that match the pattern, and
  • to give access to the parts list and test bench parts of the file name.

The pattern is derived from the example file name you gave:

04_R___1094020076_9999992_35_401_01_20121107_134029_0667.I00.asd

  • ^: The beginning of the string.
  • \d{2}_: 2 digits followed by an underscore.
  • [A-Z]___: A single capital letter followed by 3 underscores.
  • (\d{6}): A group of 6 digits (representing the parts list number). The group can later be accessed via $matches[1].
  • \d{4}_: 4 digits followed by an underscore.
  • \d{7}_: 7 digits followed by an underscore.
  • \d{2}_: 2 digits followed by an underscore.
  • \d{3}_: 3 digits followed by an underscore.
  • (\d{2})_: A group of 2 digits (representing the test bench machine number) followed by an underscore. The group can later be accessed via $matches[2].
  • \d{8}_: 8 digits (the date) followed by an underscore.
  • \d{6}_: 6 digits followed by an underscore.
  • \d{4}\.: 4 digits followed by a single dot.
  • [A-Z]\d{2}: A single uppercase letter followed by 2 digits.
  • \.asd: A single dot followed by the lowercase letters a, s and d (the extension).
  • $: The end of the string.

Upvotes: 1

Related Questions