Reputation: 33
I have several hundred documents inside of a few directories. All have a common naming structure but the values differ:
10_03022014_229_14_12-9663 5930 4454.pdf
10_03022014_230_19_4-574 1564 1452 177.pdf
What I am attempting to do is create a CSV based upon this data, strip some whitespace, and remove the PDF filename extension in PowerShell. The end result would look something like this:
10,03022014,229,14,12,966359304454
10,03022014,230,19,4,57415641452177
All of these values are alpha-numeric except for the final one which is barcode data.
To further complicate things for myself I need to have the output file have a similar naming structure based upon the first two "values" followed by a date and time stamp.
Example would be 10_03022014_datestamp_timestamp.csv
if the files in the directory start with 10_02022014
.
Any recommendations would be greatly appreciated!
Upvotes: 0
Views: 2654
Reputation: 54981
An alternative solution:
#Get pdf-files
Get-ChildItem -Filter "*.pdf" |
#Group files that belong to the same csv-file
Group-Object -Property @{e={$_.BaseName.Split("_")[0,1] -join ("_")}} |
#Foreach csv-group
ForEach-Object {
#Generate csv-filename
$path = "$($_.Name)_$((Get-Date).ToString("MMddyyyy_HHmm")).csv"
#Format content and save
$_.Group | % { $_.BaseName -replace " " -replace '[-_]',"," } | Set-Content -Path $path
}
Upvotes: 2
Reputation: 871
The file name processing seems straightforward enough. I believe you're just replacing underscores, hyphens with a comma and removing spaces from the file's base name. The following should get you the reformatted strings, at least per your two provided values:
Get-ChildItem -Filter '*.pdf' |
ForEach-Object { $_.BaseName -Replace '[-_]', ',' -Replace ' ', '' }
I'm still not exactly clear on what you mean about the csv file name. Once you clarify that, I'd be happy to help with that as well.
I think this is closer to what you're looking to do:
# Generate '_date_time.csv' string.
$fileSuffix = "_" + (Get-Date -Format yyyyMMdd) + "_" + (Get-Date -Format HHmm) + ".csv"
Get-ChildItem -Filter '*.pdf' |
ForEach-Object {
# Get the first two tokens, underscore delimited, of PDF file name.
$filePrefix = $_.Name.Split('_')[0,1] -Join('_')
# Preform requisite replacements on PDF file name
$string = $_.BaseName -Replace '[-_]', ',' -Replace ' ', ''
# Write string out to CSV file, concat prefix/suffix to generate name.
$string | Out-File -Append -FilePath $($filePrefix + $fileSuffix)
}
Upvotes: 1