Matt
Matt

Reputation: 33

Parse file names in directory to output to csv with timestamp?

I have several hundred documents inside of a few directories. All have a common naming structure but the values differ:

10_03022014_229_14_12-9663 5930 4454.pdf

10_03022014_230_19_4-574 1564 1452 177.pdf

What I am attempting to do is create a CSV based upon this data, strip some whitespace, and remove the PDF filename extension in PowerShell. The end result would look something like this:

10,03022014,229,14,12,966359304454

10,03022014,230,19,4,57415641452177

All of these values are alpha-numeric except for the final one which is barcode data.

To further complicate things for myself I need to have the output file have a similar naming structure based upon the first two "values" followed by a date and time stamp.

Example would be 10_03022014_datestamp_timestamp.csv if the files in the directory start with 10_02022014.

Any recommendations would be greatly appreciated!

Upvotes: 0

Views: 2654

Answers (2)

Frode F.
Frode F.

Reputation: 54981

An alternative solution:

#Get pdf-files
Get-ChildItem -Filter "*.pdf" |
#Group files that belong to the same csv-file
Group-Object -Property @{e={$_.BaseName.Split("_")[0,1] -join ("_")}} |
#Foreach csv-group
ForEach-Object {
    #Generate csv-filename
    $path = "$($_.Name)_$((Get-Date).ToString("MMddyyyy_HHmm")).csv"
    #Format content and save
    $_.Group | % { $_.BaseName -replace " " -replace '[-_]',"," } | Set-Content -Path $path
}

Upvotes: 2

jscott
jscott

Reputation: 871

The file name processing seems straightforward enough. I believe you're just replacing underscores, hyphens with a comma and removing spaces from the file's base name. The following should get you the reformatted strings, at least per your two provided values:

Get-ChildItem -Filter '*.pdf' |
    ForEach-Object { $_.BaseName -Replace '[-_]', ',' -Replace ' ', '' }

I'm still not exactly clear on what you mean about the csv file name. Once you clarify that, I'd be happy to help with that as well.


I think this is closer to what you're looking to do:

# Generate '_date_time.csv' string.
$fileSuffix = "_" + (Get-Date -Format yyyyMMdd) + "_" + (Get-Date -Format HHmm) + ".csv"

Get-ChildItem -Filter '*.pdf' |
    ForEach-Object {
        # Get the first two tokens, underscore delimited, of PDF file name.
        $filePrefix = $_.Name.Split('_')[0,1] -Join('_')
        # Preform requisite replacements on PDF file name
        $string = $_.BaseName -Replace '[-_]', ',' -Replace ' ', ''
        # Write string out to CSV file, concat prefix/suffix to generate name.
        $string | Out-File -Append -FilePath $($filePrefix + $fileSuffix)
    }

Upvotes: 1

Related Questions