HelicanV
HelicanV

Reputation: 41

remove extraneous characters from a filename

I have been tasked a little above my head with taking a repository of files and removing excess garbage characters from the filename and saving the renamed file in a different directory folder.

An example of the filenames are:

100-expresstoll.pdf
1000-2012-09-29.jpg
10000-2014-01-15_14.03.22.jpg
10001-2014-01-15_19.05.24.jpg
10002-2014-01-15_21.30.23.jpg
10003-2014-01-16_07.33.54.jpg
10004-2014-01-16_13.33.21.jpg
10005-Feb 4, 2014.jpeg
10006-O'Reilly_Media,_Inc..pdf

First group of numbers at the beginning are record IDs and are to be retained along with the file's extension. Everything else between the record IDs and the file extension needs to be dropped.

For example, the final name for first three files would be:

100.pdf
1000.jpg
10000.jpg

I have read Removing characters and Rearranging filenames in addition to other postings, but the complexity of having a variable character length at the front, a variable number of intermediary characters to be removed and variable file extension types have really tossed this beyond my limited PowerShell reach.

Upvotes: 3

Views: 2229

Answers (4)

mklement0
mklement0

Reputation: 440431

Probably the most idiomatic way of solving this is as follows (assumes that all files of interest - and no others - are in the current dir.):

Get-ChildItem -File | Rename-Item -NewName { ($_.BaseName -split '-')[0] + $_.Extension }

Add common parameter -WhatIf to the Rename-Item command to preview the renaming operation.

Note that Rename-Item always renames items in their current location; to (also) move them, use Move-Item.

If a target with the same name already exists, Rename-Item reports a non-terminating error for each such case (without aborting overall processing).
Note that his could also happen if an input filename contains no -, as that would result in attempt to rename a file to itself.

Explanation:

  • Get-ChildItem -File outputs [System.IO.FileInfo] objects representing the files in the current directory, which are passed through the pipeline (|) to Rename-Item.

  • Passing a script block ({ ... }) to Rename-Item's -NewName parameter executes the contained code for each input object, where $_ represents the input object at hand.

    • Note that this virtually undocumented but frequently used technique is called a script-block parameter [value], where a parameter that is designed to take pipeline input can be bound with a script block that processes the input indirectly.
  • ($_.BaseName -split '-')[0] extracts the 1st --separated token from each input filename's base name (filename without extension).

  • +, because the LHS is a string, performs string concatenation.

  • $_.Extension extracts the filename extension from each input filename.

Upvotes: 2

JosefZ
JosefZ

Reputation: 30248

Another approach without regular expression. In both following examples is used risk mitigation parameter -WhatIf for debugging purposes.

Rename files:

Get-ChildItem -File | ForEach-Object {
    $oldFile = $_.FullName
    $newName = $_.BaseName.Split('-')[0] + $_.Extension
    if ($_.Name -ne $newName) {
        Rename-Item -Path $oldFile -NewName $newName -WhatIf
    }
}

Rename and move files:

$newDest = 'D:\test'                       ### change to fit your circumstances
Get-ChildItem -File | ForEach-Object {
    $oldFile = $_.FullName
    $newName = $_.BaseName.Split('-')[0] + $_.Extension
    $newFile = Join-Path -Path $newDest -ChildPath $newName
    if ( -not ( Test-Path -Path $newFile ) ) {
        Move-Item -Path $oldFile -Destination $newFile -WhatIf
    }
}

Upvotes: 3

lit
lit

Reputation: 16266

I know this is not a PowerShell thing. If you just want something to work, this is a cmd batch file thing.

SETLOCAL ENABLEDELAYEDEXPANSION

SET "OLDDIR=C:\Users\lit\files"
SET "NEWDIR=C:\Users\lit\newdir"

FOR /F "usebackq tokens=*" %%a IN (`DIR /A:-D /B "%OLDDIR%\*"`) DO (
    FOR /F "usebackq delims=- tokens=1" %%b IN (`ECHO %%a`) DO (SET "BN=%%b")
    SET "EXT=%%~xa"
    ECHO COPY /Y "%OLDDIR%\%%~a" "%NEWDIR%\!BN!!EXT!"
)

Upvotes: 0

Don Cruickshank
Don Cruickshank

Reputation: 5948

You can use the -replace operator to do this kind of string manipulation:

Get-ChildItem | foreach {

    $old_name = $_.FullName
    $new_name = $_.Name -replace '([0-9]+).*(\.[^.]*)$', '$1$2'

    Rename-Item $old_name $new_name
}

The regular expression is the trick here:

  1. ([0-9]+) means match a series of digits (1 or more digits)
  2. .* means match anything
  3. (\.[^.]*) means match a period followed by any characters other than a period
  4. $ means that the match must reach the end of the string

The first and third are special in that they are surrounded by parentheses which means that you can use those values using the dollar notation (e.g. $1) in the replacement string.

Upvotes: 3

Related Questions