Reputation: 4160
I have a list of pdf files (from daily processing), some with date stamps of various formatting, some without.
Example:
$f = @("testLtr06-09-02.pdf", "otherletter.pdf","WelcomeLtr043009.pdf")
I am trying to remove the datestamp by stripping out dashes, then replacing any consecutive group of numbers (4 or more, I may change this to 6) with the string "DATESTAMP".
So far I have this:
$d = $f | foreach {$_ -replace "-", ""} | foreach { $_ -replace ([regex]::Matches($_ , "\d{4,}")), "DATESTAMP"}
echo $d
The output:
testLtrDATESTAMP.pdf
DATESTAMPoDATESTAMPtDATESTAMPhDATESTAMPeDATESTAMPrDATESTAMPlDATESTAMPeDATESTAMPtDATESTAMPtDATESTAMPeDATESTAMPrDATESTAMP.DATESTAMPpDATESTAMPdDATESTAMPfDATESTAMP
WelcomeLtrDATESTAMP.pdf
It works fine if the file has a datestamp but it seems to be freaking out the -replace and inserting DATESTAMP after every character. Is there a way to fix this? I tried to change it to a foreach loop but I couldn't figure out how to get true/false from regex.
Thanks in advance.
Upvotes: 1
Views: 4309
Reputation: 54605
$_ -replace ([regex]::Matches($_ , "\d{4,}")), "DATESTAMP"
Means in $_ replace every finding of ([regex]::Matches($_ , "\d{4,}"))
with "DATESTAMP"
.
As in a filename with no timestamp (or at least 4 consecutive numbers) there is no match, it returns ""
(an empty string).
Thus every empty string gets replaced with DATESTAMP
. And such a empty string ""
sits at the start of the string and after every other character.
Thats why you get this long string with every character surrounded by DATESTAMP.
To check if there even exists a \d{4,}
in your string you should able to use
[regex]::IsMatch($_, "\d{4,}")
I'm no Powershell user but this line alone should do the job. But I'm not sure about being able to use the if in a pipeline and wether or not the assignment and the echo $d are needed
$f | foreach-object {$_ -replace "-", ""} | foreach-object {if ($_ -match "\d{4,}") { $_ -replace "\d{4,}", "DATESTAMP"} else { $_ }}
Upvotes: 2
Reputation: 126732
You can simply do:
PS > $f -replace "(\d{2}-){2}\d{2}|\d{4,}","DATESTAMP"
testLtrDATESTAMP.pdf
otherletter.pdf
WelcomeLtrDATESTAMP.pdf
Upvotes: 4