Reputation: 25
I have a directory full of files filled with content similar to the below. I want to copy everything after //TEST:
and before //
, I want to copy the date and time, and the IPO into a CSV.
IPO 7 604 1148 17 - Psuedo text here doesnt mean anything just filler text, beep, boop.txt werqwerwqerw erqwerwqer 2. (test) On 7 July 2017 at 0600Z, wqerwqerwqerwerwqerqwerwqjeroisduhsuf //TEST: 37MGUI2974027//, sdfajsfjiosauf sadfu (test2) On 7 July 2017 at 0600Z, blah blah //TEST: 89MTU34782374// blah blah text here //TEST: GHO394749374// (this is uneeded)
Now, Each file has multiple instances of this data, and there may be hundreds of them.
I want to output it into a CSV similar to this:
89MTU34782374, 3 July 2016 at 0640Z, IPO 7 604 1148 17
I have successfully created that with the following, and I feel like I'm on the right track:
$x = "D:\New folder\"
$s = Get-Content $x
$ipo = [regex]::Match($s,'IPO([^/)]+?) -').Groups[1].Value
$test = [regex]::Matches($s,'//TEST: ([^/)]+?)//').Groups[1].Value
$date = [regex]::Matches($s,' On([^/)]+?),').Groups[1].Value
Write-Host $test"," $date"," IPO $ipo
However, I am having trouble getting it to find and select every instance in the file, and printing them onto a new line. I should also note that the way it is looking for the data, every text file is formatted the same way like this.
Not only am I having issues getting it to print each string/variable in the text document onto a new line, I'm having trouble figuring out how to do it for multiple files.
I have tried the following, but it seems to find the terms it's looking for from the first file, and spitting it out for as many files are contained in the directory:
$files = Get-ChildItem "D:\New folder\*.txt"
$s = Get-Content $files
for ($i=0; $i -lt $files.Count; $i++) {
$ipo = [regex]::Match($s,'IPO([^/)]+?) -').Groups[1].Value
$test = [regex]::Matches($s,'//TEST: ([^/)]+?)//').Groups[1].Value
$date = [regex]::Matches($s,' On([^/)]+?),').Groups[1].Value
Write-Host $test"," $date"," IPO $ipo
}
Does anyone have any ideas on how this could be done?
I did a bad job at explaining this. Every document has an IPO number. Every TEST string has a date/time associated with it. There may be other TEST strings but they can be ignored, they are uneeded without a date/time. I could clean it up easily if they got included into the product, though. Every TEST+date/time combo should have the IPO number from which they came
Upvotes: 1
Views: 288
Reputation: 200283
If date and //TEST: ...//
substring always appear as pairs and in the same order you should be able to extract both values with a single regular expression. Try something like this:
Get-ChildItem "D:\New folder\*.txt" | ForEach-Object {
$s = Get-Content $_.FullName
$ipo = [regex]::Matches($s,'(IPO .+?) -').Groups[1].Value
[regex]::Matches($s,' On (.+?),[\s\S]*?//TEST: (.+?)//') | ForEach-Object {
New-Object -Type PSObject -Property @{
IPO = $ipo
Date = $_.Groups[1].Value
Test = $_.Groups[2].Value
}
}
} | Export-Csv 'C:\path\to\output.csv' -NoType
Upvotes: 2
Reputation: 10019
Like so? Most of your code seems to be fine if I understand your question.
It's the loop that seems incorrect as you are repeating the same thing for the number of files found, but not actually referring to the individual files. Also, $s = ...
should be inside the loop to get the content of each file.
$files = Get-ChildItem "D:\New folder\*.txt"
foreach($file in $files){
$s = Get-content $file
$ipo = [regex]::Match($s,'IPO([^/)]+?) -').Groups[1].Value
$test = [regex]::Matches($s,'//TEST: ([^/)]+?)//').Groups[1].Value
$date = [regex]::Matches($s,' On([^/)]+?),').Groups[1].Value
Write-Host "$test, $date, IPO $ipo"
}
Upvotes: 1