Reputation: 57
I'm building a scripts to compare lines from two text files but I got the thing wrong looping through each line in the 2 files. I don't know why but it just didn't do anything for me as I wanted it in the code to do. Here is the sample text in the text files I'm processing.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Begin calculating H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013's file hashes on Friday 06/19/2020 19:03:26.576 +07:00.
The size of the folder to compute is 4001554359.
The number of the files calculating is 31
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
File Name: "Agnetha - Abba & After.mp3" File size: 85118223 File Hash: 05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC
File Name: "Bill Bailey's Jungle Hero.zip" File size: 110091242 File Hash: 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C
File Name: "David Attenborough's Galapagos.zip" File size: 121768208 File Hash: 531643D6800AC61B34D66FD1BDEA64B025E3E27D563BF3743B502D56105F9057
File Name: "Dogging Tales.mp3" File size: 49675908 File Hash: 4F427746C4EE6D7D6B3989D541254AE3D37C89E9174BDF1944DED08D3B6448B5
File Name: "Hawking.mp3" File size: 130586456 File Hash: 27B7278A28397DFB6223FBCE4C25B530E87EC29E036CA26E0872E50872FB1021
File Name: "King Alfred and the Anglo Saxons.zip" File size: 165390328 File Hash: 8ACBEF319A5C529332CE9087EE7FCC6A78BA0CCEA00A0B6F32D01BEB04DF7873
File Name: "Me & My Guide Dog.zip" File size: 130311390 File Hash: 6257749C627AF302C8946010EBD2560352486556D9572D358EDB0349A3B41CC6
File Name: "Natural Curiosities.zip" File size: 394964664 File Hash: 1D9B9D144E9A77D04AC1FFE061FA866C48A209DCC32953D585BCE15277B573F1
File Name: "Natural World - Meerkats, Secrets of an Animal Superstar.mp3" File size: 56517900 File Hash: C3199B35DEC8A2E71A672CF714B2D928DC1CF89F958B742E136DCC7E3BC22741
I want to read from the file above, save the lines in an array and process each line. Here is the piece of code in my scripts.
$hashListFile1 = "XXXXXX" #some path to the text file above
$hashListFile2 = "YYYYYY" #similar like $hashListFile1
#The above variables is the paths to the text files I'm processing.
#Continue to store the content of the 2 text files to 2 variables below respectly.
$hashListFile1Content = Get-Content -Path "$hashListFile1"
$hashListFile2Content = Get-Content -Path "$hashListFile2"
#Declare 2 arrays to store the content I'm extracting from the 2 texts.
$hashList1 = @()
$hashList2 = @()
$currentTimeStamp = Get-Date -Format "dddd MM/dd/yyyy HH:mm:ss.fff K"
$hashList1ComputeLocation = $null
#Above is the variable used to store the path extracted from the first text line in the text above.
#This path "H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013".
$hashList2ComputeLocation = $null #similar for the text file number 2
Write-Host $hashListFile1Content[5] # <= It worked when tested
Write-Host $hashListFile1Content[13] # <= worked when tested
pause
foreach ($file1Line in $hashListFile1Content) {
#loop through each item store the text lines in the array above
if ($file1Line -match "Begin calculating ") {
#capture the line that contains the text "Begin calculating H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013's file hashes...."
#It didn't work.
Write-Host $file1Line # <= test if it captured the item but it doesn't
Pause
$hashList1ComputeLocation = [regex]::Matches($fileLine, "(^Begin\scalculating\s)(.*)(\'s\sfile\shashes\son\s)(.*$)").Groups[2].Value
#I want to extract the path "H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013" but nothing is captured
Write-Host $hashList1ComputeLocation # <= nothing showed
pause
}
#continue to process the data lines
Elseif ($file1Line -match "File Name: ") {
Write-Host $file1Line
$fileName = [regex]::Matches($file1Line, "^File\sName:\s)(.*?)(\s\s\sFile\ssize:\s)(.*$)").Groups[2].Value
$fileSize = [regex]::Matches($file1Line, "(^.*)(File\ssize:\s)(.*?)(\s\s\sFile\sHash:\s)(.*$)").Groups[3].Value
$fileHash = [regex]::Matches($file1Line, "(^.*)(\s\s\sFile\sHash:\s)(.*$)").Groups[3].Value
$dataLine = @{
"File Name" = $fileName
"File Size" = $fileSize
"File Hash" = $fileHash
}
$fileInfoLine = New-Object PSObject -Property $dataLine
#Write-Host $fileInfoLine
#pause
$hashList1 += $fileInfoLine
}
Write-Host $hashList1
pause
}
Please tell me know why the foreach ($file1Line in $hashListFile1Content)
loop in the script above did not work.
Thank you.
Upvotes: 1
Views: 139
Reputation: 61028
I'm not quite sure what your aim is in comparing, but I would parse the files in a single loop, resulting in an array of two PsCustomObject arrays like this:
$filesToParse = 'D:\Test\test1.txt', 'D:\Test\test2.txt'
# create two regex strings, one for the location, the other for the file deatils
$rxLocation = '^Begin calculating\s+(.+)''s file hashes.*'
$rxDetails = '^File Name:\s+"(?<name>.*)"\s+File size:\s+(?<size>\d+)\s+File Hash:\s+(?<hash>[A-F0-9]+)'
$result = $filesToParse | ForEach-Object {
$folder = $null
switch -Regex -File $_ {
$rxLocation {
$folder = $Matches[1]
Write-Host "Location: $folder"
}
$rxDetails {
# output an object
[PsCustomObject]@{
'Location' = $folder
'File Name' = $matches['name']
'File Size' = $matches['size']
'File Hash' = $matches['hash']
}
}
default {}
}
}
# output the complete parsed stuff on screen
$result
# or write to CSV file
$result | Export-Csv -Path 'D:\Test\ParsedResultys.csv' -UseCulture -NoTypeInformation
Result of the above on screen:
Location File Name File Size File Hash -------- --------- --------- --------- H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Agnetha, Frida and the rest.mp3 85118223 05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Bill Bailey's Jungle Hero.zip 110091242 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 David Attenborough's Galapagos.zip 121768208 531643D6800AC61B34D66FD1BDEA64B025E3E27D563BF3743B502D56105F9057 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Dogging Tales.mp3 49675908 4F427746C4EE6D7D6B3989D541254AE3D37C89E9174BDF1944DED08D3B6448B5 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Hawking.mp3 130586456 27B7278A28397DFB6223FBCE4C25B530E87EC29E036CA26E0872E50872FB10AA H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 King Alfred and the Anglo Saxons.zip 165390328 8ACBEF319A5C529332CE9087EE7FCC6A78BA0CCEA00A0B6F32D01BEB04DF7873 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Me & My Guide Dog.zip 130311390 6257749C627AF302C8946010EBD2560352486556D9572D358EDB0349A3B41CC6 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Natural Curiosities.zip 394964664 1D9B9D144E9A77D04AC1FFE061FA866C48A209DCC32953D585BCE15277B573F1 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Natural World - Meerkats, Secrets of an Animal Superstar.mp3 56517900 C3199B35DEC8A2E71A672CF714B2D928DC1CF89F958B742E136DCC7E3BC22741 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Agnetha - Abba & After.mp3 85118223 05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Bill Bailey's Jungle Hero.zip 110091242 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 David Attenborough's Galapagos.zip 121768208 531643D6800AC61B34D66FD1BDEA64B025E3E27D563BF3743B502D56105F9057 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Dogging Tales.mp3 49675908 4F427746C4EE6D7D6B3989D541254AE3D37C89E9174BDF1944DED08D3B6448B5 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Hawking.mp3 130586456 27B7278A28397DFB6223FBCE4C25B530E87EC29E036CA26E0872E50872FB1021 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 King Alfred and the Anglo Saxons.zip 165390328 8ACBEF319A5C529332CE9087EE7FCC6A78BA0CCEA00A0B6F32D01BEB04DF7873 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Me & My Guide Dog.zip 130311390 6257749C627AF302C8946010EBD2560352486556D9572D358EDB0349A3B41CC6 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Natural Curiosities.zip 394964664 1D9B9D144E9A77D04AC1FFE061FA866C48A209DCC32953D585BCE15277B573F1 H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Natural World - Meerkats, Secrets of an Animal Superstar.mp3 56517900 C3199B35DEC8A2E71A672CF714B2D928DC1CF89F958B742E136DCC7E3BC22741
Next, to compare the items in both $result elements, you can use Compare-Object
, something like:
Compare-Object -ReferenceObject $result[0] -DifferenceObject $result[1] -Property 'File Name', 'File Size', 'File Hash'
Which will output the differences:
File Name File Size File Hash SideIndicator
--------- --------- --------- -------------
Bill Bailey's Jungle Hero.zip 110091242 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C =>
Agnetha, Frida and the rest.mp3 85118223 05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC <=
Edit
From your comment, I gather this needs a it of explanation.
The first line in the code puts the complete file path and names of your two textfiles in an array $filesToParse
so we can loop over them both.
Next, we define two regex strings. The first one is meant to capture the location of the line starting with "Begin calculating". (in your example file, this finds "H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013"). Tne next one is to capture the relevant parts of each file (the file Name, the file Size and the Hash value). This regex stores these parts in named captures to make things better readable.
Then it's time to loop over the two textfiles and parse out the information. The fastest way possible is to use switch -Regex -File <filename>
. ( is represented by the automatic variable $_
).
What that does is iterate over every line in the text file and checks if these lines match the regexes we created.
$folder
for later use.$folder
.default {}
)The objects we output all get collected in a variable called $result
.
$result will finally be an array (of two items, one for each textfile), where each item has an array of the objects.
Finally, by using one more cmdlet Compare-Object
, we can see if the two items in $result differ or not, when we compare the properties 'File Name', 'File Size' and 'File Hash'. (the 'Location' will of course always be different, so we don't compare that).
SideIndicator
will show either =>
or <=
) The difference might be just the file name, or the size and/or hash value.To test, I copied your example file and made some small differences in the second one, to prove it works.
Hope that explains
Upvotes: 0
Reputation: 175
There's a typo in line 43, it should be $file1Line instead of $fileLine:
$hashList1ComputeLocation = [regex]::Matches($fileLine, "(^Begin\scalculating\s)(.*)(\'s\sfile\shashes\son\s)(.*$)").Groups[2].Value
^^^^^^^^
Also, in line 35, -match
uses regex which is a bit overkill considering you have -like
which does simple wildcard matching. Mind the * at the end of the string:
if ($file1Line -like "Begin calculating*") {
And last thing which is more about personal preference, since you have a set "line format", I would try to use Select-String
as much as possible to locate certain "special lines" instead of iterating over the entire file.
Select-String -Path $hashList1 -Pattern "Begin calculation" | select -expandproperty Line
Select-String -Path $hashList1 -Pattern "file name:" | select -expandproperty Line
You could use these two lines to get the "begin calculating" line and an array of "File name:" lines, which you could iterate on. This would be more concise in my opinion, and would be easier to debug. Your code does work though, so take it with a grain of salt.
Upvotes: 1