mymicrosoftmylife
mymicrosoftmylife

Reputation: 57

Powershell - Failed loop through items of array got from text file

I'm building a scripts to compare lines from two text files but I got the thing wrong looping through each line in the 2 files. I don't know why but it just didn't do anything for me as I wanted it in the code to do. Here is the sample text in the text files I'm processing.

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

Begin calculating H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013's file hashes on Friday 06/19/2020 19:03:26.576 +07:00. 
The size of the folder to compute is 4001554359. 
The number of the files calculating is 31

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------


File Name: "Agnetha - Abba & After.mp3"   File size: 85118223   File Hash: 05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC
File Name: "Bill Bailey's Jungle Hero.zip"   File size: 110091242   File Hash: 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C
File Name: "David Attenborough's Galapagos.zip"   File size: 121768208   File Hash: 531643D6800AC61B34D66FD1BDEA64B025E3E27D563BF3743B502D56105F9057
File Name: "Dogging Tales.mp3"   File size: 49675908   File Hash: 4F427746C4EE6D7D6B3989D541254AE3D37C89E9174BDF1944DED08D3B6448B5
File Name: "Hawking.mp3"   File size: 130586456   File Hash: 27B7278A28397DFB6223FBCE4C25B530E87EC29E036CA26E0872E50872FB1021
File Name: "King Alfred and the Anglo Saxons.zip"   File size: 165390328   File Hash: 8ACBEF319A5C529332CE9087EE7FCC6A78BA0CCEA00A0B6F32D01BEB04DF7873
File Name: "Me & My Guide Dog.zip"   File size: 130311390   File Hash: 6257749C627AF302C8946010EBD2560352486556D9572D358EDB0349A3B41CC6
File Name: "Natural Curiosities.zip"   File size: 394964664   File Hash: 1D9B9D144E9A77D04AC1FFE061FA866C48A209DCC32953D585BCE15277B573F1
File Name: "Natural World - Meerkats, Secrets of an Animal Superstar.mp3"   File size: 56517900   File Hash: C3199B35DEC8A2E71A672CF714B2D928DC1CF89F958B742E136DCC7E3BC22741

I want to read from the file above, save the lines in an array and process each line. Here is the piece of code in my scripts.

$hashListFile1 = "XXXXXX"  #some path to the text file above
$hashListFile2 = "YYYYYY"  #similar like $hashListFile1

#The above variables is the paths to the text files I'm processing.

#Continue to store the content of the 2 text files to 2 variables below respectly.

$hashListFile1Content = Get-Content -Path "$hashListFile1"
$hashListFile2Content = Get-Content -Path "$hashListFile2"


#Declare 2 arrays to store the content I'm extracting from the 2 texts.

$hashList1 = @()
$hashList2 = @()


$currentTimeStamp = Get-Date -Format "dddd MM/dd/yyyy HH:mm:ss.fff K"
$hashList1ComputeLocation = $null 

#Above is the variable used to store the path extracted from the first text line in the text above. 
#This path "H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013".

$hashList2ComputeLocation = $null    #similar for the text file number 2


Write-Host $hashListFile1Content[5]   # <= It worked when tested
Write-Host $hashListFile1Content[13]  # <= worked when tested
pause


foreach ($file1Line in $hashListFile1Content) {   
#loop through each item store the text lines in the array above

if ($file1Line -match "Begin calculating ") {     
#capture the line that contains the text "Begin calculating H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013's file hashes...." 
#It didn't work.


Write-Host $file1Line   # <= test if it captured the item but it doesn't
Pause

$hashList1ComputeLocation = [regex]::Matches($fileLine, "(^Begin\scalculating\s)(.*)(\'s\sfile\shashes\son\s)(.*$)").Groups[2].Value

#I want to extract the path "H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013" but nothing is captured

Write-Host $hashList1ComputeLocation  # <= nothing showed
pause

}

#continue to process the data lines

Elseif ($file1Line -match "File Name: ") { 
Write-Host $file1Line
$fileName = [regex]::Matches($file1Line, "^File\sName:\s)(.*?)(\s\s\sFile\ssize:\s)(.*$)").Groups[2].Value
$fileSize = [regex]::Matches($file1Line, "(^.*)(File\ssize:\s)(.*?)(\s\s\sFile\sHash:\s)(.*$)").Groups[3].Value
$fileHash = [regex]::Matches($file1Line, "(^.*)(\s\s\sFile\sHash:\s)(.*$)").Groups[3].Value
$dataLine = @{
"File Name" = $fileName
"File Size" = $fileSize
"File Hash" = $fileHash
}
$fileInfoLine = New-Object PSObject -Property $dataLine
#Write-Host $fileInfoLine
#pause
$hashList1 += $fileInfoLine
}
Write-Host $hashList1
pause
}

Please tell me know why the foreach ($file1Line in $hashListFile1Content) loop in the script above did not work. Thank you.

Upvotes: 1

Views: 139

Answers (2)

Theo
Theo

Reputation: 61028

I'm not quite sure what your aim is in comparing, but I would parse the files in a single loop, resulting in an array of two PsCustomObject arrays like this:

$filesToParse = 'D:\Test\test1.txt', 'D:\Test\test2.txt'

# create two regex strings, one for the location, the other for the file deatils
$rxLocation = '^Begin calculating\s+(.+)''s file hashes.*'
$rxDetails  = '^File Name:\s+"(?<name>.*)"\s+File size:\s+(?<size>\d+)\s+File Hash:\s+(?<hash>[A-F0-9]+)'

$result = $filesToParse | ForEach-Object {
    $folder = $null
    switch -Regex -File $_ {
        $rxLocation { 
            $folder = $Matches[1]
            Write-Host "Location: $folder"
        }
        $rxDetails  {
            # output an object
            [PsCustomObject]@{
                'Location'  = $folder
                'File Name' = $matches['name']
                'File Size' = $matches['size']
                'File Hash' = $matches['hash']
            }
        }
        default {}
    }
}

# output the complete parsed stuff on screen
$result

# or write to CSV file
$result | Export-Csv -Path 'D:\Test\ParsedResultys.csv' -UseCulture -NoTypeInformation

Result of the above on screen:

Location                                   File Name                                                    File Size File Hash                                                       
--------                                   ---------                                                    --------- ---------                                                       
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Agnetha, Frida and the rest.mp3                              85118223  05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Bill Bailey's Jungle Hero.zip                                110091242 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 David Attenborough's Galapagos.zip                           121768208 531643D6800AC61B34D66FD1BDEA64B025E3E27D563BF3743B502D56105F9057
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Dogging Tales.mp3                                            49675908  4F427746C4EE6D7D6B3989D541254AE3D37C89E9174BDF1944DED08D3B6448B5
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Hawking.mp3                                                  130586456 27B7278A28397DFB6223FBCE4C25B530E87EC29E036CA26E0872E50872FB10AA
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 King Alfred and the Anglo Saxons.zip                         165390328 8ACBEF319A5C529332CE9087EE7FCC6A78BA0CCEA00A0B6F32D01BEB04DF7873
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Me & My Guide Dog.zip                                        130311390 6257749C627AF302C8946010EBD2560352486556D9572D358EDB0349A3B41CC6
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Natural Curiosities.zip                                      394964664 1D9B9D144E9A77D04AC1FFE061FA866C48A209DCC32953D585BCE15277B573F1
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2020 Natural World - Meerkats, Secrets of an Animal Superstar.mp3 56517900  C3199B35DEC8A2E71A672CF714B2D928DC1CF89F958B742E136DCC7E3BC22741
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Agnetha - Abba & After.mp3                                   85118223  05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Bill Bailey's Jungle Hero.zip                                110091242 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 David Attenborough's Galapagos.zip                           121768208 531643D6800AC61B34D66FD1BDEA64B025E3E27D563BF3743B502D56105F9057
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Dogging Tales.mp3                                            49675908  4F427746C4EE6D7D6B3989D541254AE3D37C89E9174BDF1944DED08D3B6448B5
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Hawking.mp3                                                  130586456 27B7278A28397DFB6223FBCE4C25B530E87EC29E036CA26E0872E50872FB1021
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 King Alfred and the Anglo Saxons.zip                         165390328 8ACBEF319A5C529332CE9087EE7FCC6A78BA0CCEA00A0B6F32D01BEB04DF7873
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Me & My Guide Dog.zip                                        130311390 6257749C627AF302C8946010EBD2560352486556D9572D358EDB0349A3B41CC6
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Natural Curiosities.zip                                      394964664 1D9B9D144E9A77D04AC1FFE061FA866C48A209DCC32953D585BCE15277B573F1
H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013 Natural World - Meerkats, Secrets of an Animal Superstar.mp3 56517900  C3199B35DEC8A2E71A672CF714B2D928DC1CF89F958B742E136DCC7E3BC22741

Next, to compare the items in both $result elements, you can use Compare-Object, something like:

Compare-Object -ReferenceObject $result[0] -DifferenceObject $result[1] -Property 'File Name', 'File Size', 'File Hash'

Which will output the differences:

File Name                       File Size File Hash                                                        SideIndicator
---------                       --------- ---------                                                        -------------
Bill Bailey's Jungle Hero.zip   110091242 96608B2BCB84DAD25E71EBD82727E9DE7309D7FDA1B6FD2AEE10CCF0F3CF0C5C =>
Agnetha, Frida and the rest.mp3 85118223  05B4C42DB852A49C11CB3F03817F149363275EA512ED8A441846B238C48E04CC <=


Edit

From your comment, I gather this needs a it of explanation.

The first line in the code puts the complete file path and names of your two textfiles in an array $filesToParse so we can loop over them both.

Next, we define two regex strings. The first one is meant to capture the location of the line starting with "Begin calculating". (in your example file, this finds "H:\THE LIBRARY\DREAMWORKS\DOCUMENTARY\2013"). Tne next one is to capture the relevant parts of each file (the file Name, the file Size and the Hash value). This regex stores these parts in named captures to make things better readable.

Then it's time to loop over the two textfiles and parse out the information. The fastest way possible is to use switch -Regex -File <filename>. ( is represented by the automatic variable $_). What that does is iterate over every line in the text file and checks if these lines match the regexes we created.

  • If the line matches the location line ("Begin calculating"), we store the location in variable $folder for later use.
  • If the line matches the details regex, we take the Name, Size and Hash values from that and output an object with these values, including the location value we captured earlier in $folder.
  • If the line does not match either regex, we do nothing and so skip that line (default {})

The objects we output all get collected in a variable called $result.
$result will finally be an array (of two items, one for each textfile), where each item has an array of the objects.

Finally, by using one more cmdlet Compare-Object, we can see if the two items in $result differ or not, when we compare the properties 'File Name', 'File Size' and 'File Hash'. (the 'Location' will of course always be different, so we don't compare that).

  • If there is not output from that at all, this means there were no difference found and all your files are exactly the same.
  • If this command outputs anything, then there are differences found and this wil show up on your screen. (the SideIndicator will show either => or <=) The difference might be just the file name, or the size and/or hash value.

To test, I copied your example file and made some small differences in the second one, to prove it works.

Hope that explains

Upvotes: 0

anto418
anto418

Reputation: 175

There's a typo in line 43, it should be $file1Line instead of $fileLine:

    $hashList1ComputeLocation = [regex]::Matches($fileLine, "(^Begin\scalculating\s)(.*)(\'s\sfile\shashes\son\s)(.*$)").Groups[2].Value
                                                  ^^^^^^^^ 

Also, in line 35, -match uses regex which is a bit overkill considering you have -like which does simple wildcard matching. Mind the * at the end of the string:

    if ($file1Line -like "Begin calculating*") {     

And last thing which is more about personal preference, since you have a set "line format", I would try to use Select-String as much as possible to locate certain "special lines" instead of iterating over the entire file.

    Select-String -Path $hashList1 -Pattern "Begin calculation" | select -expandproperty Line
    Select-String -Path $hashList1 -Pattern "file name:" | select -expandproperty Line

You could use these two lines to get the "begin calculating" line and an array of "File name:" lines, which you could iterate on. This would be more concise in my opinion, and would be easier to debug. Your code does work though, so take it with a grain of salt.

Upvotes: 1

Related Questions