Sv3n
Sv3n

Reputation: 69

PowerShell script that searches for a string in a .txt and if it finds it, looks for the next line containing another string and does a job with it

I have the line

Select-String -Path ".\*.txt" -Pattern "6,16" -Context 20 | Select-Object -First 1

that would return 20 lines of context looking for a pattern of "6,16".

I need to look for the next line containing the string "ID number:" after the line of "6,16", read what is the text right next to "ID number:", find if this exact text exists in another "export.txt" file located in the same folder (so in ".\export.txt"), and see if it contains "6,16" on the same line as the one containing the text in question.

I know it may seem confusing, but what I mean is for example:

example.txt:5218: ID number:0002743284

shows whether this is true:

export.txt:9783: 0002743284 *some text on the same line for example* 6,16

Upvotes: 0

Views: 2522

Answers (2)

codewario
codewario

Reputation: 21418

There's a lot wrong with what you're expecting and the code you've tried so let's break it down and get to the solution. Kudos for attempting this on your own. First, here's the solution, read below this code for an explanation of what you were doing wrong and how to arrive at the code I've written:

# Get matching lines plus the following line from the example.txt seed file
$seedMatches = Select-String -Path .\example.txt -Pattern "6,\s*16" -Context 0, 2

# Obtain the ID number from the line following each match
$idNumbers = foreach( $match in $seedMatches ) {
  $postMatchFields = $match.Context.PostContext -split ":\s*"

  # Note: .IndexOf(object) is case-sensitive when looking for strings
  # Returns -1 if not found
  $idFieldIndex = $postMatchFields.IndexOf("ID number")

  # Return the "ID number" to `$idNumbers` if "ID number" is found in $postMatchFields
  if( $idFieldIndex -gt -1 ) {
    $postMatchFields[$idFieldIndex + 1]
  }
}

# Match lines in export.txt where both the $id and "6,16" appear
$exportMatches = foreach( $id in $idNumbers ) {
  Select-String -Path .\export.txt -Pattern "^(?=.*\b$id\b)(?=.*\b6,\s*16\b).*$"
}

mklement0's answer essentially condenses this into less code, but I wanted to break this down fully.


First, Select-String -Path ".\*.txt" will look in all .txt files in the current directory. You'll want to narrow that down to a specific naming pattern you're looking for in the seed file (the file we want to find the ID to look for in the other files). For this example, I'll use example.txt and export.txt for the paths which you've used elsewhere in your question, without using globbing to match on filenames.

Next, -Context gives context of the surrounding lines from the match. You only care about the next line match so 0, 1 should suffice for -Context (0 lines before, 1 line after the match).

Finally, I've added \s* to the -Pattern to match on whitespace, should the 16 ever be padded from the ,. So now we have our Select-String command ready to go:

$seedMatches = Select-String -Path .\example.txt -Pattern "6,\s*16" -Context 0, 2

Next, we will need to loop over the matching results from the seed file. You can use foreach or ForEach-Object, but I'll use foreach in the example below.

For each $match in $seedMatches we'll need to get the $idNumbers from the lines following each match. When $match is ToString()'d, it will spit out the matched line and any surrounding context lines. Since we only have one line following the match for our context, we can grab $match.Context.PostContext for this.

Now we can get the $idNumber. We can split example.txt:5218: ID number:0002743284 into an array of strings by using the -split operator to split the string on the :\s* pattern (\s* matches on any or no whitespace). Once we have this, we can get the index of "ID Number" and get the value of the field immediately following it. Now we have our $idNumbers. I'll also add some protection below to ensure the ID numbers field is actually found before continuing.

$idNumbers = foreach( $match in $seedMatches ) {
  $postMatchFields = $match.Context.PostContext -split ":\s*"

  # Note: .IndexOf(object) is case-sensitive when looking for strings
  # Returns -1 if not found
  $idFieldIndex = $postMatchFields.IndexOf("ID number")

  # Return the "ID number" to `$idNumbers` if "ID number" is found in $postMatchFields
  if( $idFieldIndex -gt -1 ) {
    $postMatchFields[$idFieldIndex + 1]
  }
}

Now that we have $idNumbers, we can look in export.txt for this ID number "6,\s*16" on the same line, once again using Select-String. This time, I'll put the code first since it's nothing new, then explain the regex a bit:

$exportMatches = foreach( $id in $idNumbers ) {
  Select-String -Path .\export.txt -Pattern "^(?=.*\b$id\b)(?=.*\b6,\s*16\b).*$"
}

$exportMatches will now contain the lines which contain both the target ID number and the 6,16 value on the same line. Note that order wasn't specified so the expression uses positive lookaheads to find both the $id and 6,16 values regardless of their order in the string. I won't break down the exact expression but if you plug ^(?=.*\b0123456789\b)(?=.*\b6,\s*16\b).*$ into https://regexr.com it will break down and explain the regex pattern in detail.


The full code is above in at the top of this answer.

Upvotes: 2

mklement0
mklement0

Reputation: 438178

If I understand the question correctly, you're looking for something like:

Select-String -List -Path *.txt -Pattern '\b6,16\b' -Context 0, 20 |
  ForEach-Object {
    if ($_.Context.PostContext -join "`n" -match '\bID number:(\d+)') {
      Select-String -List -LiteralPath export.txt -Pattern "$($Matches[1]).+$($_.Pattern)"
    }
  }
  • Select-String's -List switch limits the matching to one match per input file; -Context 0,20 also includes the 20 lines following the matching one in the output (but none (0) before).

    • Note that I've placed \b, a word-boundary assertion at either end of the search pattern, 6,16, to rule out accidental false positives such as 96,169.
  • $_.Context.PostContext contains the array of lines following the matching line (which itself is stored in $_.Line):

    • -join "`n" joins them into a multi-line string, so as to ensure that the subsequent -match operation reports the captured results in the automatic $Matches variable, notably reporting the ID number of interest in $Matches[1], the text captured by the first (and only) capture group ((\d+)).
  • The captured ID is then used in combination with the original search pattern to form a regex that looks for both on the same line, and is passed to a second Select-String call that searches through export.txt

    • Note: An object representing the matching line, if any, is output by default; to return just $true or $false, replace -List with -Quiet.

Upvotes: 3

Related Questions