trevoirwilliams
trevoirwilliams

Reputation: 478

Split Files in Powershell with Delimiter

I currently have a Powershell script that is iterating through bundled MT103 files. Currently, it checks for a flag and decided to forward the file or not. The problem is that sometimes, MT177 (undesired) information comes bundled with the desired file and the file gets forwarded to the drop point.

How can I modify my Powershell script to detect and split this file based on the delimiter which is '{-'.

An example of this is: Multiple payments are separated by a line break. For example:

{-
MT103 payment 1
-}
{-
MT103 payment 2
-}

The desire is to split this file into multiple files and then process them individually.

The resulting files should contain

{-
MT103 payment 1
-}
{-
MT103 payment 2
-}

Upvotes: 3

Views: 3594

Answers (3)

trevoirwilliams
trevoirwilliams

Reputation: 478

This is the code i ended up with:

$Data = "{- MT103 payment 1 -} {- MT103 payment 2 -}"
[string[]]$Array = $Data.Split("{")
if ($Array.Count -gt 1) {
  for ($i = 1; $i -lt $Array.Count; $i++) {
    "{" + $Array[$i] | Out-File $destination-$i.fin
  }
}

I split the data on the opening brace '{' and then add it back to the resulting string content, then output a reconstructed string with the brace to an output file.

{- MT103 payment 1 -} 
{- MT103 payment 2 -}

Upvotes: 1

mklement0
mklement0

Reputation: 437081

# Create sample input file:
@'
{-
MT103 payment 1
-}
{-
MT103 payment 2
-}
'@ > file.txt

$index = 1

# Split the file into blocks and write them to "outFile<index>.txt" files.
(Get-Content -Raw file.txt) -split '(?s)({-.+?-})\r?\n' -ne '' | 
  Set-Content -LiteralPath { 'outFile{0}.txt' -f $script:index++ }
  • Get-Content -Raw reads the entire input file into a single, multi-line string.
  • -split splits that string into blocks of {-...-} lines:

    • Regex (?s)({-.+?-})\r?\n captures a single block, followed by a newline; inline option s ((?s)) ensures that . also matches newlines, for multi-line matching.

      • Note that even though -split by default doesn't include what the separator regex matched in the resulting array, using a capture group ((...)) does cause inclusion of what it matches.

      • If you wanted to match more strictly by only finding {- and -} on their own lines, use the following regex instead: (?sm)(^{-$.+?^-}$)\r?\n

    • -ne '' filters out empty entries resulting from the -split operation.
  • Passing a delay-bind script block ({ ... }) to Set-Content's -LiteralPath parameter allows determining an output file path on a per-input object basis:

    • 'outFile{0}.txt' -f $script:index++ outputs outFile1.txt for the first string (block of lines), outFile2.txt for the second, and so on.

    • Because delay-bind script blocks run in a child scope, you cannot directly increment $index in the caller's scope:

      • $script:index is a convenient way to refer to the variable in the script scope.
      • However, if your code is inside a function, use the following, more robust - but more cumbersome - reference to whatever the parent scope is: (Get-Variable -Scope 1 index).Value++
      • See this answer for details.

Upvotes: 6

user11829835
user11829835

Reputation:

EDITED: As I understood you need to split with a delimiter and remove undesired data.

something like the following:

$Data = "{- MT103 payment 1 -} {- MT103 payment 2 -}"
[Collections.ArrayList]$Array = $Data.Split('{-')
for($i = 0;$i -lt $Array.Count;$i++) {
    if($Array[$i] -imatch "MT177") {
        $Array.RemoveAt($i)
        $i = 0
    }
}
#Print result
$Array

Upvotes: 1

Related Questions