Reputation: 99
I'm trying to merge HUNDREDS of .rtf files using Powershell.
Here's the format: bunch of CSS stuff followed by the part I want.....
{\rtf1\ansi {\fonttbl{\f0 Arial;}}{\colortbl\red255\green255\blue255;}{\stylesheet
}\paperw11685\paperh1560\margl600\margr600\margt600\margb600\pard\plain\f0\fs28\cf0
\ql\li75\ri75\fi0\b Instructions: }
In this case, I wish to retain "Instructions:"
{\rtf1\ansi {\fonttbl{\f0 Arial;}}{\colortbl\red255\green255\blue255;}{\stylesheet
}\paperw10530\paperh1920\margl600\margr600\margt600\margb600\pard\plain\f0\fs28\cf0
\ql\li75\ri75\fi0\b You will be presented with fifty (50) questions which are ran
domly selected from a pool of hundreds of questions. }
In this case I wish to retain "You will be presented with fifty (50) questions which are ran domly selected from a pool of hundreds of questions."
The Powershell script is this:
$files = (dir *.rtf)
$outfile = "AllQuestions.rtf"
$files | %{
$_.Name | Add-Content $outfile
$MyVar = Get-Content $_.Name
$MyVar=$MyVar -replace ".*b\s","" | Add-Content $outfile
}
My intent was to replace all the string UP TO "\b " with nothin ( "" ). I used /.b\s/ (fwd slants as delimiters, .="everything zero or more times", b\s=the letter b and a space)I'm partially successful;its stripping a portion
{\rtf1........cf0
\ql\li75\ri75\fi0\b Instructions: }
to
{\rtf1........cf0
Instructions: }
This makes me think there's a linefeed after cf0 . I tried to stripp out all the linefeeds
-replace "\n*",""
that didn't change the string.
But I wanna dump ALL the previous string (from the {\rtf1.... to right before the final text) & be left with that end text.....at this point I'll take the trailing "}" dump it in a subsequent -replace
Upvotes: 0
Views: 99
Reputation: 68273
You can use a multiline regex:
$text = (@'
{\rtf1\ansi {\fonttbl{\f0 Arial;}}{\colortbl\red255\green255\blue255;}{\stylesheet
}\paperw10530\paperh1920\margl600\margr600\margt600\margb600\pard\plain\f0\fs28\cf0
\ql\li75\ri75\fi0\b You will be presented with fifty (50) questions which are randomly selected from a pool of hundreds of questions. }
'@)
$text -replace '(?ms).+\\b([^}]+)}.*','$1'
You will be presented with fifty (50) questions which are randomly selected from a pool of hundreds of questions.
Use the -Raw switch with Get-Content to read the file as multi-line text:
$files = (dir *.rtf)
$outfile = "AllQuestions.rtf"
$files | %{
$_.Name | Add-Content $outfile
$MyVar = Get-Content $_.Name -Raw
$MyVar=$MyVar -replace '(?ms).+\\b([^}]+)}.*','$1' | Add-Content $outfile
}
Upvotes: 0
Reputation: 13425
you can use look behind regex Added capturing group (.*) and non-capturing group (?: }) so that it matches exactly till }
(?<=\\b )(.*)(?: })$
Upvotes: 1
Reputation: 2282
Replace this:
.*?\\b(?!.*?\\b)[ ]*([^}]+)
To:
$1
Example
$MyVar -replace $regex,'$1'
Upvotes: 0
Reputation: 533
Try this regex ($ refers to the end of a line) to get the "Instructions:" or "You will be presented with fifty (50) questions which are ran domly selected from a pool of hundreds of questions."" portion:
\\b(.*)}$
Upvotes: 0