jazaddict
jazaddict

Reputation: 99

What REGEX pattern will get me the last portion of a string?

I'm trying to merge HUNDREDS of .rtf files using Powershell.

Here's the format: bunch of CSS stuff followed by the part I want.....

 {\rtf1\ansi {\fonttbl{\f0 Arial;}}{\colortbl\red255\green255\blue255;}{\stylesheet
}\paperw11685\paperh1560\margl600\margr600\margt600\margb600\pard\plain\f0\fs28\cf0
\ql\li75\ri75\fi0\b Instructions: }

In this case, I wish to retain "Instructions:"

{\rtf1\ansi {\fonttbl{\f0 Arial;}}{\colortbl\red255\green255\blue255;}{\stylesheet
}\paperw10530\paperh1920\margl600\margr600\margt600\margb600\pard\plain\f0\fs28\cf0
\ql\li75\ri75\fi0\b You will be presented with fifty (50) questions which are ran
domly selected from a pool of hundreds of questions. }

In this case I wish to retain "You will be presented with fifty (50) questions which are ran domly selected from a pool of hundreds of questions."

The Powershell script is this:

$files = (dir *.rtf)
$outfile = "AllQuestions.rtf"
$files | %{
$_.Name | Add-Content  $outfile 
$MyVar = Get-Content $_.Name    
$MyVar=$MyVar -replace ".*b\s","" | Add-Content  $outfile 
}

My intent was to replace all the string UP TO "\b " with nothin ( "" ). I used /.b\s/ (fwd slants as delimiters, .="everything zero or more times", b\s=the letter b and a space)I'm partially successful;its stripping a portion

{\rtf1........cf0
\ql\li75\ri75\fi0\b Instructions: }

to

{\rtf1........cf0 
Instructions: }

This makes me think there's a linefeed after cf0 . I tried to stripp out all the linefeeds

-replace "\n*",""  

that didn't change the string.

But I wanna dump ALL the previous string (from the {\rtf1.... to right before the final text) & be left with that end text.....at this point I'll take the trailing "}" dump it in a subsequent -replace

Upvotes: 0

Views: 99

Answers (4)

mjolinor
mjolinor

Reputation: 68273

You can use a multiline regex:

$text = (@'
{\rtf1\ansi {\fonttbl{\f0 Arial;}}{\colortbl\red255\green255\blue255;}{\stylesheet
}\paperw10530\paperh1920\margl600\margr600\margt600\margb600\pard\plain\f0\fs28\cf0
\ql\li75\ri75\fi0\b You will be presented with fifty (50) questions which are randomly selected from a pool of hundreds of questions. }
'@)

$text -replace '(?ms).+\\b([^}]+)}.*','$1'

 You will be presented with fifty (50) questions which are randomly selected from a pool of hundreds of questions. 

Use the -Raw switch with Get-Content to read the file as multi-line text:

$files = (dir *.rtf)
$outfile = "AllQuestions.rtf"
$files | %{
$_.Name | Add-Content  $outfile 
$MyVar = Get-Content $_.Name -Raw    
$MyVar=$MyVar -replace '(?ms).+\\b([^}]+)}.*','$1' | Add-Content  $outfile 
}

Upvotes: 0

radar
radar

Reputation: 13425

you can use look behind regex Added capturing group (.*) and non-capturing group (?: }) so that it matches exactly till }

(?<=\\b )(.*)(?: })$

Upvotes: 1

walid toumi
walid toumi

Reputation: 2282

Replace this:

.*?\\b(?!.*?\\b)[ ]*([^}]+)

To:

$1

Example

$MyVar -replace $regex,'$1'

Demo

Upvotes: 0

ariscris
ariscris

Reputation: 533

Try this regex ($ refers to the end of a line) to get the "Instructions:" or "You will be presented with fifty (50) questions which are ran domly selected from a pool of hundreds of questions."" portion:

\\b(.*)}$

Upvotes: 0

Related Questions