kei
kei

Reputation: 20471

PowerShell - Finding and modifying multiple text in string

I've got the following text for example (stored as $test):

\u003c/p\u003e\n\u003cp\u003e\u003c/p\u003e\n\u003cimg src=\"/sites/mysite/SiteCollectionImages/banner.jpg\" alt=\"\" style=\"float:none;height: auto;width: auto\"/\u003e\n\u003cp\u003eMore Meat, Less Waste, Means More Value For Your Dollar.\u003c/p\u003e\n\u003cp\u003e\u003c/p\u003e\n\u003cp\u003eWhen substituting Emu meat in your recipes or planning your serving portions, keep in mind that low fat Emu meat will not shrink like other meats. You get more of what you pay for with no bones, exterior fat, or gristle. Emu meat is very shelf stable especially if vacuum packaged. Properly vacuum packaged meat will keep fresh in your refrigerator for up to 4 weeks, and up to 6-9 months in your freezer.\u003c/p\u003e\n\u003cp\u003e\u003c/p\u003e\n\u003cimg src=\"/sites/mysite/SiteCollectionImages/logo.jpg\" alt=\"\" style=\"float:none;height: auto;width: auto\"/\u003e\n\u003cp\u003e\u003c/p\u003e\n\u003cp\u003e\u003c/p\u003e\n\u003cp\u003e

I would like to update the bolded text in between img src=\" and \" (to something like /sites/newSite/newLibrary/originalFilename.v2.jpg)

How would I go about doing these replacements in Powershell using regex?

I've tried $test -replace '(?<=img src=\")(?<imgUrl>\")', ' ' to start and even that doesn't do any replacements for me.


Update

I was able to capture what I needed to replace by using $test -replace '(?<=img src=\\")(.+?)(?=\\")', '$1' (Thanks to @user1390638)

I wanted to apply a function to $1 before replacement so I had to do this to make it work:

[regex]::Replace($test, '(?<=img src=\\")(.+?)(?=\\")', {param($match) someFunction($match.Groups[1].Value) })

Upvotes: 1

Views: 117

Answers (2)

user1390638
user1390638

Reputation: 190

Your regex is wrong, to match the string between you should use the following regex.

  • First part of the regex (?<=img src=\") will find the img src=\" text inside your text variable. Note that you should escape \ character here.
  • Second part of the regex .+? gets everything between, ? means non-greedy, so it will stop at the first match
  • Last part of the regex (?=") means until quote.

(?<=img src=\\").+?(?=")

Assuming your text is assigned to $text variable.

$text -replace('(?<=img src=\\").+?(?=")',"/sites/newSite/newLibrary/originalFilename.v2.jpg")

To replace multiple texts, you can basically call -replace twice such as $text -replace(...) -replace(...)

Upvotes: 2

Ash
Ash

Reputation: 3246

Regex

(?<=img src=\\\").+?(?=\\)

(?<=img src=\\\") # Finds `img src=\` escaping `\` and `"`  
.+?               # Everything between the two outer capture groups  
(?=\\)            # To the next backslash, again escaping the `\`

If you want to capture the url first to do other stuff with it, you can create a new regex object to check your matches.

$obj = # Import your text here how you like
$regex = [regex]::new('(?<=img src=\\\").+?(?=\\)')
$matches = $regex.Matches($obj)

You could create new variables for what you want to change the path to and then use the matches to replace the text. Adding a scramble function.

function Scramble {
    Param(
        [parameter(ValueFromPipeline=$true)][string]$InputObject
    )

    $split = $InputObject -split "/" | Select-Object -Skip 1
    return "/" + (($split | Get-Random -Count $split.Count) -join "/")
}

foreach ($match in $matches) {
    $obj.Replace($match.Value, ($match.Value | Scramble))  # .Net Method here to replace the specific string found. No need for another regex.
}

Upvotes: 1

Related Questions