S P
S P

Reputation: 15

Extracting url from a string with regex and Powershell

I'm using powershell and regex. I'm scraping a web page result to a variable, but I can't seem to extract a generated url from that variable.

this is the content (the actual url varies):

"https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0&

$reg = "([^&]*)&;$" always returns false.

I've been trying -match and Select-String with regex but I'm in need of guidance.

Upvotes: 0

Views: 442

Answers (2)

It really depends on what format the content is in.

(?<=\&quot;) looks behind "&quot" for (.*?) which any numbers of non-newline characters and then looks ahead for (?=\&amp;) which is "&amp;".

Here's a fair start:

$pattern = "(?<=\&quot;)(.*?)(?=\&amp;)"
$someText = "&quot;https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0&amp;"
$newText = [regex]::match($someText, $pattern)
$newText.Value

Returns:

https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0

Upvotes: 0

mklement0
mklement0

Reputation: 439277

I suggest using a -replace operation:

$str = '&quot;https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0&amp;' 

$str -replace '^&quot;(.+)&amp;$', '$1'

Upvotes: 1

Related Questions