RiddleMeThis
RiddleMeThis

Reputation: 895

Regex to capture an URL

I've extracted an URL from a website in this string form:

@{href=http://download.company.net/file.exe}[0]

I can't figure out pattern how to get this part out of it: http://download.company.net/file.exe so I can use it as URL to download file.

From my point of view the logic would be, that I need to first match "http" as beggining of a string, wildcard inbetween and then match "}", but not include it in final output. So IDK ...[http]*\} (I know that this "syntax" of mine is totally wrong, but you get the idea)

Reason I dont want to include "exe" to pattern, is that file extension could be "msi" and I want it to be more universal. Also some good and comprehensive PS regex article would help me greatly (with inexperience in mind) - I really didnt find any "newbie friendly" or comprehensive enough to understand this topic.

Upvotes: 1

Views: 1312

Answers (2)

Saleem
Saleem

Reputation: 8988

I'd use -cmatch or -imatch as

if ($content -imatch '(?<=href=).*(?=})') {
    $result = $matches[0]
} else {
    $result = ''
}

In case of test data, it will return

http://download.company.net/file.exe

Upvotes: 1

Martin Brandl
Martin Brandl

Reputation: 58981

You can either, use [regex]::match or -replace.

In the following example, I capture everything after href= that is not a starting curly bracket }:

'@{href=http://download.company.net/file.exe}[0]' -replace '@{href=([^}]+).*', '$1'

Output:

http://download.company.net/file.exe

Upvotes: 1

Related Questions