Bob Wintemberg
Bob Wintemberg

Reputation: 3252

Returning a portion of a regular expression match

This question shows my ignorance of regular expressions. I've never understood it quite enough.

If I wanted to match, for instance, just the URL portion of an tag in HTML, what would I need to do?

My regular expression to get the entire tag is:

<A[^>]*?HREF\s*=\s*[""']?([^'"" >]+?)[ '""]?>

I have no idea what I would need to do to get the URL out of that and I have no clue where to look in regular expression documentation to figure this out.

Upvotes: 1

Views: 182

Answers (4)

Andrew Hare
Andrew Hare

Reputation: 351476

I switched things up a bit - try something like this:

<a[^>]*href="([^"]*).*>

Upvotes: 0

Rahul
Rahul

Reputation: 13056

You can use round brackets to group parts of the regular expression match. In this case you could use a round bracket around the URL part and then later use a number to refer to that group. See here to see how exactly you can do this.

Upvotes: 1

SingleNegationElimination
SingleNegationElimination

Reputation: 156138

the exactly HOW part depends on the regex library you're using, but the way is to use a grouped expression. You actually already have one in your example, as grouped expressions are parenthesized. The href attribute value is your first group (your zeroth group is the whole expression.)

Upvotes: 2

Suroot
Suroot

Reputation: 4423

If programming in Perl you could utilize the $1 operator within an if() statement. For ex.

if( $HREF =~ /<A[^>]*?HREF\s*=\s*[""']?([^'"" >]+?)[ '""]?>/ ) {
 print $1;
}

Upvotes: 3

Related Questions