Reputation: 5433
I'm trying to do something I thought would be simple, but no luck. The goal is to grab the href value from any tag. Example:
Source Material:
<link href="http://www.somesite.com/test.css" rel="stylesheet" type="text/css">
RegEx attempting:
<link[^>]*href=["{1}](.*?)["{1}][^>]*>
It seems valid at http://regexpal.com/, but I'm trying it at http://www.solmetra.com/scripts/regex/index.php, however, and it isn't working.
Any ideas?
Upvotes: 0
Views: 2883
Reputation: 270637
Looks like you have the {1}
inside a character class []
when it should really follow after. Actually, it isn't even necessary since it is implicit. But instead, you should use [^"]
to match everything up to the next quote:
<link[^>]*href="([^"]*)"[^>]*>
Note: You're only attempting to match double-quoted href attributes. This will require modification if you expect to encounter any single-quoted attributes.
Obligatory public service announcement: It is better to use a proper HTML parsing library to parse HTML and retrieve attributes than to try parsing it with regular expressions.
Upvotes: 2