Reputation: 242
I am using the following Expression to select all hyperlinks
//a[@href]
How can I write an expression to select all hyperlinks which match this format
here http://abc.com/articles/ is constant and the article number increases
Upvotes: 0
Views: 1088
Reputation: 9541
<a\s.*?href=(?:["'](http://abc.com/articles/([0-9])+)["']).*?>(.*?)</a>
UPDATE:
If you need the xpath expression here it is:
a[starts-with(@href,'http://abc.com/articles/')]
this would return all the links which has href attribute which starts with 'http://abc.com/articles/' I hope this answers your qiestion.
Upvotes: 1
Reputation: 76898
It's a bit overkill, but this is the regex I use in my apps to find URLs in plain text:
(\b(?:(?:https?|ftp|file)://|www\.|ftp\.) (?:\([-A-Z0-9+&@#/%=~|\$\?!:,\.]*\) |[-A-Z0-9+&@#/%=~|\$\?!:,\.])* (?:\([-A-Z0-9+&@#/%=~|\$\?!:,\.]*\) |[A-Z0-9+&@#/%=~|\$]))
Upvotes: 0
Reputation: 282865
That expression looks like XPath, not a regex. A regex for that particular URL would look like
^http://abc.com/articles/\d+$
But I guess you'll have to use your xpath query to find the hyperlinks, then filter them based on the HREF attribute using that regex.
Upvotes: 1