Reputation: 2800
I want to get regex for the following construct where it should result as:
Actions and Sci-Fi
<a href="/?genre=Action">Actions</a> <a href="/?genre=Sci-Fi">Sci-Fi</a>
Upvotes: 4
Views: 44
Reputation: 174834
Don't parse html files with regex. If you insist then you could use the below regex and get the text inside anchor tags from group index 1.
<a\s[^<>]*>([^<>]*)<\/a>
Explanation:
<a '<a'
\s whitespace (\n, \r, \t, \f, and " ")
[^<>]* any character except: '<', '>' (0 or more
times)
> '>'
( group and capture to \1:
[^<>]* any character except: '<', '>' (0 or
more times)
) end of \1
< '<'
\/ '/'
a> 'a>'
Upvotes: 4