Need help in forming regular expression in perl

Question

I need some suggestion in parsing a html content,need to extract the id of tag <\a> inside a div, and store it into an variable specific variable. i have tried to make a regular expression for this but its getting the id of tag in all div. i need to store the ids of tag<\a> which is only inside a specific div .

The HTML content is



-
aaa

-
bbb

.
.
.




-
ccc

-
ddd

-
eee

.
.

Need some suggestion, Thanks in advance

update: the regex i have used

if($content=~m/sel_cat " id="([^<]*?)"/is){}

while($content=~m/sel_cat " id="([^<]*?)"/igs){}

amon · Accepted Answer

There are so many great HTML parser around. I kind of like the Mojo suite, which allows me to use CSS selectors to get a part of the DOM:

use Mojo;

my $dom = Mojo::DOM->new($html_content);

say for $dom->find('a.sel_cat')->all_text;
# Or, more robust:
# say $_->all_text for $dom->find('a.sel_cat')->each;

Output:

aaa
bbb
ccc
ddd
eee

Or for the IDs:

say for $dom->find('a.sel_cat')->attr('id');
# Or, more robust_
# say $_->attr('id') for $dom->find('a.sel_cat')->each;

Output:

sel_cat_10018
sel_cat_10007
sel_cat_10016
sel_cat_10011
sel_cat_10025

If you only want those ids in the part_two div, use the selector #part_two a.sel_cat.

Need help in forming regular expression in perl

Answers (2)

Related Questions