Reputation: 109
I'm having a bit of a problem with a regex in Perl.
Assume I'm getting a string with URIs embedded somewhere in it. I'd like to store every unique URI.
My problem is that URIs in that string might have different formats. Some might be mylightsaber24.com
, others might be http://www.companyabc.co.uk
or even www.thisisawebsite.com/index.html?someparameters
.
For that reason, both Regexp::Common qw /URI/
and Regexp::Common qw/net/
failed me :(
Any pointers?
Thanks so much!
Bonus points for identifying that www.nomansland.com
and nomansland.com
are basically the same entry.
Upvotes: 2
Views: 570