msallge
msallge

Reputation: 109

Perl regex to match URIs in a string

I'm having a bit of a problem with a regex in Perl.

Assume I'm getting a string with URIs embedded somewhere in it. I'd like to store every unique URI.

My problem is that URIs in that string might have different formats. Some might be mylightsaber24.com, others might be http://www.companyabc.co.uk or even www.thisisawebsite.com/index.html?someparameters.

For that reason, both Regexp::Common qw /URI/ and Regexp::Common qw/net/ failed me :(

Any pointers?

Thanks so much!

Bonus points for identifying that www.nomansland.comand nomansland.com are basically the same entry.

Upvotes: 2

Views: 570

Answers (1)

Pavel Vlasov
Pavel Vlasov

Reputation: 3465

What's about these CPAN modules:

Upvotes: 2

Related Questions