C0deAttack
C0deAttack

Reputation: 24667

Matching URLs in text except those enclosed by square brackets

I'm trying to create a regex so I can identify URLs in text.

Possible (likely) Test cases:

Only the lines without square brackets should match. And only the URL should be matched, not the whole line. In case it was unclear the bold text in the list above is what I would like the regex to match on.

The current regex I've worked out is:

(^|[^\[ ])(https?://\S+)

Only the first 2 lines match, I can't figure out how to make the other lines without the Square brackets match?

I've used groups because I'll be replacing the match with some HTML later. But need to get the regex working properly first.

I've been using this online tool to help me build and test the regex; http://gskinner.com/RegExr/

Upvotes: 2

Views: 608

Answers (3)

orlp
orlp

Reputation: 117701

Your modified working regex:

([^\S\]](https?:\/\/[^\]\s]+)[^\S\]]|^(https?:\/\/[^\]\s]+)$)

Rubular

Upvotes: 1

Hun1Ahpu
Hun1Ahpu

Reputation: 3355

This should work:

(?<=^[^\[\]]*)(https?://\S+)(?=[^\[\]]*$)

With [^\[\]]* you say that there could be any symbols except square brackets before and after your link. This uses positive lookahead and lookbehind to check that there is no brackets.

Upvotes: 0

codaddict
codaddict

Reputation: 455122

You can also use negative lookahead assertions to ensure the line does not contain square brackets using the regex:

^(?!.*\[.*\]).*(https?://\S+)

Rubular link

Upvotes: 1

Related Questions