danneth
danneth

Reputation: 2791

RegEx - character not before match

I understand the concepts of RegEx, but this is more or less the first time I've actually been trying to write some myself.

As a part of a project, I'm attempting to parse out strings which match to a certain domain (actually an array of domains, but let's keep it simple).

At first I started out with this:

url.match('www.example.com')

But I noticed I was also getting input like this:

http://www.someothersite.com/page?ref=http://www.example.com

These rows will of course match for www.example.com but I wish to exclude them. So I was thinking along these lines: Only match rows that contain www.example.com, but not after a ? character. This is what I came up with:

var reg = new RegExp("[^\\?]*" + url + "(\\.*)", "gi"); 

This does however not seem to work, any suggestions would be greatly appreciated as I fear I've used what little knowledge I yet possess in the matter.

Edit: Some clarifications.

Upvotes: 1

Views: 2027

Answers (1)

SilentGhost
SilentGhost

Reputation: 320039

Edit: here is the modified regex for arbitrary domain:

RegExp("(^|\\s)(https?://)?(\\w+\\.)?" + url, "gi");

The idea here is that you're matching only url preceded by some white spaces character, which makes it impossible to be inside the query.

Upvotes: 1

Related Questions