mrinterested
mrinterested

Reputation: 155

RegEx with JavaScript matches more that it should

Fairly simple HTML (the ellipses indicate that there's more code):

...Profile">
 Some text
 </a>...

Using on-line RegEx tester for JavaScript (http://regexpal.com/), I can extract "Some text" (note that it contains newlines) with the following expression:

(?=Profile">)[\s\S]*(?=</a)

(Unfortunately, look-behinds are not supported by JavaScript, and so I also extract Something"> to later remove this. The problem is, however, that the below code

var ShowContent = document.getElementById(id);
ShowContent = ShowContent.innerHTML;
var patt3=/Profile">[\s\S]*(?=<)/;
var GetName=patt3.exec(ShowContent);
alert(GetName);

doesn't extract what the on-line tester shows, but also it includes the whole HTML code that is after "Some text" (IE, not only the ending < /a but also everything after).

Does anyone have any suggestions?

Upvotes: 0

Views: 76

Answers (2)

Niet the Dark Absol
Niet the Dark Absol

Reputation: 324790

You would probably be better off making the quantifier ungreedy. Try this regex:

/Profile">([\s\S]*?)(?=<)/

Upvotes: 0

Rob W
Rob W

Reputation: 349222

When you're certain that the supplied string does not contain possible pitfalls (eg. <input value='Profile">'>, replace [\s\S]* with [^<]* (anything but a <):

var patt3 = /Profile">([^<]*)/;
var getName = patt3.exec(ShowContent);
getName = getName ? getName[1] : ''; // If no match has been found -> empty string

alert(getName);

(I also replaced GetName with getName, because camelCased variables starting with a capital usually indicate a constructor. Stick to the conventions, and do not start non-constructors with a capital).

Upvotes: 2

Related Questions