user2064000
user2064000

Reputation:

Getting too many matches for regex

I wrote a regex tester in JS. However, it appears that for some regexes, I get multiple matches.

For example, if for the content hello, world, the regex hello.* is given, the it is reported to match hello, world. However, if the regex is now set to (hello|goodbye).* then the reported matches are hello, world and hello, whereas it should be hello, world only.

<!DOCTYPE html>
<html>
    <head>
        <title>Regex tester</title>
        <meta http-equiv="content-type" content="text/html; charset=UTF-8">
    </head>
    <body>
        <script type="text/javascript">
            function resetform() {
                document.getElementById("results").innerHTML = "";
            }

            function escapetags(str) {
                return (str.replace('&','&amp;').replace('<', '&lt;').replace('>', '&gt;'));
            }

            function check() {
                if (!document.form1.re.value) {
                    document.getElementById("results").innerHTML = '<p style="color:red"><b>Error: No regular expression specified</b></p>';
                    return;
                }
                if (!document.form1.str.value) {
                    document.getElementById("results").innerHTML = '<p style="color:red"><b>Error: No content specified</b></p>';
                    return;
                }
                var pattern,
                modifiers = "";
                if (document.form1.nocase.checked) {
                    modifiers = "i";
                }
                if (document.form1.global.checked) {
                    modifiers = modifiers + "g";
                }
                try {
                    if (modifiers) {
                        pattern = new RegExp(document.form1.re.value, modifiers);
                    } else {
                        pattern = new RegExp(document.form1.re.value);
                    }
                } catch (excpt) {
                    document.getElementById("results").innerHTML = '<p style="color:red"><b>Error: Invalid regular expression</b></p>';
                    return;
                }
                var matches = pattern.exec(document.form1.str.value);
                if (matches == null) {
                    document.getElementById("results").innerHTML = '<p><b>Regular expression did not match with content<b></p>';
                } else {
                    document.getElementById("results").innerHTML = '<p><b>Regular expression matched with content</b></p><p>Matches:</p>';
                    for (var index = 0; index < matches.length; index++) {
                        document.getElementById("results").innerHTML += escapetags(matches[index]) + '<br>';
                    }
                }
            }
        </script>
        <h1>Regex tester</h1>
        <form name="form1">
            <p>Regex:</p>
            <input type="text" name="re" size="65"><br>
            <input type="checkbox" name="nocase">Case insensitive
            <input type="checkbox" name="global">Global
            <p>Content:</p>
            <textarea name="str" rows="8" cols="65"></textarea><br><br>
            <input type="button" value="Check" onclick="check();">
            <input type="button" value="Reset" onclick="reset();resetform();">
        </form>
        <div id="results"></div>
    </body>
</html>

Can anyone help me find the problem in my code?

Thanks in advance.

Upvotes: 0

Views: 310

Answers (3)

amrinder007
amrinder007

Reputation: 1475

I think you want something like this,

var a = new RegExp("hello, world"); //or your string
var b = "hello, world";
if(a.test(b)){
   //do your stuff
}
else{
   //do your stuff
}

it will only match for the given pattern.

Upvotes: 0

Matt Burland
Matt Burland

Reputation: 45135

The .exec() method of the JavaScript regex will return the entire matched string as the first element and then any captured groups as subsequent elements. When you use the regex:

(hello|goodbye).*

The brackets define a capture group, so your returned array will be

[0] = hello, world
[1] = hello

As Loamhoof suggest below, you can add ?: to make a group non-capturing if that is not desirable.

Upvotes: 1

Loamhoof
Loamhoof

Reputation: 8293

"(hello|goodbye). then the reported matches are hello, world and hello*"

No, the second "match" is just the result of your capturing group (what's between the parenthesis). Ignore it, or make the group non-capturing: (?:hello|goodbye)

Upvotes: 4

Related Questions