Jiman Sahariah
Jiman Sahariah

Reputation: 110

javascript regex split by semicolon except &

I need & to remain in the string.

Example:

"Rajagiri School of Engineering & Technology;Indian School of Mines University;"

Output should be:

['Rajagiri School of Engineering & Technology', 'Indian School of Mines University']

Upvotes: 0

Views: 4244

Answers (2)

Toan Nguyen
Toan Nguyen

Reputation: 936

If you want to match everything outside of "real ;s":

(?:&|[^;])+

would work. Or (?:&\w+;|[^;])+ if more than just & entities are to be expected.

If your regex engine supports split operations, perhaps this regex (matching semicolons only if not preceded by &amp) is also a good idea

(?<!&amp);

To also allow other entities like above, (?<!&\w+); can be used if your regex implementation supports indefinite repetition inside lookbehind assertions. Most don't, though, .NET being an exception.

In Javascript:

var data = "Rajagiri School of Engineering &amp; Technology;Indian School of Mines University;"
var regex = "(?<!&amp);";
var result = data.split(regex);
console.log(result);

<p id="demo">Click the button to change the text in this paragraph.</p>
<button onclick="myFunction()">Try it</button>

<script>
  function myFunction() {
    var data = "Rajagiri School of Engineering &amp; Technology;Indian School of Mines University;"
    var regex = "(?<!&amp);";
    var result = data.split(regex);
    document.getElementById("demo").innerHTML = result;
  }
</script>

Upvotes: 1

anubhava
anubhava

Reputation: 786031

You can use replace with a callback and discard the result of there is &amp before ;.

var str = "Rajagiri School of Engineering &amp; Technology;Indian School of Mines University;";

var arr = str.replace(/(&amp)?;/g, function($0, $1) { return $1=="&amp"? $1+";" : "\n";
          }).split("\n").filter(Boolean);

Output:

["Rajagiri School of Engineering &amp; Technology",
 "Indian School of Mines University"]

Upvotes: 1

Related Questions