cube
cube

Reputation: 1802

REGEXP in .split()

Ok, I am trying to find a the dom pattern:

 <div>
    <br>
    </div>

from my contenteditable div which typically looks like this, with multiple spans:

<div id="edit" contenteditable="true">
    <span>text</span>   
    <span>text</span> 
    //and more spans maybe 
     <div>
        <br>
     </div>
</div>

The line of code that I am using is:

return string.split(/\r\n?|\n|<div>(.*?)<br>(.*?)<\/div>,gis/);

The problem is this portion of the regex <div>(.*?)<br>(.*?)<\/div>,gis.. it never matches, even though the pattern exists. Just for clarity sake, the return runs in a loop across the input text, triggered by the input change event on my contenteditable div. I need an array version of the text delimited every where the pattern is. No library for this please.

Upvotes: 0

Views: 133

Answers (4)

Benjamin Gruenbaum
Benjamin Gruenbaum

Reputation: 276306

Here is a solution that does not involve any external library and is easy to understand.

For starters, let's grab the edit div's contents

var $edit = document.getElementById("edit")

Now, we create a small function to iterate through our DOM. There are plenty of ways to do this, here is the way Douglas Crockford did it in his book "JavaScript : The Good Parts" iirc:

function walkTheDOM(node, func) {
    func(node);
    node = node.firstChild;
    while (node) {
        walkTheDOM(node, func);
        node = node.nextSibling;
    }
}

This functions goes through every element in the dom of node and runs func on it.

The only thing remaining is to call it on our $edit div from before.

walkTheDOM($edit, function (node) {
    if (node.nodeName.toLowerCase()==="div") { // we got a div
        if(node.innerHTML.trim() === "<br>"){ //whose inner html is <br>
           console.log("GOT",node);//print its name
        }
    }
});

Here is a fiddle of it all working

After you've done all the work of finding it, you can easily extract whichever text/data you want from the rest of the data. See this question on why parsing HTML with regular expressions is generally a bad idea.

Upvotes: 1

bfavaretto
bfavaretto

Reputation: 71918

The flags should go outside:

return string.split(/\r\n?|\n|<div>(.*?)<br>(.*?)<\/div>/gis);

I'm not very good with regex, but that seems too greedy to me also. I believe it will match any div that contains a br, not only the ones that just contain a br. And if they are nested, it should match the outermost one. I'd tackle this problem by traversing the DOM, as suggested in the comments.

Upvotes: 0

Pochemuchkin
Pochemuchkin

Reputation: 3

1) Regexp flags should be after closing "/"

2) Use [\S\s]* instead of .*

3) "<text" is erroneous html code because "<" should be replaced by "&lt;"

Upvotes: 0

ZachB
ZachB

Reputation: 15366

I see a few potential issues: (1) You want your flags (gis) outside of the // marks. (2) Your first use of | needs parentheses to match \r, \n or \r\n. You probably don't need these at all though. (3) I'm not sure why you have an alternate here: \n|<div>. (4) s isn't a flag that I'm aware of.

This should do the trick:

/<div>(.*?)<br>(.*?)<\/div>/gi

Upvotes: 0

Related Questions