Reputation: 3759
I am facing a performance issue when calling a method that replaces an innerHTML text using a regular expression:
function getReplacedText(textToReplace) {
return textToReplace.replace(/\<img src=[\"|\']([\S\s]+\\)*([\S\s]+).png[\"|\']\/\>/i,"*$2*");
}
The objective behind this replacement, is to retrieve the innerHTML
of a contentEditable div
in a keyup handler function, and replace each img
tag with the name of the file. This replacement is necessary in my case to know if the replaced text exceeds or not the max length allowed to the editable div.
function keyupHandler(event) {
var myEditableDiv = document.getElementById("editableDiv");
const currentText = getReplacedText(myEditableDiv.innerHTML);
if (currentText.length >= 750) { //750 is the max length
event.preventDefault();
}
}
For example, the wanted output for abc <img src="assets\test\1F619.png"> def
would be abc *1F619* def
When I don't use the getReplacedText
I don't have any performance problem. Could you please advise me of a better approach or a better use of the regular expression?
This is an example of the text to replace when performance begins to degrades:
dsd<img src="assets\test\1F619.png"/><img src="assets\test\1F619.png"/><img src="assets\test\1F629.png"/><img src="assets\test\1F630.png"/>sdfsdfsdffsdf<img src="assets\test\1F629.png"/>sdfsdsdfsdf<img src="assets\test\1F627.png"/><img src="assets\test\1F631.png"/>sdfsdfsdf<img src="assets\test\1F631.png"/>sdfsdfsdf<img src="assets\test\1F632.png"/>sdfsdfs<img src="assets\test\1F629.png"/><img src="assets\test\1F629.png"/>sdfs<img src="assets\test\1F631.png"/>df<img src="assets\test\1F632.png"/>sdfsdfsdf
Upvotes: 0
Views: 153
Reputation:
You don't need a DOM to parse html tags !!!
The fastest way to do it, and won't choke on possibly malformed html.
Find
/<img(?=\s)(?=(?:[^>"']|"[^"]*"|'[^']*')*?\ssrc\s*=\s*(?:(['"])(?:(?!\1)[\S\s])*?((?:(?!\1|\\)[\S\s])*?)\.png\s*\1))\s+(?:"[\S\s]*?"|'[\S\s]*?'|[^>]?)+>/
Replace *$2*
https://regex101.com/r/bCYXV1/1
Explained
# Begin 'img' tag
< img
(?= \s )
(?= # Asserttion (a pseudo atomic group)
(?: [^>"'] | " [^"]* " | ' [^']* ' )*?
\s src \s* = \s* # src attribute
(?:
( ['"] ) # (1), Quote
(?:
(?! \1 )
[\S\s]
)*?
( # (2 start)
(?:
(?! \1 | \\ )
[\S\s]
)*?
) # (2 end)
\.png # find the 'png' file
\s*
\1
)
)
# Have the png file, just match the rest of tag
\s+
(?: " [\S\s]*? " | ' [\S\s]*? ' | [^>]? )+
> # End img tag
var input = "dsd<img src=\"assets\\test\\1F619.png\"><img src=\"assets\\test\\1F619.png\"><img src=\"assets\\test\\1F629.png\"><img src=\"assets\\test\\1F630.png\">sdfsdfsdffsdf<img src=\"assets\\test\\1F629.png\">sdfsdsdfsdf<img src=\"assets\\test\\1F627.png\"><img src=\"assets\\test\\1F631.png\">sdfsdfsdf<img src=\"assets\\test\\1F631.png\">sdfsdfsdf<img src=\"assets\\test\\1F632.png\">sdfsdfs<img src=\"assets\\test\\1F629.png\"><img src=\"assets\\test\\1F629.png\">sdfs<img src=\"assets\\test\\1F631.png\">df<img src=\"assets\\test\\1F632.png\">sdfsdfsdf";
console.log(input.replace(/<img(?=\s)(?=(?:[^>"']|"[^"]*"|'[^']*')*?\ssrc\s*=\s*(?:(['"])(?:(?!\1)[\S\s])*?((?:(?!\1|\\)[\S\s])*?)\.png\s*\1))\s+(?:"[\S\s]*?"|'[\S\s]*?'|[^>]?)+>/g
,"\n*$2*"));
Upvotes: 1
Reputation: 371138
Avoid using regular expressions to parse HTML. Use DOMParser
instead - find <img>
tags, and replace them a text node containing only the last part of the src:
const input = String.raw`dsd<img src="assets\test\1F619.png"><img src="assets\test\1F619.png"><img src="assets\test\1F629.png"><img src="assets\test\1F630.png">sdfsdfsdffsdf<img src="assets\test\1F629.png">sdfsdsdfsdf<img src="assets\test\1F627.png"><img src="assets\test\1F631.png">sdfsdfsdf<img src="assets\test\1F631.png">sdfsdfsdf<img src="assets\test\1F632.png">sdfsdfs<img src="assets\test\1F629.png"><img src="assets\test\1F629.png">sdfs<img src="assets\test\1F631.png">df<img src="assets\test\1F632.png">sdfsdfsdf`;
const doc = new DOMParser().parseFromString(input, 'text/html');
doc.querySelectorAll('img[src]').forEach((img) => {
img.replaceWith(' ' + img.src.match(/[^\/]+(?=\.png$)/)[0] + ' ');
});
console.log(doc.body.innerHTML);
Upvotes: 2