Reputation: 16367
I am allowing my users to wrap words with "*", "/", "_", and "-" as a shorthand way to indicate they'd like to bold, italicize, underline, or strikethrough their text. Unfortunately, when the page is filled with text using this markup, I'm seeing a noticeable (borderline acceptable) slow down.
Here's the JavaScript I wrote to handle this task. Can you please provide feedback on how I could speed things up?
function handleContentFormatting(content) {
content = handleLineBreaks(content);
var bold_object = {'regex': /\*(.|\n)+?\*/i, 'open': '<b>', 'close': '</b>'};
var italic_object = {'regex': /\/(?!\D>|>)(.|\n)+?\//i, 'open': '<i>', 'close': '</i>'};
var underline_object = {'regex': /\_(.|\n)+?\_/i, 'open': '<u>', 'close': '</u>'};
var strikethrough_object = {'regex': /\-(.|\n)+?\-/i, 'open': '<del>', 'close': '</del>'};
var format_objects = [bold_object, italic_object, underline_object, strikethrough_object];
for( obj in format_objects ) {
content = handleTextFormatIndicators(content, format_objects[obj]);
}
return content;
}
//@param obj --- an object with 3 properties:
// 1.) the regex to search with
// 2.) the opening HTML tag that will replace the opening format indicator
// 3.) the closing HTML tag that will replace the closing format indicator
function handleTextFormatIndicators(content, obj) {
while(content.search(obj.regex) > -1) {
var matches = content.match(obj.regex);
if( matches && matches.length > 0) {
var new_segment = obj.open + matches[0].slice(1,matches[0].length-1) + obj.close;
content = content.replace(matches[0],new_segment);
}
}
return content;
}
Upvotes: 1
Views: 278
Reputation: 45589
Change your regex with the flags /ig
and remove the while loop.
Change your for(obj in format_objects)
loop with a normal for loop, because format_objects
is an array.
Okay, I took the time to write an even faster and simplified solution, based on your code:
function handleContentFormatting(content) {
content = handleLineBreaks(content);
var bold_object = {'regex': /\*([^*]+)\*/ig, 'replace': '<b>$1</b>'},
italic_object = {'regex': /\/(?!\D>|>)([^\/]+)\//ig, 'replace': '<i>$1</i>'},
underline_object = {'regex': /\_([^_]+)\_/ig, 'replace': '<u>$1</u>'},
strikethrough_object = {'regex': /\-([^-]+)\-/ig, 'replace': '<del>$1</del>'};
var format_objects = [bold_object, italic_object, underline_object, strikethrough_object],
i = 0, foObjSize = format_objects.length;
for( i; i < foObjSize; i++ ) {
content = handleTextFormatIndicators(content, format_objects[i]);
}
return content;
}
//@param obj --- an object with 2 properties:
// 1.) the regex to search with
// 2.) the replace string
function handleTextFormatIndicators(content, obj) {
return content.replace(obj.regex, obj.replace);
}
This will work with nested and/or not nested formatting boundaries. You can omit the function handleTextFormatIndicators
altogether if you want to, and do the replacements inline inside handleContentFormatting
.
Upvotes: 1
Reputation: 33928
You can do things like:
function formatText(text){
return text.replace(
/\*([^*]*)\*|\/([^\/]*)\/|_([^_]*)_|-([^-]*)-/gi,
function(m, tb, ti, tu, ts){
if(typeof(tb) != 'undefined')
return '<b>' + formatText(tb) + '</b>';
if(typeof(ti) != 'undefined')
return '<i>' + formatText(ti) + '</i>';
if(typeof(tu) != 'undefined')
return '<u>' + formatText(tu) + '</u>';
if(typeof(ts) != 'undefined')
return '<del>' + formatText(ts) + '</del>';
return 'ERR('+m+')';
}
);
}
This will work fine on nested tags, but will not with overlapping tags, which are invalid anyway.
Example at http://jsfiddle.net/m5Rju/
Upvotes: 1
Reputation: 413976
Your code is forcing the browser to do a whole lot of repeated, wasted work. The approach you should be taking is this:
As to how to combine the regular expressions, well, it's not very pretty in JavaScript but it looks like this. First, you need a regex for a string of zero or more "uninteresting" characters. That should be the first capturing group in the regex. Next should be the alternates for the target strings you're looking for. Thus the general form is:
var tokenizer = /(uninteresting pattern)?(?:(target 1)|(target 2)|(target 3)| ... )?/;
When you match that against the source string, you'll get back a result array that will contain the following:
result[0] - entire chunk of string (not used)
result[1] - run of uninteresting characters
result[2] - either an instance of target type 1, or null
result[3] - either an instance of target type 2, or null
...
Thus you'll know which kind of replacement target you saw by checking which of the target regexes are non empty. (Note that in your case the targets can conceivably overlap; if you intend for that to work, then you'll have to approach this as a full-blown parsing problem I suspect.)
Upvotes: 1