Reputation: 7853
We have a TinyMCE script on one of our pages that allows users to paste text segments from Word into it. We've noticed that on paste from Word documents, we get some additional, unwanted CSS like code prepended in the text are, which looks like
@font-face
{
font-family: "Arial";
}
@font-face
{
font-family: "Cambria Math";
}
@font-face
{
font-family: "Cambria";
}
p.MsoNormal, li.MsoNormal, div.MsoNormal
{
margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Arial;
}
strong { }
.MsoChpDefault
{
font-size: 10pt;
font-family: Cambria;
}
div.WordSection1
{
page: WordSection1;
}
We currently have a PHP script that uses a regex expression to delete this data before it is saved. However, we wish to have this data deleted on paste, so that the user never comes into contact with it.
I've added the following regex expression into the onPaste plugin of TinyMCE
/@font(.*)\{(.*)\}/i
However, it doesn't delete anything. If I remove the last literal bracket \}
, it will remove sections of the code, but not the entire thing, so the expression seems to be in the correct place, however, it seems like it is not formed correctly.
Basically, I'm looking for a valid JavaScript regex expression that will delete everything from @font
to the last curly bracket }
.
Upvotes: 2
Views: 579
Reputation: 905
I agree with Sean Kinsey, but depending on the regex engine, you may need to account for new lines. If you have to worry about newlines and carriage returns, I would use [\s\S]
instead of .
to capture those characters as well. Here is an example that you can try out on jsbin or another dynamic JavaScript tester:
// An array of lines of the css code.
var cssCode = [];
cssCode.push('@font-face');
cssCode.push('{');
cssCode.push(' font-family: "Arial";');
cssCode.push('}');
cssCode.push('@font-face');
cssCode.push('{');
cssCode.push(' font-family: "Cambria Math";');
cssCode.push('}');
// A string with new line characters separating each line in the array.
cssCode = cssCode.join("\n");
// Show the matches.
alert(cssCode.match(/@font[\s\S]*?{[\s\S]*?}/g));
Upvotes: 0
Reputation: 50205
The dot selector (.
) in Javascript RegExp matches all characters EXCEPT line-breaks. Unfortunately in Javascript there is no s
flag to turn on the dot matching line-breaks. Instead there is the work around of using the character set [\s\S]
to match any whitespace character (including line-breaks) and any non-whitespace character. Therefore the following RegExp will delete everything from @font
to the last curly bracket }
:
yourText = yourText.replace(/@font[\s\S]*\{[\s\S]*\}/i,'');
Upvotes: 3
Reputation: 38046
This works just fine
"@font-face {...}".match(/@font.*?{.*?}/g);
["@font-face {...}"]
It is important to use the ?
as the * is a greedy quantifier.
Not doing so would cause a single match to occur starting with the first @font
and ending with the last }
.
Upvotes: 0