Reputation: 14955
I want to replace html tags and newline characters with a <br>
tag. In order to do so, I have used the following code, but it does not replace \r\n
.
const newText = text.replace(/<script.*?<\/script>/g, '<br>')
.replace(/<style.*?<\/style>/g, '<br>')
.replace(/(<([^>]+)>)/ig, "<br>")
.replace(/(?:\r\n|\r|\n)/g, '<br>')
<div class="text-danger ng-binding" ng-bind-html="message.causedBy ">javax.xml.ws.soap.SOAPFaultException: Response was of unexpected text/html ContentType. Incoming portion of HTML stream: \r\n\r\n\r\n\r\n500 - Internal server error.\r\n\r\n\r\n\r\n<div><h1>Server Error</h1></div>\r\n<div>\r\n <div class="\"content-container\"">\r\n <h2>500 - Internal server error.</h2>\r\n <h3>There is a problem with the resource you are looking for, and it cannot be displayed.</h3>\r\n </div>\r\n</div>\r\n\r\n\r\n\n\t</div>
I appreciate if you help me. (:
Upvotes: 0
Views: 4314
Reputation: 29109
This works for me. Are your CRLFs '\r' one escaped character or two characters, being '\' and 'r'.
If you have HTML elements with characters \n and \r, they are literal, and that would be really odd inside a div unless you are displaying source code. Plain ol' line breaks will end up as expected with a single escape character.
Also ,it's not clear if your source is getting pulled from an element or is static text.
You might have to escape the literal case in your regex.
replace(/(?:\\r\\n|\\r|\\n)/g, '<br>')
const text = `
<div class="text-danger ng-binding" ng-bind-html="message.causedBy ">javax.xml.ws.soap.SOAPFaultException: Response was of unexpected text/html ContentType. Incoming portion of HTML stream: \r\n\r\n\r\n\r\n500 - Internal server error.\r\n\r\n\r\n\r\n<div><h1>Server Error</h1></div>\r\n<div>\r\n <div class="\"content-container\"">\r\n <h2>500 - Internal server error.</h2>\r\n <h3>There is a problem with the resource you are looking for, and it cannot be displayed.</h3>\r\n </div>\r\n</div>\r\n\r\n\r\n\n\t</div>`
const newText = text
.replace(/<script.*?<\/script>/g, '<br>')
.replace(/<style.*?<\/style>/g, '<br>')
.replace(/(<([^>]+)>)/ig, "<br>")
.replace(/(?:\r\n|\r|\n)/g, '<br>')
//.replace(/(?:\\r\\n|\\r|\\n)/g, '<br>')
console.log(newText)
const text2 = document.getElementById('text').innerHTML
const newText2 = text2
.replace(/<script.*?<\/script>/g, '<br>')
.replace(/<style.*?<\/style>/g, '<br>')
.replace(/(<([^>]+)>)/ig, "<br>")
.replace(/(?:\r\n|\r|\n)/g, '<br>')
//.replace(/(?:\\r\\n|\\r|\\n)/g, '<br>')
console.log(newText2)
<div id='text'>
This
is
<script>// nothing here </script>
a
div
These are literal \r\n\r\n and will not get escaped unless you uncomment the special case.
</div>
Upvotes: 1
Reputation: 324690
You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML.
Instead, you have a parser at your fingertips. Use it!
var tmp = document.createElement('div');
tmp.innerHTML = text;
// replace all start/end tags with <br> for... some reason, I guess!
Array.from(tmp.getElementsByTagName("*")).forEach(function(elem) {
// ignore <br> tags
if( elem.nodeName.match(/^br$/i)) {
// do nothing
}
// outright remove <script> and <style>
else if( elem.nodeName.match(/^(?:script|style)$/i)) {
elem.parentNode.replaceChild(document.createElement('br'), elem);
}
// replace element with its contents and place a <br> before and after
else {
elem.parentNode.insertBefore(document.createElement('br'), elem);
while(elem.firstChild) {
elem.parentNode.insertBefore(elem.firstChild, elem);
}
elem.parentNode.replaceChild(document.createElement('br'), elem);
}
});
var html = tmp.innerHTML;
// since replacing newlines with <br> is a string operation, go ahead and use regex for that
html = html.replace(/\r?\n/,"<br />");
Upvotes: 1
Reputation: 37367
Just replace meverything that matches that pattern (<[^>]+>|\r|\n)
with empty string.
It is simple alternation, where \r
is carriage return, \n
is newline character (so it surely removes all new line characters which sometimes are imbinations of \r
and \n
).
<[^>]+>
will match every HTML tag.
Upvotes: 0