Reputation: 727
How do I tell if an html string contains content (text, images, video tags, etc) and not just tags (for instance, an empty table, empty divs, spaces, nbsp etc)
I need to be able to do this in javascript, in the browser, and it needs to support IE8. I've come to the conclusion that parsing the html is the best way to go about this. If there is another way that could work I would be interested in that as well. Regex is not acceptable.
Critically, I need this to not run javascript while it is checking. Things like <script>alert(1)</script>
and <img src=x onerror=alert(1)/>
should not alert. This has been the major stopping point for IE8. IE9 has document.implementation.createHTMLDocument, IE 10 and later have DOMParser for html, neither of which will run JS, but I cant find a solution for IE8.
I think the best thing to find would be a javascript based html parser, but all of the ones I have looked at are for Node or do not support IE8.
Upvotes: 0
Views: 189
Reputation: 4024
You can use this to parse html string in IE8:
var xmlDocument = new ActiveXObject('Microsoft.XMLDOM');
xmlDocument.async = false;
xmlDocument.loadXML(str);
to detect IE
version use this function:
function getInternetExplorerVersion()
// Returns the version of Windows Internet Explorer or a -1
// (indicating the use of another browser).
{
var rv = -1; // Return value assumes failure.
if (navigator.appName == 'Microsoft Internet Explorer')
{
var ua = navigator.userAgent;
var re = new RegExp("MSIE ([0-9]{1,}[\.0-9]{0,})");
if (re.exec(ua) != null)
rv = parseFloat( RegExp.$1 );
}
return rv;
}
and usage:
var ver = getInternetExplorerVersion();
if ( ver> -1 )
{
if (ver = 8.0 )
{
var xmlDocument = new ActiveXObject('Microsoft.XMLDOM');
xmlDocument.async = false;
xmlDocument.loadXML(str);
}
}
Upvotes: 1