Reputation: 424
I was reading a book and then I read this line :- "The HTML parser don't know about your JavaScript code; it treats it like any other text". So if we write:
<script type="text/javascript">
alert("first");
var string = "</script>";
</script>
we get an error because that "</script>"
works as a closing tag to the HTML parser, and that second line </script>
work as close of script then that script tag executed, and gives :-
Uncaught SyntaxError: Unexpected token ILLEGAL
even that first alert()
not executed? I don't know why? But my main question is that if "</script>"
treated as tag then when we write something like this:
var str = "<h1> hello world </h1>";
then this doesn't render any "hello world" on screen? According to the previous example, the HTML parser should treat that string as HTML tags as well, but it did not? Can anyone to explain me? Sorry for bad English :(
Upvotes: 5
Views: 550
Reputation: 424
I think i got the answer, according to https://www.w3.org/TR/html4/types.html#type-cdata
Although the STYLE and SCRIPT elements use CDATA for their data model, for these elements, CDATA must be handled differently by user agents. Markup and entities must be treated as raw text and passed to the application as is. The first occurrence of the character sequence "< /" (end-tag open delimiter) is treated as terminating the end of the element's content. In valid documents, this would be the end tag for the element.
the text of style and script elements use CDATA for their data model, and that text directly passed to application, ( javascript intrepreter in js, and layout engine in css(?)) and
the first occurrence of character sequence "< /" is treated as terminating the end of element (without space between < and / it was not working in answer, normal ?)
so when i we write :-
var string = "</script>";
the combination "< /" treated as end and that text content ( var string = " ) is passed to js intrepreter, and we know that string is not correclty ended ( " missing ), so that shows error, and then "); treated like text, and to solve this as the specification says combination of < / word as terminater, we can write like :-
var string = "<\/script>";
here html parser do not understand the javascript code so escape sequence not works, for html parser </ these are all 3 seperate charaters, and there are other a lot of variation to break "< \" token sequence,
ex:-
var str = "< /script>";
( did you notice space between < and /, i don't know it is allowrd in standard or not but it works )
var str = "<" + "/script>";
but there is also one thing to remember :-
var str = "< /scr" + "pt>" also works . ( forgot space, space for so. )
because according to :-
https://stackoverflow.com/a/236106/3810909
In practice browsers only end parsing a CDATA script block on an actual close-tag.
thanks, and sorry for weak engilsh
Upvotes: 2
Reputation: 6808
You should understand how browsers work. How HTML and Javascript is rendered. Here is a good read How browsers work
below text is from above link.
The tokenization algorithm
The algorithm's output is an HTML token. The algorithm is expressed as a state machine. Each state consumes one or more characters of the input stream and updates the next state according to those characters. The decision is influenced by the current tokenization state and by the tree construction state. This means the same consumed character will yield different results for the correct next state, depending on the current state. The algorithm is too complex to bring fully, so let's see a simple example that will help us understand the principal.
Basic example - tokenizing the following HTML:
<html>
<body>
Hello world
</body>
</html>
The initial state is the "Data state". When the "<" character is encountered, the state is changed to "Tag open state". Consuming an "a-z" character causes creation of a "Start tag token", the state is change to "Tag name state". We stay in this state until the ">" character is consumed. Each character is appended to the new token name. In our case the created token is an "html" token. When the ">" tag is reached, the current token is emitted and the state changes back to the "Data state". The "" tag will be treated by the same steps. So far the "html" and "body" tags were emitted. We are now back at the "Data state". Consuming the "H" character of "Hello world" will cause creation and emitting of a character token, this goes on until the "<" of "" is reached. We will emit a character token for each character of "Hello world". We are now back at the "Tag open state". Consuming the next input "/" will cause creation of an "end tag token" and a move to the "Tag name state". Again we stay in this state until we reach ">".Then the new tag token will be emitted and we go back to the "Data state". The "" input will be treated like the previous case.
The same applies for </script>
tag too . Thats how it works.
Upvotes: 2