Reputation: 3139
I'm looking into HTML5 and I'm puzzled why it goes so easy on well-formedness.
<div id="main">
<DIV ID="main">
<DIV id=main>
are all valid and produce the same result. I thought with XHTML we moved to XML compliant code at no cost (I don't count closing tags as a cost!). Now the HTML5 spec looks to be written by lazy coders and/or anarchists. The result is that from the start of HTML5 we have two versions: HTML5 and the XML compliant XHTML5. Would you consider it an asset if C would suddenly allow you to write a for construct in the following ways?
for(i = 0; i < 10; i++) {
for(i = o; i < 1o; i++) { // you can use "o" instead of "0"
for(i = 0, i < 10, i++) { // commas instead of semicolons are alright!
Frankly, as an XHTML coder since many moons I feel a bit insulted by the HTML5 spec.
Wadya think?
Steven
edit:
Mind the "wadya": would you as a customer accept a letter with "wadya" written instead of "What do you"? :-)
Upvotes: 3
Views: 2814
Reputation: 499002
HTML 5 is not an XML dialect like XHTML is.
What made HTML so popular was the fact that it tolerated mistakes, so just about anyone could write an HTML page.
XHTML made it much more difficult and it didn't get widely adopted. At the same time, further development of HTML/XHTML stagnated, so an industry group formed up, the WHATWG who started work on the next generation of HTML and decided to revert to a non XML standard for HTML 5.
Since XML is stricter than HTML, you can always write your HTML to be XML compliant. Make sure attributes are in lower case, use value delimiters, elements have closing tags and use correct XML escaping where needed.
Upvotes: 7
Reputation: 1199
HTML was never intended to convey media, and therefore never intended for any kind of marketing or merchandising. HTML was only intended to convey text and to provide some sort of descriptive structure upon the text it was describing. The people originally using HTML professors and scientists who needed the ability to describe their communications in more depth and line breaks and quotes would allow. In other words HTML was only intended to be a document storage mechanism. Keep in mind there were no web browsers at this time.
HTML was first made popular with the release of the web browsers. Initially web browsers were just text parsers that provided a handy GUI for navigating the hyperlinking between documents, but this changed almost immediately. At this time there was still no actual standard for HTML. There was the list of tags and a description of mechanisms initially created for HTML, which along with an understand of SGML, was all that was required to create an HTML parser.
With web browsers came the immediate demand to extend HTML in ways HTML never intended. It was at this point that the inventors and original users completely lost control of the web. Tags were added, such as center and font, and tables became the primary mechanism for laying things out on a page instead of describing data. Web browsers supplied a media demand completely orthogonal to the intentions of HTML. Marketing people, being what they are, care very much for the appearance and expressive nature of communications and don't give a crap for the technology which makes such communication possible. As a result parsers became more lax to accommodate the incompetent. You have to understand that HTML was already lax because there were no standard parsing rules and SGML, due to being so very obtuse, encourages a lax nature outside of parsing instruction tags.
Its not that these early technology pioneers were stupid, although its easy to argue the contrary, they simply had other priorities. When the web went mainstream there was an immediate obsession to conquer specific business niches in this new medium. All costs were driven towards marketing, market share, traffic acquisition, and brand awareness. Many web businesses operate today with similar agendas, but today's web is not a fair comparison. In the 90s marketing was all that mattered and technology costs were absolutely ignored. The problem was so widespread and the surge of investment so grand that it completely defied all rational rules of economics. This is why there was an implosion. The only web businesses that survived this crash were those that confronted their technology costs up front or those who channeled investment monies into technology expenses opposed to additional marketing expense.
http://en.wikipedia.org/wiki/Dot-com_bubble
After the crash things changed. Consider the crash good timing, because although it was entirely driven by bad business decisions, foolish investments, and irrational economics there was positive technology developments going on behind the scenes. The founders of the web were completely aware that they had lost all control of their technology. They sought to solve this problem and set things straight by creating the World Wide Web Consortium (W3C). They invited experts and software companies to participate. Although solving many of the technology problems introduced to the web by marketing drivin motivations was a lost cause many future problems could be avoided if the language were implemented in accordance with an agreed upon standard. It was during this time that HTML 2 (the first standard form of HTML), HTML 3, and HTML 4 were written.
At the same time the W3C also began work on XML, which never intended to be a HTML replacement. XML was created because SGML was too complex. A simple syntax based upon similar rules was needed. XML was immediately written off by marketing people and was immediately praised by data evangalists at Microsoft and IBM. Because the holy wars around XML were trivial, insignificant, and short lived compared to such problems plaguing HTML XML's developement occurred at rocket speed. Almost immediately after XML was formed the first version of XML Schema was formed.
XML Schema was an extradinary work that most people either choose to ignore or take for granted. An abstration model for accessing the structure of HTML was also standardized based upon XML Schema, know as the Document Object Model (DOM). It is important to note that the DOM was initially developed by browser vendors to provide an API for JavaScript to access HTML, but the standard DOM released by the W3C had nothing to do with JavaScript directly.It quickly became obvious that many of technology problems plaguing HTML could be solved by creating an XML compliant form of HTML. This is called XHTML. Unfortunately, the path of adoption from HTML to XHTML was introduced in a confused manner that is still not widely understood years after clarification finally occurred.
So, there was a crash and leading up to this period of economic collapse there were some fantastic technology developments. The ultimate source of technology corruption, the web browsers, were finally just starting to innovate around adoption of the many fantastic technology solutions dreamed up at the W3C, but with the crash came an almost complete loss of development motivation from the browser vendors. At this time there was only really Netscape, IE, and Opera. Opera was not free software, so it was never widely adopted, and Netscape went under. This essentially left only IE and Microsoft pulled all their developers off IE. Years later development on IE would be revived when competition arose from Firefox and when Opera adopted free licensing.
About the same time that browsers were coming back to life the W3C was moving forward with development of XHTML2. XHTML2 was an ambitious project and was not related to XHTML1, which created much confusion. The W3C was attempting to solve technology problems associated with HTML that had been allowed to fester for long and their intentions were valid and solid. Unfortunately, there was some contention in the XHTML2 working group. The combination of failed communication on how and why to transition from HTML to XHTML in combination with the unrelated nature of XHTML2 and its infighting made people worry.
The marketing interference that allowed the web to crash regressed with the web crash, but it did not die. It was reviving during this period as well. Let's not forget that marketing motivations give dick about technology concerns. Marketing motivations are about instant gratification. All flavors of XHTML, especially XHTML2, were an abomination to instant gratification. XHTML2 would eventually be killed for a single draft was published. This fear and disgust lead to the establishment of separate standards body whose interests were aligned with moving HTML forward in the nature of instant gratification silliness. This new group would call itself WHATWG and would carry the marketing torch forward.
The WHATWG was united, because their motivations were simple even if their visions of the technology were ambitious, essentially to make it easier for developers to make things pretty, interactive, and reduce complexity around media integration. The WHATWG was also successful, because the web began to contract since the crash. There were fewer major players around and each had a specific set of priorities that were more and more inalignment.
The web is a media channel and its primary business is advertising. Web businesses that make money from advertising tend to be significantly larger than web businesses that make money from goods or services. As a result the priorties of the web would eventually become the priorities of media and advertising distribution. For instance why did JavaScript become much faster in the browser? The answer is because Google, an advertising company, made it a priority to release a web browser that was significantly faster at processing JavaScript. To compete other browsers would need to become 20 to 30 times faster to keep up. This is important because JavaScript is the primary means by which advertisement metrics are measured, which is the basis of Google's revenue.
Since HTML5 is a marketing friendly specification it allows a lax syntax. Browser vendors are economically justified to spend more money writing more complex parsing mechanisms against sloppy markup, because it allows more rapid developement by which media is published so as to allow deeper penetration of advertising. This is economically qualified because all of the 5 major web browsers available now are primarily funded from advertising revenue. Unfortunately, this is nothing but cost for anybody else that want's to write a parser and is limiting or harmful to any later interpretation of structured data. The result is a lack of regard for the technology and the rise of hidden costs with limits upon technology innovation within the given medium.
This is why HTML syntax continues to be shit. The only solution is to propose an alternate and technologically superior communication medium that technologically emphasizes a decentralization of contracting market concerns.
Upvotes: 4
Reputation: 579
For natural parsing the quotes aren't necessary in the first place.
Regarding case, HTML elements are reserved regardless of case; for example, you can't define your own DiV or Div.
HTML is a markup language where speed and simplicity is a greater priority than consistency.
While arguable, this matters greatly to search engines; documents with quoted attributes and any kind of error are very expensive to process. It's funny -- the quoted example in HTML docs has 'be evil' in quotes; as to say that, not using quotes is not being evil.
Upvotes: 1
Reputation: 17002
Honestly, your question answers itself. "We have two different specs." Each spec addresses a different level of conformance, and they do so for a reason. As much as we might loathe the notion of "backwards compatibility," it's a burden we have to bear, and HTML5 is far better at maintaining it than XHTML5 will ever be.
Upvotes: 0
Reputation: 943537
Better that the spec allows it then it forbids it, everyone does it anyway, and browsers have to error correct.
XHTML never really took off, not least because MSIE never supported it (pretending it is HTML by sending a text/html content type not withstanding).
Upvotes: 1