Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Speaking of which, I don't understand why not. It seems like it would have been trivial to keep html5 a true xml. I do not understand what the actual technical reason for not doing that was. Naively, it just seems like breaking compatibility out of disdain rather than actually useful progress. Saving a couple of characters every once in a while does not justify the change, so I presume there must be a better reason?


There was XHTML and HTML5 is a direct result of finding out that was not the right solution. The main issue that was being solved there was that browsers do not parse invalid plain HTML consistently, which XHTML solved by requiring invalid XHTML to be rejected outright. This did not work. HTML5 solves this by defining the parsing rules such that there is a concept of document being invalid, every sequence of bytes deterministically maps to one particular DOM tree. This feature essentially precluded basing HTML5 on either XML (simply impossible) or SGML (that might be possible, but is in fact redundant formalism and describing the syntax in prose makes more sense, as everybody is going to hand-craft the parser anyway).


They specified how tag soup gets forced into HTML5.

They could have just as well defined how tag soup gets forced into XHTML.


I felt XHTML had fairly limited adoption on the web and in many cases web page authors seem to have preferred the »render tag soup« approach that in most cases did the intended thing than having to deal with XML namespaces, proper nesting and escaping, etc. Even though in most cases HTML nowadays seems to be authored as if it was XML with every element painstakingly closed and often even making elements that need no closing self-closing.


Probably because XML would need to be extended quite a bit to accommodate all of the multimedia stuff, attributes without values or quotes, special names for certain characters, optional or disallowed closing tags and whatnot that's in HTML5.

I think pushing in both the layout design conveniences and the strictness of XML data transfer in the same standard would be quite bulky at best. In practice we'd likely see a lot of nasty security issues in implementations and so on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: