Re: In any case...
... why should anybody be publishing malformed HTML? Haven't they heard of HTMLtidy?
Yes, but most distributions carry a very old version which croaks horribly with the most common things naïve writers throw at it: "broken html" input (AKA "tagsoup"), and HTML5 tags :|
Getting from "shitty tagsoup" to "well-formed HTML" is harder than it looks at first ;)