Do you operate or use a free website hosting service? Would you like to help those who do?
Last night I taught my teenage niece Denika how to write web pages. Wanting to be a good influence, I taught her to write handcoded valid XHTML, how to use stylesheets, and how to use the
W3C HTML Validation Service.
Imagine our suprise when we uploaded Denika's page to a free hosting service, only to find that the advertising markup inserted by the host caused an otherwise compliant XHTML strict document to no longer validate!
When I got home I reported the problem to the
www-validator list in
Free hosting services append invalid HTML ads.
And today I wrote the
Free Hosting Service HTML Validation Test Page.
What I'd like you to do is copy my page (it is under the GNU Free Documentation License) to a free web hosting service of your choice. You can find an extensive list of free hosts at
http://www.freewebspace.net/. Please notify me at
when you've posted a copy, and I will add it to the
list of test pages.
Then click the link in your new copy of the page that looks like this:
and the W3C will validate the page that contains the link. (Try clicking the link here, to see how this article validates!) There is another link that will validate the original document off the server at LinuxQuality.
The page itself contains what I hope is a clear explanation of why the page was placed at a hosting service's site, why it's important for everyone's markup to validate, how they can get started in learning how to make their own markup compliant with
W3 Consortium standards, as well as instructions that anyone coming across the page can use to place a copy at some other free hosting site.
Now you may point out that you get what you pay for, so people who care that their documents validate can pay for commercial hosting. But many people get their start at web publishing by using the free hosting services, and many people cannot afford to pay for hosting, such as the poor, students, and people from developing countries. Also the free hosts offer a measure of anonymity to people who might be endangered by what they publish, such as people working for political change or to advance human rights.
Here is the first test page that I have placed:
You will see that it doesn't validate.
I'm also a big fan of writing validating pages, but one has to acknowledge that the W3 design of XML was pretty dumb in the decision that there can only be one root element in page. If more than one root element was allowed, one could append or prepend data to a document and still have it be valid. And don't even get me started on the & bit. Note that this is also why XML is bad for log files, which one only appends to.
XML-related stuff is batting 0.000 with me tonight. First I have complaints complaints about RSS 1.0 and now this.
So yes, providers that don't respect end-to-end are evil. But I need to let off some steam :)
...sometimes these free hosting services don't adjust Content-Length to compensate for the added header.
For example, I was attempting to download a file (a steplist for a Dance Dance Revolution simulator) from a netfirms.com website. The file stopped halfway through the first set of steps when I tried to load it in Mozilla or use wget. Eventually I found out that I could use wget's --ignore-length option to get the whole file; without it, wget stopped once it retrieved Content-Length bytes.
The author didn't realize there was a problem, since M$IE ignores the Content-Length header (and therefore was able to retrieve the whole file), but Mozilla and wget actually obey the standard (and are therefore faced with hassles whenever hosting services skimp).
The question boils down to, how do we get people to follow standards?
GeoCities is especially bad. They have this ad-square thing which they claim is less obtrusive than a regular banner ad. Some would disagree, but anyway, that is one of the ways that Yahoo tries to differentiate their service.
Trouble is, adsquare is made with DHTML. Not only is this appended to the end of the page after the "</html>"; it also uses scripts that don't work on Gecko-based browsers, so it ends up not only malformed, but also ugly and non-functional.
Tripod does a little better than some: they look for your "<body>" tag, rewrite it, and insert their banner right after that. The content that they insert still isn't valid XHTML, though.