The Great LinTraining Cleanout

Posted 26 Jan 2003 at 16:29 UTC by dyork Share This

Can you help me clean up the LinTraining database? I am looking for some good web-surfing volunteers to assist us in The Great LinTraining Cleanout on Wednesday, January 29, 2003 starting at 8pm Eastern US/Canada time.

The LinTraining site was established as a service to the Linux community by Dave Whitinger and I in 1999. It evolved out of the work I was doing with LPI where I kept getting asked by people interested in Linux where they could get training. Unfortunately, in the four years of its existence, much of the data has gotten quite stale and needs to be removed. WE NEED YOUR HELP to keep this site as a strong resource for people who want to get training about Linux. Help us help others come over into the Linux world!

The Great LinTraining Cleanout - Here is the plan. This coming Wednesday, January 29th, starting at 8pm Eastern US/Canada time, I'll get on a IRC channel #lintraining on with as many other folks as can help. What we will do is simply this:

  1. Starting at the top of the country listing of training centers on the front page of LinTraining each person will take a category (country/province/state) and just start going through the listings. (reporting that they are taking that category (country/ province/state) to the channel). Ideally the person taking the category will be able to read the predominent language of the sites of that category so they can do more than just check if the URL is alive.

  2. For each listing, the test will be really simple - can we get to the URL listed?
    1. If yes, the URL is accessible, then is there any mention of Linux training on the page referenced or easily found?
      • If yes, do nothing more and go on to the next entry.
      • If no, report it to the IRC channel as 'no Linux ref" with the center's name and location.
    2. If no, the URL is not accessible, report 'dead url' to the IRC channel with the center's name and location.


  3. Once a person has completed a country/state/province, they go on to the next one that no one else is doing and repeat the step above. (I will be tracking the countries in progress.)

I will be logging all of the channel traffic and can then go back in the day or days afterward and delete the ones identified - or at least "unpost" their submission until it can be checked out further.

I think with a good number of folks we could clean up the database (or at least identify the potential problems) within a short amount of time. It should be a good bit of fun, too. So the only question is whether or not folks can be found to assist...

Are you interested in helping? Please drop me a note if you are. (Since I get so much spam, and haven't yet set up spam-filtering, please put "LinTraining Cleanout" in the subject line. Thanks.)

Why not use a robot?, posted 27 Jan 2003 at 15:20 UTC by chalst » (Master)

Robots can check for liveness, and for some keywords, which is about as sophisticated as what you are asking volunteers to do.

Re:Why not use a robot?, posted 29 Jan 2003 at 12:15 UTC by dyork » (Master)

chalst - you raise a good point. If all I wanted to do was check for dead links, a robot would by far be the best way to go... and you made me realize that I was oversimplifying the help I want.

Basically the reason I don't want to use a robot is because they are not intelligent enough. A well-meaning friend from OCLUG sent me a dump from a robot after seeing my posting. Of the 11 sites it checked, 4 were actualy dead links, 6 were 302 redirects (URL change) and one link showed the exact reason I don't want to use a robot.

In this case, the specific URL the training center provided gave back a 404... but a simple deletion of the filename part of the URL in the location bar of my browser took me to their main site, where I did find the correct links to their training info. Now, I don't know of a robot intelligent enough to do that.... maybe they are out there... but I don't know of them.

Now, could I go through once with a robot and get rid of the ones with dead hostnames? Sure... and maybe I will before tonight... that will at least give fewer URLs for humans to check.

The good news is that while this is the first "LinTraining Cleanout", it should also be the last. Dave (whitinger) and I will be implementing a new codebase for the site that will include automatic mailouts for quarterly updates, etc., and some other items that should make it so that the site won't get so out of date going forward.

Thanks for the comment.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page