Older blog entries for robogato (starting at number 3)

6 Nov 2006 (updated 7 Nov 2006 at 06:15 UTC) »

Advogato Status Report

A new rev of mod_virgule code went live on Advogato today. We have a couple of new features in addition to the usual minor changes and bugfixes. I'll summarize them but see the changelog for details.

The trust metric cache is now loaded into an Apache thread private memory pool so it can persist across hits. It no longer has to be loaded and parsed on every hit. Instead it's loaded only when the cache is updated (usually once per hour). I'm not sure if this will produce any noticable performance increase but mod_virgule doesn't seem to thrash the hard disk quite so much now.

Most of the major issues with gcc 4.x seem to be fixed. There are still loads of warnings due to char vs xmlChar mismatches. Explicit casts are needed to fix these and I'll continue adding them as I get time.

Hits on non-existent projects now return a 404 instead of a 200 so that search engines will stop pounding all sorts of bogus project URLs that have accumulated over the years. We get over 25k hits per month on one non-existent project, so this is eating a small but measurable amount of our bandwidth.

I've got about 75% of the coding done for blog syndication. It's now possible to specify a feed URL in a user profile. The feeds are aggregated hourly but are not appended to the Advogato blogs yet. I got bogged down (blogged down?) in the minor differences between RSS, Atom, and mod_virgule diary formats. They all use different formats for dates, of course. I'm hoping to have a few more hours available this week to finish it up. I'll probably test it for a few days on robots.net and then take it live here once I'm fairly sure it's stable.

Advogato.org is now set up over at Technorati. Well, maybe. It seems to take several attempts to get anything set up at Technorati. Their view of our RSS feed is still hosed but it should clear up after the next couple of articles are posted. Technorati syndication might generate a little more human traffic to our stories here. Now we just need to work on posting some interesting, original content like in the good ol' days. Maybe advogato the cat could be induced to bring back the Advogato's Number editorials (hint hint).

rillian asked about switching the recentlog to "as posted" mode from the current "unique" mode where only one post per user is allowed. He made the point that the Planet aggregators do this and it seems to be the preferred method by most readers. With the spam problem under control, I didn't see any reason not to make the switch. So, as of 3 November, any blog posts made should show up in recentlog until they scroll off naturally according to the date.

26 Oct 2006 (updated 26 Oct 2006 at 01:09 UTC) »

Advogato Status Report

A new rev of mod_virgule code went live on Advogato today. There were only minor changes and bugfixes. See the changelog for details.

Most of the code changes this week were aimed at getting a clean compile on gcc 4, Apache 2.2, and the newest version of the Apache APR libs. Here and there mod_virgule still relies on some of the deprecated Apache 1.3 compatibility code that's being dropped from the newest Apache libs. There's still more work to do here but it's getting close.

The rate of Advogato spammer account deletion has slowed to a trickle. Most of the easy to spot spammers are gone. I did run across a few today, however: Pramod, nulledphpscritps, Phat, JohnH, bekka, Zorro and a few more if you follow the inbound certs on those.

The removal of all those accounts has had two side effects. One is that thousands of certifications issued by the spammer acounts have also been removed. That seems to have contributed to shorter run times for trust metric computations. The other side effect is that search engine robots are hitting a lot of account pages that no longer exist. Mod_virgule didn't handle this quite the way it should. It displayed an error page saying the account wasn't there but it returned a result code of 200. So the search engines continue indexing the bad URL and continue hitting it every couple of days looking for updates. I've tweaked this so mod_virgule now returns a 404 when displaying the person not found error page. This should cause the search engine robots to eventually stop trying to hit all those dead accounts.

18 Oct 2006 (updated 4 Aug 2009 at 04:24 UTC) »

View This Profile Here

10 Oct 2006 (updated 10 Oct 2006 at 01:28 UTC) »

Advogato Status Report

It's been a little over a week since Advogato transitioned to new hardware and a new codebase. Overall the transition went pretty well considering the differences in the code. Minor bugfixes are ongoing. Please excuse the mess during the transition period!

In case you haven't noticed, the much requested password reminder feature is working and has already been used several times. Maybe that means we'll be seeing some long lost friends posting again?

Bandwidth and DoS issues seem to be coming under control. One strange problem now solved was a Microsoft proxy server that was hitting a single Advogato diary 5 times per second. Most likely just a misconfigured RSS program of some kind. Once the source of the problem was identified, an email to their NOC took care of it. The new, paged person index also seems to have reduced bandwidth usage somewhat.

Spam, spam, spam. It's still with us but there's light at the end of the tunnel.

To reduce the attraction of Advogato to spammers, blog entries posted by untrusted users now have nofollow attributes included in all links. Further, links posted by untrusted users in their account profile pages are supressed altogether. A note has been added to the new accounts page so potential spammers know their links are going to be worthless in search engines.

Two active groups of spammers have been identified so far. One was an SEO firm in New Delhi, India. They were using their own IP addresses to connect, so adding a few addresses to iptables has taken care of them (at least for the moment). The second group connects from random IPs in China and Korea. It will take a group effort to discourage them - how, you may ask?

By using Advogato's new spam rating system. It's based on a suggestion by lkcl and also on the system used over at craigslist. If you see a post in the recent log that looks like spam, click to that user's profile page. If you're certified as apprentice or higher, you'll see two new things at the top of the page: a spam score and a "flag this account as spam" link. Click the link and you'll add to the account's spam score. If you're an apprentice, you'll add one point to their spam score. A Journeyer adds two points. A master adds three points. When an account's spam score reaches a preset threshold (currently set at 10 points), the account is automatically deleted. This system only applies to untrusted, observer accounts, of course. If a user is certified by a trusted user, it's assumed they aren't spammers. There are several spammers currently listed in the recentlog, so let's give it a try.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!