Older blog entries for robogato (starting at number 28)

The URL rendering bug that redi spotted has been fixed, I think. Looks like it was an artifact of the Apache APR 1.3 to 2.0 upgrade that had gone unnoticed for a quite a while. If anyone spots any other URL issues in the project section, let me know.

Advogato Status Report

A new rev of mod_virgule code is live on Advogato. See the changelog for the details.

Aside from the usual minor bugfixes and tweaks, there are two new features you may have noticed already.

New certification indicators: A visual indication is now added to trust certifications that are less than 30 days old. This should make it easier to spot new certs on the user profiles. You can check this out on your own user profile if you've certified anyone, or been certified by anyone, in the last 30 days.

Article lists: Ever wonder how many Advogato articles you've posted? Or wanted to read other articles by a particular poster? Each user profile now includes a reverse chronological list of the 10 most recent articles posted by that user. For users who are more prolific, there is a link to a separate page that includes a complete listing of all articles posted by that user.

In addition to providing a new way to explore Advogato's articles, this should provide another direct route for search engine robots to find the static links to the articles.

11 Jul 2007 (updated 11 Jul 2007 at 20:40 UTC) »
Advogato Status Report

New mod_virgule code is live on Advogato. See the changelog for the details.

More minor bugs fixes. The aggregator should do a better job now of rejecting dupes from feeds that retroactively alter the post date on blog entries. The no_cache and no_local_copy flags in the Apache request records are now set for logouts to prevent browsers from caching old logout results and to prevent the server from sending a 304. This was preventing some Galeon users (and possibly other browsers) from logging out.

I replaced the social bookmarking test links on the article pages with a fully functional social bookmarking tool, linked from the standard "share this" icon. The share link is now available on project and profile pages as well as on articles. If someone has a favorite social bookmarking service that's not listed yet, let me know and I'll add it.

Time has been a scare resource for me lately, so progress through the ToDo list has been slower. More updates as time allows and, as always, patches are welcome.

Social Networking

Google sponsered a CMU project last year to study and reinvent online social networking. The result was Socialstream, a design concept based on the idea of a Unified Social Network (USN). A lot of what they came up sounds similar to what the semantic web folks are working on with OpenID and formats such as FOAF and DOAP. Basically, they're suggesting that social network sites standardize on a data sharing format that would allow them easily interact with each other and become part of a larger network of sites.

The project also did some interesting research, ranging from social networking theory and taxonomy to identifying common complaints about social networking sites and desirable features. They also researched who uses social networks and broke down the results into archetypical user types. The researchers also created a video demo of the Socialstream concept site. Some of the ideas they mention are already in Advogato or are on the ToDo list. I think there are plenty of other ideas here we can incorporate into Advogato as well.

Trust/Authority Metrics

Someone pointed out a link to an article by Michael Jensen in the Chronical Review: The New Metrics of Scholarly Authority. It talks a lot about Web 2.0 authority models. It mentions the Google PageRank system but, oddly, leaves out any mention of the mod_virgule trust metrics implemented on Advogato. Still, it's an interesting read.

Advogato Status Report

A new rev of mod_virgule code is live on Advogato. See the changelog for the details.

Mostly minor stuff. Setting a project staff relation to none now consistently removes the relation from your user profile. Thanks to Gary Benson for noticing the bug. I upgraded the server from CentOS 4.4 to 4.5. This was just a maintenance update and shouldn't cause any changes. We're having another wave of account spam lately but the new flagging system has largely controlled it. One of the spammers discovered a way of circumventing the code which strips anchor tags posted in the notes field of untrusted accounts. I've fixed the bug that allowed this.

GPL v3 Release Party in Dallas?

The GPLv3 is supposed to be released on 29 June. I saw joolean mention a GPLv3 release part in Brooklyn and figured, why not here in Dallas too? If there are any other Advogatoans in the DFW area who'd like to get together to celebrate the release of the new and improved GPL, let me know.

Trust Metric Growing Pains

The good news is that Advogato is growing again. The bad news is that this is bringing to light some issues with the trust metrics. First, there are a growing number of new users who have multiple certs but are still rated as observer. Second, there was the related incident with user OpenSpecies. Many people thought his blog posts looked spammy and flagged him as spam. Other users trusted him at Apprentice or Journeyer level but even with six or seven certs he never acquired enough gato-juice to reach Apprentice level. Because he stayed at Observer level, his account was always at risk of being classified as spam. This happened once, resulting in the decision to increase the spam score required to delete an account. I reinstated his account from a backup. A few months later it had been flagged as spam enough times to get deleted again. I restored it, however, OpenSpecies opted to move elsewhere and requested the account be permanently deleted.

The lack of gato-juice available for certifying people can be traced back to an issue with the trust metric seed users. Of the four original seed users, only raph is actively visiting Advogato and certifying users. Federico has visited in the last year but no longer certifies any users. Miguel hasn't visited in many years and only certified a handful of users. Alan has certified many users but no longer seems to be an active user himself (hopefully I'm wrong about that). This means there are really only two seeds and almost all the trust flowing to new users through certification is at best several generations removed from them.

To improve the situation, I'm going to add a few new seed users. This will need to be done gradually so that we can make sure it fixes the problem without resulting in cert inflation. My criteria for selecting new seed users will be: 1) Must be currently rated as a master by at least one of the original seed users 2) Must be rated as master by other non-seed users 3) Must be an active Advogato user who visits the site regularly and has posted at least one article 4) Must be reasonably well known within the community and have occasion to meet and interact with many other Free Software developers in person.

I talked with Raph about possible ways of handling this. Elections, nominations, automated selection by the trust metric itself, or just picking someone. Eventually, I think it would be interesting to have the trust metric select new seeds automatically as needed but that will take more time for testing and experimenting than I've got right now. So, initially I've opted for picking someone who meets the qualifications to save time. Our first new seed is: mako. By a handy coincidence, he's traveling to several European conferences over the next few weeks, giving him a chance to meet more people who may need certifying.

This is one of several things that I think should start pumping some new life into the trust metrics. Another issue I'm looking at is what to do with inactive users who have become stagnant sources in the trust metric network flow. These include users who will not return for one reason or another such as ettore, sisob or lilo. Trust passing through these nodes is essentially unchangeable, which is a problem because trust in the real world is dynamic. Sometimes we trust a person today that we didn't yesterday. Sometimes we no longer trust someone that we trusted in the past. If enough certs become stagnant and cannot be removed, this tends to make the trust metrics innaccurate. One way of dealing with this is to identify users who are inactive and expire their outbound certs automatically after enough time has elapsed. The tricky part is deciding how long a user has to go without visiting the site before being considered inactive. DV, for example, is an active user yet has gone for as much as a year between logins. Federico, one of our seed users, hasn't logged in for seven months. Right now, I'm thinking that exceeding one year without a login is a pretty good indication of inactivity.

Advogato buzz

Advogato showed up on a list of social network site statistics at the X2iN blog: Social Network Marketing, the Sky is the Limit.

Advogato's founder Raph Levien will be giving a talk titled Advogato: Lessons Learned at 6:30 PM on Monday, June 25 as part of Google's Open Source Developers @ Google series. The talk will be at Google's Mountain View campus. Guest are welcome and should sign in at Building 43.

Advogato Status Report

A new rev of mod_virgule code is live today on Advogato. See the changelog for the details.

The mod_virgule config.xml file now supports having a list of a authorized "editors". Article posting priviledges can be limited to these editors. Don't worry, this feature isn't intended for Advogato, where all certified members will continue to be able to post articles. It will be used on robots.net. In the past robots.net was configured such that only the users who were trust metric seeds could post stories. As robots.net has grown, the need has arisen to make a clear distinction between the list of trust metric seed users and the article editors. I think this feature will be useful on other sites that use mod_virgule as well.

I've tweaked the HTML layout of the diary entries, replacing the older style markup with divs. At the request of trs80, the div wrappers on each diary entry now include the username as a second class. While not needed for CSS, this additional class designator can be used by screen-scrapers to easily identify the author of each entry in the recentlog. Screen-scraping aggregators can use this as part of a dupe-control mechanism. This same username as class convention is used on many Planet sites, so it should make Advogato's recentlog more easily parsable by existing Planet scrapers. The fun part was the slight difference between legal mod_virgule usernames and legal CSS1/2 class names. This prompted the creation of a new utility function, virgule_force_legal_css_name(). Supplied with an arbitrary string of text, this function will return a properly escaped CSS1 class name.

More good Advogato buzz

Andrey Golub of Milan IN recently discovered Advogato and gave a nice mention in his blog. He also added Advogato to Milan-IN's listing of Online Social Networking Platforms. Perhaps this will bring a few other new Advogato members from the Italian free software community our way.

Dan York also gave us a great mention in his blog. He's an Advogatoan from way back who left Advogato for LiveJournal during the extended Advogato server outage back in 2004. He was writing to commemerate his 7th year of blogging and rediscovered Advogato in the process. His entry summarizes the recent changes on Advogato and suggests dyork may be making an appearance in our recentlog again soon.

During a recent discussion on the Extreme Programming mailing list about the possibility of a certification mechanism for XP programmers, Martijn Meijering suggested that a community trust metric system similar to Advogato's might be a desirable alternative to certification based on traditional knowledge-based testing.

Advogato Status Report

A new rev of mod_virgule code went live today on Advogato. See the changelog for the details.

I improved the ATOM feed handling of the aggregator code. Feeds that include only a <summary> tag and no <content> tag are now handled correctly. Also, feeds that include an <updated> tag but no <published> tag are handled correctly. Both these variations, while technically legal according to RFC 4287, seem to be very rare in the real world (not to mention a bit odd). Why include the datestamp of the last update but not the original publication date? Why include the full content of the blog but call it the summary instead of the content? Both these weird-but-legal annoyances were apparently generated by a "django powered" site. Not sure if that means the problem stems from django or just how it was used in this case.

The last few sections of mod_virgule still using hard-coded pages now use templates. This allowed another nasty chunk of hard-coded, site specific markup to be removed from the mod_virgule codebase. It was nice to see the code and binary get smaller for a change! Even though you probably won't notice any huge change in how the site looks, this is a major milestone for mod_virgule. It's finally possible to use it for a new website without having to modify the C source to remove Advogato or robots.net specific HTML. A few more changes are needed to group all the templates together with the CSS files to create an easily themable layout.

Despite the report that Advogato has failed, things continue to look better each month. We've set new records for user logins three months running (at least since I started keeping records six months ago). More than 70 Advogato users have returned to the recentlog via blog aggregation so far. The founding gato himself even stopped by this week to post an article on the new browser wars.

Advogato got a positive mention in a recent comparison of sites for software developers in John Manoogian's blog Inventing What's Next.

10 Apr 2007 (updated 10 Apr 2007 at 21:28 UTC) »

Advogato Status Report

New mod_virgule code went live today on Advogato. See the changelog for the details.

I've refactored some of the page rendering code to simplify the problem of pre-rendering page content for use in template-based pages. This should make the job of converting mod_virgule's hard-coded pages to template-based pages as easy as swapping out three or four lines of code. All the profile pages are now template based, as are the project pages. The new header has been added to all these pages. There are still a handful of hard-coded forms and form result pages. They're next up on the ToDo list.

You may have noticed some experimental social bookmarking links I've added to the article headers. Three social bookmarking sites are supported: Digg, del.icio.us, and Reddit. If you have an account at one or more of these services, try it out and let me know if it works for you. I'd like to get some feedback on this idea. Would you like to see additional bookmarking services included? Which ones? Also, would you like to see this idea extended to blog entries as well as articles? If this turns out to be a handy feature, I may encapsulate all the bookmark icons in some sort of little popup window, something like Alex King's "share this". Then we'd just have one little icon instead of a whole string of them - probably the emerging social bookmarking icon.

Advogato Status Report

New mod_virgule code is live today on Advogato. See the changelog for the details. No new release yet, though. I'm hoping I'll find time to finish up a couple of additional things before the next release.

The feed aggregator can now handle RSS/ATOM feeds that include the blog content as unescaped XHTML within the feed XML tree instead of as escaped content within a single XML node. This seems like a risky approach since the slightest markup error in the blog's XHTML renders the whole feed invalid and unparsable. Worse, the particular ATOM feed that brought this problem to light, generated by blogger, appears to randomly alternate between the two methods. One post is carried as normal escaped content within the entry node and the next is shoved in as an unescaped tree of XHTML tags. But who am I to argue with blogger? If it exists in the wild and doesn't appear to violate the standards, I'll try to make mod_virgule handle it correctly.

I've added support for the foaf:mbox_sha1sum field in the FOAF files output by mod_virgule. This field is an SHA-1 hash of the user email address. It's used as an identifier by some FOAF applications. There is also a group working on a SpamAssassin plugin and email whitelist database that will use trust metrics and FOAF data collected from community sites like Advogato. The email field in the user profile used to be optional, so if you're an old time Advogato user, check your profile and make sure your email address is included. Actually, everyone ought to make sure their email address is current, just in case you need to use the password reminder some day.

Blog (diary) pages are now template based rather than hard coded HTML generated by mod_virgule. The blog page template includes the new page header.

Barbara Irwin of the Victoria Linux Users Group emailed to let us know they've added Advogato to the Loads of Linux Links (LOLL) directory. The LOLL directly looks like an interesting collection of Linux links. Check it out.

Google turned down Advogato's Summer of Code mentor application. While disappointing, this didn't come as a total shock. There's no official organization behind mod_virgule, it's a very small project, and it still seems to be viewed as dead or dying by a few people. That's okay, maybe next year. In the meantime, I'm going to continue working to bring mod_virgule up to date.

There are several badly needed features that are going to require some major code refactoring and code cleanup. One of the Summer of Code ideas was directly related to this. The existing code base desperately needs improved commenting and documentation. I'd really like to see the comments normalized to Doxygen style and comments added to all the currently uncommented sections of the code. Having better comments and documentation would really help with future refactoring of the code and would also lower the barrier for new developers who need to understand how mod_virgule works. Any volunteers? Adding and rewriting code comments doesn't require extensive programming skill (though you will need to be able read and understand some less than beautiful C code).

There are other SoC mod_virgule ideas that I'd still like to see someone help with. Even without Google funding, it's still good experience and might even be fun. If you think you might be interested in helping out, take a look at the ideas list and let me know.

Advogato Status Report

A new rev of mod_virgule went live yesterday on Advogato. See the changelog for the details.

With all the articles being posted lately, the need to edit an article to correct mistakes and typos resurfaced. The article code is a bit scary and looks way overdue for a complete rewrite. But until then, I've added one more kludge to allow editing. Articles are now editable by the author for a period of 30 days after they're posted. (If you can't fix your typos in 30 days, you probably never will!) Articles that have been edited will include a revision date in the article header.

Otherwise, mostly small changes this time around. The much maligned certification dialog text inherited from robots.net has been toned down to something more minimal. I made a few very minor security enhancements to the new accounts page. A CSS clear:both style was added to the recentlog post headers. This fixes the bug that allowed floated images in a post to overlap the next post. I've migrated a few more pages to the new header style.

I made a few minor tweaks to the profile pages to help control bandwidth wastage and security problems. Untrusted users no longer have RSS feeds or FOAF RDF support on their user profiles. This is to prevent abuse by spammers but will also help cut down on bandwidth slightly. The biggest change is that RSS feeds don't exist until an account has at least one diary entry. This removes about 9,000 RSS feeds that were empty (but still being checked several times an hour by a hundred different aggregators).

I've banned a misbehaving web robot, named VoilaBot, used by a French search engine. Despite retrieving our robots.txt file several thousand times per day, it appears to ignore it. This robot was using gigabits of our bandwidth (up to 10% of the total so far this month). We get no inbound traffic from this search engine in return (which isn't suprising since Advogato isn't a French language site).

I've also banned several other robots that appeared to be harvesting email addresses for spammers. One of these had an agent string only one character different than pipeman's XML-RPC client. A typo on my part blocked him for a few hours. Sorry about that.

Google Summer of Code Mentor Application

I filed a mentor application in Advogato's name for the 2007 Google Summer of Code. If Google accepts it, I'm hoping maybe we can recruit a student or two to help with some of the mod_virgule work.

18 Feb 2007 (updated 18 Feb 2007 at 14:41 UTC) »
Advogato Status Report

A new rev of mod_virgule code went live today. See the changelog for the details. Lots of minor bug fixes and a couple of more interesting changes. Even one hardware note: I've doubled the RAM on the server from 1GB to 2GB.

Long Lost Trust Certifications Restored

You may have noticed some additional inbound or outbound trust certifications on your page or slight changes in your certification level this week thanks to some repairs done to the XML datastore. This would be a good time to go through your certs and make sure you've certified everyone you want to and no one you don't want to.

Over the 8 years Advogato has been online, it has suffered through several semi-catastrophic events including disk failures and power supply failures. There was also a mod_virgule bug triggered under disk-full conditions that truncated many user account profiles a year or so ago. The result of these past catastrophies was the complete loss of a few user profiles and minor corruption of many others. Usually, enough of the profile XML file remained (or could be restored) to allow a user to log in but some or all of the trust metric certifications and other data were lost. For a while the corrupt profiles could cause mod_virgule to segfault during a trust metric update (that bug has been fixed for a while). The most noticeable side-effect is missing or incorrect certs on the profile page.

One of the interesting things about the way the trust certs are stored in the XML database is that each cert is recorded in the profile of both the issuer and subject. This means it's possible to reconstruct a lost cert provided one of the two records still exists. Well, I finally got a chance to write some code to do that. I've written a new mod_virgule function to analyze the user profiles, find these sorts of problems, and repair them when possible. In addition to restoring lost certs, the new code also looks for invalid XML, missing profiles, certs to or from non-existent accounts, and a few other forms of corruption that are known to occur occasionally.

The result?

1115 missing outbound certs records restored
1264 missing inbound certs records restored
17 other misc profile corruption problems fixed

One side effect of all this is that all those missing certs will now be included in the trust metric computations again. So there have probably been a few changes in certification levels.

Consistent Page Headers on the way

One persistent category of Advogato complaints I get is about the inconsistent page layout. Some pages have menus at the top, other pages have the menu at the bottom. Sometimes the menu is centered, sometimes it's right justified. Most pages don't have a logo or even the name of site on them, which makes it confusing if you arrive from a search engine anywhere but the index page. On the other hand I feel like I have to balance the need for an updated, consistent page layout with Advogato's historically minimal design. So I'll try to take things slow and not make any major changes overnight. I've created a standard page header and page layout that should address the consistency issues without drastically altering the appearance of the site.

Over time, I'll try to get the new header on every page so the site begins to look a little more consistent. There are still a few pages with hard-coded HTML generated by mod_virgule. Making these remaing pages template-based will require code changes. One other nice result of finally getting the last few parts of mod_virgule fully template-based is that we should be able to purge the last non-standard HTML and maybe even bring the site up to full XHTML standards compliance.

As part of the page header improvements, I've converted the Advogato logo from GIF to PNG. The new logo has the same dimensions but the filesize is about 20% smaller, saving us a little bandwidth. I've also added a Google Coop AJAX-based search widget to provide a site search function, another frequent request. The new layout can be seen on the people page and a few other pages so far. You may also notice some new stats on the people page - this is another handy use of the new user account analysis code.

Advogato Articles

I was pleased to get all the emails and comments on my GNU/FSF news summary. I'd still like to find a volunteer who's willing to put together a summary like this every month.

I was also very pleased to see other new articles posted by mjg59, fxn, and lkcl. The ACPI article got picked up by linux today and generated more hits than any other article in the last several months. If we could generate a few articles like that every month, we'd be well on the way to making Advogato a more interesting and useful site.

PyCon and Advogato

PyCon is coming to Dallas, where the Advogato site is hosted. Is anyone up for some type of Advogato get-together during the conference (Feb 23-25)? If you'll be in town and want to meet some fellow Advogato users, email me and we'll work out the details.

19 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!