robogato is currently certified at Master level.

Name: RoboGato
Member since: 2006-10-04 20:50:40
Last Login: 2014-09-16 21:47:45

FOAF RDF Share This

Homepage: http://www.advogato.org/

Notes:

Cat? Machine? More than a fusion of the two? Who knows. Actually, this is just an alter-ego for me. I use this account to access Advogato admin functions and post site status updates. As many of you know, Raph has handed off maintenance of Advogato and mod_virgule to me. Initially, I see this as a triage situation. DoS prevention, performance issues, and long-standing bugs will have the highest priority. Once things are under control, I'll try to start working on the massive ToDo list of features that are needed to make Advogato competitive with other modern community networking sites. Along the way, maybe we can improve the article quality and general usefulness of the site. With your help, we should be able to put Advogato on the cutting edge again.

Articles Posted by robogato

Complete list of articles by robogato

Recent blog entries by robogato

Syndication: RSS 2.0

As you probably noticed we're under attack by spammers again. Heavy account creation and blog spamming wiped out the recentlog. It's partially recovered and should be back to normal in another few hours. Account creation is off for now so that should prevent further spamming but the site may be slow due to the heavy traffic generated by the spammers. Looks like a botnet or multiple proxies being used. If anyone's interested in doing a little research on their own, here are a few of the many IPs from which the spam is originating: 173.208.47.67, 218.186.17.251, 190.212.92.132, 99.129.227.221, 86.122.20.133, 61.140.173.221, 67.72.247.233, 176.9.33.251, 110.4.89.20, 122.177.153.205, 66.56.158.67, 72.64.98.16, 79.141.172.14.

More Minor Security Updates

I declared an Advogato hacking day today and got a little more work done on our security ToDo list. I've added a set of cryptographic nonce functions to generate tokens for email verification and CSRF prevention. The tokens have configurable expiration times. The new code replaces the hard-coded token generation used by the original cookie functions.

I also added a generic email function that can be used for account verification. This replaced the hard-coded part of the password recovery email function.

I was able to get the CSRF token code integrated with the account creation forms. It's tested and live. Hopefully this will knock out a few more of our automated account spammers including the commercial Incansoft spamming tools. I've still got a little more work to do before I can turn on the email verification but we're nearly there.

12 Sep 2011 (updated 12 Sep 2011 at 22:29 UTC) »

Status Update

Advogato has been under a sustained attack from spammers since 11:00 UTC Sunday. The attack is originating from a botnet of at least several hundred nodes with world wide distribution. The attack is automated and creates 10 to 20 new user accounts with large, spam-filled blog posts every minute. I discovered the attack around two hours after it started and immediately turned off new account creation.

Mod_virgule buffers the 100 most recent new accounts for display in the "recent people joining" box on the front page. The attackers had blown past that number pretty quickly, requiring me to use the web server logs to track down and remove the bad accounts. Once removed, it left the recent accounts buffer completely empty. It will fill up again once I'm able to turn new account creation back on.

I spent a while Sunday logging and blocking IPs for individual nodes of the attacking botnet but basically gave up after blocking the first hundred or so. With account creation off, the attackers fail to create accounts and what we're left with is a low-level DDoS attack. The bandwidth being used isn't disabling and hopefully the attacker will give up once they realize no new accounts are being created.

Other Fun

The switch to the libxml2 HTML parser solved a lot of internal problems but as some of you have noticed, it introduced a new one. Libxml2 "thinks" in XML and when it comes across a set of HTML tags with no content, such as <em></em> it turns that into a self-closing tag: <em /> which is great if you're viewing the result with an XML parser but most browser HTML parsers can't parse certain tags as self-closing and see the tag as an open with no corresponding close. This has the effect of including all the subsequent markup on the page inside the offending tag, usually terminating display of the page.

It looks like only a handful of tags produce this effect, so it should be possible to filter them out. It may be possible to drop empty tag pairs before parsing or convert them back to open/close pairs.

Redi: in theory yes but the mod_virgule codebase is scary mix of HTML 4 (and earlier), XHTML, and XML. Throw in the random markup coming in from syndicated blogs and the resulting tag soup is very difficult to normalize without breaking something. However, incoming blog markup was previously being normalized to XHTML by libxml2 and I'm thinking now, we may have to switch that to HTML 4 to force the open/close tags. The function you mention produces different output depending on what markup type is specified on the tree (or on the individual node). So, parse the blog, walk the tree forcing it all to HTML 4, then ask libxml2 to export it. Maybe... I'm doing some work on the code today, so I'll let you know.

Another Update: I've got some code changes in that might (or might not) help with the broken tag problem. We'll have to see if any incoming blog posts break anything over the next day or so. Nothing new on the spam attack, it's still going strong. I'm going to look at implementing a few more security features in the code that might allow us to turn account creation back on without waiting for the attack to subside.

2 Jun 2011 (updated 3 Jun 2011 at 19:32 UTC) »

Robogato Returns

We had a bad hardware crash recently and, as I was restoring Advogato to new hardware, I realized that it's been too long since I've devoted any significant time to improving the code around here. I took advantage of the downtime caused by the crash to make some final tweaks to the long-awaited libxml2 based HTML parser and made it live. It fixes a lot of the rendering problems already and will fix more once I make a few more tweaks.

I'm also working on improving security in general and making account creation by spammers harder in particular. I had a nice email exchange with dkg about the subject awhile back. He took a look at the code and provided a laundry list of things that needed fixing or improving. I'm working on those now. The first change just went live this week - mod_virgule now requires the POST method for submitted forms. This minor change already stopped a couple of our automated account spammers who were creating accounts with GETs. Only the dumbest spammers were doing that I'd think. Using POST isn't much harder. More changes to come.

If you're wondering what caused the increase in spam accounts we've been seeing for the last year, here's a possible contributor: Incansoft, apparently a purveyor of web-based spam tools, added an Advogato attack to a spamming tool they sell called Web20Bot (sorry, not going to link to it but you can google it). Web20Bot will create phony account profiles containing your backlink spam on 20 websites including Advogato.org, squidoo.com, wordpress.com, blogger.com, tumblr.com, and livejournal.com. They claim Web20Bot handles email verification and captchas, so working out a defense may be interesting. I doubt any of their spam lasts more than 48 hours around here anyway but it would be nice to make life harder for them. (incidentally, if someone were to come up with a copy of this thing so we could analyze it, that might be cool - maybe we could help other sites being attacked by it too).

Update: Thanks for pointing out those issues, Redi. I've fixed the diary edit problem, it should not have been checking for a POST. The <person>, <project>, and <wiki> tags were special cases in the old HTML handler. If one is broken, all three probably are. I'll get on that now. It will take me a little while to track down the problem. <proj> was deprecated in favor of <project> way back in the Raph days but the code checking for <proj> wasn't dropped until this most recent update. I didn't realize anyone still used it. I can add it back in.

Update 2: Ok, found the problem. The old tag handlers output directly to the apache buffer while the new handlers modify the XML tree, which is rendered to the buffer later. I need to modify or replace the handlers for those three tags. I'll try to get to it today if time allows.

Update 3: I think the special tag issue is fixed now, let's try this code for a day or so and see if any problems show up.

<person> test: redi

<proj> test: mod_virgule

<project> test: mod_virgule

<wiki> test: WikiPedia:Advogato.org

Watch for Spammers

If you're wondering about the source of the recent increase in phony users signing up for Advogato accounts, I think I've found it. A number of Russian SEO/spammer blogs are discussing a list of websites that seem to be highly trusted by Google based on the ratio of pages in the main Google index to the supplemental Google index. Advogato is #16 on the list. (I'd provide some links but giving them links from Advogato is the last thing we should do. If you're curious you should be able to find them using a site like Technorati to find blogs that have linked to Advogato in the last few weeks.)

A side effect has been a big bandwidth hit. I thought at first we'd been slashdotted. But the main result is a rash of SEO spammers signing up for Advogato accounts and trying to find some way to get backlinks to their link farms and spam sites. Average survival time for their profiles has been less than 48 hours so probably nothing to worry about but everyone should take a look at the "recent people joining" list and flag anyone who looks like spam. Hopefully it will die down in a week or two.

32 older entries...

 

Others have certified robogato as follows:

  • trs80 certified robogato as Master
  • fxn certified robogato as Master
  • mchirico certified robogato as Master
  • atai certified robogato as Master
  • Omnifarious certified robogato as Master
  • wainstead certified robogato as Master
  • mako certified robogato as Journeyer
  • airlover certified robogato as Apprentice
  • phaulyx certified robogato as Master
  • MartySchrader certified robogato as Master
  • eldeguzman certified robogato as Master

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page