Older blog entries for robbat2 (starting at number 4)

OLS2008 - "Issues in Linux Mirroring: Or, BitTorrent Considered Harmful"

As one talk I was really interested in, I went to John Hawley's talk entitled "Issues in Linux Mirroring: Or, BitTorrent Considered Harmful", as seen from the perspective of the kernel.org mirrors.

This paper was really interesting for me, both as the Gentoo releng infra liasion (I get the bits from releng onto the mirrors), as well as working for IsoHunt, since he was complaining about BitTorrent.

Before the actual material about BitTorrent, he had some harsh words about distributions and space usage, and the lack of co-ordination. Having multiple major distributions doing their releases in the same week really only hurt themselves, because the mirrors get saturated by users. Between two major distros, they use up fully half of the 5.5TiB at kernel.org, and having them doing new material at the same time just blows out the cache, even with stupid amounts of memory. (Comments were made about Mark Shuttleworth having the money to buy some boxes with TiB of RAM for kernel.org). Co-ordination between distributions is needed to resolve this issue, and the audience discussion suggested we should try the distributions@freedesktop list first, and if that's too much noise, start up a list at kernel.org instead.

Moving onto BitTorrent, he noted that in large Linux torrent swarms, the standard tracker balancing algorithms end up with a net effect that a few slow peers joining greatly slow down the swarm speed at present (based on analysis of the tracker used by Fedora for the F8 release). If mirror are performing seeding, in many cases, it will still be faster for the mirror to provide content for a given user than other client peers. If the objective is to move content as fast as possible, this is needed vs. the normal BT objective of balancing total bandwidth usage.

Issues for distributions in handling bittorrent to make life easy for mirrors, he had several complaints about the level of manual interaction needed, to which I responded with the Gentoo structure of symlink trees under experimental, which is used for mirrors to run torrents easily, as well as powering the HTTP seeding additions to the BitTorrent protocol.

In using rtorrent(libtorrent), he complained that it wasn't using sendfile at all, which had a large negative performance impact, should be tackled upstream.

The BitTorrent community also needs to look at tweaking the peer decision protocol in the announce protocols, to hand out a smarter selection of fast peers. Where fast is local (look at BGP looking-glass for clues) or is a designated fast mirror that should be used as a fast peer.

Lastly, he noted that the trackers seem to be badly run, as somebody from isoHunt, I offered to post up my own work on running effective trackers to the inter-distro discussion.

Syndicated 2008-07-26 19:34:19 from Move along, nothing to read

2008 conference season: OLS2008 and OSCON2008/FOSSCoach

- In 2006, I went to MySQL UC, and OSCON.
- In 2007, I went to the Vancouver PHP conference and LWE-SF.
- For 2008, I went to MySQL UC, and I'm going to be at OLS2008 in Ottawa next week, July 21st thru 27th.1

I'll have the entire Sunday free in Ottawa (my flight home is in the evening, and the conference itself ends up Saturday). Anybody that wants to hang out, that would be cool, or sight-seeing.

Additionally, if you're interested in PGP keysigning, or CACert assurances, you should seek me out with some ID. This applies doubly to all Gentoo developers with the upcoming tree-signing work.

While I'm not going to OSCON since it conflicts with OLS, my friend Zak Greant (really I mean it, he lives just up the street from me!) is going to OSCON, and putting on a totally free mini-conference within it: FOSSCoach. If you're just trying to get a start in open source from a beginner's perspective, and would like to be more than just a user, it should be worth checking out. (I meant to hype it a while ago, but was too busy).

Syndicated 2008-07-19 21:00:48 from Move along, nothing to read

Crummy Stats on the Gentoo 2008.0 release

Ok, so this isn't a full one week period yet, but I'm going to be out tonight probably, so 8 hours ahead of time is close enough. These also don't account for anybody who went and picked a specific mirror manually. I could do a much better job, but this is just a quick scrape of the numbers. There are many pitfalls in them, so they are more for interest than serious statistics.

Downloads by bouncer product (no arch breakdown)
gentoo-2008.0-livecd (x86,amd64) 72518
gentoo-2008.0-minimal 26543
gentoo-2008.0-universal (hppa,ppc,sparc64) 2925
gentoo-2008.0-packagecd (sparc64) 385
'completed' from torrent tracker
livecd-i686-installer-2008.0-r1 975
livecd-i686-installer-2008.0 867
install-x86-minimal-2008.0 681
livecd-amd64-installer-2008.0-r1 451
livecd-amd64-installer-2008.0 373
install-amd64-minimal-2008.0 353
install-powerpc-universal-2008.0 69
install-powerpc-minimal-2008.0 61
install-alpha-minimal-2008.0 48
install-ia64-minimal-2008.0 46
install-hppa-universal-2008.0 42
install-sparc64-universal-2008.0 29
packages-sparc64-2008.0 28
install-sparc64-minimal-2008.0 28
install-hppa-minimal-2008.0 28
www node traffic

For the two machines that serve up exclusively the www.gentoo.org vhost, they normally do 6-9GiB/day in HTTP traffic, and on the day of the release they jumped to 21GiB (and 14GiB for the second day).

Syndicated 2008-07-12 23:34:40 from Move along, nothing to read

Tree-signing in Gentoo and recent research into Package Manager Security

So on Slashdot today, there was a link to the latest research into Package manager security. Specifically, their focus was on defeating signed packages by use of malicious mirrors and replay attacks of signed content. Recording the source of client requests, and possibly denying specific security updates (having an older tree that doesn't contain the security updates).

This plays into some of my long-ongoing tree-signing research in Gentoo. The GLEPs with the exception of 02 and 03 have been mailed to the GLEP editors as well as the portage-dev mailing list, and will be going to the gentoo-dev mailing list after the GLEP editors have reviewed them.

For dealing with the new issues raised by Cappos et al, at Gentoo we are really lucky to have our own infra maintained hardened rotation of mirrors at rsync://rsync.gentoo.org/ in addition to the community mirrors at rsync://rsync$N.$CC.gentoo.org/. Nobody using just the infra-maintained mirrors (barring MITM attacks) would be vulnerable to the new attacks described by Cappos, however those using a community-maintained mirror could be.

Using the main mirrors for new signing purposes, this will enable us to deliver the new MetaManifests reliably via our own infrastructure, even when the user has a community mirror for their actual tree content. The actual changes to the GLEP for this weren't very big at all. Just a timestamp header inside the signed area, as well as distributing the MetaManifests via a trusted medium.

As a minor side note on the infra-maintained rsync.gentoo.org rotation, this would be a good time to consider sponsering a box to Gentoo for that purpose. Each of the 5 existing boxes in the rotation does 50-65GiB of traffic every day - averaging to 6.5Mbit/sec, over a 24-hour period. These boxes are bandwidth, memory and CPU intensive, however they don't hit disk very hard (we serve the trees directly from memory). 4GiB RAM, 2+ 64-bit processors (single core or dual core is fine), ~16GiB of disk (optional: software RAID1 is nice for avoiding downtime, and fancy fast disks aren't needed). We need a serial console or KVM to install it securely - you just boot the box to a livecd, get the access details to infra, we install it from there with our own stage4 tarball that links into cfengine. The machine continues to be owned by the sponsor, in your data centre.

Syndicated 2008-07-11 01:59:02 from Move along, nothing to read

Gentoo infra staff over the recent history of Gentoo

So working on a cleanup of machines, I looked at the history of the infra pages in CVS, and I noticed that infra has had a lot of developers of the years.

I'm probably missing a few, that never made it to the list, or predated the list, but I think it's a good start. I've also listed what they did either from the webpage, or from memory, again apologies if I got it wrong.

I'd like to thank all of those that put work into infra in the past, but have retired from Gentoo

  • Alex Howells (astinus) - Mirrors, DNS.
  • Jeffrey Forman (jforman) - Mirrors, DNS, Bugzilla, sysadmin, lots
  • Andrea Barisani (lcars) - Lists, LDAP, mail, sysadmin, lots
  • Kyle England (kengland) - sysadmin, cfengine
  • Lars Weiler (pylon) - CVS, SVN, overlays
  • Robert Coie (rac) - Forums, DBA
  • Jon Portnoy (avenj) - Mirrors
  • Sascha Schwabbauer (cybersystem) - Mail, Jabber
  • Tim Haynes (piglet) - Mirrors
  • Corey Shields (cshields) - LOTS
  • Rob Holland (tigger) - sysadmin
  • Benjamin Coles (sj7trunks) - Bugzilla, sysadmin
  • Michael Cummings (mcummings) - sysadmin
  • David Olsen (lude) - Mirrors
  • Albert Hopkins (marduk) - packages.g.o
  • Luca Mercuri (siggy) - www
  • Andrew D. Fant (jfmuggs) - backups, www
  • Curtis Napier (curtis119) - www, torrents
  • ???? (little_bob) - nagios

I'm not forgetting our current infra team, I hope to do a followup about them sometime soon too.

Syndicated 2008-07-05 01:53:42 from Move along, nothing to read

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!