Older blog entries for robbat2 (starting at number 19)

Gentoo release statistics as of 2009/10/09 23h59 UTC

solar was asking about release statistics, so I grabbed the current data from Bouncer. The nearly 34k releases for 10.0 is just in the 5 days that it's been out. I included the various architetures that were part of each released 'product', to make some degree of comparision possible.

What Hits Arches
installcd-minimum 228561 alpha,amd64,hppa,ia64,ppc,ppc64,sparc64,x86
installcd-universal 374388 alpha,amd64,hppa,ppc,sparc64,x86
packagecd 162537 alpha,amd64,ppc,ppc64,sparc64,x86

livecd 242422 x86
minimal 287496 alpha,amd64,hppa,ia64,ppc,ppc64,sparc64,x86
packagecd 42572 amd64,ppc-g4,ppc-ppc,sparc64
packagecd-32ul 10909 ppc64
packagecd-64ul 2981 ppc64
universal 111359 alpha,amd64,hppa,ppc,ppc64,sparc64

livecd 307481 amd64,x86
minimal 330505 alpha,amd64,hppa,ia64,ppc,ppc64,sparc64,x86
packagecd 39118 ppc,ppc-g3,ppc-g4,ppc64,ppc64-g5
universal 122280 alpha,hppa,ppc,ppc64,sparc64

bt-http-seed 72980 ALL
livecd 411958 amd64,x86
minimal 496943 alpha,amd64,hppa,ia64,ppc,ppc64,sparc64,x86
packagecd 27593 ppc-g4,sparc64
universal 137554 hppa,ppc,ppc64,sparc64

livecd 19426 amd64,ppc64,x86
livedvd 4 amd64,x86
minimal 14069 alpha,amd64,hppa,ia64,ppc64,sparc64,x86
universal 1745 ppc64,sparc64

livecd 37771 amd64,x86
livedvd 17842 amd64,x86
minimal 55745 alpha,amd64,hppa,ia64,ppc,sparc64,x86
universal 3142 ppc,sparc64

livecd 477934 amd64,x86
minimal 406531 alpha,amd64,hppa,ia64,ppc,sparc64,x86
packagecd 12308 sparc64
universal 83600 hppa,ppc,sparc64

livedvd 4870 amd64,x86

livedvd 33703 amd64,x86

livedvd 0 amd64,x86

  • 2008.* has the LiveDVD's pulled from mirrors due to size complaints.
  • bt-http-seed was an (failed) experiment with a set of mirror URLs for trying to load-balance Bittorrent's HTTP seeding
  • Bouncer really needs replacing, but there's nothing really good to do so that I'm aware of. mod_sentry isn't nice. Other suggestions welcome. Should support products, architectures within products, seperate check/serve URLs, detailed hit recording for analysis.

Syndicated 2009-10-10 05:53:20 from Move along, nothing to read

Visualizing Gentoo profiles

To add a new USE flag, that's globally enabled for all Linux profiles, what's the minimum set of profiles that need to change? Deprecated profiles must be handled as well, for users that need to migrate still.

I ran into this today, while working on the USE=modules changes for linux-mod.eclass.

As an attempt to solve this, I munged up some GraphViz work to show profile inheritance, pictures as the end. Both sets have the trailing profiles "/desktop", "/developer", "/server" turned off for the 2008.0 and 10.0 releases, to cut down on the noise.

Graphs and script for download.

My answers as to which profiles:

  • default-linux
  • default/linux
  • base
  • embedded

Odd observations

  • Several Prefix profiles (linux/{amd64,ia64,x86} link to 2008.0 profiles explicitly instead of the generic architecture)
  • default/linux does not bring in base. Some profiles at a glance neglect this and might not have base brought in at all.
  • "embedded" is all alone in the tree.

Thumbnail of one graph

Question for any skilled GraphViz users:

If all nodes in a given subgroup/cluster have an edge going to a single destination node, is there any way to get graphviz to replace them with a single fat edge from cluster to destination node?

Syndicated 2009-09-21 10:31:00 from Move along, nothing to read

Heatwaves lead to hardware failures

So for our Vancouver heatwave (I noted 39C away from the water today, in the shade!), it's finally claimed some of my computer hardware. Most annoying, the battery backup unit (BBU) in the newer fileserver, and 1.5 of the disks of the RAID1 array in the old server...

My website and personal email will be offline for a day or two while I ensure my backups are up to date, and redeploy to the newer fileserver (after I buy a new BBU tomorrow).

Syndicated 2009-07-30 13:01:10 from Move along, nothing to read

new fortune-mod-gentoo-dev release

I really need to get back to writing in this blog. In the meantime, I scoured my email for the last 2 years of fortune submissions that I hadn't compiled together yet, and make a release. Go forth and amuse yourselves with it.

Syndicated 2009-03-05 11:27:22 from Move along, nothing to read

gentoo mirror stats: master distfiles distribution.

Now for the second set of statistics. These aren't directly useful to mirrors in estimating their traffic, but instead gives a good overview of how our mirroring setup works internally, and now much traffic is involved in the fan-out stage. Distfiles are the main content moved around by this system, but it is also used for the other directories for releases, experimental and snapshots.

A very quick overview of the existing setup:

  1. Developer uploads new distfile directly to dev.gentoo.org.
  2. The master-distfiles box pulls from dev.gentoo.org hourly.
  3. The master-distfiles box checks every ebuild, and downloads missing distfiles from their primary URI if they do not exist. The daily distfile report is also created at this point.
  4. Every hour, the cluster master of ftp.osuosl.org pulls the latest content from master-distfiles. (Averages 240MB/day of traffic).
  5. The OSL FTP cluster master (in Oregon) pushes to it's slave locations in Atlanta and Chicago.
  6. All distfiles mirrors pick up their content from one of the FTP nodes - Internet2-connected hosts are directed via DNS to an Internet2-connected slave for performance.

Each of the distfiles mirrors has about 140-160MB of upstream traffic every day (including both the new files and the rsync overhead for scanning). If there are no files changed, the rsync traffic for a directory scan is 1-2MB. While this isn't a lot of traffic, it's very spiky, as mirrors tend to be on fast links.

The new weekly builds from the Release Engineering team will probably be adding another 1.3GB per week, staggered as one arch per day.

I got a small subset of the logs from the OSU FTP cluster for processing some of these statistics. They cover the 24 hour period of 2008/08/07 UTC. It does not have data of which traffic went via Internet2, and I've grouped the sources by country code (using IP::Country::Fast from CPAN).

CC OutBytesCount, [Notes]
South America
AR 1498379141
BR 1498405221
== 299678436 2
AT 3202290562
BA 1498404221
BE 1464739661
BG 2199886072
CH 1496743121
CZ 8062803705
DE 149092997310
DK 2295154041
EE 1360037741
ES 4493037003
FI 1387115261
FR 7996356615
GB 3960190613
GR 4172227743 [1]
IS 1360037741
LV 1499118641
NL 4519136003
NO 1499088261
PL 6957242141
PT 2840207112
RO 3668540933
SE 4496643343
SK 1498405681
== 8683670590 55
AU 2974020902
JP 4493696853
KR 4509289423
RU 1972457562
SG 1356810941
TH 1358357761
TW 4927311704
== 2159194513 16
North America:
CA 7429692847
US 317491485824
== 3917884142 31
Middle East:
IL 1935272832
KW 1497725501
== 343299833 3

Grand Total:
== 15403727514 bytes 107

[1] One Greek mirror was excluded from the traffic and counts, as this was their catchup sync with 7Gb of traffic after some hardware-related downtime.

As a bit of analysis, I think that more than half of our mirrors (Europe, Middle East, RU) would benefit from having a box to sync against in Europe.

Syndicated 2008-12-16 22:50:31 from Move along, nothing to read

gentoo mirrors stats: a rsync.gentoo.org box

I was doing some statistics about Gentoo mirrors to see about future plans, and thought that the indirect crowd that read my blog via the various aggregators might be interested in numbers.

These are the traffic for boobie.gentoo.org, which is a newer box in the official rsync.gentoo.org box directly maintained by the Infrastructure team. Hardware specs are 2x Xeon 3050 @2.13Ghz, 4GB RAM. Disk is mostly irrelevant - the rsync workload is served purely from RAM (tail-packing reiserfs, backed via loop device pointing to a file on tmpfs).

Inbound traffic is spiky, but does not exceed 10Mbit by more than a little bit - we can the inbound rsyncs from the rsync1 master to 10Mbit. Outbound traffic varies between 4Mbit and 9Mbit, with an average around 6-7Mbit.

Date InBytes InBPS OutBytes OutBPS
2008-12-01 2451035341 28368 59523455410 688928
2008-12-02 2325176854 26911 54877643699 635157
2008-12-03 2167829249 25090 50850785431 588550
2008-12-04 2227342435 25779 50823673845 588236
2008-12-05 2182014214 25254 50558268814 585165
2008-12-06 2039468435 23604 47476164351 549492
2008-12-07 1906528455 22066 50327689263 582496
2008-12-08 2127792797 24627 52759944753 610647
2008-12-09 2327731419 26941 56661069093 655799
2008-12-10 2246262570 25998 52107127647 603091
2008-12-11 2302572673 26650 53602727876 620401
2008-12-12 2077185312 24041 47108235487 545234
2008-12-13 2162193709 25025 50807583749 588050
2008-12-14 1698766788 19661 43678479520 505537
2008-12-15 2370132609 27432 58353939353 675392

Syndicated 2008-12-16 21:37:02 from Move along, nothing to read

I'm a mac... vs. *NIX

Many thanks to [info]logik for this work of brilliance. Posted with permission, and slightly reformatted here.

A stoner, takes a puff of his joint and says, "Hi, I'm a mac!".
The poorly dressed wannabe bank teller beside him says, "... and I'm a PC."

The door nearby blows in and a heavily armed tactical team storms the room,
throwing both of them to the floor, barrels of MP5k's against their skulls.

Someone yells, "AREA CLEAR!"
The lieutenant comes in after them, smoking a cigar, surveying the area.
"I'm Solaris,
the sergeant over there is BSD (You remember your daddy mac?),
the pretty boy with the M14, he's Linux,
and the guy toting the M60... That there is HPUX.
Now, shut the fuck up, both of you.
We've had about enough of your 'Bill and Ted Get a Computer' bullshit.
Keep it up, and we're gonna do the same thing to you that we did to OS2, got it?"

Syndicated 2008-11-25 09:29:13 from Move along, nothing to read

Gentoo recruiting randomness

As a recent random time-waster, I went and read all of the bugs in the "Recruitment" product of the Gentoo Bugzilla. In doing so, I found twelve developers (ebuild or other) that weren't listed in our LDAP or historical tracking at all. I added them back now, I have gentoo-core announcements from when several of them joined as well that I double-checked.

The "lost" developers
  • pihta - bug 20756
  • ct - bug 22211
  • srcerer - bug 23184 (retire date approximate)
  • fede2 - bug 25464
  • vlaci - bug 31795
  • teval - bug 36753
  • mccabemt - bug 43029
  • rip7 - bug 46353
  • twk-b - bug 53723
  • dj-submerge - bug 57051
  • little_bob - bug 69742
  • ruth - bug 70469
Other LDAP changes from my review:
  • svyatogor - bug 20756 - updated join date for original docs work, he had commit rights two years before his previously stated join date
  • archaelus - bug 30835 - data fixup
  • apokorny - bug 70188 - add join date
Further plans:

There are 92 developers without join dates. We need to find join dates for them via BugZilla and CVS/SVN. Also audit all join dates for every other developer. Lastly, discover and capture retirement dates for every past developer.

Present statistics: 673 developers total. 247 active, 426 retired.

Syndicated 2008-11-24 09:05:57 from Move along, nothing to read

long-term ccache statistics for a portage-dedicated instance

Migrating data and cleaning up my old desktop display head machine, I decided to check out my ccache statistics. This is a very old cache, having first started 2006-01-13. The oldest item in the present cache is 2008-01-12, but the statistics are valid for the entire period. hits 229k and 834k misses = approximately 21% hit rate. This wasn't any crazy repeated compiling of my own code, just a dedicated ccache directory for Portage to use.

cache hit                         228637
cache miss                        834113
called for link                   100293
multiple source files                526
compile failed                     20645
ccache internal error                 14
preprocessor error                 12425
cache file missing                     9
bad compiler arguments                 1
not a C/C++ file                   39097
autoconf compile/link             183802
unsupported compiler option        34481
no input file                      96690
files in cache                    204344
cache size                           1.8 Gbytes
max cache size                       2.0 Gbytes

Syndicated 2008-09-14 07:53:25 from Move along, nothing to read

10 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!