Older blog entries for robbat2 (starting at number 13)

I'm a mac... vs. *NIX

Many thanks to [info]logik for this work of brilliance. Posted with permission, and slightly reformatted here.

A stoner, takes a puff of his joint and says, "Hi, I'm a mac!".
The poorly dressed wannabe bank teller beside him says, "... and I'm a PC."

The door nearby blows in and a heavily armed tactical team storms the room,
throwing both of them to the floor, barrels of MP5k's against their skulls.

Someone yells, "AREA CLEAR!"
The lieutenant comes in after them, smoking a cigar, surveying the area.
"I'm Solaris,
the sergeant over there is BSD (You remember your daddy mac?),
the pretty boy with the M14, he's Linux,
and the guy toting the M60... That there is HPUX.
Now, shut the fuck up, both of you.
We've had about enough of your 'Bill and Ted Get a Computer' bullshit.
Keep it up, and we're gonna do the same thing to you that we did to OS2, got it?"

Syndicated 2008-11-25 09:29:13 from Move along, nothing to read

Gentoo recruiting randomness

As a recent random time-waster, I went and read all of the bugs in the "Recruitment" product of the Gentoo Bugzilla. In doing so, I found twelve developers (ebuild or other) that weren't listed in our LDAP or historical tracking at all. I added them back now, I have gentoo-core announcements from when several of them joined as well that I double-checked.

The "lost" developers
  • pihta - bug 20756
  • ct - bug 22211
  • srcerer - bug 23184 (retire date approximate)
  • fede2 - bug 25464
  • vlaci - bug 31795
  • teval - bug 36753
  • mccabemt - bug 43029
  • rip7 - bug 46353
  • twk-b - bug 53723
  • dj-submerge - bug 57051
  • little_bob - bug 69742
  • ruth - bug 70469
Other LDAP changes from my review:
  • svyatogor - bug 20756 - updated join date for original docs work, he had commit rights two years before his previously stated join date
  • archaelus - bug 30835 - data fixup
  • apokorny - bug 70188 - add join date
Further plans:

There are 92 developers without join dates. We need to find join dates for them via BugZilla and CVS/SVN. Also audit all join dates for every other developer. Lastly, discover and capture retirement dates for every past developer.

Present statistics: 673 developers total. 247 active, 426 retired.

Syndicated 2008-11-24 09:05:57 from Move along, nothing to read

long-term ccache statistics for a portage-dedicated instance

Migrating data and cleaning up my old desktop display head machine, I decided to check out my ccache statistics. This is a very old cache, having first started 2006-01-13. The oldest item in the present cache is 2008-01-12, but the statistics are valid for the entire period. hits 229k and 834k misses = approximately 21% hit rate. This wasn't any crazy repeated compiling of my own code, just a dedicated ccache directory for Portage to use.

cache hit                         228637
cache miss                        834113
called for link                   100293
multiple source files                526
compile failed                     20645
ccache internal error                 14
preprocessor error                 12425
cache file missing                     9
bad compiler arguments                 1
not a C/C++ file                   39097
autoconf compile/link             183802
unsupported compiler option        34481
no input file                      96690
files in cache                    204344
cache size                           1.8 Gbytes
max cache size                       2.0 Gbytes

Syndicated 2008-09-14 07:53:25 from Move along, nothing to read

Linux MD RAID devices and moving spares to missing slots

Setting up the storage on my new machine, I just ran into something really interesting, what seems to be deliberate usable and useful, but completely undocumented functionality in the MD RAID layer.

It's possible to create RAID devices with the initial array having 'missing' slots, and then add the devices for those missing slots later. RAID1 lets you have one or more, RAID5 only one, RAID6 one or two, RAID10 up to half of the total. That functionality is documented in both the Documentation/md.txt of the kernel, as well as the manpage for mdadm.

What isn't documented is when you later add devices, how to get them to take up the 'missing' slots, rather than remain as spares. Nothing in md(7), mdadm(8), or Documentation/md.txt. Nothing I tried with mdadm could do it either, leaving only the sysfs interface for the RAID device.

Documentation/md.txt does describe the sysfs interface in detail, but seems to have some omissions and outdated material - the code has moved on, but the documentation hasn't caught up yet.

So, below the jump, I present my small HOWTO on creating a RAID10 with missing devices and how to later add them properly.

MD with missing devices HOWTO

We're going to create /dev/md10 as a RAID10, starting with two missing devices. In the example here, I use 4 loopback devices of 512MiB each: /dev/loop[1-4], but you should just substitute your real devices.

# mdadm --create /dev/md10 --level 10 -n 4 /dev/loop1 missing /dev/loop3 missing -x 0
mdadm: array /dev/md10 started.
# cat /proc/mdstat 
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4] 
md10 : active raid10 loop3[2] loop1[0]
      1048448 blocks 64K chunks 2 near-copies [4/2] [U_U_]
# mdadm --manage --add /dev/md10 /dev/loop2 /dev/loop4
mdadm: added /dev/loop2
mdadm: added /dev/loop4
# cat /proc/mdstat 
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4] 
md10 : active raid10 loop4[4](S) loop2[5](S) loop3[2] loop1[0]
      1048448 blocks 64K chunks 2 near-copies [4/2] [U_U_]

Now notice that the two new devices have been added as spares [denoted by the "(S)"], and that the array remains degraded [denoted by the underscores in the "[U_U_]"]. Now it's time to break out the sysfs interface.

# cd /sys/block/md10/md/
# grep . dev-loop*/{slot,state}
dev-loop1/slot:0
dev-loop2/slot:none
dev-loop3/slot:2
dev-loop4/slot:none
dev-loop1/state:in_sync
dev-loop2/state:spare
dev-loop3/state:in_sync
dev-loop4/state:spare

Now a short foray into explaining how MD-raid sees component devices. For an array with N devices total, there are slots numbered from 0 to N-1. If all the devices are present, there are no empty slots. The presence or absence of a device in a slot is noted by the display from /proc/mdstat: [U_U_]. That shows we have a devices in slots 0 and 2, and nothing in slots 1 and 3. The mdstat output does include slot numbers after each device in the listing line: md10 : active raid10 loop4[4](S) loop2[5](S) loop3[2] loop1[0]. loop4 and loop2 are in slots 4 and 5, both spare. loop3 and loop1 are in slots 0 and 2. The slot numbers that are greater than the device numbers seem to be extraneous, I'm not sure if they are just an mdadm abstraction, or in the kernel internals only.

Now we want to fix up the array. We want to promote both spares to the missing slots. This is the first item that Documentation/md.txt is really wrong it. The description for the slot sysfs node contains: "This can only be set while assembling an array." This is actually wrong, we CAN write to it and fix our array.

# echo 1 >dev-loop2/slot
# echo 3 >dev-loop4/slot
# grep . dev-loop*/slot
dev-loop1/slot:0
dev-loop2/slot:1
dev-loop3/slot:2
dev-loop4/slot:3
# cat /proc/mdstat
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4] 
md10 : active raid10 loop4[4] loop2[5] loop3[2] loop1[0]
      1048448 blocks 64K chunks 2 near-copies [4/2] [U_U_]

The slot numbers have changed in the mdstat output and the sysfs, but they no longer match at all. The spare marker "(S)" has also vanished. Now we can follow the sysfs docmentation, and force a rebuild using the sync_action node.

In theory, the mdadm daemon, if running, should have detected that the array was degraded and had valid spares, but I don't know why it didn't. Perhaps another bug to trace down later.

# echo repair >sync_action 
(wait a moment)
# cat /proc/mdstat
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4] 
md10 : active raid10 loop4[4] loop2[5] loop3[2] loop1[0]
      1048448 blocks 64K chunks 2 near-copies [4/2] [U_U_]
      [=============>.......]  recovery = 65.6% (344064/524224) finish=0.1min speed=22937K/sec

The slot numbers still aren't what we set them to, but the array is busy rebuilding still.

# cat /proc/mdstat 
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4] 
md10 : active raid10 loop4[3] loop2[1] loop3[2] loop1[0]
      1048448 blocks 64K chunks 2 near-copies [4/4] [UUUU]

Now that the rebuild is complete, the slot numbers have flipped to their correct values.

Bonus: regular maintenance ideas

While we can regularly check individual disks with the daemon part of smartmontools, issuing short and long disk tests, there is also a way to check entire arrays for consistency.

The only way of doing it with mdadm is to force a rebuild, but that isn't really a nice proposition if it picks a disk that was about to fail as one of the 'good' disks. sysfs to the rescue again, there is a non-destructive way to test an array, and only promote to repair mode if there is an issue.

# echo check >sync_action 
(wait a moment)
# cat /proc/mdstat
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4] 
md10 : active raid10 loop4[3] loop2[1] loop3[2] loop1[0]
      1048448 blocks 64K chunks 2 near-copies [4/4] [UUUU]
      [============>........]  check = 62.8% (660224/1048448) finish=0.0min speed=110037K/sec

Either make a cronjob to do it, or put the functionality in mdadm. You can safely issue the check command to multiple md devices at once, the kernel will ensure that it doesn't check array that share the same disks.

Syndicated 2008-09-08 01:15:39 from Move along, nothing to read

Apparently non-existent, but quite real parts: Analog Devices AD2000B

Edit 2008/09/16:

Code fixed now, no specs available yet See my patches here.

Edit 2008/09/05:

A private source that I inquired of indicates that the AD2000B part was only a special run of the AD1989B part. There shouldn't be any functional differences. On the side of a spec sheet, the AD1989B specs should be available "shortly" from Analog Devices.

Original posting:

So in more details to follow, I picked up hardware for a new workstation to replace my G5. The only part of the hardware that isn't working yet, is the digital audio (SPDIF/Toslink) output. My motherboard is an Asus P5Q-Premium, and the specifications claim to have "ADI® AD2000B 8-Channel High Definition Audio CODEC" as the audio chip. This chip is apparently the successor to the AD1988B chip. The analog audio part works fine, just that I use optical to overcome an interference issue on the run between my computers and my actual working area of my desk (with a small digital decoder and stereo speakers).

Digging around in the ALSA drivers, it just seems I need to find a different set of controls to toggle the digital lines to be outputs or enabled - and that this data would be in the public datasheet, just like previous versions of the chip. I submitted a technical request to Asus a few days ago, with no response yet. I also contacted Analog Devices directly. Their customer support referred me to their application engineers, whom I phoned, and they then proceeded to deny the existence of the chip, and I quote: "It's not in my system, we don't manufacture it." That's really interesting, because I've got it on my motherboard!

Either the divisions of Analog Devices aren't talking, or Asus is using chips from a 3rd party that's ripping off Analog Device's trademark amongst other things.

Here's the text off the chip:

AD2000BX
14??793.1
#0816 0.3
SINGAPORE

I tried to take a photo, but it's really annoying and hard to read, without dis-assembling my machine, which I'd prefer not to do at this point.

However, I did find another photo on the web, of the same area from a review of the motherboard. The Analog Devices logo is also clearly visible after the 'BX' portion of the text. From the photo I could make out:

AD2000BX
1383055.1
#0808 0.2
SINGAPORE

If I had to make a guess about it, the chip is AD2000BX, the second line is the serial number, the third is the year and week of manufacturer, plus the revision of the chip, and the last line is the manufacture location.

If you're from Asus or Analog Devices, and you're reading this, where's the datasheet for the chip? Is it a real ADI part? I simply want the public datasheet like the rest of models so that I can fix digital audio output in Linux myself, and contribute it back to the ALSA project.

P.S. The upstream ALSA bug is here. There's no downstream Gentoo bug.

Syndicated 2008-09-03 21:25:10 from Move along, nothing to read

Jeeves IRC replacement now alive - Willikins

This is a copy+paste from my email to the gentoo-dev mailing list, simply because some developers and users follow the RSS feeds rather than read email. If you want the bot in your channel and you are a channel founder/lead op, please respond on the thread in the mailing list

Hi folks,

Sorry that it's taken this long to get completed, but the Jeeves
replacement, Willikins, is finally 99% done, and ready to join lots of
channels.

Getting the bot out there
-------------------------
If you would like to have the new bot in your #gentoo-* channel, would
each channel founder/leader please respond to this thread, stating the
channel name, and that they are the contact for any problems/troubles.

Bug reports
-----------
Please open a bug in the Gentoo Infrastructure product, using the
'Other' component, and assign it directly to me.

Custom bot functionality:
-------------------------
Here's all the functionality that we have assembled, beyond the standard
rbot stuff.
Bugzilla
========
!bug [ZILLA] ID
Looks up bug #ID in the per-channel default or specified bugzilla.

!bugstats [ZILLA]
Totals of bugs per the bugzilla 'status' field.

!archstats [ZILLA] [STATUS] [RESO]
Totals of bugs per architecture, optionally with some specific set of
status or resolution values, comma delimited.

status = OPEN, DONE, UNCONFIRMED,NEW,ASSIGNED,REOPENED, RESOLVED, VERIFIED, CLOSED
Reso = FIXED, INVALID, WONTFIX, LATER, REMIND, DUPLICATE, WORKSFORME,
       CANTFIX, NEEDINFO, TEST-REQUEST, UPSTREAM
zilla = gentoo xine sourcemage redhat mozilla kernel fdo abisource
        apache kde gnome
If you want another bugzilla, file a bug.

Gentoo-specific
===============
!meta [-v] [CAT/]PACKAGE
Print the metadata and optionally herd members for a given package.

!changelog [CAT/]PACKAGE
Changelog stats for a package

!devaway list
List all away developers.

!devaway DEVNAME
Display .away message for a single developer.

!herd HERD
Show herd members

!expn NAME
Show the expansion of any public Gentoo mail alias

!glsa GLSAID
Shows the title and external IDS for any given GLSA ID.

!earch [CAT/]PACKAGE
Earch output for a given package

!rdep [CAT/]PACKAGE
Reverse RDEPEND for a given package

!ddep
Reverse DEPEND for a given package

What isn't supported yet
------------------------
1. !glsa -s TEXT
This used to search for GLSAs that matched that string in their title or
external IDS.

2. New bug announcements
Jeeves used to announce brand new bugs to #gentoo-bugs as well as
targeted channels or users, depending on the product, component,
assignee, cc and a number of other factors (deeply nested if/else
trees). The old implementation had this in code entirely, and it would
be nice to avoid having to modify the code whatsoever, and instead have
some domain-specific language for doing this.

Source availability
-------------------
Gentoo specific:
http://git.overlays.gentoo.org/gitweb/?p=proj/rbot-gentoo.git
Bugzilla support:
http://git.overlays.gentoo.org/gitweb/?p=proj/rbot-bugzilla.git
(flameeyes has his own tree as well, but he's been sick lately, so it
was lagging behind my development)

Right now, if you want to run your own instance of the bot, you will
need the latest Git tree of the rBot itself, as upstream only fixed the
last remaining issue a couple of hours ago.

Thanks to
---------
solar:
Running the old Jeeves Eggdrop till now, and helping to document all of
the Eggdrop functionality we used.

flameeyes:
Bugzilla plugin development

halcy0n:
Gentoo-specific stuff

tango_, jsn-:
(rbot upstream developers) For fixing the bugs as I found them :-).

Syndicated 2008-08-06 21:30:36 from Move along, nothing to read

SSH ControlMaster for Gentoo CVS

Cardoe was complaining that repeatedly hitting the Gentoo CVS server was too slow, and it turned out he wasn't using SSH ControlMaster at all. Other developers have blogged about it before, but here is a quick reminder how.

Without ControlMaster, running "time ssh robbat2@cvs.gentoo.org w" shows a turnaround of 1.9 seconds. With ControlMaster, It's more in the range of 0.07-0.09 seconds :-).

~/.ssh/config:
Host master-cvs.gentoo.org
    HostName cvs.gentoo.org
    User robbat2
    ControlMaster yes
    ControlPath ~/.ssh/master-%l-%h-%p-%r.sock
Host cvs.gentoo.org
    ControlMaster no 
    ControlPath ~/.ssh/master-%l-%h-%p-%r.sock
    BatchMode yes
Setup Usage:
ssh -f -n -N master-cvs.gentoo.org

Now just do anything like you would normally. For security, you should probably close the ControlMaster session if you're going away from your machine for a long time. It would be nice to detect the loss of the ControlMaster and re-initiate it always at the start of a sequence.

Syndicated 2008-08-05 22:03:21 from Move along, nothing to read

OLS Day -1: Wireless mini-summit

On Tuesday for OLS2008, I attended the wireless mini-summit. In past years, fellow Gentoo developer dsd has attended, and was remember by some of the attendees. I'm not so much involved with wireless stuff these days, but I have a good grasp of it from back when I worked at Net-Conex doing point-to-point links using Airaya WirelessGRID 802.11a gear doing 108Mbit w/ AES256, plus personal experimentation.

The vendors (Intel, Marvel, Broadcom, Atheros, Ralink, Nokia and others) and major distributions (Fedora/RH, Ubuntu, Debian, Suse) were present, but I was the only attendee from the smaller distros. Also present were some of the other core wireless developers, incl. Johannes Berg.

Most of the talk focused on 802.11 stack and driver issues, with a presentation about WiMax from Intel.

One of the really interesting things was the work from Luis R. Rodriguez, on the new Central Regulatory Domain Agent (CRDA). There were some large questions from the Intel crowd about API and interaction, but the general concept was very well received. The support for signing the domain file is probably going to not be used for the most part, as there are too many other places to subvert usage of the data even if the file is signed. 802.11d and 802.11h are mostly considered as useless as apparently no regulatory agencies have signed off on them.

Another interesting discussion came out of the discussion on power management. Stuff on the usage of the CARRIER interface flag. It's apparently quite inconsistent, and the UP/DOWN status on some wireless devices has large implications. Some devices go totally away on DOWN, and need firmware loaded on UP. In some, the power consumption in reset state prior to loading firmware is lower than any other powered-down state. Multiple power levels may be added later to try and allow devices to define what states are best/available for their power saving. Implications of firmware loss and DOWN state on associations to APs, esp. when some parts of WPA are in play. This all also sucks with some DHCP clients as they perform DOWN on release or failure, which loses the firmware - such behavior from userspace really needs to be stamped out.

For lunch, we went to a buffet resturant, Tuckers. I was a little dubious of this at first, as buffet is really not my thing, however I can say that it was quite decent, esp. their roast beef carvery, with some nice whole-grained mustard. The salads weren't so great, but overall I think I'd eat there again in a group if there was sufficient group demand.

On the way out of lunch, I ran into a cute girl (I'll call her A) with a rubber-spiked laptop bag, and started chatting to her. As she was an Ottawa resident, she was prepared with an umbrella for the torrential summer rain that started during lunch. Sharing her umbrella we returned to the hotel conference rooms, splitting up thereafter as she was in the virtualization mini-summit.

Post-lunch, resuming the wireless mini-summit, we discussed more issues about the CRDA, core mac80211 development, and then breakout sessions on power-management and ??? (I can't remember what the other side was, even though I was in it).

For dinner, I took a clear walk out via parliament, and a very long way, full route. Ended up at "Elgin Street Freehouse" for dinner, had an Indian-fusion twist on steak, and did manage to find virgin Mojitos successfully. Nice 5km walk for exploring.

Syndicated 2008-07-28 18:47:13 from Move along, nothing to read

OLS2008 - "Issues in Linux Mirroring: Or, BitTorrent Considered Harmful"

As one talk I was really interested in, I went to John Hawley's talk entitled "Issues in Linux Mirroring: Or, BitTorrent Considered Harmful", as seen from the perspective of the kernel.org mirrors.

This paper was really interesting for me, both as the Gentoo releng infra liasion (I get the bits from releng onto the mirrors), as well as working for IsoHunt, since he was complaining about BitTorrent.

Before the actual material about BitTorrent, he had some harsh words about distributions and space usage, and the lack of co-ordination. Having multiple major distributions doing their releases in the same week really only hurt themselves, because the mirrors get saturated by users. Between two major distros, they use up fully half of the 5.5TiB at kernel.org, and having them doing new material at the same time just blows out the cache, even with stupid amounts of memory. (Comments were made about Mark Shuttleworth having the money to buy some boxes with TiB of RAM for kernel.org). Co-ordination between distributions is needed to resolve this issue, and the audience discussion suggested we should try the distributions@freedesktop list first, and if that's too much noise, start up a list at kernel.org instead.

Moving onto BitTorrent, he noted that in large Linux torrent swarms, the standard tracker balancing algorithms end up with a net effect that a few slow peers joining greatly slow down the swarm speed at present (based on analysis of the tracker used by Fedora for the F8 release). If mirror are performing seeding, in many cases, it will still be faster for the mirror to provide content for a given user than other client peers. If the objective is to move content as fast as possible, this is needed vs. the normal BT objective of balancing total bandwidth usage.

Issues for distributions in handling bittorrent to make life easy for mirrors, he had several complaints about the level of manual interaction needed, to which I responded with the Gentoo structure of symlink trees under experimental, which is used for mirrors to run torrents easily, as well as powering the HTTP seeding additions to the BitTorrent protocol.

In using rtorrent(libtorrent), he complained that it wasn't using sendfile at all, which had a large negative performance impact, should be tackled upstream.

The BitTorrent community also needs to look at tweaking the peer decision protocol in the announce protocols, to hand out a smarter selection of fast peers. Where fast is local (look at BGP looking-glass for clues) or is a designated fast mirror that should be used as a fast peer.

Lastly, he noted that the trackers seem to be badly run, as somebody from isoHunt, I offered to post up my own work on running effective trackers to the inter-distro discussion.

Syndicated 2008-07-26 19:34:19 from Move along, nothing to read

4 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!