Older blog entries for avriettea (starting at number 203)

the show goes on

I'm receiving ideas from myriad places containing the traditional wisdom of how to treat this sort of head injury. It's not a conventional injury, for one. It's got some features of damaged bits, but other parts seem to be fine. Still others are really worrysome, such as the intermittent hallucinations. It looks like I'll be doing inpatient stuf for a day or so (maybe 3-4) on an EKG and EEG, and get my MRI redone. It would be so nice if they found something. Of course, "something" is going to be like "You have a four centimeter mass on your prefrontal cortex. While it's probably operable, there's a margin of morbidity."

Fortunately for all concerned, it's never lupus. So, we're safe there.

Syndicated 2007-08-01 23:41:00 (Updated 2007-08-01 23:52:40) from Alex J. Avriette

The iTunes disk

I've been ruminating on this iTunes disk notion. We already have this silly iDisk thing which is a wrapper around webdav. At one point, we were able to mount ftp connections as volumes (although I think this has been turned off by default; It was a pain in the ass with the early copies of Safari, but I just don't use ftp anymore – or Safari, for that matter). Anyways, why not have an "iTunes Disk" functionality? It's really not that hard (in the sense that it's frequently easier to scale up and out code than to start from scratch. Sometimes it isn't. I'd bet, though, that the QA and vetting process for such a feature would take longer than the coding) as most of the code is written already.

Live preview and columns

The Finder has column display, so hierarchies are not hard to figure out. There's live previewing in the Finder already, both for small icons and "click-hovering" (in column display, tap the icon once, it shows up enormous in the rightmost column – you can actually play m4p and m4b files from here, if iTunes has the right credentials!). All they'd have to do is make it pretty (making it simple/elegant is not very hard, as the users are already using the finder, and the display in iTunes wouldn't change). The functionality is there.

Browsing through iTunes data, including rendering of data through iTunes tokens (so they're not decoupled)

What reason would they have for not extending this any further? There are other examples of this, such as the "burn folder" functionality. Furthermore, Microsoft has kind of taken the idea and run with it in Vista (and to a lesser extent, XP). There's a whole lot of stuff you can do in explorer.exe without really launching an application. Isn't this usually the other way around? I'm not saying Vista has a slicker-n-guano human interface, just that there are features that seem sensible to have which aren't in MacOS (to be fair, there's a lot of garbage I don't want in Windows).

What seems to be missing from the equation, though, is that performance-tuned backend. As we've now hit twenty times the storage of the original iPod, we're still using the same stuff. I wholeheartedly agree with the notion of having it be hacker-friendly, and XML is a reasonable way to do that. However, when you provide somebody with XML they have to either write their own lexer/parser or grab an API from somebody. Why not, as a couple people suggested to me, use something like sqlite? I wouldn't be having these irritated "seek...seek...seek" times between book parts, gapless cd's, and so on. Seems to me that SQL is fairly agnostic as an interface goes. Similar to XML's being an agnostic data exchange format. So what would be wrong with providing either APIs (I am sure they would make sure the snake is kitted out) or SQL hooks via their ODBC stuff, or even appleshare hooks. If you really, really want XML, there's no reason you couldn't have a select_as_xml() function built into the API. I suppose you could probably spit out an iTunes library on an interval, or when things were changed (changes tend to be not-often and come in groups) for applications that insist on frobbing the XML.

As I was mulling all this over, it occurred to me that it's not unlike the NeXT vision (that is, plugins to the finder, rather than heavy applications), and of course OS X is full of next relics. So why wouldn't they have done this already? I could write software that generated a library for iTunes to read, but I have no idea how to handle writes, which of course the software would have to do.

Syndicated 2007-07-31 15:37:00 (Updated 2007-07-31 16:43:57) from Alex J. Avriette

Getting serious about media


Apple has been courting the music devotee for quite some time. While some of us may think this goes back to, say, the first generation iPod, it of course goes much further. Step back even further and you see Altivec, which was essentially a means to render floats on the proc (because back then, graphics cards weren't the hyperactive reality-engine-in-a-chip that they are today). Further yet, we have the 'av' models (660av, 840av, 6100/8100av, etc), which were the company's first efforts at producing a real graphics machine for consumers (the other choice for the 'prosumer' market was spending a couple hundred grand on a deskside machine). We can even extrapolate and point out that Quicktime was a step in this direction, as well (and I had it on my SE/30, to give you an idea of how far back this goes).

And yet, Apple has no serious approach for storage. You can now get 2tb of storage in your Mac, built-to-order. In fact, as soon as the deal with Dell/Alienware expires, Hitachi will be selling you 1tb drives for your Mac, pushing that number up to 4tb, in the chassis alone.

What about the XServe RAID? The problem you'll encounter here is precisely the same problem you'll have with 4tb in your tower. Because Apple is taking a bottom-up approach, adapting consumer hardware to professional uses, they run into ugly issues like the OS (or the controller) sleeping the drives when they're not in use – even when they're part of a volume group. I have in front of me five 250GB firewire 800 drives. Unfortunately, even if I were to make a RAID 5 out of them, I expose myself to substantial risk of data corruption. I could instead go with RAID 0, but the problem there is of course that the risk I mitigate by switching from 5 to 0 is offset by the increased risk of failure due to reduced redundancy.

The other problem that bothers me is the absolute, glaring failure of Apple to actually support the two-percent (you could call this hyperparetotic if you like, although it might be more applicable to ask Benford for a reacharound) folks in the media market. Because their product lines encourage people to expand their storage needs at a rate much faster than the rest of the consumer market, and the baseline of what they consider normal (Apple's selling 80GB iPods, and I reckon we'll make it to 100 or 120gb before Apple changes the form factor in some way), that hyper-extended 2% (it becomes much more dilute if we extend out to the traditional 80/20 Pareto principle) will be consuming seemingly exponentially larger storage real estate, and are going to need novel ways to manage them. How plausible is this? Well, they've managed to get most of us carrying around accelerometers. How far away could it be that we begin to understand logical versus physical volumes and volume groups? (the bad news: iSCSI is already taken, they'll need a new product name)

We wouldn't need novel forms of storage if everyone understood how to manage an FCAL loop or could trouble themselves to memorize what RAID levels 0, 1, 5, 10, and 15 are. But, we do need that technology, and Apple is in a unique place to provide it to a segment of the market that doesn't have a problem spending a thousand or two more for a laptop, every year.

Let me change directions a bit here. It's very unusual to find a database that is greater in size than a terabyte. Further, the data contained therein is generally smaller than its footprint by a factor of six to ten. So it's fair to say that for most individuals – indeed most organizations – their data footprint is smaller than 250GB, and probably smaller than 100GB. However, when running through indices for data that large, when we want subsecond response times, most organizations that are serious about data (Oracle, SGI, and RHAT, for example) realize that the filesystem very much gets in the way.

Consider this. When I moved all my iTunes data off of the primary filesystem and into a logical partition therein, I essentially said to the operating system that I didn't care too much about the niceties of file systems. Instead, I wanted portability, scalability, and containment. But why not add to that performance? What are the needs of your average iTunes user? It seems like an obvious answer at first, but really, it's quite complicated.

  • Performance (prefetch; no gaps between gapless tracks or video segments)
  • Redundancy (durability; with "thousands of songs" in my pocket, at $1-$20 per each, I don't want them to disappear)
  • Containment (protection from commingling; Sandy's media should remain separate from mine)
  • Portability (the ability to move media from one machine or device to another; not necessarily the ability to duplicate or "share")
  • Scalability (reformatting or upgrading devices is inherently dangerous; I would like the ability to simply add storage when I run out of room, be it with a physical or logical device, or expanding a logical device)
But we don't have anything that addresses even one of these items from Apple. My current inventory looks to be about 30,000 music items (this includes iTunes U and music videos), 200 movies, 250 TV items, 300 podcasts, 200 book items (misleading, as most of them are split into 3-5 pieces). All told, it's about 300gb (incidentally, the library is about 3 million [logical] words, or about 60MB of XML... more on this in a minute). What I'm getting at is that all of these things have different needs.

Colons? Never! We're running Unix under the hood!

Apple has given us the ability to denote where a library lives. That's half the code you need to support multiple libraries. Of course, from that point, you're two-thirds of the way to defining kinds of libraries in multiple locations. I could specify ten gigs for gapless music, a hundred gigs for television, a hundred gigs for "normal" music, and twenty-five gigs for low-quality audio like iTunes U and podcasts. Because the hardware requirements of all four of those types of media are different, why not specify them in different places (e.g., on different physical volumes)? Moreover, why not specify them as separate libraries so that I could take parts with me and leave others at home? Do I really need to keep two hundred movies on me? Probably not. Of course, I can come up with smart playlists, but I can't manage them manually without spending substantial time on just that. It becomes necessary to have the software understand them in some way. Even extremely rudimentary functionality in this regard would be a significant improvement.

The other thing here is the filesystem getting in the way. Oracle and other database vendors get past this problem by having what they call "raw mode." Essentially, the database owns the physical disk, rather than having the operating system format it and manage it. Why would you want to do this, right? THe answer is simple. The database has its own set of users and advanced ackles. Why does it need to have the operating system managing permission on that disk? Just let "oracle" own it. Since the system administrator is making sure nobody's looking up Oracle's skirt, and Oracle is making sure that the data it's sending out is going to the right, vetted people, Oracle gets the benefit of not having to ask the filesystem for permission to do everything.

Consider the 4mb "song" versus the 750+mb movie or 350mb television show. Among songs, we have albums like Dark Side of the Moon, at 100mb, but with individual components ranging from 2MB to 20MB. DSOTM in particular is intended to be one single piece rather than ten individual pieces. So, when we're at 1.95MB on track A, we need to be reading .25MB into track B. This is governed by the filesystem. With RAIDs, we can set the "block size" to optimize this process. The notion is that for bigger files, we don't want to have to go back to the disk a bunch of times to read a file that's a gig in length. So we set the block size to very high numbers like 64MB or even higher. The corollary to this of course is that for very small files, we don't want to sit there waiting to read 64MB when the file itself is 1MB in length. In this case, we can set block sizes down as small as a few KB.

Apple is in a unique position to help these customers out. First, Apple has been culturing a userbase (in the "imma growin me sum pleghs" sense of the word culture, not as in "high"** culture) of people with enormous storage needs, who spend lots of money on their products, and whose storage needs are generally easily grouped into a few narrow categories. How hard would these things be to understand?

  • An "iTunes Disk" function in the preferences (or in Disk Utility.app), or "Let iTunes manage this disk".
  • A "multiple libraries" function under Advanced... .
  • Different types of media storage by location.
  • "Add volume to library" to extend the available logical space.
  • "Mirror my data here" for one-click redundancy.
  • iTunes Prosumer Edition (or iTunes Pro, or iTunes Enterprise Edition, etc).

So getting back to how this helps the consumer, and how it helps Apple, let me say that Apple's great strength is their ability to abstract away complicated ideas behind simple interfaces through the use of clever algorithms and other tricks of software and hardware. Of late, they've even had a problem bringing their employees up to speed, technically. Because they're not hiring PhD's to work at the local "genius bar," they have to not only explain technical concepts to their employees in a manner that is technical, but not too technical, but of course also explain those products in lay terms for their consumers.

It seems to me that Apple has the opportunity to mine their own customer base for new customers. That's pretty win-win. Of course, for every product Apple brings to market, they manage to fuck up two that are already available and abort three more for lack of management and vision.

Syndicated 2007-07-25 19:11:00 (Updated 2007-07-25 21:34:10) from Alex J. Avriette

quick update

if you're wondering, hey, wtf happened to alex? why hasn't he emailed/called/whatever? well, I'm not going to explain it in public, but now would be a good time to get in touch with me. basically, I've had some head trauma, and my memory is hosed (as far as I can tell) from May until now, and continues to be an issue going forward (that is, forming new memories). nominally, I augment my memory with the use of e-mail and live calendaring. however, we've also had some issues with our ISP, so I've been offline for a little while too.

anyways, I hate to be a burden, but go ahead and get in touch.

Syndicated 2007-07-24 21:43:00 (Updated 2007-07-24 21:47:24) from Alex J. Avriette

I'm not dead yet...

I am not dead. It's been a rough week or two here. Rough enough I got an EEG this morning (and CT friday). As far as I can tell, it looked pretty normal, but then, I'm not a neurologist. If you want additional details, you know how to reach me.

In other news, there seems to have been a spike in the rifle market. The Army is expecting to spend $16k on each of them. I mean, that number could include disposing of the busted ones, but the soldier's trained. So there's no salary or anything in there. Schoomaker actually said that they really needed to replace all their M16's with the M4 (which is really the M16A4 Carbine). Presumably the SDM-R will retain the longer barrel, and here's to hoping that the XM107 and the M1 are still in the field. There are just some things that a 5.56 can't do.

Syndicated 2007-07-09 23:17:00 (Updated 2007-07-09 23:37:26) from Alex J. Avriette

Video games

Since my Xbox 360 died, I haven't been much on video games. But we bought a used PS2 to play Katamari Domacy, and as an impulse buy, I picked up Final Fantasy X. Now, ages and ages ago, I wasted hours on this game, in its first or second iteration, on an original Gameboy. I find now, literally two decades later, that Square and the FF games are still the black-hole of time and productivity they've always been. Ick.

Syndicated 2007-07-03 17:59:00 (Updated 2007-07-03 18:01:58) from Alex J. Avriette

TDMA status

Progress on Net::TDMA (although I suspect this is not the right place for it) was interrupted of late by paying customers. Paying for perl, even. Well... paying for data, I chose the perl part of it. I suppose I could have done all the munging and scraping in ruby, but I don't know enough about (or whether there is) ruby's LWP I also chose XML::Dumper for storing and transferring the data. This way, I figure, the customer has the data and can do with it what they want, or I can take the same data, xml2pl it, and stuff it into MySQL or something else sufficiently facile.

So, work on the TDMA stuff resumes today, along with interviews.

Syndicated 2007-07-02 12:36:00 (Updated 2007-07-02 12:45:24) from Alex J. Avriette

Learned a new perl trick today

So I was parsing html, which is always kind of an icky job. But perl has this great regex engine I can employ to do the parsing for me. The problem with the regex engine is it's very difficult to debug a bad expression. I remember being confounded for hours by them in the past.


@bottle{qw{ upc_code year name varietal size }} = $bottling =~ m{
(\d+)</a></td> # this is the UPC code
<td>([^<]+)</td> # this is the year
<td>([^<]+)</td> # this should be the name
<td>([^<]+)</td> # this is the varietal
<td>([^<]+)</td> # bottle size
}x;

Luckily, perl gives us the /x modifier to regexes. So in this case you can see a very simple expression, but I'm sure you can imagine much more complicated expressions. If we want to see where the expression is broken, we can just do this:


@bottle{qw{ upc_code year name varietal size }} = $bottling =~ m{
(\d+)</a></td> # this is the UPC code
# <td>([^<]+)</td> # this is the year
# <td>([^<]+)</td> # this should be the name
# <td>([^<]+)</td> # this is the varietal
# <td>([^<]+)</td> # bottle size
}x;


and run it each time, opening up another little piece of the expression each time. This way we can "walk" down the expression finding where we goofed. In the case above, there was just a line that needed a \s* (which is easy to forget about when using /x!).

Syndicated 2007-06-30 21:00:00 (Updated 2007-06-30 21:16:40) from Alex J. Avriette

Just for the record

Trusting companies to do anything decent is a bad idea. Their entire purpose is to make you do as much work as humanly possible while simultaneously paying you the least amount they can before you will leave or otherwise stop working.

I can't believe I did it again. Why did I forget that? This time, we got really burned. This time it wasn't just fucking around with my income, it was affecting my marriage, my friends, and even my cars. So I come home to everything being a complete wreck after a month of being locked in a cell and working sixty+ hour weeks. Now I get to put my life back in order, one piece at a time.

Syndicated 2007-06-28 19:12:00 (Updated 2007-06-28 19:18:49) from Alex J. Avriette

Internet backup solutions


An internet backup provider, whose name rhymes with "posie" (the 'pocket full of posies' line in the children's nursery rhyme refers to the scabs associated with the black plague, so it fits), has spammed this site.

That doesn't really bother me. The first one was actually helpful, but was posted from a misleading address. It suggested that I use this posie service to back up my stuff. Well, that's fine for a gig, or a few hundred megs, or whatever. But what we're talking about here is the transport of several hundreds of gigs of data across the fancy interweb to posie.

This might work if I had SDSL still, and had a T1 to the house. This would also work if I took my personal computers to work and used the giant pipe at work to upload my data to posie. But I don't have a T1 or an OC3. I have cable – that I am borrowing from a neighbor and is thus intermittent. So that means I get about 600kbyte/s down, but we're locked at 384bits (note I said bits) on the way out.

But the marketing drone doesn't really understand these sorts of things. First, don't spam. Second, don't spam twice. Third, think and understand the situation before you start talking. Lastly, if you're going to use a pseudonym, like bonnie, for some marketing firm, please let me know that you are with a marketing form so that I may become aware of the fact that you're trying to sell me something, and you might not always be telling the truth.

So, bonnie, you are the first comments I've deleted in a long time. Unmoderated comments I think are the way to, but it's shit like this that tempts me the the other way.

By the by, I'm not going to link to the remote storage company, because that would contribute to their google page rank. My wife will also never recommend their service to a customer again (she works on the retail side of Apple).

The name of the advertising firm, however, for those who wish to blacklist or whichever, is Starline Marketing. These are the sorts of people I put on black lists for spam and the like. Perhaps google should implement procmail in blogger for comments (or mail for gmail, or...)

Syndicated 2007-06-27 04:21:00 (Updated 2007-06-27 04:45:46) from Alex J. Avriette

194 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!