Older blog entries for dgatwood (starting at number 7)

Phase 2 of the pitch correction software is complete. The correction table appears to be being built correctly. I have a few more corrections to make in my pitch shift code, in that I need to move the data into a separate buffer to avoid overwriting data that needs to be read when shifting the pitch upwards, or else I need to make it go the opposite direction in that case. I'll decide which is more efficient tonight and go with it.

Also, I still need to write or borrow code to rip AIFF headers off the input file and write then to the output file.

Once those two things are fixed, this thing will be ready for its first test. I expect it to sound like crap, since this is a first cut, and the bulk of it was hacked together in a single evening... but it should be entertaining, anyway.

Future improvements:

1. Better handling of boundaries between transform blocks,
2. During the shift, a weighted average of the two nearest array elements should be used to generate the new value, instead of simply truncating the double to an integer index.
3. Make the transform block narrower to minimize aliasing and latency.
4. Rewrite main() to take arguments for the various parameters instead of passing hard-coded values to the subroutines.
5. Experiment with different FFT implementations to try to minimize latency.
6. Once it works suitably, rewrite as a VST plugin.

I've taken a break from kernel hacking to do something more pressing. I got annoyed with the cost of software for automatic vocal pitch correction (on the order of $350-$700). It's simple enough that there's no reason for it to be so expensive that only people who do this stuff professionally can afford it.

Don't get me wrong, I have no problem with paying for well-written software that's complex in nature. I'd pay $350 for an audio editing program. But paying $350 for a mere audio filter? I can buy a stand-alone hardware solution for $500, and unlike software, if it doesn't work, I have the right to send it back or sue. There's no way I'd pay that kind of money for a piece of filtering software, even if I -had- that much to spend on it (which I don't -- I'm doing audio recording entirely for fun in my spare time as a hobby).

To make matters worse, none of the companies appeared to have a Mac OS X version, making their software completely useless to me. I contacted one of them, and they suggested they might have one by the end of the year. Thanks, but no thanks. If I'm going to pay that kind of money for something that does so little, it darn well better have been Mac OS X native three months ago.

So, in desperation, last night, I pulled together FFT routines and fundamental frequency detection code, and I wrote a lot of glue code, including a first cut at an arbitrary frequency shifter. With the exception of generating the frequency table and actually doing the file I/O, I essentially have a first cut at the software in a little over three hours of coding, assuming my reasoning was sound and I didn't make any stupid mistakes in the math.

The basic idea is this:

1. Take a chunk of size 2^k for some k.
2. Detect the fundamental frequency.
3. Look in a table to determine the frequency of the nearest note.
4. Apply a FFT to the chunk.
5. For each frequency-domain array slot, multiply the array index by the new frequency and divide by the old to get the new index.
6. For each slot, copy the value stored at the new index in the input array into the slot specified by the old index in the output array.
7. Apply a reverse FFT to the chunk.

I'm also working on some clever tricks to avoid glitches at the FFT chunk boundaries with as little aliasing as possible, if my theories hold. The real trick is to build up the table. There are three parameters:

1. Base frequency (i.e. A = 440),
2. Temperament (equal tempered, C major, D minor, whatever), and
3. Compression ratio.

I'll probably only do equal temperament in the first cut to save time. The third option is perhaps the most interesting. The idea is to assume that the singer is pretty close to on-pitch, i.e. not a tone-deaf singer. Instead of hard-locking the voice to an exact pitch, it would instead move it "closer" to the right pitch.

Within a certain range, it will lock it to the right pitch, but beyond that, it will slowly move away, becoming faster as it gets closer to a semitone away, with the semitone being unaltered, and the exact opposite as it gets closer to the next note up or down.

The implications of this are that a slide, glissando, etc. will end up sounding more rapid, but should otherwise sound reasonable, despite the modifications. Similarly, intentional vibrato will be reduced, but generally not eliminated. The idea is to end up with a voice that doesn't sound like it has been artificially "forced" to pitch, i.e. with a certain amount of pitch variation, but that still sounds basically in tune.

Barring any nasty bugs, I expect to have a first cut (equal temperament only) finished this weekend at the latest. Wish me luck.

Well, I never got around to trying the updated code. Other projects (non-software) got in the way. I'm working on recording a song I wrote about a year ago. I'm basically done recording at this point except for patching a few places here and there, but I'm still heavily editing. That's occupying every evening until I finish it. Once that's done, I'll get back to the Nubus driver.

I recently encountered a handful of old patches for the MkLinux kernel that I don't think I have seen before, though a couple of them looked very familiar. We're due for a new kernel release and a new Linux server release ASAP, in part because of a nasty bug in the 2.0.xx kernel when dealing with OpenSSH's privilege separation.

I'm hoping I can get the Nubus framework in place and working before the next release, even if I don't have time to actually bring up any drivers in it. Of course, the drivers are the easy part. It only takes about an hour to port PCI ethernet drivers from NetBSD. I'd expect the Nubus drivers to be even easier. We'll see. :-)

Latest status... after talking with the NetBSD guys, I came up with a massive code restructuring for the offending function that cuts the number of accesses down from hundreds to... umm... four one-byte reads per slot. Should shave seconds from NetBSD's boot, and hopefully it will make the MkLinux version actually work. I'll try it this weekend.

The VM issues turned out to be nastier than I thought. It would wedge consistently with the hardware mapped. I then tried leaving the RAM backing that chunk of address space, and it -still- hung. Hmm. Changed it to allocate memory in a wired fashion and the hang disappeared. Turned back on the mapping of hardware addresses over that space and the hang came back.

Hmm. Rebuilt the kernel with a debug build so I could see where things were going wrong in the path through the VM system, since clearly something wasn't getting set up correctly in the page tables.

Odd, the debug kernel goes right through the problem spot, eventually generating a panic from calling certain VM routines without being at splvm. D'oh! Wrapped the code with s=splvm() and splx(s).

At last count, it correctly probes empty slots. When it tries to probe a full slot, as best I can tell, it's wedging whenever it hits the byte lane where the actual Nubus declaration ROM occurs. Probing different cards yields different, repeatable hangs.

What's odd is that it is successfully reading from a given address, then wedges on about the fifth or sixth access to that address. At this point, I'm starting to wonder if there's something wrong with the probe code port itself.

I've pinged the netbsd-mac68k mailing list to ask the developers if they've ever seen anything similar on the original platform. Failing that, I'm going to have to hand compare the code to the original mac68k code, and possibly hand-compare it to the Linux nubus code as well to get a second perspective from code that is known to work on PPC machines to some extent.

Now getting back to drawing my comic strip for tomorrow....

For some reason, the memory mapping problem isn't behaving like it should. One of two things should have always happened: either a successful access or a machine check exception (which would be caught by the trap handler and my setjmp usage and treated as an unsuccessful probe). Instead, the kernel is just wedging. Strangest thing.

So my code was actually asking the VM system for a range without backing so that it wouldn't leak memory, but I'd never seen code written quite like that before. However, on further digging, the PCI code did it just with a straight allocation followed by a pmap_map. Tried using -that exact code- and got the same results.

At this point, it seems that either the hardware is in some way failing to generate the correct exception, the exception isn't getting handled (because of interrupts being off?), or the hardware isn't configured correctly for probing. I'm going to try reconfiguring BART and see if that helps.... :-|

Wow, I'd only had an account for all of... maybe five minutes when I got my first cert. Nice to know I'm certifiable.
:-D
:-D
:-D

As of last night, I finished writing the additional VM support routines for mapping and unmapping I/O addresses into the kernel's address space as needed for Nubus support in MkLinux.

In the process, I found that there's a really nasty bug in the pmap system (the bottom half of the VM system, i.e. the part that directly manages page tables and the processor's MMU) in MkLinux. Basically, the specs require that a sync instruction (or sync and tlbsync in 604/604e-based designs) be issued prior to leaving the critical section after a TLB flush. In the MkLinux implementation, the sync and tlbsync instructions occur AFTER releasing the lock, leaving the potential for serious corruption problems in the kernel on SMP systems.

The code basically calls a Mach VM routine to allocate a chunk of virtual address space large enough for the region, then calls pmap routines directly to add the mappings. It then obtains locked access to the PTEs for the region, marks them as WIMG_IO, and invalidates the TLB entries as it goes. It's the most horrible thing I've ever had to do in my life, and I feel dirty for having written it, but the VM system doesn't allow you to freely map and unmap hardware into the kernel's address space.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!