23 Jan 2001 (updated 23 Jan 2001 at 18:32 UTC)
»
Cool. I got the Sphinx-II "continuous audio" module working
with select(2) in Perl. So I can get rid of this
horrible bit of code in my speech I/O framework, since I
don't have to fork off a blocking process to do speech
recognition anymore:
# ARGH ... POE has messed with %SIG
@SIG{keys %SIG} = ('DEFAULT') x keys %SIG;
The Sphinx-II "non-blocking" utterance processing interface
is kind of broken... it processes only a single frame of
data at a time, which is way less than the amount typically
available to read from the audio device, and there's no
explicit function to flush unprocessed frames (though you
can just call uttproc_rawdata() with an empty
buffer and blocking on :-) Fortunately, recognition is so
much faster than real time that there's absolutely no reason
to use non-blocking mode, even in a single-threaded program.
Once the 0.3 release happens I will volunteer to take a
hacksaw to all the redundant and poorly-designed interfaces
in Sphinx-II, fix them up, and properly document them.
In other news, I'm learning POE
incrementally ... I've just taken the first step towards
using it as more than just a convenient wrapper for
select(2). Namely, I've taken my random collection
of states handling the Festival server, Sphinx, audio I/O,
and "dialog management" (such as it is - currently this is
just "Hello World" and repeating back what the user says),
and split them up into multiple sessions. Soon I'll take a
stab at packaging them up into actual components.
And I just discovered that the ALSA emulation of the OSS
interface is not quite bug-compatible. In ALSA,
select(2) on PCM devices (including
/dev/dsp) works as expected. With the kernel OSS
drivers, you
have to call read(2) on /dev/dsp before
you can select it for reading, and if you start writing to
it, even if your sound card is capable of full-duplex, you
will no longer be woken up on read. Total fucking brain
damage. Sigh...