Regarding the article posted recently on next-generation
GUI
design...
I've been involved in MIT's small contribution of Gnome
development, mostly as a tester and commentator. When I
talk to
people outside the project (and sometimes inside, as well) I
often
hear comments like "ah, so it's just like Windows,
then?"
On a fundamental level, this is an apt description.
Though there
is a lot of technical innovation and tremendous leapfrogging
of
Microsoft's technology underway now, that development is
what you
could roughly call "incremental." The user interface is
fundamentally
limited to a keyboard and a pointing device.
PDA interfaces seem like a radical departure from the
traditional
PC desktop interface. But really, it's just a degenerate
case that
requires an extensive reworking of the same paradigm. Shift
emphasis
to the pointer, simplify and modularize interface components
so
they'll fit on a small device. It doesn't really change the
fundamental mode of interaction with the technology - point,
tab,
nudge, poke.
I'm really happy to see this incremental improvement
happening, and
it really does take a lot of engineering skill and effort;
and it
really makes a big difference at the end of the day.
We need something radically different for there to be a
real
"revolution" in GUI design. Or perhaps it's more of a
revolution
out of GUIs, and the original author of the article
is really
speaking of a mini-revolution that might be useful in the
existing
paradigm.
It seems to be quite obvious what the next step up from
GUIs should
be - voice and natural language interfaces. Humans have a
built-in
capacity for extremely high-bandwidth, highly expressive
communication
using natural language, especially speech. Natural language
interfaces have the potential to be much more user-friendly,
fluid,
and efficient than conventional GUIs. Alas, it's much, much
harder to
process English commands properly than it is to respond to
pokes and
prods and taps of labelled buttons.
Alas, most applications demand higher performance from
NLP (Natural
Language Processing) than is attainable with current
production
systems. But the technology is improving every day, helped
along by
advances in hardware performance, by techniques in managing
complexity, and by sheer brute force production of a working
knowledge
of the problem.
This process would, ironically, be helped along a great
deal if we
had better user interfaces with which to design our NLP
systems.
Along those lines, I have two particular projects in mind.
I'm
currently studying NLP at MIT, but I don't have plans for
grad school.
Maybe I'll end up in a position to actually do serious
research in
this area, but it's more likely it will be a hobby. Maybe
I'm
completely in left field and this will take centuries of
slow progress
to realize, maybe some company will come out with products
to do this
in the next 20 years, or maybe I'll invent something nifty
in my
copious free time. In any case, even the simplest of these
systems
can do fun and amazing things.
1.) Implementing an NL command and control
interface.
The sci-fi vision of a computer that responds to
natural-language
requests - "Computer, what is the time?" or "Computer, play
some
Bach." - is extremely captivating and represents an obvious
paradigm
under which to create an extreme intuitive interface.
2.) Implementing an NL programming interface.
It is tempting to try to equate machine languages with
natural
languages. (When asked what languages I speak, I often tell
people I
speak Perl fluently.) Of course, they are very different --
but both
are extremely powerful. It is certainly possible, for
instance to
describe, in English, what markup and style I want to use in
my HTML
document, in addition to actually dictating the content
itself. Using
a sort of batch processing approach (and the solution to
problem #1)
one could in a sense translate the description into the
desired
document. (Of course, a more efficient and user-friendly
approach
might be to convey the desired markup through an interactive
dialog,
which adds another layer of complexity.)
But what about arbitrary code in a Turing-complete
language? Why
can't I write a description of exactly what I want my
program to do,
and have that "translated" from English into Perl? Assuming
this
requires a semantic representation of the desired
functionality, would
this not also make it possible to generate a
performance-optimized C
version, say? Would this not speed up the software
engineering
process quite considerably? Would it reduce debugging to a
process of
refining the program's specification, perhaps in an
interactive
dialog? A prerequisite to answering those questions would
be figuring
out at what level of abstraction the English specification
would need
to be. Can the user merely describe the desired mappings
from input
to output, or perhaps the screens and buttons the users
sees, and some
notion of how they interconnect and what functionality they
represent.
Or would the programmer have to speak in pseudocode, at the
level of
functions and variables? How do knowledge representation
and
reasoning interplay with NLP in this application?