The Perfect User Interface

Posted 2 Feb 2001 at 11:00 UTC by maphew Share This

The computer User Interface is something I spend a lot of time thinking about, and even more time cussing and fretting over. :) I have encountered a number interfaces that I wish were more widely used. This document is my own small attempt to help that happen. I'm very interested in your thoughts on the matter.

The home page for this doc is There are a couple of graphics to break the layout. I'll warn you in advance, this is pretty Windows centric since that's where the bulk of my gui experience is. I've been using Unices for a couple of years, but mostly from remote terms. My home linux box doesn't have enough muscle to run X comfortably.

  Unlike what the title suggests, I don't know what the Perfect Interface is or might be. However I do have some distinct ideas of what it should include (in no particular order):

  • shortcut keys for all commands (where feasible)
  • direct dynamic customizability
  • collapsing defining/configuring modes and display/reporting modes
  • context sensitive menus
  • context sensitive position dependant keyboard shortcuts (analagous key arrays)
  • always hot command-line integrated with the gui
  • a facade proposal

  Shortcut keys for all commands, wherever possible and it makes sense. The best examples of this I have encountered are exemplified by the pixel magic programs, PhotoShop and The Gimp. In these programs all of the primary (and most of the secondary and tertiary) tools can be quickly activated with a simple keystroke, and not some tendon-stretching chorded combination either. A simple 'z' or 'b' to select the Zoom or paintBrush tools, not Alt-Z or Ctrl-Shift-B.

  PhotoShop is slightly superior (for the moment) in that it has temporary tool assignment. You can have a paint brush tool active, and if you push and hold the space bar, the active tool becames the hand (for panning). The moment the space bar is released, the paint brush (or whatever the previous tool was) becomes active again. Alt while Zooming changes to Zoom Out, Ctrl to Zoom In. To be fair, Gimp also uses temporary tool assignment in places, but I picked PhotoShop as the prime example because it is more complete and far reaching throughout the program, at this moment in time.

  Gimp fans, don't despair, because it has something I've never seen in another program: dynamic key binding.  To assign a new keyboard shortcut to a command, open the menu as you normally would, select but don't activate the command, press the desired key, and voila! the task is done. No messing with config files, no wading through many layers of cusomize this dialogs, just direct action on the object you are modifying. If config files and layers of dialogs are your cup of tea, they're there too. It's glorious. The Gimp's dynamic key assignment is a perfect example you can see-right-now of the next element: collapsing interfaces.

  It is time for the programming paradigm of code > compile > results to fade to the background. Wherever possible we should be <span class="inlineHeading">able to directly interact with our data</span>. The symbolic image is a potter throwing away cumbersome utensils and placing their hands directly on the clay.

  An example of how C>C>R gets in the way is customizing the desktop. Currently the standard approach to adjust your colour scheme is to open a simulation desktop which does not resemble yours at all, select an element to change, pick the new colour, slap the apply button, move the fake model desktop out of the way, then open a few windows and see if you like the change. Most of the time the alteration is not quite what you envisioned and the process must be repeated several times.

  A better approach would be to enter configuration mode, have a colour and an editing tool pallette open up, directly click on the characteristic to be modifed, title bars for instance, and then select the new colour from the pallete and apply it to the real desktop. For machines with sufficient hardware, most of them these days, skip the apply stage altogether and have realtime preview.

  Getting back to keyboards for a moment, there is a window manager for linux (whose name escapes me at the moment) which allows the keyboard to direct the mouse cursor. Slap a key to go into m-cursor control mode and arrow your way to freedom. Moving too slow? Hold the shift key and the cursor will gallop along. I hate having to abandon the keyboard just to click an icon and then come back to the keys. That being said, it sucks just as much having to move back to the keyboard for one stroke when a click or two would do.

  On to context sensitive menus. Sometimes I really think computer evolution is devolution. If an alt key modifies the menu choices, that fact should be displayed.

  To be sure, WordPerfect had something like this at least 3 years earlier, and who knows how old the idea really is. My main question here is, why didn't this incredibly useful convention cross over into the graphical user interface? I miss it.

  Terminate also had something else which was common in the era and has since fallen into disuse, a quick reference keystroke screen available at anytime and from anywhere.

  And before I finally allow Terminate to rest... a cool feature I still miss everyday: assigning usernames and passwords to shortcut keys. Logging in to Slashdot or Dejanews? Ctrl-F1, Enter. done. Sure Internet Explorer and Mozilla have 'remember this login' but you have to have already filled out the form, on that server, before you can use it. Actually, since the topic is the Perfect Interface, username and password is a bad idea anyway **link to useit** and shouldn't be there. Not that that matters since we all use cypherpunks, cypherpunks anyway right? Anyway the same feature could be adapted to any frequently filled field.

   By now you may have decided I'm a keyboard bigot. I'm not

   Dan Kaminzky's Analogous Key Arrays {Hmm, for some reason Dan appears to have password locked his sight, or, he hasn't paid his netpedia bill, or something. Anybody know a mirror?} Now this feature I haven't actually tried yet, but I've read the description and I love the concept. Basically you always use the same shortcut keys, the numeric keypad for instance, but what they do depends on what feature set is active or what mode you are in. And, you don't have to remember them all because the gui buttons gives you the context. If the toolbar is a 3x3 matrix, the center button will always be activated by typing 5, the center left by 4, etc. Use the -/+ keys (or whatever) to move the button group in focus up or down. Sorry I can't be more decriptive, it's been awhile since I read the document.

   One of my all time favourite interfaces is sported by the various versions of AutoCAD going back at least to 1987 (when I first encountered it). AutoCAD, being a drafting program, is inherently graphical and it's graphic user interface was in use a long time before Microsoft jumped on the bandwagon. And yet, not only did AutoCAD have a command line, it was essential. You could get around without it, pointing and clicking at layered menus, but not very well. Now the shoe is on the other foot, the command line is there, but unless you have a strong keyboard predisposition you're not as likely to use it.

  Some other programs have adopted an integrated command line, but most of the ones I've seen have missed one of the <span class="inlineHeading">most important elements - the principle of being always hot</span>. Any keypress, excepting an Alt chord, is automatically sent to the command line. If you have to grab the mouse and click on the command area in order to type you negate most of the reason for having a command line in the first place.

  Another important element of the AutoCAD command line is that any (well almost any) gui command is echoed there. This means it can be used as a learning device. Can't remember the name of that command? Find it's toolbar button, push it, and you're off doing whatever you want to do and you can see the text-mode command you could use next time as an alternate route.

  Now the realism part. Everybody's an expert, there are N thousand Perfect User Interface proposals **google link**, and nobody really likes anybody elses. Also there are thousand of legacy applications which are in use everyday but nobody is working on anymore. So here's my solution: The Facade. Graphical Skins are wonderfully popular, you can even buy them for your cell phone. I'd like to take the idea a little further, don't stop with changing the program's look and feel, carry right on into changing how it works.

  The Facade, as I envision it, is a small lightweight program which stores your personal preferences and customizations and applies them to any program you use (at your discretion of course). Found yourself having to use yet another graphics program which hasn't seen the light of one stroke tool changes? No problem, have Facade intercept your keyboard commands and translate them into what the program expects. Don't want to look at their inefficient and incomplete menu layout? Have Facade cover their toolbar with yours.

   It will take a lot of work to build the translation table for any particular progam and nobody will want to do it more than once. This means the translation tables would have to be sharable among users to spread the work around. The shared tables could be called, you guessed it, faces. And since you are going to all the trouble of fixing that dumb program anyway, you might as well cover that face with a skin, right? And since all the developers reading this will see the light and want the perfect ui we need an API so the face and skin can be tightly integrated and bonded with the program. Might as well call that the skull, eh?

  Feedback on how to improve this essay is welcome, a working facade, more so. :)

Keyboard controls mouse is standard XFree stuff, posted 2 Feb 2001 at 14:35 UTC by Tv » (Journeyer)

Try this: ctrl-shift-numlock (you hear a quick beep-beep), now use the keypad keys. Tapping shift while scrolling makes it speed up, 5 is mouse1 etc. Press c-s-numlock again to deactivate.

good ideas, but nothing new, posted 2 Feb 2001 at 20:05 UTC by dutky » (Journeyer)

Much of what you say is fairly good, but is also mostly covered by current research and writing in Human Computer Interfaces and user interface design. It also seems that you are not familiar with much of the existing work on HCI and interface design. I would suggest that you check out Bruce Tognazzini's web page and browse through the user interface books in your local large/technical bookstore.

Shortcut Keys

Yes, but I actually like those "tendon-stretching corded combination" shortcuts. I think that the use of some meta-keys to represent sweeping mode changes is well established (e.g. Shift- manipulate for multiple select or contrained draw/move/modify) and usefully extends the shortcut vocabulary for specific applications. (which is to say that I don't think the system standard shortcuts should use corded combinations. Applications should be free to define the behavior of corded meta-key combinations)

Direct Dynamic Customization (DDC?)

I've never used the Gimp for any substantial work, but this feature sounds both intriguing and infuriating. If I had to put up with this feature I would like a way to turn it off completely, once I have defined my shortcuts, if only to prevent the feature from tripping me up in daily use. I would also like to see a good management interface for the entire list of defined shortcuts, since there is a complex issue of shortcut redefinition, which is not adequately handled in any current UI with which I am familiar.

Collapsing Modes

This is called progressive disclosure amoung HCI folk and user interface designers. I'm a bit confused by how you connect this with coding and compilation, however. It seems that what you really want is more immediate feedback than most customization tools currently provide. Better feedback is also a well established trope in HCI/UI design.

Context Sensitive Menus

The main complaint against context sensitive menus is that they are completely invisible to beginning users, and are thus fairly mysterious. Clearly, however, any battle that may have been fought against contextual menus has been lost. The mysteriousness of contextual menus is increased when entirely different menus appear depending upon what your mouse pointer is positioned over when you summon the menu. To a user who doesn't clearly understand the application's metaphores, the reason for the different contextual menus can be pretty unclear.

Context Sensitive Shortcuts

If contextual menus are confusing, contextual shortcuts are doubly so. I'd want to be able to turn this feature off on any program I had to use.

Hot Command Lines

I've never understood why people liked a CAD package whose primary interface was the keyboard, but I'm an old Mac-guy. There were some development tools from Apple that did similar things (Apple unix for the Mac II and the old Macintosh Programmer's Workshop environment) and I never thought they were all that usefull.

the Facade proposal

I really like parts of this idea. I would like to see a layer between the user interface and the rest of the program logic that would be editable by end users. This would basically be a scripting langauge in which the entire UI would be written. The UI script would determine how the UI's appearance and behavior, and would generate a consistent set of messages that would be sent to the program logic back end. Users could edit the UI scripts in any desired manner, and the back end logic would be none the wiser that anything had changed. Some of this was possible with programs written using NeXT's interface builder, but I have never seen this in any other commercial offering.

Ultimately, what I really want in a UI is twofold:

First, I want a nice, consistant UI on programs I have never used before. Most common tasks should be performed in the same way as other programs with which I am already familiar (having standard open file, save new file, page setup, and print dialogs, along with standardized behaviors for menus, text edit boxes, etc. is a really good start) and any really new types of tasks should try to model the ways of doing the new task after ways of doing similar tasks in other programs (I should not have to operate scroll bars via left and right mouse buttons in one program but then have little arrows and page regions in another).

Second, I would like to be able to edit the user interface to my tastes, once I am comfortable with the program and know how I use it on a regular basis. I should be able to move menu items and change shortcut keys to my heart's content. This also means that any UI feature that I don't like I should be able to turn off.

superficial changes?, posted 2 Feb 2001 at 21:07 UTC by RyanMuldoon » (Journeyer)

A lot of what you mention seems to not really solve many problems: my take on what you say is that we should have better keyboard shortcuts, and users should be able to change the UI to suit them better. To some extent, in the GNOME world, this can already happen. GTK is what has the dynamic keyboard shortcuts, not just the Gimp. Also, programs that do their UI with glade are already prepared for users to modify the UI (which is represented in XML) if they really want to. But this doesn't really change anything about HCI. You still have to be the one directing all the operations on your computer. I'm more interested in looking at applications of Information Agents - little daemons that can take care of things for me, and just give me reports if something important happens. The Anti-Mac Interface (which there was an advogato article about several months ago) has a lot of interesting ideas that break free of the normal WIMP UI. I am interested in seeing how we can be more expressive in communicating with our computers. Giving computers more information about the objects we manipulate should hopefully let them do some of the drudgery that we have to put up with. And, when we do actively try to perform some operation, the computer will be better able to help us. (A simple example: if all music files had more metadata, like including the mood, or date recorded, etc, we could ask our music player to play all "relaxing music"). The trick is to figure out ways to give the computer this extra information without being an overly burdensome task for the user. One possibility is having the computer pay attention to what files you use together, and build up relation graphs. So at the very least, you could open a file, and ask your computer for all similar files - it would then traverse the graph for what files are most used with that file. I'm sure more clever things could be done than this though.

The Facade?!, posted 2 Feb 2001 at 23:55 UTC by Malx » (Journeyer)

We have it already!

You could use TCL/TK - it is pretty what you need.
If you thinks it is too old - use Mozilla ( application platform.
Here JavaScript and XML are languages for interface. You could see mozilla-browser skin examples. You could edit them if you know JS+XML freely.
All real components made in C/C++ - so you need to add just core function (or just a plugin) and then access it through JS/XML.

What I really wont to see in new interfaces:

  • First of all - It must minimize key strokes and mouse moves to accomplish a task (especially for repetitive tasks). I thinks for WIN UI it is a problem to get any automatization work :)
  • Second - APP must use window space fully! If I make window smaller it must leave there only usefull info (and never it must try to place there all thouse garbage of menues). Look at Icon-style applications for WindowMaker
  • In addition to previous - one must think of using same space for different data - you could use color/size/shape to represent different data on the same place of screen!!! at the same time. It is easie to change your focusing than to change image with clicking or presing keys.

So unix shell (text) interfaces still the best. :-)
Also I like Key+Mouse bottons press/releases shortcuts of GIMP very much. And GTK+ dynamic SC.

Is this a good way to go?, posted 3 Feb 2001 at 00:38 UTC by kjk » (Journeyer)

I think that most of your proposals would actually make for a worse UI. My key argument is that a way to a better UI is by simplifying things as much as possible not by adding new features that make programs more complicated to use. Emacs, Vi, Linux, command line, they are all great for power users. They'll make you more efficient if you're willing to spend considerable time learning how to use them properly. Most people, however, don't have this time and just want to have their work done with the least amount of hassle. As far as specifics go:

Shortcuts for all commands should only be an option for power users, turned off by default. It's good for people who spend a lot of time in one program, (say, Gimp) but if I use Gimp as a novice I don't want strange things to happen just because I pressed a random key.

Context-sensitivity again the value it might bring by making operation more efficient can be overtaken by the fact that it makes thing unpredictible for novices (sth. changes and you don't really know why).

The facade limitless configurability is also bad for most users. Consistency has much more value. This has been witnessed in practice (Windows/Mac have more appeal for most users that Linux because, among others, it's more consistent) and there is common-sense thinking behind it as well (would you like street signs to look different in different cities? why should programs/UIs be different?)

Now before you cry out loud please note that I acknowledge that those ideas are very good for power user so it's great if your program targets power users. For everyone else (and there's much more non-power users than power users) a good UI should first and foremost strive for simplicity not feature-city (as Alan Copper said "no matter how great your UI is it would be better if there was less of it"). I found "The inmates are running the asylum" by Alan Copper to have a great discussion of the issues involved. His main point is that design (including UI design) should be a user-centered process and not an implementation of arbitrary ideas (that's why "perfect UI" can't exist; what's perfect for a secretary is not perfect for a seasoned programmer; different sets of requirements will lead to different trade-offs in designing UI). Some of the ideas presented have a reason behind them (e.g., shortcuts in Gimp is a time-saver for power-users) but some (e.g. facade) will bring more problems than possible benefits.

2 kjk, posted 4 Feb 2001 at 00:13 UTC by Malx » (Journeyer)

There is one thing you have not thinks of.
It is transition from novice to power user. It would be impossible for novice to become power in your case of turning on/off features.

Re: is this a good way to go?, posted 4 Feb 2001 at 23:12 UTC by cdent » (Master)


I think that most of your proposals would actually make for a worse UI. My key argument is that a way to a better UI is by simplifying things as much as possible not by adding new features that make programs more complicated to use. Emacs, Vi, Linux, command line, they are all great for power users. They'll make you more efficient if you're willing to spend considerable time learning how to use them properly. Most people, however, don't have this time and just want to have their work done with the least amount of hassle.
I agree that this may be true for entry level users, but I think you encounter a significant problem if you simplify things: expressiveness. A simple interface means that the story that you can tell the computer has to be relatively simply. A simple interface gives you a grammar like this:
Make Dick run. Make Jane run.
A more complex interface (admittedly with a steeper learning curve) would allow more detailed and expressive interactions:
Have Dick move quickly in a random but southerly direction. Have Jane prevaricate about the bush. If Dick encounters Jane, have them have a conversation using <filename> as the idea source.

A well designed interface would provide both and would optionally (defaulting to on) teach the user the more complex grammar as they go. The context-sensitive "tips" that some programs have are an effort to do this, but they haven't quite made it all the way there.

Ideally a workstation environment would adapt to you, sort of like a parent talking to a child. As skills increased, so would the available vocabulary and the complexity of the grammar. How to do that? I don't know, I'll leave that as an exercise for the reader and go do some reading myself. There's probably a lot that can be learned from foreign language instructors, as that's really what we are talking about here: learning new languages.

Speaking of learning, posted 6 Feb 2001 at 06:46 UTC by kjk » (Journeyer)

Speaking of learning (not only languages): have you read N. Stephenson's "Diamand age"? I was really impressed by the idea he presented there which, I think, boils down to what you describe: an ultimate teaching environment that adapts to your level of knowledge and guides you to higher levels in the most efficient way. Granted, we don't yet have technology to build such device if it's at all possible (with the biggest, rather philosophical than technological, problem being: who should decide what you should learn/be taught?).

2kjk, posted 7 Feb 2001 at 23:57 UTC by Malx » (Journeyer)

No. How to find it in network? (also how to get your e-mail? :)

initial response to comments, posted 8 Feb 2001 at 00:01 UTC by maphew » (Apprentice)

Thank you everybody for your comments. I'll respond in more depth once I've had a chance to ponder them more thoroughly and for now restrict myself to a couple of general and necessarily shallow notes.

dutky: With the exception of the facade, there isn't supposed to be anything new, rather a collection of Best of Breed features. My ideal UI would be something which I could throw my whole body into not just a few finger and wrist movements, large movements as well as subtle twitches should be possible. Something akin to that will probably develop sooner or later, but my essay is purposely restricted to things which could be accomplished right now, with technology already demonstrated and in use somehwhere in a more or less popular fashion. I like graphics apps which use the keyboard because I have two hands and want to use both of them.

Malx: I'm unclear how Tcl/Tk or Mozilla could bring new interface functions to legacy, possibly closed source, apps. Please elucidate.

kjk: Agreed, a method of dealing with the complexity of limitless customization and concommitant confusion is a must. To my way of thinking the facade would deal this using templates. Most users will just use the default template aligned towards their history: Apple, Emacs, Windows, VI or modes, etc. and the user will have to explicitly flip a toggle somewhere to allow them to enter into customization mode.The default state, a previously saved session, or an "official" template downloaded from [program].com / could be activated at any time.

Basically think of Facade as cascading style sheets applied to applications and user interaction instead of web pages and looks.


closed source, posted 9 Feb 2001 at 00:34 UTC by Malx » (Journeyer)

I thinks it would be same way, as you using now Libraries or COM components.
TCL/TK uses TCL only as a glue for binari modules
Same - Mozilla uses JavaScript to call external binary plugins, which do the real job.

But! It is agains MS way of programming :)
There you make UI first and then put there functionality.........

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page