A ZUI is a dynamic interface. It presents the user with a canvas that is larger than the viewing area, and it is on this canvas that items are placed: The placement is arbitrary and may be determined by the users, by the system, or by both. Across this canvas, the user can scroll their viewing window to view different items much like any canvas that is too large for the viewing area. The ZUI differs from a normal canvas in that the user may zoom in or out onto a particular item if they so wish, much like a telescopic lens on a camera. Though some systems are designed to limit the amount of zooming (Plaisant et al, 1998), the general idea is that a ZUI has an infinite resolution (Bederson and Hollan, 1994). This means that the user may potentially zoom in or out to an infinite degree (the user can zoom in or out as much as they desire, and will never be stopped except by system limitations).
ZUI's are similar to 3D interfaces in that the user may manipulate 3 dimensions of travel: x, y, and z. However, the ZUI differs from a 3D interface in that the z-axis can only be travelled along in a single direction: in or out. When a user is presented with the canvas, that chances are that any items of interest that they see will not be at the correct resolution: for example, a short web page may just be visible as a small dot, but no text or graphics are properly discernable. If the user wishes to examine this item in greater detail, all they do it zoom in to take a closer look.
A recent update has been released from Ben Bederson called Piccolo (XXX), and I recommend that the curious investigate the demo further.
Perceptive readers will already have questions forming about the feasibility of such an interface: how can a person find something, how do they know it is there, what happens when they get lost and so on. No reasonable proponent of the system would exhort the ZUI as the be all and end all of interfaces, but there are solutions to most of these problems as shall be illustrated.
Desert Fog and Context
This describes a state whereby the user of a ZUI has manipulated the interface such that they have no "landmarks" or cues upon which to work out where they are (Jul & Furnas, 1998). This could occur in one of three ways: The user has zoomed in to a point where no information may be displayed on the monitor; the user has zoomed out to a point where ever piece of information on the monitor is too small to be visible; or that they have simply focused on a part of the convas where there is no information. Without any cues, users have been noted to wander around at random, something predicted by the model of information foraging (Pirolli & Card, 1999) due to a complete lack of scent. Potentially, this is a worse sitation than most orthodox interfaces where at least the user can often infer the context of their operations by looking at what is on screen. With desert fog, there is nothing on screen to aid this inference.
However, several solutions present themselves to remedy this problem. In one example, Jul & Furnas extended view-navigation theory (Furnas, 1997). Briefly, view-navigation theory is the examination of what makes an "information structure navigable". For this, there are two sets of requirements: traversal requirements (a navigable structure must have short paths between nodes and all views must have a small number of links leading out of the view); and navigation requirements (a navigable structure must present accurate cues of all nodes, and all links to other parts of the structure must present a small amount of information to determine this).
ZUI's show strong evidence for meeting the traversal requirements (the most direct path between any two elements is a straight line through the 3 dimensional space, and for each view, there is only one outlink to other parts). However, ZUI's will fall somewhat short on the navigation requirements in a desert fog situation: no cues are available for any other part of the structure except that they are not presently being viewed, and while the second requirement is strictly met (a small amount of outlink information), in reality this information is of no use. The essential problem is that spatial multiscale environments will always contain cues to views leading to desert fog. In theory, this means that it is easy to get lost with a few moments of carelessness.
To resolve this issue, Jul and Furnas proposed providing better cues for what they termed to be the interesting parts of the structure: those bits with content from which the users context may be accurately inferred. The presentation of cues for information beyond the perceivable resolution would reduce one of the methods of getting lost for a user could not then zoom out until all information was too small to resolve. This would be resolved by reducing the resolution until something was visible, and therefore the correct action to take when lost would be simply to zoom out.
In terms of task efficiency, this method may not be realistically useful. If a view has somehow been accidentally zoomed in for a few hours, it would take a user a few hours to zoom back out until something was visible. Plaisant et al (1998) dealt with this problem by allowing users to temporarily "return" to an overall position which displayed their current location on-screen. Their method could be further extended by allowing the user to further select a new location, one that contained some kind of contextual information, though in the above case of having zoomed in for a long time, any information of the same resolution will not have visible cues as to its content.
Again, this could be resolved by having context dependent information (such as tooltips) presented to the user upon request, but it could be argued that this is one step away from a simple interface, and one step closer to the orthodox GUI. Careful thought would need to be applied to the implementation too because tooltips, as they currently exist, obscure information underneath. Solutions such as having the tooltip fade after a certain period of time or having them transparent are not elegant solutions. An argument may be made for having any message to the user presented in a different part of the screen, but this in itself produces its share of issues as users would need to visually locate the message (which would appear further away from the mouse cursor than a tooltip would), and then locate the cursor again afterwards. This issue may be minor, but the loss of screenspace isn't. Suggestions such as large transparent text being presented across the field of vision are not complete either as messages still may obscure important information, visual location is again necessary, and the presentation of complex information may be confusing.
Pook et al (2000) also noted the loss of context that can easily occur within ZUI's, but instead proposed in-place contextual aids (a context layer) which transparently overlapped the users viewing area, along with a hierarchy tree in a separate frame.
The ZUI may also present problems in searching for particular information. If an item is placed out of resolution in a cluttered structure, it will be hard to them to find it, and indeed the process will likely be slower than searching through a long list of textual items (an orthodox equivalent) due to the need to pan and zoom throughout the structure. Raskin (1991) proposed a simple text-based search to locate items of interest, but this method may be compromised in two situations. Firstly, if the required information is not textual, it is hard to search for. Some search machines have been constructed that rely upon graphical search criteria, but they are difficult to use. The second issue is that sometimes the searcher may actually know what they are looking for: I have often searched for information that I cannot recall, but I recognise it when I see it. The second point is an issue for all search systems and not unique to ZUI's, but the first issue does have solutions. Raskin proposed the effective labelling of items to ensure that required information can be located: even if what is wanted cannot be searched for directly, the labels can. Despite this though, this is not a complete solution. In the real world, items are often unlabelled or labelled erroneously and meta-data may not be entered correctly. In this situation, an "invisible" item could simply stay out of sight and out of search. Many users would simply assume that their required information is just not present which could lead to a problem in task completion.
One possible solution to this would be to enforce correct labelling. For small systems, forcing users to enter meta-data from a constrained subset of terms would be one possible solution, but in larger systems, this method may just not be feasible: imagine trying to do this for the World Wide Web!
Personal anecdotal experience would suggest caution. Many workers are not motivated by the desire for accuracy but rather expediency, and slackness would be rewarded with obscure or missing labels with a consequent detrimental effect on performance.
One more solution would be to ensure that information so used is placed properly. Coincidentally, if the information space were to be used efficiently, then desert fog is less likely to occur because there should be fewer blank spaces with no contextual cues. Furnas & Bederson (1995) developed space-scale diagrams to give designers the opportunity to see what documents would be visible at particular resolutions (refer to the study for more details). However, if the system were to be given a hierarchical order of the document, these diagrams could be reversed and used to place the document in a location suitable for viewing. For example, document A might be considered very important and should be one of the first documents encountered as it contains commonly accessed information. Document B on the other hand might contain something obscure and relatively unimportant. The use of a space-scale diagram would allow the system to calculate where (in terms of resolution) the document should be to be visible at the starting point (note that this is a misnomer as there is no real starting point in a ZUI), whereas document B would be better placed small and out of the way, possibly only as a minor adjunct to a more important document (and thus visible at a lower resolution). However, this relies upon the documents actually having a hierarchy in the first place, as well as assuming that there is sufficient free space available at all resolutions. Note that the system could juggle the positions of other items should there not be enough free space, but changing the positions of documents may confound a users spatial navigation ability.
However, another possible solution exists to this conundrum. The purpose of searching for information is usually in response to a particular information need.
Once a relevant document is found, the information is used and the need disappears. Research has shown that many people can make perfectly accurate relevance judgements about documents given only a small part of the information (Mochizuki & Okumura, 2000; Tombros & Sanderson, 1998; and Salmoni & Payne, 2002), it may not be necessary for the full information to be presented to the user. Short abstracts, even using simple existing techniques allow a better than chance level of discrimination. Indeed, while no abstract or summary will be as effective as the full document in terms of accuracy of judgement, the speed is greatly improved (Mochizuki & Okumura, 2000). Curiously, Mochizuki & Okumura found that the title alone allowed a level of discrimination equal to that of the full text, though the method used in this study allowed the participants to examine the full text at any time throughout the experiment so this result should be taken with caution (which the authors acknowledge). Salmoni & Payne (2002) however found that author-generated summaries appeared to provide the best cues to document relevance. Some automatically generated snippets appeared to provide an incorrect context for the participant, though again some information appears to be better than none.
The provision of a small part of the information would provide the following advantages:
- Less information is presented to the user resulting in less time to complete the search task;
- Less visual clutter meaning that information from more documents may be presented on-screen at any one time;
- Fewer resources required of the system as less information has to be shown;
This consequently raises the question of how the summaries could be used to access the full document. The World Wide Web (Berners-Lee et al, 1994) showed that Hyperlinks are a readily understood means of navigation. A hyperlinked summary would be simple to implement and use. However, it should be remembered that the above idea has yet to be tested in any way.
In line with modern ("kewl") applications, a sustained period of focus by the mouse cursor might be a signal to the computer to open the complete document: hover the mouse over a summary, and within a moment, the complete document zooms out into focus.
And in Practice?
Since details were first published, much research has been performed on ZUI's and how well they are used with task-based evaluations. Hornbæk, Bederson and Plaisant (2001) compared ZUI task-based performance against an overview+detail interface. The overview+detail interface was simply the provision of an overview of a map (shown in the corner of the display) with the main part of the window visible across the canvas. Like the ZUI, it was concerned with information visualisation. There were two tasks, firstly a navigation task (finding a well-described map object), and secondly a browsing task (scanning for objects that fulfilled a set of criteria). It was found that there was no difference between either in terms of task completion accuracy, with twenty-six participants preferring the overview+detail interface whereas only six preferring the ZUI. Comments from users included "It is easier to keep track of where I am" for the overview+detail interface. However, task completion time was lower for the ZUI. A recall task was also measured to test the participants memory of the maps' structure and content, and again the ZUI was found to allow a better recall.
It may be argued that these types of task do not relate to the range of tasks that a typical user would perform in an everyday situation. Probably the biggest challenge facing researchers would be to devise a set of tasks suitable for analysis that compare a ZUI against an orthodox WIMP user interface. The tasks employed in the earlier study relied upon the navigation of information that was primarily visual in nature (maps). Against a text-based task, an interface that is more textually based may prove more usable.
For visual information, ZUI's do appear to have benefits in that they reduce task completion time while not increasing the errors committed. Further careful research would cast some empirical light upon this matter.
The previous work was performed with regard to the ZUI as a single user system. Little work has been performed upon the use of a ZUI for collaborative work. With the changes in computer usage, such research is necessary in order to better understand whether a ZUI is suitable for this style of working.
While he did not examine a ZUI, Carl Gutwin examined the aspect of collaborative applications using a visual interface, the radar view interface: this interface contained an overview of the entire workspace in a small frame while the work itself was performed in a larger frame which focused upon a small part of the workspace. This interface relied like the ZUI upon a visual style, but other than that differed greatly. For example, a ZUI may contain a canvas that may be extended infinitely in either direction along the z-axis, whereas the radar view has a finite canvas. Zooming on a radar view may be possible, but it also may not (this depends upon the implementation).
All interfaces designed to support collaborative work should facilitate collaborative awareness. To do this, the interface should enhance a users knowledge of who else is working, what they are doing, and how they are doing it (Huahai, 1999). In practical terms, this may be problematic for the effective communication of another users context may be difficult. Many of the subtle interpersonal signals communicated by people in face-to-face cooperative work are extremely difficult to formalise, and therefore difficult to deal with using a purely computer based system (Gutwin, 1998).
Groupware tends to limit or reduce the effectiveness of collaborative work in three different ways (after Gutwin, 1998):
- Reducing the users' perception of the workspace;
- Inhibiting the expressivity of the users' physical bodies;
- Limiting the ways in which artifacts may be used;
However, a computer groupware based system presents its own advantages for the above limits. While groupware may reduce the expressivity of artifacts, the creation and use of artifacts may extend beyond that possible in the real world; while a person might find themselves relatively unexpressive when using groupware when compared to a face-to-face situation, a groupware system may therefore only allow important aspects of a user to be communicated, therefore reducing possible confusion.
It appears that a groupware based system will never be as capable as a face-to-face environment if it is used to simulate a face-to-face environment. However, groupware may allow new possiblities of task-related communication that have hitherto been unconsidered should it peculiar nature be taken full advantage of.
When compared against interfaces based on the desktop metaphor, collaborative work may be enhanced in some aspects and inhibited in others. Presenting a group of workers with a common canvas would provide a direct corollary of shared directories observed in the desktop metaphor. However, while a single worker may be able to use their natural spatial navigation abilities to relocate information acquired at an earlier time, communication of this information to a co-worker may be require a different solution to that used currently.
The desktop metaphor allows a user to specify the location of a file by noting its drive, and the directory it is located in. Sending a series of coordinates to another user may be too abstract to be usable though this is purely conjecture. However, other solutions exist.
One example taken from the web browser interface would be of bookmarks. I would personally recommend using the phrase flags for the ZUI as this should be a readily understood metaphor by most users. Once a required location is reached, it could be marked by the user with a flag, perhaps colour-coded to denote the user who placed it (or if there are many users, a name, a picture, or a descriptive label). If another user required this location, they could simply examine a list of available flags using search criteria to narrow a long list down to something manageable. Selecting one would bring the user to the flags location.
Another solution would be to indicate which areas of the canvas are most commonly used. Locations at which users spend a lot of time would be presented differently to other locations, though the degree of this difference should not be major (i.e., noticeable but not distracting. Simple experimentation would ascertain the correct characteristics for this). An example would be a slightly darker shade of background colour.
A more interactive solution would be to allow users to relocate other users. For example, if user A was at an interesting location, and she wanted user B to view the same information. To do this, she would command the system to send the coordinated to user B's computer which would alter the display automatically. In order to lessen disorientation, it may be advisable to employ a fast animation to show the user the new location relative to the old one. Clearly, this would need to persmission of user B before it would be carried out, but this is a minor implementation detail.
In total, the problems associated with navigation through the ZUI may be characerised as three types: start-state problems, where the user is not aware of where they currently are; goal-state problems, where the user does not know how to reach a particular location; and overload problems in which the user is presented with too much information. The goal-state problems may be further characterised as those that are defined when the user knows which location they need to be at but does not know how to get there ("how do I get there?" or navigation problems); and those that are defined when the user does not know where they should be at all ("where is it?" or location identification problems).
The start-state problems were highlighted by Jul and Furnas (1998) and the term desert fog was used to describe them. As stated earlier, the simplest solution is for the user to zoom out until they are familiar with their current location. However, this may not be practical for two reasons. The first reason is that this may take time, and the second reason is that the user may be sufficiently disorientated so as to forget where they were. In the latter case, the user has simply substituted a start-state problem for a goal-state problem.
Allowing the user to zoom out as much as they wish and then quickly return to their original location is one answer to this ("zoom out and spring back"), and another would be to allow the user to simply spring out to a realistic "top view" of the canvas which would indicate their current location. As an example, the user would press a pre-defined key and then be presented with the entire canvas (see problems with infinite resolutions above) with a small animated square showing where they were. Animation for this would be a beneficial method as it would allow identification of the current location with minimal disruption to working memory. Releasing the key would zoom the use back to their original location. If their original location was too small to be visible from the top view, then a form of animation could be used to indicate their locations distance.
The way to solve navigation problems depends somewhat upon the task, i.e., how is the user aware of their required location. If another user has requested their presence at a certain location, that user can transmit that location simply: possibly even making the system move for them (which should be faster and more accurate than a human operator could achieve). However, if a user wishes to visit a location that they previously encountered, then the solution further depends upon whether it was flagged by the user or not. If it was, then the user simply has to identify the correct flag and use that to guide their navigation. If not, the implementation of a history facility might be useful: Pook et al (2000) proposed a history mechanism, but this has yet to be tested under realistic circumstances.
It might also be asked under what situations should a location be consigned to a history list as a simple navigation could arguably pass through many points indeed: however, it is suggested that only when a user spends more than a specified time at a particular location (say more than ten seconds) should the location be recorded in the history.
Location identification problems also depend somewhat upon the users circumstances. If the user is able to locate the information with a search, then this would probably be the best solution. Other solutions include labelling, and summaries of documents sufficient for the user to correctly infer content (thus allowing more information to be scanned in a given time).
Overload problems are more tricky. Although the ZUI idea is flexible, once a user is used to a particular layout, changing it may actually disrupt the users performance. Hence, temporarily reducing the amount of information presented to the user may not be advisable if good performance is to be maintained as this alters the cues that a user may use to navigate (imagine walking to work and finding that the streets have changed their location!).
As mentioned earlier, only presenting users with summaries of documents reduces the amount of information that a person has to sort through but this may be criticised as increasing the systems capacity for error (i.e, reducing accuracy when scanning documents for relevant content). Though little pertinent work has been done on non-textual documents for this issue, research has shown that users performance with summaries is not much worse than when the full document is accessible and yet faster. See Mochizuki and Okumura (2000) for one example.
The best possible solution would be to introduce (and enforce!) meta data searches. Clearly the meta data would benefit from being constrained so as to reduce the possibility of errors through synonymy and polysemy and other such issues which plague WWW search engines. Searches with accurate meta data would allow a textual search through all items regardless of their media (sound, graphic, text, MM) thus reducing the amount of information that a user needs to search through before they find their requried document.
The above findings and proposals may not provide an adequate solution. Many findings with the field of human-computer interaction are counter-intuitive, and even the best designed study may fall prey to interactions between elements which when tested alone are perfectly fine (Cutrell, Czerwinski and Horvitz, 2001).
In isolation, the above proposed solutions may work fine. However, when combined, the complexity of the interface increases somewhat which may have detrimental effects on a users' performance. Task-based evaluation would probably be the best way to test this, but caution should be taken to ensure that the choice of tasks is relevant to the intended use of the interface.
The extensions to the ZUI described above do appear to have an important role to play as they help to combat areas of confusion and error for the user, but they also make the interface more complex than the plain ZUI. However, there are few of these extensions, indeed probably fewer than would be needed on an interface based upon the desktop metaphor.
ZUI's are a radical departure from the most common "desktop" metaphor currently used on graphic user interfaces. Indeed, some research shows that task performance time is reduced when a ZUI is used, but this is not an unqualified statement, as the range of tasks typically performed by users have yet to be tested to any great extent. In addition, the use of a ZUI with multiple applications has also yet to be tested.
Despite this, ZUI's appear to offer a novel way for people to navigate complex information structures, though a lot of consideration needs to be made for some areas in which users spatial navigation skills fail. While the base ZUI is indeed an elegant solution that relies upon humans' natural ability of spatial navigation, like any complex system further extension would be needed to augment our abilities and counter existing drawbacks. The provision of such extensions in themselves would not guarantree a faultless system, and important trade-offs would need to be considered before finalising a design.
- Piccolo: [http://www.cs.umd.edu/hcil/piccolo/index.shtml]
- The Humane Interface: [http://humane.sourceforge.net/the/zoom.html]
- Demin: [http://guir.berkeley.edu/projects/denim/]
- 2Goto: [http://www.2goto.com/]
Bederson, B.B., & Hollan, J.D. (1994) Pad++: A Zoomable Graphical Interface. CHI'94, short paper.
Berners-Lee, T., Cailiau, R., Luotonen, A., Nielsen, H.F., and Secret, A. (1994) The World-Wide Web. Communications of the ACM, 37(8), 76-82.
Cutrell, E., Czerwinski, M., & Horvitz, E. (2001) Notification, disruption and memory: Effects of messaging interruptions on memory and performance. In Hirose, M. (Ed) Proceedings of Human-Computer Interaction -- Interact 2001, Tokyo, IOS Press, copyright IFIP, 2001, 263-269.
Furnas, G.W. (1997) Effective View-Navigation, Human Factors in Computing Systems. CHI'97 Conference Proceedings, New York, NY: ACM Press, 367-374.
Furnas, G., & Bederson, B.B. (1995) Space-Scale Diagrams: Understanding Multiscale Interfaces, Proceedings of ACM SIGCHI'95.
Gutwin, C. (1998) Workspace Awareness in Real-Time Distributed Groupware. Ph.D. Thesis submitted to the Department of Computer Science, Calgary, Alberta, Canada.
Hornbæk, K., Bederson, B.B., & Plaisant, C. (2001) Navigation Patterns and Usability of Overview+Detail and Zoomable User Interfaces for Maps. ACM Transactions on Computer-Human Interaction (TOCHI), 9 (4).
Huahai, Y. (1999) Collaborative Applications of Zoomable User Interface. Collaboratory for Research on Electronic Work, School of Information, University of Michigan, 1999.
Jul, S. & Furnas, G.W. (1998) Critical Zones in Desert Fog: Aids to Multiscale Navigation. In UIST '98, pages 97-106, San Francisco, CA, USA, Nov. 1998. ACM Press.
Mochizuki, H., and Okumura, M. (2000) A Comparison of Summarisation Methods Based on Task-based Evaluation, LREC2000, pp.633-639, 2000.6.
Plaisant, C., Mushlin, R., Snyder, A., Li, J., Heller, D., and Shneiderman, B. (1998) LifeLines: Using Visualisations to Enhance Navigation and Analysis of Patient Records. In Proceedings of the 1998 American Medical Informatic Association Annual Fall Symposium, pages 76-80, 1998.
Pook, S., Lecolinet, E., Vaysseix, G., and Barillot, E. (2000) Context and Interaction in Zoomable User Interfaces.
Salmoni, A.J., and Payne, S.J. (2002) Inferences of Content from Search Engine Summaries and Headings. HCI2002 Proceedings, vol 2.
Tombros, A., and Sanderson, M. (1998) Advantages of Query Biased Summaries in Information Retrieval. Research and Design in Information Retrieval, 2-10.