ACM Multimedia 95 - Electronic Proceedings
November 5-9, 1995
San Francisco, California

A Confederation of Tools for Capturing and Accessing Collaborative Activity

Scott Minneman
Xerox Palo Alto Research Center (PARC)
3333 Coyote Hill Road
Palo Alto, California 94304
415.812.4353
minneman@parc.xerox.com
http://sandbox.xerox.com/minneman/slm.html

Steve Harrison, Bill Janssen, Thomas Moran
Xerox PARC
{harrison, janssen, moran}@parc.xerox.com

Gordon Kurtenbach
Now at: Alias Research
gkurtenbach@alias.com

Ian Smith
Now at: GVU Center, Georgia Tech
iansmith@cc.gatech.edu
http://www.cc.gatech.edu/gvu/people/Phd/Ian.Smith.html

ACM Copyright Notice


Abstract

This paper presents a confederation of tools, called Coral, that combine to support the real-time capture of and subsequent access to informal collaborative activities. The tools provide the means to initiate digital multimedia recordings, a variety of methods to index those recordings, and ways to retrieve the indexed material in other settings. The current system emerged from a convergence of the WhereWereWe multimedia work, the Tivoli LiveBoard application, and the Inter-Language Unification distributed-object programming infrastructure. We are working with a specific user community and application domain, which has helped us shape a particular, demonstrably useful, configuration of tools and to get extensive real-world experience with them. This domain involves frequent discussion and decision-making meetings and later access of the captured records of those meetings to produce accurate documentation. Several aspects of Coral--the application tools, the architecture of the confederation, and the multimedia infrastructure--are described.


Table of Contents


KEYWORDS: activity capture, digital audio and video, CSCW, real-time indexing, content- and context-based indexing and retrieval, usability, user interfaces, distributed multimedia systems

Introduction

Much of the work of groups, even in such orderly settings as structured meetings, takes the form of casual interaction--the give-and-take of conversational exchanges whereby a group comes to a shared understanding of the technical, process, and relational facets of their work [Minneman, 1991]. This casual activity is poorly supported by most computational tools, which tend to focus on the outcomes of such activity, while ignoring much of how the group arrived at those outcomes. Furthermore, many attempts to gather such information end up formalizing the activity, making the participants conform to a way of working that suits the computer rather than supporting their natural work practices.

Collecting audio, video, and computing recordings of a setting of human interaction provides a rich, revisitable source of records of group process; but these records are unwieldy: the benefits of the data are often overshadowed by tedious sequential access. Random-access digital video and audio (instantaneous seek times), pen-based computing (informal interaction), and signal analysis (speaker identification, scene change detection), can be combined to provide users with heretofore unfathomable capabilities; but functional systems require careful design work. Therein lies our challenge--designing tools that let people work with these rich, time-based media in facile ways--helping rather than hindering their interactions and extracting indices from the structure of their natural activity, rather than imposing regularity upon their process.

Our group was particularly well poised for taking on this challenge, because three of our ongoing projects provided critical capabilities: (1) WhereWereWe [Minneman and Harrison, 1993] offered digital audio, video, and computing streams capture, indexing, and playback; (2) the Tivoli application [Pedersen et al., 1993] furnished digital whiteboard functionality on a large, pen-based electronic display device called the LiveBoard(1) [Elrod et al., 1992]; and (3) the Inter-Language Unification (ILU) project [Janssen, 1994] added a powerful distributed-object programming facility. These efforts have been converging to provide a confederation of tools for unobtrusively capturing time-based records(2) of group activity, indexing the recordings, and accessing the indexed material for browsing, searching, and reexperiencing the original activity.

Table of Contents

Activity Capture and Access

We have a set of subtly intermingled goals. Our first goal is to support the natural communicative and interactive activities that people engage in during the course of collaborative work. Ideally, we want to provide tools that are immediately useful in informal collaborative settings; at minimum, the tools must not inhibit or distort the natural activities. The next goal is to capture records of the activities. We do this with both the support applications and with unobtrusive multimedia recording. The third goal is to provide ways for the captured materials to be indexed. Our final goal is to provide access to the indexed, captured records so the original activities can be revisited and used as effective resources for further work.

Support. Because we want tools to support informal activity, we have limited ourselves thus far to simple, generic tools for notes and shared representations. Our LiveBoard technology provides an informal shared workspace; its whiteboard metaphor is quickly accessible to users. This application provides several advantages over physical whiteboards--editing, printing, saving and retrieving, multiple pages, etc.--that make it useful by itself.(3) Also, we utilize laptop computers for notetaking, which is becoming standard practice for many people. We also provide ways for the different devices to send information to each other. It is possible to provide more elaborate meeting tools, but these tend to impose a constraining structure on user activities. Instead, we have focused on simple tools that provide a basic level of support, but in addition serve as capture devices.

Capture. Multimedia capture of activity first involves initiating and coordinating the recording apparatus for a variety of media (audio, video, text, program logs). Our current audio and video media recording is workstation-based, using Sun Sparcstation built-in audio and prototype video digitizing hardware.(4) We have more platform flexibility in tools for capturing computing records, having developed capture tools on various UNIX systems, PCs, and Macs. In particular, the support tools mentioned above produce time-stamped records of their behaviors. And we are developing an architecture, based on the use of network distributed objects, for the uniform treatment of records of diverse timestream data.

Indexing. Indices are meaningful (or at least heuristically useful) pointers into the captured multimedia records, providing the means for users to randomly access those records. We are exploring a variety of methods for creating indices--let us consider them in four broad classes. First, there are intentional annotations, which are indices that participants create during an activity for the purpose of marking particular time points or segments of activity. A prime example of this is sequential notetaking. A participant, taking the role of "scribe," takes brief notes on the activities as they progress. What is crucial for us here is not just what a note contains, but also when it is created, making the note an index (while a single-person scribe is common practice, there can, of course, be multiple notetakers). Second, there are side-effect indices. These are activities whose primary purpose is not indexing, but which provide indices because they are automatically timestamped and logged. An example is switching pages on the LiveBoard. The purpose of switching to another page is to see other material or to start a new page. This may indicate a topic switch, and thus is a potentially useful index into the overall activity. In fact, in our work every event on the LiveBoard is a potential index. Third, there are derived indices, which are produced by automated analyses of detailed timestream records. For example, signal analysis of audio/video records can produce speaker identification indices and scene change indices. Finally, there are post hoc indices, produced by anyone who later accesses the activity records--an intentional annotation, but after the fact. Indices are often easier to make when reflecting on the activities rather than in the heat of the moment. Although we have explored all of these methods of indexing, we have concentrated on intentional and side-effect indexing in the work reported here.

Access. Tools to support the access of captured and indexed records are a ripe area for research. The tools must support the user in finding the records of the desired session, assembling the indices in a comprehensible format, controlling the playback of the multimedia records, and in creating new multimedia artifacts from the captured materials. There is great potential for new tools in this arena. For example, we have discovered the need for making the computational tools into players, so that the state of those tools can be seen in coordination with the playback of audio and video. Another example is a timeline tool for presenting diverse indices. Further, the access tools should allow the user to add further annotations and indices. The accessing activity should, itself, be revisitable. In the work reported here, we have developed a very simple, yet quite useful, environment for access. We are currently exploring new tools.

In the remainder of this paper, we first describe our particular application domain, demonstrating how these kinds of tools can be effective and useful for users. Then we describe the Coral architecture and two particular applications in more detail. We then consider a broader range of uses for these tools and describe how these are being explored. Finally, we address some of the lessons we've learned over the course of this effort about tools, infrastructure, and uses.

Table of Contents

Description of the Application Domain and the Work Settings

To ground the development of our tools, we have focused on supporting a specific, real-work domain--the process of assessing and managing intellectual property at PARC. Researchers at PARC are encouraged to submit invention proposals (IPs) to report novel ideas. An IP is a 5 to 10 page (or longer) document describing an invention, its technical details, related art, its current state of implementation, etc. One of the principal activities in the intellectual property process is regular meetings of technical people to assess the submitted IPs. One of the several different technical panels meets each week. The purpose of these meetings is to determine if the submitted IPs are technically sound and sufficiently useful to spend the legal resources for creating and filing patents from them. This requires a detailed technical discussion of the IPs to come to a shared understanding of them and to build a consensus on what to do with them (e.g., patent, defensively publish, or hold as a trade secret).

There is a manager of the intellectual property processes; let us call him Ron. One of his duties is to schedule and moderate these meetings. Another duty is to document the technical assessments, the issues raised, and the decisions made at these meetings. This documentation is important--it gives feedback to the inventors; it informs the research management and the patent attorneys; and it becomes part of the corporate legal records.

Traditionally, Ron wrote these documents from his handwritten notes taken during the meetings. His problem in doing this was not only the great number of meetings to document, but the diversity of the technologies under discussion. He is knowledgable in some arenas (having been a PARC researcher himself) but a complete novice in others. This meant that he often couldn't immediately assimilate comments made during the meeting into his notes and had to subsequently consult with those present at the meeting to help create accurate documentation.

Supporting the Work Settings

It should be clear from this description that tools to capture the content of these meetings and to access the discussion could be very helpful for producing the required documentation. Ron was enthusiastic about exploring new ways to improve the process. After preparing an initial suite of our technologies, we intervened to support the IP work process, starting at the beginning of 1994. We gradually introduced additional functionality and iteratively refined the tools. By the end of the year, a set of fairly stable tools and practices was reached.(5) It is this stable configuration that we describe here.

There are two different kinds of settings involved in this process--the capture setting, which is the meeting with its discussions, and the access setting, where the captured meeting materials are "salvaged" to produce the required documentation.

The capture setting is shown in Figure 1. This photo depicts a mock-up of an assessment meeting. The 4 to 10 meeting participants sit around a table facing each other. They bring hardcopies of the IPs, which they have read beforehand and which they use during the meeting. There is a LiveBoard that is prepared with the meeting's agenda. Microphones on the table capture the audio, which is digitized and stored. Ron uses a laptop computer on the table to type notes during the meeting. Thus, tools are in place to capture three streams of activity: audio, LiveBoard interaction, and text.


Figure 1. An activity capture setting. The microphone, camera,
LiveBoard, and laptop capture the audio, video, scribbling, and
and textual notetaking activities of the meeting.

A meeting proceeds as follows. Recently submitted IPs form the agenda, and the IPs are dealt with sequentially. This partitions the meeting into natural segments of about 10-30 minutes each. Tivoli has been prepared ahead of time with a page for collecting notes for each IP. Thus the activity of switching Tivoli pages produces indices of these IP segments.

During each IP segment there are two kinds of activity, discussion and conclusion. Discussion activity takes place across the table and involves all the participants, who are interacting with each other. This is the most critical activity, and we want our technology to be non-intrusive. For example, we do not require that the participants focus on the LiveBoard. Ron takes notes during the discussion on the laptop; he acts as a recorder, only occasionally participating in the discussion himself. Ron's notes are not totally private, however. We found it useful to "beam" his notes from the laptop to the LiveBoard as he takes them. Participants tend not to orient towards these beamed notes, but rather monitor them "out of the corner of their eyes" to make sure their contributions are being noted by Ron.

Ron brings the discussion activity to a close and initiates the conclusion activity of the IP segment. This activity involves making a decision on how to handle the IP and noting any associated action items. This activity is different from the open discussion in that Ron stands at the LiveBoard, where he marks the rating and disposition of the IP and handwrites the action items. Although there is discussion during this part, the participants are more focused on the board and on Ron than on each other, because they all want to see and make certain they concur with the conclusions.

Later, Ron documents the discussions and conclusions reached in each meeting (this may be days or weeks afterwards). He does this in the access setting, which is simply an office with a Sun workstation and audio/video playback devices, pictured in Figure 2. We call this particular configuration the "Salvage Station."


Figure 2. An access setting. The Salvage Station consists of a workstation,
monitor, and speaker. The workstation contains the LiveBoard display,
playback controls, and an editor for creating documentation.

Creating the documentation in this setting involves a careful review of the captured materials. Ron summarizes each IP discussion in about a page of text. His typed notes from the meeting are often cryptic, serving more as indices into the recordings than as substantive summaries in themselves.(6) The Salvage Station, shown as a screen shot in Figure 3, provides him with playback controls, a Tivoli application showing the same pages as were created on the LiveBoard in the meeting, and a text editor to create the documentation. Every mark or note that was made on or beamed to the LiveBoard serves as a time index into the recordings. Thus, in order to get access to the recorded materials for a given IP, he buttons the Tivoli application to display the page containing the notes for that IP. Then by simply touching a mark or note on the displayed page he causes the media to play from the time when that mark or note was made in the meeting. This gives him meaningful random access into the recorded material. He listens and relistens to some portions of the recordings until he understands the significance of what was said. Although he may alternate between listening and typing the summary documentation, he often does both simultaneously, listening only for important points that he may have missed the first time around.


Figure 3. Salvage Station screen showing Tivoli, a text editor (buffers
of meeting notes and summary), and a simple timeline controller.

Description of the Application Domain and the Work Settings

Results of Using the Tools in This Domain

Our work in supporting this application domain is ongoing, and we are continuing to develop new tools. However, even at the stage of development described here, we have several qualitative indications that the tools were successful. First, we were able to support the meetings without disrupting them or forcing changes in practices. Second, the meeting participants commented favorably when they saw that the documentation was qualitatively improved by Ron's having access to the captured materials. Third, the meetings proceed more freely, because the participants seem to have greater confidence that their contributions will not be lost. Fourth, Ron is pleased to have the capability to produce better documentation. The trade-off is that it takes him longer to produce this more thorough documentation; but this situation should improve as we refine our access applications.

Description of the Application Domain and the Work Settings
Table of Contents

The Coral Architecture

Coral is not a close knit system, but rather consists of a loosely coupled confederation of tools that work together through protocols of communication. We decided early on that the most practical way to evolve the system was to construct a set of applications that share access to a distributed multimedia infrastructure and that export interfaces to each other to permit various kinds of cooperative action. For example, a text editor that participates in this milieu might, by virtue of its connections to WhereWereWe and Tivoli, be making annotated events in a database that will later serve as pointers back into a multimedia recording, driving the current page of the LiveBoard display when the text editor is flipped from page to page, and "beaming" segments of the editor's text onto the screen of the electronic whiteboard for shared discussion or review.

Each of the tools in the Coral confederation is fully functional without one or more of the others (as we know from various weeks where we were still sorting out bugs in the individual pieces), but the combination of the tools makes for a more powerful union. Thus, the "system design" is emergent, based on a developing a shared infrastructure and developing protocols for exporting functionality to neighboring tools. Toward this end, major efforts were focused on the design of a suitable application programmer's interface (API) to the WhereWereWe multimedia system resources and similar APIs to other tools (e.g., Tivoli's beaming functionality), which are exported with a distributed object protocol.

Inter-Language Unification (ILU) and WhereWereWe

The tools described here, and WhereWereWe in particular, rely heavily on the Inter-Language Unification (ILU) project for their implementation [Janssen, 1994]. ILU is a distributed object system that lets users easily build objects which exist in one address space on a particular machine but have "proxy" objects that exist in the same or different address spaces on the same or other machines in a network. Either the real or proxy objects may be implemented in any of the following languages: C++, C, Python, Common Lisp, Fortran, tcl,(7) or Modula-3. The tools under discussion currently make use of only the C++, Python, and Modula-3 language bindings.

The proxy objects are operated on by "client" programs which can assume these proxies to be performing their respective functions, although in reality the proxy makes a remote procedure call to the real object (the "server" object), which actually carries out the operation. For example, a WhereWereWe client may create an ILU object from the API (e.g., a video Player object), and not concern itself with the details of the fact that in reality this object is communicating with the WhereWereWe server to carry out its functions. Furthermore, the object may be transparently shared--WhereWereWe allows multiple proxies to represent the same server object to facilitate resource sharing. For instance, multiple users may share a single video Recorder object, each potentially having control over the state of the device (e.g., pause and resume, frame rate), thus reducing hardware resource demands for compression and storage. Most of these extensions are transparent to typical ILU API users, and the simpler nomenclature of clients and servers will be retained throughout this paper.

The Coral Architecture

WhereWereWe Model

The application programmer can see seven basic abstractions in the WhereWereWe API. These are all objects, which belong to the following classes: Session, Stream, Event, Player, Recorder, Data, and Notifier. Most applications need more than one of these classes, but few (if any) need all of them. The classes can be broken into three groups.

Sessions, Streams, and Events are classes of objects that are used to do naming in WhereWereWe. Sessions are named collections of Streams, which correspond to semantic occasions, such as "Project meeting from October 15". Streams are media data that can be played back, such as audio, video, or a program activity log. Events are "occurrences" that happen at some point or interval in a Stream. This association with a Stream is purely for the convenience of retrieval, but is one natural way of thinking about the relationship between Events and Streams. Also, each of these three classes supports a property list on each instance of the class, so that application programmers may associate arbitrary application-specific data with each object.

Players, Recorders, and Data objects are used to convert Streams into other forms. A Recorder both creates a new Stream object and takes responsibility for storing the data associated with that Stream (often this is simply a disk file which can be replayed later). A Player displays for the user the data of a previously recorded Stream. A Data object converts a Stream's recorded data into a raw form that a processing application can use as input for its algorithms.

A Notifier object is used by client applications that need to stay informed of the status of ongoing playback or recording activities.

It should be noted here that WhereWereWe can be thought of as "glue" that allows index-making and browsing activity for stream data. It does not attempt to provide general media playback services, but rather provides an infrastructure into which such services can be inserted and utilized in a uniform way. WhereWereWe, at present, has built-in drivers for digital audio and video in one format.(8) WhereWereWe also provides a limited mechanism for additional drivers to be installed and used with no changes to the client software and minimal changes to the server software.

These WhereWereWe API elements may be combined in many useful configurations.

The Coral Architecture
Table of Contents

Description of Two Primary Tools

The initial implementation of the WhereWereWe infrastructure was completed in September 1993. Starting about midway through the development, and still continuing, a number of applications have been constructed or modified to take advantage of the facilities it offers. To explain what makes the WhereWereWe API useful to application writers, we will discuss its use with a pen-based application (Tivoli) for indexing from casual notetaking activity, and an Emacs mode we wrote to gain experience with textual annotation.(9)

Tivoli

Tivoli is the large-scale, pen-based electronic whiteboard application running on the LiveBoard to support publicly-viewed image manipulation in meetings.(10) While its many electronic whiteboard features have been documented elsewhere [Pedersen et al., 1993; Moran et al., 1995], a number of extensions facilitate seamless marking of activity for later retrieval, enable easy replay of multimedia records, and provide participants with a sense of the relation between the notetaking and recording functionality. Figure 4 shows a typical Tivoli screen layout in our application domain; this one is for use in rating invention proposals.


Figure 4. Typical Tivoli screen shot, showing an application-specific
form, pen-drawn strokes, keyboard text, and clock objects.

Tivoli has within it a stand-alone history mechanism that allows an "infinite undo" of drawing and editing operations; this history facility also gave us a leg up on allowing the drawing/editing process to be replayed. Late in 1993, Tivoli was extended to use the WhereWereWe API and become a marking and browsing application. Very little needed to be done to extend Tivoli to support indexing, as its history was already retaining timestamps of drawing and editing operations. The application was modified to write that information into the files that it retained about sessions where audio and/or video were recorded. Tivoli was eventually further modified to produce other timing indices, but considerable utility as a side-effect indexer accrued from simply tying into its existing history mechanism.

In addition to the indexing functionality outlined above, the Tivoli application can be used to drive the various WhereWereWe resources for playback. Since each stroke drawn on the LiveBoard is timestamped, it is possible to select a stroke and have WhereWereWe and Tivoli replay all of the recordings made at that time. Thus, the user can utilize a Tivoli page's display as an interface, answering questions such as "what was Joe saying when I jotted this down?" or "what's this all about?"(11) Thus the strokes themselves constitute an important index into the activity.

The user is also presented with a simple timeline interface to the playback functions. This timeline panel offers the user several kinds of control that are not available with the direct selection interface described above. For instance, gross controls such as going to a general portion of the recording (say, "2/3 of the way in") or starting and stopping the session playback, but also finer-grained control such as hopping forward or backward several seconds to catch an unintelligible utterance.

Oftentimes, much of the graphical activity that constitutes the indices of side-effect markers comes after the event--making notes about a point that was made, or sketching a suggested solution. We added a feature that blurs the boundary between Tivoli as a side-effect indexer and as an intentional indexer. Users can insert a sort of temporal bookmark, a graphical object whose primary purpose is to initiate later replay from its creation time. These graphical objects display as a clock face (showing the time that they were created, often allowing users to see the progression of a discussion) and often further serve as bullets in list items (Fig. 4). These clockmarks have become very popular graphical elements, strewn throughout the pages of Tivoli.

Description of Two Primary Tools

WEmacs

GNU Emacs is a popular text editor [Stallman, 1993], which we have extended to interact with WhereWereWe. Over the course of our project, we have used two methods for interfacing it to the infrastructure and to other tools. First, the internal Emacs Lisp interpreter was extended to allow calls to ILU objects, thus permitting Emacs Lisp to create the necessary WhereWereWe objects directly. Later, in order to be compatible with more versions of the editor, we moved the mechanism for ILU connection into a Python subprocess that sits beneath the editor to communicate with WhereWereWe and Tivoli.

A simple interface, especially designed for portable computers, has been built for notetaking with an Emacs connected to WhereWereWe (this has been dubbed WEmacs). In this interface, the user can "make a note" about something in the current Session(12) by typing a particular keystroke (currently Tab). WEmacs generates an Event whose start time is the current time and whose duration is zero; this Event is now in the WhereWereWe database for later use by browsers. WEmacs currently represents this Event as a distinguished line of characters (a "timestripe") in the buffer. Whenever an additional timestamp is indicated, the editor submits the region between it and the previous one as an annotation (which goes onto the event's property list) on the prior Event. Submitting the Event does not preclude further changes; saving also parses the entire buffer and updates the annotations (as well as saving enough other pertinent information so that the Session can be revisited later). Playback works similarly; WEmacs' algorithm looks back for the previous Event string and begins play from the corresponding Event.


Figure 5. Emacs' WhereWereWe mode, showing the timestamping
character strings and user annotations.

WEmacs also interacts directly with Tivoli, using an ILU interface exported for external control of the application. When WEmacs is in automatic update mode, each time the user enters a tab character, in addition to submitting the previous annotation to the WhereWereWe database, the event is "beamed" to the LiveBoard, appearing on the current Tivoli page, further denoted by one of the clock objects (set to the event's initiation time from its timestripe) discussed above. If necessary, Tivoli scrolls to assure that the beamed text is visible.

The WEmacs user is also provided with other controls of the Tivoli application. Scrolling and page changing are available to the seated user; this can be very convenient for reflecting (even initiating) agenda progress in meeting settings, or when preparing to go to the board for pen-based drawing activity.

WEmacs also functions in the retrieval setting, both as a text editor for refining meeting notes, and as a method for controlling the session appliances.

Description of Two Primary Tools
Table of Contents

Discussion: A Broader Range of Uses

Focusing on a particular domain was useful for grounding our work: It allowed us to assemble and evolve a set of really usable and useful tools. It demanded that we keep our participants satisfied. It required us to make our systems robust enough to be continuously operational. And it gave us insights into the subtleties of collaborative processes and the effects that tools (even seemingly unobtrusive ones) can have on them. But the other side of the coin is that we can become too narrowly focussed. We believe that these kinds of tools are applicable to a much broader range of collaborative situations and uses.

In our domain, only one person was taking notes; and thus all the notetaking indices reflected that one person's point of view. In many situations, such as less constrained discussions or brainstorming, it may be important to see the activity from multiple points of view. This suggests a multiplicity of various kinds of devices that give each individual participant the capability to take notes and create indices.

Another limitation that seemed reasonable in our domain was that we focused only on recording audio, under the assumption that most of the content of the activity was carried in the audio. This is not true in many situations, such as engineering design, where physical artifacts play a crucial role in the activities. Even in our domain, there are many indexical actions, such as pointing to a particular point in a document during the discussion, that we do not capture. While the WhereWereWe infrastructure supports video capture and playback, the storage requirements were deemed prohibitive for the many hours of recording we anticipated. We look forward to improved support for video, and foresee interesting challenges in effectively utilizing video in these multi-participant settings.

In our domain, indices are created manually at a fairly coarse grain--notes are taken at one per minute at the fastest--and long periods of time can go by without any indices being created. More complete and finer indices can be produced by analyzing the audio to identify the speakers, which is also important since people often tend to orient to who said what.

It is on the access side that perhaps the broader opportunities lie. In our domain, there is a particularly demanding access task--generate a detailed summary of the content of the discussion. As we have noted, this is a costly process. But users might access captured materials for quite different reasons. One may want to quickly "skim" a session to be reminded of what transpired (e.g., in preparation for another meeting). Another may want to find information germane to a new IP they're considering. These suggest tools geared for skimming (e.g., Arons' Speech Skimmer [1993]) or searching.

One crucial variable that affects the nature of the access task is whether the person was present at the captured activity or not. One who was at the captured session can rely on a myriad of remembered cues (e.g., the interesting part happens right after John left the meeting). The person who was not there is "flying blind" and will place greater reliance on the captured indices. This variety has numerous user interface implications, and needs to be explored.

Thus far, textual documentation has been the dominant product of the captured activity. But there is wide variety of different kinds of multimedia documentation of collaborative activities. Creating a detailed textual summary is a demanding task, whereas creating a few pointers to the highlights of an activity might be much easier to do; and may be just as useful for many situations. The spectrum of possibilities needs to be explored.

Although these opportunities for further use are attractive, they must be approached with a sensitivity to issues of security, access, and context. The users supported by the current system speak freely in the capture setting, knowing that only Ron has later access to their comments. If a change to this situation is being considered, we must respect the privacy of our users, and make clear the range of reuse that is possible. We also believe that material of this sort could be damaging (and/or useless) if it is decontextualized, so we face further challenges as we attempt to broaden the scope of this work.

Table of Contents

Other Tools

A number of other tools have been prototyped or developed by project members to fit into the Coral architecture. Many are being evaluated for use in the invention proposal review meetings. Some fit well enough together with the other activity capture applications and the needs of the users that they will probably become part of the routine suite of tools; others may only influence design of new tools or result in modification of existing ones.

Pedals

Our existing range of clients currently requires a high level of buy-in to the indexing activity; often, there are interesting events that require no further elaboration to be useful for later retrieval (or where later browsing will immediately reveal why they were made).(13) The Pedals application aims to provide an extremely simple interface--a single switch--for creating indices. Pedals uses an external microcontroller to read digital inputs (analog sensing is available, but, as yet, unused) and trigger the submission of labelled Events in the database. We've recently begun exploring applications of this very general form of marking--ranging from a simplified means to allow meeting every participant to indicate events of interest (e.g., a topic they'd like to discuss with a colleague, a new invention idea) to using switches to instrument a capture setting (e.g., detect a copier door opening).

Other Tools

Speaker Identification

Speaker segmentation [Wilcox et al., 1994] is a method of deriving indices from a collected audio stream. Utilizing hidden Markov models, the technique can provide an indication of who was speaking at various points in the audio record of an event. This is now being integrated into a timeline representation of the activity capture records [Kimber et al., 1995]. This should prove particularly useful in settings where few intentional or side-effect indices are available, or for locating a particular contribution (e.g., John's comment on prior art).

Other Tools

PARCTab Marking

A laptop computer can be an obtrusive and inappropriate device in many settings--many users may be happier with jotting a quick note on a personal data assistant (PDA). As part of the shakedown of the WhereWereWe server and API, a simple marking client was written for the PARCTab [Adams et al., 1993], a networked PDA. This is a dedicated WhereWereWe application, wholly written for the purpose of marking digital video and audio. It uses a different ILU stubber, for the Modula-3 language, whereas the others use C++ and Python.

The MarkTab application uses a unistroke alphabet recognizer [Goldberg and Richardson, 1993] to allow free text entry on the Tab's touch-sensitive screen, and the Tab's simple 3-button arrangement for control. The user can think of the MarkTab application as a recipe box of 3x5 cards, one card for each event--cards that may additionally be sorted (at creation time) into categories. One of the two buttons on the Tab cycles through the categories, another signals an event and gives the user a blank card of the category that was selected when the event button was depressed. The user then uses the stylus to jot down, with unistrokes, an arbitrarily detailed textual annotation about the event in question and indicates (with a soft button) that she is done.

Other Tools

Marquee

Text isn't always the most intuitive or fitting means for making personal annotations. Marquee [Weber and Poon, 1994] is a pen-based video logging tool that enables users to correlate personal notes and keywords with a videotape during recording. The Marquee log consists of a scrolling note-taking area which is divided into a series of timezones. Once a timezone is created, by drawing a line across the tablet, the user may then make notes appropriate to the time of the event. In addition, the user is provided with a facile way of categorizing the material by applying keywords to the timezone.

Marquee was built to index analog video tape; it was modified to use WhereWereWe as its indexing and streams capture and replay subsystem. When the modified Marquee starts up, it creates or joins the recorders for the streams of a Where Were We session, and then begins operation in its notetaking role. When it needs a "timestamp" (where it previously would retrieve a videotape time code), Marquee now asks Where Were We what the current time is(14) and records that instead of the SMPTE time code information. When Marquee is put in "review mode" it creates or joins a set of players and instructs them to seek to points in absolute time that it noted previously. For simplicity of the transition from analog to WhereWereWe, Marquee maintains its own ink database; the switch to events would not be difficult.

Other Tools
Table of Contents

Discussion: Infrastructure and Tools

As outlined earlier, Coral grew out of merging three existing projects--Tivoli, WhereWereWe, and ILU--in a domain where a natural opportunity presented itself. The Where-WereWe infrastructure had been designed to support the addition of new client applications, and the Tivoli program was recognized as an application whose history mechanism was well-suited for producing index marks. As it turned out, it was not difficult to hook WhereWereWe and the Tivoli code together using the ILU distributed object system; Tivoli presented no particular complexities, nor was this application a particularly demanding one from the point of view of WhereWereWe. In fact, the simplicity of that initial hookup was deceptive; keeping the system on-line for regular weekly use highlighted areas where we were depending on software that is less than 100% reliable, and numerous shortcomings were revealed over the course of the last 18 months of heavy use.

Coral pushes on numerous aspects of software engineering where the state of the world leaves something to be desired. However, we feel that we are well poised to take advantage of improvements in these related areas--distributed systems, operating systems, object-oriented databases, multimedia compression and network transport, and others.

For example, the infrastructure could benefit greatly from synchronization primitives provided by the underlying operating system. If WhereWereWe had more control over the timing of its media streams, better playback synchronization performance would result, especially for streams that need to stay synchronized for long periods of time and/or cross several pause and resume boundaries.

Another example is in the area of distributed object storage. WhereWereWe currently implements its own object persistence atop a standard relational database. This method is completely ad hoc, and the performance of our initial stab at the problem suffers from our not knowing the characteristics of our eventual use (we do a form of lazy evaluation that often ends up resulting many more database hits than would have been necessary). While we've now had enough experience that we're poised to do a better job with a second implementation, it is very clear that a system which focused its efforts on providing fast and efficient persistent object storage and retrieval would be of considerable value.

Perhaps the most successful implementation decision in Coral was to develop it such that minimal buy-in was required for an application to begin participating in the Coral framework. A program that wished to become a indexing client simply needed to locate a master WhereWereWe object and submit Events. Simple dedicated indexing clients can be written in less than a page of Python; piggybacking on Tivoli or Emacs requires minimal initial programming investment. This meant that, after minimal modifications, programs could participate in activity capture settings without major interruption to their own research agendas.(15) Coral's basis as a loose confederation has proved to be very powerful, because applications can choose to participate at a variety of levels.

Generalizing and extending the infrastructure turned out to be more difficult than desired. For example, the inclusion of new media types, perhaps higher quality audio or video, required that WhereWereWe be recompiled. While not fundamentally a big deal, this ran counter to the spirit of a collection of loosely connected elements. We have since redesigned and repartitioned the Coral infrastructure to include the notion of independent media servers which implement and serve all media-specific functionality, using something like the MIME types mechanism to determine what media server is needed for a particular stream, and a broker to connect to an existing instance or launch a new one.

Time is a slippery quantity in the WhereWereWe internals, the API, and in many of the application programs. Although the use of absolute time makes many problems simpler, the complexity of time in the tools does not vanish. Indexing applications used during playback, creating post hoc indices, obviously create marks whose creation time is not coincident with the time they mark--both times need be retained, but representation is problematic. Further, it is clear that client applications may need the ability to access the future with events, as they may need to begin the event generation process before the actual event occurs. While the application programmer can easily reference these quantities using absolute time and the current WhereWereWe API, supporting these capture and access concepts is a significant implementation challenge.

While Coral's confederation approach has worked well for getting a suite of diverse applications working together, it has resulted in a few problems. Applications have the opportunity to stay blissfully unaware that they are participating in an activity capture setting. We need to provide lightweight ways to keep the tools adequately coordinated. For example, WEmacs beams text up onto the current Tivoli page, submits it to WhereWereWe as an event, and retains a local copy in its buffer. Modifications made in each of those locations is not necessarily reflected in the others. Our current suite of applications has evolved a set of ad hoc interfaces for portions of this functionality (e.g., WEmacs to Tivoli for beaming does not go through the WhereWereWe infrastructure). We are working on an extended notification system--one that includes events--that will help with some of these difficulties, but a general solution to this problem remains a major challenge.

At the level of applications, we are still gaining experience with various types of functionality and their numerous interactions. WEmacs and Tivoli offer a solid start in using multimedia capture in simple capture settings, but are both somewhat lacking in the access setting. If hooking Tivoli to WhereWereWe spotlighted how any program with a time-based history is already 90% of a marking client, then writing access applications is revealing how everything is potentially a stream. If we want Tivoli or WEmacs to look the way it did when a particular utterance was made, then the best way to have that happen is for Tivoli or WEmacs to act as players. Tivoli has already been augmented with some of this functionality, working in both a playback mode, which animates the exact appearance and construction of past states, and a "bouncing-ball" mode, where a cursor points to the area where drawing or editing was happening.

Once more and more of the functionality of the capture and access tools is exported via recorder and player interfaces, we gain a uniformity that can be exploited to solve other interface problems. Currently, using the suite of tools for review is plagued by having a variety of applications that each may want to control the playback of assorted multimedia streams and each other. This coordination has been the source of many of the ad hoc inter-process communication paths described above or compromises in user interface generality. Once these programs all appear as players, they can then more easily be gathered into composite players and uniformly handled.

The unification of tools into composite streams quickly gathers other potential uses. Recording the activity of a composite stream, i.e., its constitution and the messages it distributed to its member objects, will allow us to playback a playback session. This is potentially a crucial notion when a user wants to review the accessing done by another user (e.g., seeing what a close colleague found interesting in a recorded seminar). These situations quickly bring up the time and past- vs. present-event subtleties discussed above.

We currently have minimal query support; application programmers end up writing code to sift through all the Events for a Session in order to find those that they want to represent. As we shift to a greater focus on accessing, we will need finer grain query support for getting subsets of events and sessions. Furthermore, we will need to devise formalisms for formulating and performing temporal queries.

Although video is a supported datatype in the infrastructure and current suite of tools, we have had little chance to adequately explore its utility. The low resolution and framerate of our current video offering leaves much to be desired, particularly in settings where documents or detailed physical objects are of interest. On the other hand, the sheer size of video streams is, in large part, the root of our inexperience with video, so improved quality will need to be balanced against the costs of transmission and storage. We are improving the quality and reliability of the video datatype in this current reworking of our infrastructure, and expect to be using more prevalently in the near future.

The Coral architecture and the particular tools described here have already proven remarkably flexible, and are proving their utility in regular use. As the examples illustrate, the WhereWereWe API, coupled with ILU, makes it relatively painless to explore multimedia recording, playback, and indexing in a variety of settings. The Coral suite of tools is providing us with a foundation for interesting applications and has supplied invaluable fodder for our current infrastructure revision efforts. We think there are further gains that can be made, particularly in areas involving automatic and semiautomatic indexing, studies of use and refinement of the applications, and in browsing tools for timestream data.

Table of Contents

Related Work

The work has its historical roots in CoLab [Stefik et al., 1987] and Media Space [Stults, 1986]. From the former came a focus on meeting support tools and from the latter a focus on multimedia communications environments. What has emerged is neither the intersection nor union of those two projects; in fact, few multimedia systems have aimed at recovering casual information from everyday work settings. Most research in multimedia systems has concentrated on either real-time interaction or static, authored, multimedia documents.(16) A couple of projects have superficially similar motivations, but tackle other aspects of the problem. There are electronic meeting rooms to enhance decision making, multimedia systems for instructional presentation or usability testing, systems that augment human memory recall, video-on-demand systems, video conferences, and hypermedia systems with audio and video to organize records of informal activity.

Electronic Meeting Rooms. Early electronic meeting rooms, such as the EDS Capture Lab [Mantei, 1989], attempt to provide computer support for meeting process. The CaptureLab and our project both support and extend some otherwise paper or whiteboard-based activity using computational tools. However, CaptureLab was focused on decision-making and more formal meeting process, while our project involves making records from informal aspects of meeting room activity.

More recent meeting-room systems that have included multimedia focus on making and accessing recordings of technical presentations; e.g., Bellcore's STREAMS [Cruz and Hill, 1994] is aimed directly at this application. Importantly, these tend to be monolithic systems with a clearly defined model of use which all tools buy into: there is a speaker/audience model of setting, they are integral with a multimedia telecommunications system, and notetaking is a purely private activity outside the scope of the system. Consistent with the focus on presentation, such systems provide individuals with means of locating and displaying meetings in remote or post-facto settings. In contrast, our system does not distinguish or privilege particular users' activity and integrates through a confederation strategy.

Memory Aids. One way in which the recordings and notes are employed is to improve recollections of the meeting; a couple of systems have tackled this problem area directly. The IBM We-Met system [Wolf et al., 1992] started down this path; followed by H-P's Filochat [Whittaker et al., 1994], which used a pen-based computer and digital audio recording to provide a single user with a means to take notes in a meeting and, by selecting the handwritten note, replay the recording made when the note was taken. Although discussing many issues common to our effort, the Filochat work, with its emphasis on personal use, excludes many aspects that arise when that same functionality becomes collaborative and is offered as a network service.

Pepys [Newman et al., 1991] kept an automatic diary of offices visited and colleagues encountered using a network of sensors and communicating identification badges; it did not employ recording--in our parlance, it created events but not streams. Although the system demonstrably stimulated recall of some events, remembering is but one step in recovering content from casual activity.

Xcapture [Hindus et al., 1993] is a short-term memory device; by constantly rerecording the last few minutes of audio, it is possible to replay something that was just uttered. This scheme obviates the need for marking but requires immediate action on the part of the user.

Video on demand. Video on demand systems allow a user to select a video clip (perhaps a long clip, like a movie) and have the video, audio, and perhaps supporting documents, be instantly available for viewing [Rangan et al., 1992; Rowe and Smith, 1992]. The data usually can be played back at various speeds and with random access. These systems concentrate on allowing synchronous access to data recorded at a previous time.

Teleconferencing. Video conferencing systems tend to be modelled on telephony; they give the user the capability to conduct a face-to-face type interaction with a user a remote location [Fish et al., 1990; Watabe, 1990; Ahuja and Ensor, 1992]. These systems do not usually allow the user to review the session; the data is not stored in the system. These systems focus on connecting people who are synchronized in time.

Multimedia documents. Multimedia document systems focus on the construction, layout, and retrieval of mixed media documents, especially those containing video or high-resolution images [Buchanan and Zellweger, 1992; Hardman et al., 1993]. These systems focus on the presentation of previously constructed data, allowing asynchronous communication between author and reader. Of particular note in this category is Raison d'Etre [Carroll et al., 1994]. This system did not augment the capture of activity, but rather organized fragments of recorded video using an issue-based hypermedia framework. The source material consisted of video recordings of interviews with members of a design project that were then manually segmented and categorized. Thus, segment retrieval was conceptualized as a pre-structured (rather than emergent) activity, organized around the content and not an augmented one based on activity indices.

Table of Contents

Summary

This work has demonstrated that users can reap considerable benefit from appropriately designed and deployed activity capture and access technologies. Working closely with a set of motivated users has done much to hone our notions of what such systems might do. Further experiences of how these users' work practices and our technologies have coevolved over the 18 months we have been working together will be reported elsewhere.

The Coral confederation of applications resulted from a mixture of top-down and bottom-up development; the flexible approach permits expedient changes to serve the needs of our users, while supporting a smooth transition from prototype to architectural changes. The confederation approach has served us well over the course of the project, but elements of the system are currently being redesigned to better support the uses and demands that have emerged from our expereinces with real applications and actual users. In particular, the move to media servers will permit easier exploration of new datatypes, and an improved notification system should ease application coordination.

We are indeed shifting some attention from the capture setting to the range of accessing that might be useful for a population of users. A wide range of scenarios surface here, from looking over a meeting that one missed to searching for a remembered comment to maintaining a group notebook. These applications take further advantage of the network and multi-user aspects of the infrastructure, allowing us to investigate the power of merging information from multiple users' marking activity and derived indices.

Activity capture and access via the recording of time-based data has turned out to be an extremely rich area with diverse research threads--speech signal processing, pen-based user interfaces, distributed object systems, real-time multimedia indexing, and so on. The niche of near-synchronous and pre-narrative multimedia has proven to hold opportunities for both novel applications and truly useful functionality.

Table of Contents

Acknowledgments

Thanks to: Chuck Hebel and the many TAP members who have participated in the work described herein; our colleagues--Sara Bly, Dan Swinehart, Bryan Lyles, Don Kimber, Karon Weber, and members of the Collaborative Systems Area at PARC--for working with us as the ideas and systems in this paper were developed and implemented; Tom Rodriguez and Victoria Bellotti who also provided helpful advice on the work and on earlier drafts of this paper. We acknowledge the Collaborative Computing group at Sun Microsystems Inc., for providing their prototype digital video hardware, associated software, and support.

Table of Contents

References

Adams, N., R. Gold, B. Schilit, M. Tso, and R. Want, "An Infrared Network for Mobile Computers", Proceedings of the USENIX Symposium on Mobile and Location-independent Computing, pp. 41-52, August 1993.
Ahuja, S., and J. Ensor, "Coordination and Control of Multimedia Conferencing", IEEE Communications Magazine, Vol. 30, No. 5, pp. 38-43, May 1992.
Arons, B., "Interactively skimming recorded speech", Proceedings of UIST `93 Symposium on User Interface Software and Technology, November 1993.
Buchanan, C., and P. Zellweger, "Scheduling Multimedia Documents Using Temporal Constraints", Proceedings of the Third International Workshop on Network and Operating System Support For Digital Audio and Video, November 1992.
Carroll, J., Alpert, S., Karat, J., Van Dusen, M., Rosson, M., "Raison d'Etre: Capturing Design History and Rationale in Multimedia Narratives", Proceedings of the CHI '94 Conference on Human Factors in Computing Systems, pp. 192-197, April 1994.
Cruz, G., and R. Hill, "Capturing and Playing Multimedia Events with STREAMS", Proceedings of the Second ACM International Conference on Multimedia, pp. 193-200, October, 1994.
Elrod, S., R. Bruce, et al., LiveBoard: A large interactive display supporting group meetings, presentations, and remote collaboration, Proceedings of CHI'92, April 1992.
Fish, R., R. Kraut, and B. Chalfonte, "The Video Window System In Informal Communications", Proceedings of the Conference On Computer-Supported Cooperative Work, pp. 1-11, October 1990.
Goldberg, D. and C. Richardson, "Touch-Typing with a Stylus", Proceedings of the INTERCHI '93 Conference on Human Factors in Computing Systems, pp. 80-87, April 1993.
Hardman, L., G. van Rossum, and D. Bulterman, "Structured Multimedia Authoring", Proceedings of the First ACM Conference On Multimedia, pp. 283-289, August 1993.
Hindus, D., C. Schmandt, and C. Horner, "Capturing, Structuring, and Representing Ubiquitous Audio", ACM Transactions on Information Systems, Vol. 11, No. 4, pp 376-400, October 1993.
Janssen, B., ILU Manual, Xerox Technical Report ISTL-CSA-94-01-02, January 1994.
Kimber, D., L. Wilcox, F. Chen, and T. Moran, "Speaker Segmentation for Browsing Recorded Audio", Proceedings of the CHI'95 Conference on Human Factors in Computing Systems, pp. 212-213, May 1995.
Mantei, M., "Observation of Executives Using a Computerized Supported Meeting Environment", International Journal of Decision Support Systems, pp. 153-166, June 1989.
Minneman, S., The Social Construction of a Technical Reality: empirical studies of group engineering design practice, Ph.D. Dissertation, Stanford University, 1991.
Minneman, S., and S. Harrison, "Where Were We: making and user near-synchronous, pre-narrative video", Proceedings of the First ACM Conference On Multimedia, pp. 207-214, August 1993.
Moran, T., P. Chiu, W. van Melle, and G. Kurtenbach, "Implicit Structures for Pen-Based Systems Within a Freeform Interaction Paradigm", Proceedings of the CHI'95 Conference on Human Factors in Computing Systems, pp. 487-494, May 1995.
Newman, W., M. Eldridge, and M. Lamming, "Pepys: Generating Autobiographies by Automatic Tracking", Rank Xerox Technical Report, EPC-91-106, 1991.
Pedersen, E., K. McCall, T. Moran, and F. Halasz, "Tivoli: An Electronic Whiteboard for Informal Workgroup Meetings", Proceedings of the INTERCHI '93 Conference on Human Factors in Computing Systems, pp. 391-389, April 1993.
Rangan, P., H. Vin, and S. Ramanathan, "Designing an On-Demand Multimedia Service", IEEE Communications Magazine, Vol. 30, No. 7, July 1992.
Rowe, L., and B. Smith, "A Continuous Media Player", Proceedings of the Third International Workshop on Network and Operating System Support For Digital Audio and Video, November 1992.
Stallman, R., GNU Emacs Manual, Ed. 7, Ver. 18, September 1992.
Stefik, M., G. Foster, D. Bobrow, K. Kahn, S. Lanning, and L. Suchman, "Beyond the Chalkboard: Computer Support for Collaboration and Problem-Solving in Meetings", Communications of the ACM, Vol. 30, No. 1, pp. 32-47, 1987.
Stults, R., Media Space, Xerox Palo Alto Research Center Technical Report, 1986.
Watabe, K., S. Sakata, K. Maeno, H. Fukuoka, and T. Ohmori, "Distributed Multiparty Desktop Conferencing System: MERMAID", Proceedings of the Conference on Computer-Supported Cooperative Work, pp. 27-38, October 1990.
Weber, K., and A. Poon, "Marquee: A Tool for Real-Time Video Logging", Proceedings of the ACM Conference on Human Factors in Computing Systems, pp. 58-64, April 1994.
Whittaker, S., P. Hyland, and M. Wiley, "Filochat: Handwritten Notes Provide Access to Recorded Conversations", Proceedings of the CHI '94 Conference on Human Factors in Computing Systems, pp. 192-197, April 1994.
Wilcox, L., D. Kimber, and F. Chen, "Audio Indexing Using Speaker Identification", Automatic Systems for the Identification and Inspection of Humans, Proceedings of SPIE 2277, pp. 149-157, July 1994.
Wolf, C., J. Rhyne, and L. Briggs, "Communication and Information Retrieval with a Pen-based Meeting Support Tool", Proceedings of the Conference on Computer-Supported Cooperative Work, pp. 322-329, November 1992.

Table of Contents

Endnotes

(1)
The LiveBoard hardware currently used in this project is made by LiveWorks, Inc., a Xerox Company.
(2)
Time-based records include audio, video, and those computing records where a temporal element makes sense, such as program activity logs, email, presentation slides, document versions.
(3)
The LiveBoard also supports shared use among distributed sites; we do not utilize this capability in the work reported here.
(4)
The video hardware employed in the work came from a collaborative research agreement with Sun Labs Inc.the so-called DIME boards use the Intel i750 chipset and produce moderate quality video at reduced frame rates. The quality of the audio and video leaves much to be desired; improvements are underway for each regime.
(5)
The intervention process and techniques will be described in a future paper.
(6)
Ron's notes have undergone a progression from discussion summaries to descriptions of activity (e.g., "Jeff's comments about prior art.") as he's gained confidence in the recordings. Ironically, this has actually permitted him to focus more of his attention on the discussion itself. The utility of the tools occasionally shows up in explicit discourse (e.g., "No reason to write that downI'll pull it off the audio.")
(7)
The tcl ILU bindings do not currently support the construction of servers.
(8)
The audio format is Sun Sparcstation 8 kHz mulaw, the video is Intel's RTV2.0 (variable resolution and frame rate, tuned for network and storage resources).
(9)
The API has also been used in a number of other applications, some of which will be discussed later in this paper.
(10)
It is the research version of the Windows application called "Meeting Board" that is shipped with the PC-based LiveBoard.
(11)
WhereWereWe's name derives from its ability to perform this retrieval operation while the activity is still being recorded; letting groups get distracted and return to earlier points in their discussion to restart"Where were we?"
(12)
The user specifies the name of the Session at WEmacs startup time. This manual step will be unnecessary when we fix the Session initializer to communicate this piece of information to WEmacs directly.
(13)
The clockmarks in Tivoli may be used in this way, but are limited to one user at a time and can interrupt the flow of a meeting.
(14)
WhereWereWe provides a service to allow client programs to request the current absolute time from the server. This facility is provided for clients running on platforms that are not participating in various clock synchronization protocols (such as NTP).
(15)
On the other hand, participation in Coral can bring up interesting research and interface issues. Considerable effort was devoted to making the Tivoli application be an intuitive tool for scribbling and playback.
(16)
Furthermore, we believe that much of the emerging multimedia infrastructure work is missing the mark on particular classes of multimedia applications, particularly those where the delay between the production and use of the multimedia streams is short (near-synchronous), those where simultaneous reading and writing of a multimedia recording is important (akin, in the analog world, to reading from and writing to different places on the tape), and systems facile enough to support using unproduced video as a conversational prop (pre-narrative) [Minneman and Harrison, 1993].
Table of Contents