CS295J/Contributions for class 12

From VrlWiki
Jump to navigation Jump to search

A predictive, fully-integrated model of user workflow which encompasses low-level tasks, working spheres, communication chains, interruptions and multi-tasking.

OWNER: Andrew Bragdon

    1. Traditionally, software design and usability testing is focused on low-level task performance. However, prior work (Gonzales, et al.) provides strong empirical evidence that users also work at a higher, working sphere level. Su, et al., develops a predictive model of task switching based on communication chains. Our model will specifically identify and predict key aspects of higher level information work behaviors, such as task switching. We will conduct initial exploratory studies to test specific instances of this high-level hypothesis. We will then use the refined model to identify specific predictions for the outcome of a formal, ecologically valid study involving a complex, non-trivial application.
    2. Impact: Current information work systems are almost always designed around the individual task, and because they do not take into account the larger workflow context, these systems arguably do not properly model the way users work. By establishing a predictive model for user workflow based on individual task items, larger goal-oriented working spheres, multi-tasking behavior, and communication chains, developers will be able to design computing systems that properly model users and thus significantly increase worker productivity in the United States and around the world.
    3. 3-week Feasibility Study: To ascertain the feasibility of this project we will conduct an initial pilot test to investigate the core idea: a predictive model of user workflow. We will spend 1 week studying the real workflow of several people through job shadowing. We will then create two systems designed to help a user accomplish some simple information work task. One system will be designed to take larger workflow into account (experimental group), while one will not (control group). In a synthetic environment, participants will perform a controlled series of tasks while receiving interruptions at controlled times. If the two groups perform roughly the same, then we will need to reassess this avenue of research. However, if the two groups perform differrently then our pilot test will have lent support to our approach and core hypothesis.
    4. Risks/Costs: Risk will play an important factor in this research, and thus a core goal of our research agenda will be to manage this risk. The most effective way to do this will be to compartmentalize the risk by conducting empirical investigations - which will form the basis for the model - into the separate areas: low-level tasks, working spheres, communication chains, interruptions and multi-tasking in parallel. While one experiment may become bogged down in details, the others will be able to advance sufficiently to contribute to a strong core model, even if one or two facets encounter setbacks during the course of the research agenda. The primary cost drivers will be the preliminary empirical evaluations, the final system implementation, and the final experiments which will be designed to support the original hypothesis. The cost will span student support, both Ph.D. and Master's students, as well as full-time research staff. Projected cost: $1.5 million over three years.

A mixed-initiative system for interface design

Owner: Eric

Proposal Overview

Note: click here for flowchart.

We propose a framework for interface evaluation and recommendation that integrates behavioral models and design guidelines from both cognitive science and HCI. Our framework behaves like a committee of specialized experts, where each expert provides its own assessment of the interface, given its particular knowledge of HCI or cognitive science. For example, an expert may provide an evaluation based on the GOMS method, Fitts's law, Maeda's design principles, or cognitive models of learning and memory. An aggregator collects all of these assessments and weights the opinions of each expert based on past accuracy, and outputs to the developer a merged evaluation score and a weighted set of recommendations.

Different users have different abilities and interface preferences. For example, a user at NASA probably cares more about interface accuracy than speed. By passing this information to our committee of experts, we can create interfaces that are tuned to maximize the utility of a particular user type.

We evaluate our framework through a series of user studies. Interfaces passed to our committee of experts receive evaluation scores on a number of different dimensions, such as time, accuracy, and ease of use for novices versus experts. We can compare these predicted scores to the actual scores observed in user studies to evaluate performance. The aggregator can retroactively weight the experts' opinions to determine which weighting would have given the best predictions of user behavior for the given interface, and observe whether that weighting generalizes to other interface evaluations.

Inputs

  • The task the user is trying to accomplish
  • The GUI he/she is using to perform this task
  • The utility a user gets for values of different performance metrics (time, cognitive load, fatigue, etc.)
  • The predicted and/or actual trace of a user using this GUI

Outputs

  • An evaluation of the GUI, in terms of the individual metric values (i.e. time, cognitive load, etc.), and the overall utility for this as expressed by the utility function.
  • Suggested improvements for the GUI, in two forms:
    • Immediate transformations that can be automatically applied to the GUI
    • Higher level suggestions/guidelines that would have to be made by a developer

Contributions

  • The creation of a language for abstractly representing user interfaces in terms of the layout of graphical components and the functional relationships between these components.
  • A system for generating interaction histories within user interfaces to facilitate individual and collaborative scientific discovery, and to enable researchers to more easily document and analyze user behavior.
  • A system that takes user traces and creates a GOMS model that decomposes user actions into various cognitive, perceptual, and motor control tasks.
  • The development of other evaluation methods using various cognitive/HCI models and guidelines.
  • A design tool that can provide a designer with recommendations for interface improvements. These recommendations can be made for a specific type of user or for the average user, as expressed by a utility function.