CS295J/Research proposal (draft 2)

From VrlWiki
Jump to navigation Jump to search

Introduction

We propose a framework for interface evaluation and recommendation that integrates behavioral models and design guidelines from both cognitive science and HCI. Our framework behaves like a committee of specialized experts, where each expert provides its own assessment of the interface, given its particular knowledge of HCI or cognitive science. For example, an expert may provide an evaluation based on the GOMS method, Fitts's law, Maeda's design principles, or cognitive models of learning and memory. An aggregator collects all of these assessments and weights the opinions of each expert, and outputs to the developer a merged evaluation score and a weighted set of recommendations.

Systematic methods of estimating human performance with computer interfaces are used only sparsely despite their obvious benefits, the reason being the overhead involved in implementing them. In order to test an interface, both manual coding systems like the GOMS variations and user simulations like those based on ACT-R/PM and EPIC require detailed pseudo-code descriptions of the user workflow with the application interface. Any change to the interface then requires extensive changes to the pseudo-code, a major problem because of the trial-and-error nature of interface design. Updating the models themselves is even more complicated. Even an expert in CPM-GOMS, for example, can't necessarily adapt it to take into account results from new cognitive research.

Our proposal makes automatic interface evaluation easier to use in several ways. First of all, we propose to divide the input to the system into three separate parts, functionality, user traces and interface. By separating the functionality from the interface, even radical interface changes will require updating only that part of the input. The user traces are also defined over the functionality so that they too translate across different interfaces. Second, the parallel modular architecture allows for a lower "entry cost" for using the tool. The system includes a broad array of evaluation modules some of which are very simple and some more complex. The simpler modules use only a subset of the input that a system like GOMS or ACT-R would require. This means that while more input will still lead to better output, interface designers can get minimal evaluations with only minimal information. For example, a visual search module may not require any functionality or user traces in order to determine whether all interface elements are distinct enough to be easy to find. Finally, a parallel modular architecture is much easier to augment with relevant cognitive and design evaluations.

Overview of Contributions

Note: For reference, this is the aggregate set contributions from last week. Maybe we can edit/add/remove from this as needed.

  • Design and user-study evaluation of novel techniques for collecting and filtering user traces with respect to user goals.
  • Extensible, low-cost architecture for integrating pupil-tracking, muscle-activity monitoring, and auditory recognition with user traces in existing applications.
  • System for isolating cognitive, perceptual, and motor tasks from an interface design to generate CPM_GOMS models for analysis.
  • Design and quantitative evaluation of semi-automated techniques for extracting critical paths from an existing CPM_GOMS model.
  • Novel algorithm for analyzing and optimizing critical paths based on established research in cognitive science.
  • A design tool that can provide a designer with recommendations for interface improvements, based on a unified matrix of cognitive principles and heuristic design guidelines.
  • The creation of a language for abstractly representing user interfaces in terms of the layout of graphical components and the functional relationships between these components.
  • A system for generating interaction histories within user interfaces to facilitate individual and collaborative scientific discovery, and to enable researchers to more easily document and analyze user behavior.
  • A system that takes user traces and creates a GOMS model that decomposes user actions into various cognitive, perceptual, and motor control tasks.
  • The development of other evaluation methods using various cognitive/HCI models and guidelines.
  • A design tool that can provide a designer with recommendations for interface improvements. These recommendations can be made for a specific type of user or for the average user, as expressed by a utility function.

Background / Related Work

Methodology

TODO: Add some intro paragraph here.

Collecting User Traces

Given an interface, our first step is to run users on the interface and log these user interactions. We want to log actions at a sufficiently low level so that a GOMS model can be generated from the data. When possible, we'd also like to log data using additional sensing technologies, such as pupil-tracking, muscle-activity monitoring and auditory recognition; this information will help to analyze the explicit contributions of perception, cognition and motor skills with respect to user performance. In addition to specific user traces, many modules could use a transition probability matrix based on interaction predictions.

Generalizing User Traces

Evaluation and Recommendation via Modules

This section describes the aggregator, which takes the output of multiple independent modules and aggregates the results to provide (1) an evaluation and (2) recommendations for the user interface. We should explain how the aggregator weights the output of different modules (this could be based on historical performance of each module, or perhaps based on E.J.'s cognitive/HCI guidelines).

Sample Modules

CPM-GOMS

This module will provide interface evaluations and suggestions based on a CPM-GOMS model of cognition for the given interface. It will provide a quantitative, predictive, cognition-based parameterization of usability. From empirically collected data, user trajectories through the model (critical paths) will be examined, highlighting bottlenecks within the interface, and offering suggested alterations to the interface to induce more optimal user trajectories.

HCI Guidelines

This section could include an example or two of established design guidelines that could easily be implemented as modules.

Fitts's Law

This simple module will use Fitts's Law to provide interface evaluations and recommendations.

Affordances

This simple module will provide interface evaluations and recommendations based on perceived affordances and if possible a comparison to actual affordances.

Interruptions

While most usability testing focuses on low-level task performance, there is also previous work suggesting that users also work at a higher, working sphere level. This module attempts to evaluate a given interface with respect to these higher-level considerations, such as task switching.

Working Memory Load

This module measures how much information the user needs to retain in memory while interacting with the interface and makes suggestions for improvements.

Automaticity of Interaction

Measures how easily the interaction with the interface becomes automatic with experience and makes suggestions for improvements.

Preliminary Results

Initially, let's make up some fictional (but reasonable) preliminary results that we'd like to see and think we can accomplish before submitting the proposal.

[Criticisms]

Any criticisms or questions we have regarding the proposal can go here.