CS295J/Research proposal (draft 2)
Introduction
- Owners: Adam Darlow, Eric Sodomka
We propose a framework for interface evaluation and recommendation that integrates behavioral models and design guidelines from both cognitive science and HCI. Our framework behaves like a committee of specialized experts, where each expert provides its own assessment of the interface, given its particular knowledge of HCI or cognitive science. For example, an expert may provide an evaluation based on the GOMS method, Fitts's law, Maeda's design principles, or cognitive models of learning and memory. An aggregator collects all of these assessments and weights the opinions of each expert, and outputs to the developer a merged evaluation score and a weighted set of recommendations.
Systematic methods of estimating human performance with computer interfaces are used only sparsely despite their obvious benefits, the reason being the overhead involved in implementing them. In order to test an interface, both manual coding systems like the GOMS variations and user simulations like those based on ACT-R/PM and EPIC require detailed pseudo-code descriptions of the user workflow with the application interface. Any change to the interface then requires extensive changes to the pseudo-code, a major problem because of the trial-and-error nature of interface design. Updating the models themselves is even more complicated. Even an expert in CPM-GOMS, for example, can't necessarily adapt it to take into account results from new cognitive research.
Our proposal makes automatic interface evaluation easier to use in several ways. First of all, we propose to divide the input to the system into three separate parts, functionality, user traces and interface. By separating the functionality from the interface, even radical interface changes will require updating only that part of the input. The user traces are also defined over the functionality so that they too translate across different interfaces. Second, the parallel modular architecture allows for a lower "entry cost" for using the tool. The system includes a broad array of evaluation modules some of which are very simple and some more complex. The simpler modules use only a subset of the input that a system like GOMS or ACT-R would require. This means that while more input will still lead to better output, interface designers can get minimal evaluations with only minimal information. For example, a visual search module may not require any functionality or user traces in order to determine whether all interface elements are distinct enough to be easy to find. Finally, a parallel modular architecture is much easier to augment with relevant cognitive and design evaluations.
Background / Related Work
Each person should add the background related to their specific aims.
- Steven Ellis - Cognitive models of HCI, including GOMS variations and ACT-R
- EJ - Design Guidelines
- Jon - Perception and Action
- Andrew - Multiple task environments
- Gideon - Cognition and dual systems
- Ian - Interface design process
- Trevor - User trace collection methods (especially any eye-tracking, EEG, ... you want to suggest using)
Specific Aims and Contributions (to be separated later)
See the flowchart for a visual overview of our aims.
Inputs
Outputs
Also passed as input is the utility function to optimize over. This utility function is a weighting of various performance metrics (time, cognitive load, fatigue, etc.), where the weighting expresses the importance of a particular dimension to the user. The utility function is used by the aggregator to provide evaluations and recommendations appropriate to user preferences.
As output, the aggregator will provide an evaluation of the interface and a set of recommended improvements. Evaluations are expressed both in terms of the utility function components (i.e. time, fatigue, cognitive load, etc.), and in terms of the overall utility for this interface (as defined by the utility function). These evaluations are given in the form of an efficiency curve, where the utility received on each dimension can change as the user becomes more accustomed to the interface.
Suggested improvements for the GUI are also output. These suggestions are meant to optimize the utility function that was input to the system. If a user values accuracy over time, interface suggestions will be made accordingly.
Collecting User Traces
- Owner: Trevor O'Brien
Given an interface, our first step is to run users on the interface and log these user interactions. We want to log actions at a sufficiently low level so that a GOMS model can be generated from the data. When possible, we'd also like to log data using additional sensing technologies, such as pupil-tracking, muscle-activity monitoring and auditory recognition; this information will help to analyze the explicit contributions of perception, cognition and motor skills with respect to user performance.
Generalizing User Traces
- Owner: Trevor O'Brien
The user traces that are collected are tied to a specific interface. In order to use them with different interfaces to the same application, they should be generalized to be based only on the functional description of the application and the user's goal hierarchy. This would abstract away from actions like accessing a menu.
In addition to specific user traces, many modules could use a transition probability matrix based on interaction predictions.
Parallel Framework for Evaluation Modules
- Owner: Adam Darlow, Eric Sodomka
This section will describe in more detail the inputs, outputs and architecture that were presented in the introduction.
Evaluation and Recommendation via Modules
- Owner: E J Kalafarski
This section describes the aggregator, which takes the output of multiple independent modules and aggregates the results to provide (1) an evaluation and (2) recommendations for the user interface. We should explain how the aggregator weights the output of different modules (this could be based on historical performance of each module, or perhaps based on E.J.'s cognitive/HCI guidelines).
Sample Modules
CPM-GOMS
- Owners: Steven Ellis
This module will provide interface evaluations and suggestions based on a CPM-GOMS model of cognition for the given interface. It will provide a quantitative, predictive, cognition-based parameterization of usability. From empirically collected data, user trajectories through the model (critical paths) will be examined, highlighting bottlenecks within the interface, and offering suggested alterations to the interface to induce more optimal user trajectories.
HCI Guidelines
- Owner: E J Kalafarski
This section could include an example or two of established design guidelines that could easily be implemented as modules.
Fitts's Law
- Owner: Jon Ericson
This simple module will use Fitts's Law to provide interface evaluations and recommendations.
Affordances
- Owner: Jon Ericson
This simple module will provide interface evaluations and recommendations based on perceived affordances and if possible a comparison to actual affordances.
Interruptions
- Owner: Andrew Bragdon
While most usability testing focuses on low-level task performance, there is also previous work suggesting that users also work at a higher, working sphere level. This module attempts to evaluate a given interface with respect to these higher-level considerations, such as task switching.
Working Memory Load
- Owner: Gideon Goldin
This module measures how much information the user needs to retain in memory while interacting with the interface and makes suggestions for improvements.
Automaticity of Interaction
- Owner: Gideon Goldin
Measures how easily the interaction with the interface becomes automatic with experience and makes suggestions for improvements.
Integration into the Design Process
- Owner: Ian Spector
This section outlines the process of designing an HCI interface and at what stages our proposal fits in and how.
Preliminary Results
Each person should come up with a single paragraph describing fictional (or not) preliminary results pertaining to their owned specific aims and contributions.
[Criticisms]
- Owner: Andrew Bragdon
Any criticisms or questions we have regarding the proposal can go here.