Nascent Papers

Steve

[updated 11/10/11]

Combining Topic Modeling with Crowdsourcing for Image Description

19:00, 15 November 2011 (EST)
Based on a conversation w/ Rebecca who is making a black box that can take in an image and output a distribution of likely noun phrases (or sentences extracted from training captions) that describe the image. She's got her topic modeler trained on images of shoes!
Thinking about integrating Turk with the topic modeling stuff. The TM stuff is good at labeling "expert" features of the shoes, but does not get basic things right, like color or when these shoes might be worn. These are things Turkers could easily do. Can we figure out what the "expert" terminology is versus normal terminology? Can we integrate that with a language model for better caption generation?
Check out "Im2Text" paper by Tamara Berg (NIPS 2011)

A Visual Survey of Evaluation Methods

Expected Contribution(s):

A survey and visualization of evaluation methods for visualization tools.
I'm imaging a tool for people who don't know what kind of evaluation to do with a new tool. I think it would be awesome to read in a .bib file of related papers, and somehow be able to summarize (visualize?) the kinds of evaluations done by peers, as sort of a suggestion for an evaluation method.

Domain-effects on Visual Thinking and Communication

Experiment: How does an individual's academic discipline (e.g., math, physics, bio) affect the way s/he chooses representations to explain basic scientific phenomena

Expected Contribution(s):

Crowdsourcing Visualization Edits and Quality Assessment

Expected Contribution(s):

An interaction pattern and infrastructure for soliciting and verifying critiques of 2D static visualizations. We'll have a pattern like Find-Fix-Verify described in the Soylent paper (which proposes crowdsourcing for text revision) tailored for visualization/graphics. (I think there are some big differences between text and graphics revision that might call for a different pattern.)
(What should the evaluation look like here? In Soylent, the evaluation is specific to the types of edits it makes available; for instance, the results explain how successful crowdsourced edits are at shortening text.)

Abstract: We introduce an interaction pattern for soliciting and integrating crowdsourced human feedback for data visualization. Human feedback in the visualization design process is often useful in building tools that are aesthetically pleasing and useful for data analysis. We hypothesize that even non-experts on the web can make significant contributions to visualization design. With a new framework X, we demonstrate that workers on Mechanical Turk can effectively guide designs by completing several kinds of tasks: selecting the preferred image in a side-by-side comparison of generated visualizations; answering a quantitative image analysis question with known ground truth; providing qualitative feedback about a visualization; reviewing (another worker's) criticism of a visualization and answering "Do you agree with this?"

Modeling the Visual Attention Spotlight for Vector Field Visualization Using Natural Image Descriptors

Expected Contribution(s):

A method for learning 2D ROIs (bounding boxes) in the image plane that are interesting to users. We are learning the low-level properties (thinking Gist descriptors, textons) of interesting ROIs by having users manually select "interesting" ones in training, given the domain and analysis task, and by capturing the ROIs that users attend to using eye-tracking. The assumption is that these interesting areas, when encoded by descriptors, cluster nicely for all users, for a particular visualization tool.
An algorithm for identifying interesting ROIs (scale/rotation invariant) in a new image, using this training data. We integrate this in the visualization tool to highlight areas of interest.
A quantitative evaluation for an identification task (e.g., finding fiber bundle crossings) comparing the original and highlighting versions of a tractography application.

Abstract: TBD

Evaluating Analysts' Bias Toward Beauty in Visual Analytics'

Expected Contribution(s):

A space analysis of visual and interaction aesthetics in analytics tools, and their relationship to perceived ease of use, perceived usefulness, and analysis behaviors.
A quantitative evaluation of time spent, insights found, etc., when analysts are given a small set of tools that vary in aesthetics to analyze data.

Expected Results:

Found a couple interesting detrimental effects of beautiful applications:
1. they are "trusted" more, insights found are cross-referenced less than with amateur-looking applications,
2. they are used for longer time periods than "ugly" ones, even when told that these ugly tools provide different features and may contain different insights.
Users felt more confident in their analyses using beautiful applications.

Abstract: TBD

Hypotheses / Random Ideas

We can predict what tasks users are solving with visualizations, looking at interaction histories, eye-tracking data, etc.
We can learn interaction sequences from a visualization and datasets that repeat/complete analysis tasks on a new dataset. (i.e. We can guide an analyst through high-level interactions with a visualization.)
We can predict an analyst's susceptibility to cognitive "nudges" by learning and comparing behaviors with previous susceptible analysts.
Analysts spend more time w/attractive analytics than ugly ones
Analysts feel more confident in the analyses they produce when using attractive/conforming ones versus ugly/unconventional ones
Technology acceptance model (TAM) extends to visualization / visual analytics
1. Is trust a factor? Is that part of perceived usefulness or perceived ease of use?
More saccades with more uncertain data
1. Does the analyst spend more mental effort trying to build a narrative of what’s going on in the data when s/he is aware of uncertainty in the vis?
Viewing other analyses or past sessions from a visual tool will improve confidence in my own analysis or in the tool. (Not just the effect of training.)
1. How do we communicate interesting findings in visualizations?
2. If users took screenshots of interesting/persuasive views, can we learn what makes a good expository view?
3. Check out CHI ‘11 paper - “The Impact of Social Information on Visual Judgments”
We can make a useful dataset from images on wiki pages (scrape them) and learn something about how people do expository/explanatory visualization/narrative construction
We can evaluate what makes a "high rated" visualization on ManyEyes.
1. It’s easy to visually tell the difference between the “high rated” sorted list of visualizations there, and just the “most current”. I’m not sure the topic of the visualization is even that critical -- I could probably classify to a first approximation just by looking at the title syntax and the distribution of the visualization data.

Jadrian

Curve noise statistics

An upgraded version of my ISMRM abstract.

Present original findings (expanded to range over FA + MD)
- Potentially also investigate multiple algorithms: nonlinear DT fitting, multi-DT, Q-ball, other tractography algorithms, etc.
Explain results with first-order model
Validate the model (repeat experiment 2) on more "realistic" data with ground truth. Some options:
- Brain-based computational phantom (constructed like so: T1+HARDI scan, 2-tensor fit, tractography, filter curves for T1 termination criteria, lowpass curves, synthesize DWIs from curves)
- Popular phantom (tractography cup?)
- HARDI atlas
- Generate synthetic HARDI scans from 2-tensor fit of averaged LARDI scans for tons of healthy normals
Prescribe a simple pipeline for deriving the noise equation for any combination of algorithms
Maybe also consider writing up the application to PICo: probability of connection to a cortical ROI is the integral of the uncertainty PDF over that subcortical surface. But the benefit from this technique (gain in precision) may be minimal, as symmetric PDFs on either side of an ROI boundary would tend to contribute the same on either side; results might be about the same without the PDF spreading, just because you've got a lot of curves.

Diffusion model parsimony

Compare multiple diffusion models (single-tensor, multi-tensor, higher-order tensor, higher-order ODF [e.g. QBI], multi-compartment biological [e.g. CHARMED]) in terms of their "parsimony" on real data. Quantify parsimony with chi-squared goodness-of-fit on top of the probability integral transform for Rician noise. This will hopefully demonstrate that most models under- or over-fit, and that the multi-compartment models are juuuuuust right. Come to think of it, a Goldilocks reference would make for a snappy title. See dhl/cad/jadrian email from 2011-12-15 "a quickie Dagstuhl thought experiment".

Could potentially extend this to a "constructive" algorithm: something that picks the degrees of freedom for an RBF or other higher-order diffusion model.

Improved Hough-Peak--Finding by Topological Persistence

Put the toy-problem initialization technique on a solid theoretical basis. Rather than blurring the histogram, use topological persistence to disregard "sub-peaks".

Feature Identification in Multispectral Images Using the Hough Transform

Think about a "support" function for prospective line segments, with Gaussian falloff on the image plane. Finding an application area for this one might be hard.

Toy Problem / Probability Integral Transform

Write up the toy problem as an example use of the PIT in model testing for DWIs. Or maybe piggyback on Leemans/Sijbers MRM'05 synthetic DWIs from tractography, using PIT to get p-values for synthetic images. This may not be worth its own paper; maybe better to hold off on PIT stuff until it can be put in context of my own "real-problem" work.

Curve clustering based on chi-squared reconstruction error from a boundary model

I've already written up an outline of this paper. Clustering proceeds by building a boundary model of clusters/bundles, and the objective function is chi-squared goodness-of-fit derived from the tractography PDFs.

Automatic Shape-Sensitive Curve Clustering (§4.1.2)

Distance measure definition, distributed clustering algorithm, spectral clustering refinement, comparison to other techniques. Base on q-ball and DTI tractography from DTK; use Mori's atlas for ground truth? Moberts et al.'s ground truth clusterings? [moberts/van_wijk/vilanova might have a ground truth. Song Zhang had a clustering ground truth paper. Cagatay played with this at some point. bang not huge, how big is the buck? but I could be convinced -- not sure this is a vis paper? but it could be. Application of curve similarity to other areas (bat flight trajectories) would be convincing at Vis too.]

An Orientation-Aware Boundary Surface Representation of Space Curve Clusters

Contributions:
1. "Natural" representation of clusters of curves, useful for higher-level operations. [I think that "natural" is not descriptive enough. I would try to refine that. Is the orientation a tangency constraint? There's something about it that seems important, but "orientation aware" doesn't quite seem like what it is... . I would give an example or two of the kind of higher-level operations you mean.
2. Superior results versus naive algorithms (and previously published work to solve the same problem? "Competitors" to consider include Gordon's crease surfaces, flow critical surfaces, etc. [dhl mentions Ken Joy's work...] [define "superior". Faster? more accurate? If you mean more "natural" then this may be the same thing as above]).
Proofs:
1. Prose argument that contrasts cross-section-based boundary surfaces to other representations of curve clusters. A bunch of curves is a mess and does not lend itself to operations on the cluster volume (smoothing, joining, etc.). Median curves have no width. Rasterization creates surface artifacts and loses orientation information. Alpha shapes lose orientation information. [nice!]
2. Alpha shapes is the main "competitor". Expected results show topological defects resulting from a global choice of alpha. Run both algorithms on phantom and real data and discuss features. [I suspect that there are some other shrink-wrap approaches that might be competitors, unless alpha-shapes are always superior]

A Sparse, Volumetric Representation of Space Curve Clusters (§4.1.4)

The benefit of the initial form of the macrostructure model is its ability to reconstruct its input curves. The evaluation on this should be relatively easy, as there is no comparison to other techniques. A manual clustering is acceptable but automatic clustering that implies some bound on reconstruction error would probably be better. Good for Vis, SIGGRAPH, EG, EV, I3D, ISMRM. [but why does anyone care? This could be a way of establishing the minimal information required to represent brain datasets... but again, does anyone care about that? Must find related work. dhl suggests that this may be "importance filtering" for curves, but what's the benefit?]

Automatic Tractography Repair / "Healing a Broken Tractogram"

ISMRM poster / Neuroimage paper / TVCG paper? Demonstrate the utility of the cluster boundary surface algorithm for repairing broken curves (§4.1.5).

Title, Authors: "Healing a Broken DTI Tractogram with Curve Cluster Boundary Surfaces". Jadrian Miles and David H. Laidlaw.
Contributions
1. A slicing-based (or alpha-shape contraction--based) algorithm for generating a cluster boundary surface with orientation, spreading, and medial axis metadata.
2. An algorithm for "sampling" novel un-broken curves from the cluster boundary surface model ("sparsifying")
3. An algorithm for extrapolating the curve cluster along its axis ("lengthening")
4. An algorithm for joining axially-aligned curve clusters based on extrapolation and refinement against underlying DWIs ("bridging")
5. Maybe others: "fattening", "smoothing"
Proofs of Contributions
1. Demonstration (with figures) of the boundary surface algorithm on synthetic and real-world tractography data.
2. Compare the proposed algorithm (which uses only the boundary surface, not tractography curves) against one that uses barycentric coordinates of the tractography curves that form a triangle about the seed point to propagate an interpolated curve. This comparison should result in an error measure.
3. Demonstration (with figures) of the boundary extrapolation algorithm. I'm currently unaware of any competitors for this.
4. Compare against local extrapolation models: tensor deflection, linear extrapolation, cubic spline extrapolation, smoothed Bezier extrapolation. Also compare QBI tractography on HARDI data against our algorithm on angular-subsampled DTI data.
Figures
1. Fig
2. Fig
3. Fig
Related work
Abstract
Conclusions
Methods

Automatic Tractography-Based DW-MRI Segmentation

Using DTI/QBI, automatic clustering, and simple macrostructure adjustment (dilation, splitting, merging, bridging gaps, §4.1.5), segment the WM and compare to some ground truth. Mori's atlas?

[The above two could each be ISMRM posters or talks, and should be quickly followed up by an MRM paper.]

A ??? (§4.2.1)

Generating synthetic images from the macrostructure model. Who would care about this? Possible improvement over Leemans, et al. due to gap-filling? [use data matching to support that the model is good, and wave hands about the usefulness of the higher-level model -- has a bigger bang feel than the clustering one]

[check out Ken Joy's multi-material volume representation stuff (last author) -- tvcg, I believe or maybe TOG]

Hua

[Last edited: May 11, 2012]

Random Ideas

Can we apply genetic algorithm to automatically explore different design spaces?

Applying causation model to visual Analytics - as metrics for evaluation (does the user’s judgement conforms to the causation model behind) - explicitly display a causation model to guide the user's reasoning

Computational modeling to model the visual interface elements effectiveness

Mathematically characterizing the visual analytics design and interaction space

focus + context algorithm for circular layout

Applying correlation to visual analytics - guides the display of visual elements (as variables in the DOI function) - good ways to visualize correlation in complex dataset? (not as correlation for 2D/nD points..) - can we use correlation to "compress" the visualization of data? (display the same amount of information using less screen estate)?

Eni

A Comparative Study of Cognitive Functioning and Fiber Tract Integrity

Contributions

1. Establish a relationship between diffusion metrics and measurements of working memory and motor control.

i.Statistically comparing working memory test results with several quantitative tractography metrics of structural integrity in the SLF and the fornix

ii.Comparing motor control test results with the same metrics, in the SLF and the fornix

2. The paper draws on the relationship between axial or radial diffusivity and the nature of axonal damage, to infer the prevalent type of damage in tracts affected by CADASIL

Results

1.

i. Working memory test results should correlate with most metrics, in the SLF. They should not however correlate as much with the metrics, in the fornix, since CADASIL affects SLF more drastically than it affects the fornix.

ii. In the SLF, motor control test results should not correlate with the metrics as highly as the working memory test results do.

2. Among the metrics, there should be a few that correlate more significantly than others with the cognitive test result. These correlations should provide clues abut the type of damage that CADASIL causes. Also, they will potentially bring out the most effective metrics for the assessment of white matter integrity in CADASIL patients.

Tentative Abstract

In this study we examine the interdependence of cognitive test results and white matter integrity, as assessed by quantitative tractography metrics. Different fiber tracts in the brain are related to different congnitive functions, e.g. the SLF is related to working memory. Therefore atrophy in a certain tract leads to impaired performance in functions controlled by that tract. Knowing that CADASIL causes severe injury in the SLF, but not the fornix, we chose to compare the structural integrity of these tracts, as measured by several quatitative tractography metrics, against working memory test results and motor control test results. The n-back test was used to test working memory and the X-test was used to test motor control. The results of the n-back correlated well with the metrics, in the SLF. They did not correlate with the metrics, in the fornix. Among the metrics NTWLad expressed the highest correlation with the n-back test results, suggesting that there is more axonal loss than demyelination in the tract. The results of the X-test did not correlate significantly with any of the metrics, in the SLF. These results confirm that there is a strong relationship among performance in cognitive tests and white matter health, measured by quantitative tractography. Furthermore, they draw some light on the nature of axonal damage caused by CADASIL.

Trevor

Tentative Title(s):

Extracting Semantic Content from Interaction Histories in 3D, Time-varying Visualizations

Interaction Histories for Collaboration, Search, and Prediction in 3D, Time-Varying Visualizations

Contributions:

Introduces a generalizable framework for automatically generating sharable, editable, searchable interaction histories in time-varying 3D applications.
Demonstrates utility of Relational Markov Models (RMMs) in extracting semantic information from interaction histories, useful for prediction and automation in scientific exploration.
Contributes the technical implementation details (software itself? open source project?) for applying said methods in pre-existing applications.

Results:

Techniques were applied in 3 existing applications: Animal kinematics from CT & X-ray, bat flight kinematics from light capture, and (__?? brain stuff, wrist stuff??, maybe infovis stuff like proteomics??___)
User evaluation of history generation matched user-defined histories in X% of cases. (Fully-automated, semi-automated, manual)
Collect data on collaboration? Anecdotal evidence on how tools were used for collaboration? (Need to get on this quickly, with new data sets that are actively being explored. Talk to Beth, Sharon.)
User study on task completion times with tools versus without tools.
Relational models evaluated against survey data. i.e. User was trying to uncover this in series of interactions, system interpreted interactions as this or that.

(Need to think more about how to objectify the previous two bullets.)

Abstract: TBD.

References:

Why interaction is more powerful than algorithms

Relational Markov Models and their Application to Adaptive Web Navigation

Visualizing Interaction History on a Collaborative Web Server

Distributed Cognition: Toward a New Foundation for Human-Computer Interaction Research

Çağatay

Coloring 3d line fields using Boy’s real projective plane immersion

Abstract:

It’s often useful to visualize a line field, a function that sends each point P of the plane or of space to a line through P; such fields arise in the study of tensor fields, where the principal eigendirection at each point determines a line (but not a vector, since if v is an eigenvector, so is −v). To visualize such a field, we often assign a color to each line; thus we consider the coloring of line fields as a mapping from the real projective plane (RP2) to color space. Ideally, such a coloring scheme should be smooth and one-to-one, so that the color uniquely identifies the line; unfortunately, there is not such mapping. We introduce Boy’s surface, an immersion of the projective plane in 3D, as a model for coloring line fields, and show results from its application in visualizing orientation in diffusion tensor fields. This coloring method is smooth and one to one except on a set of measure zero (the double curve of Boy’s surface).

Andy Forsberg, Jian, DHL

Vis '09 - 3D vector visualization methods paper / study

Abstract:

TBD..

Nascent Papers

Contents

Steve

Combining Topic Modeling with Crowdsourcing for Image Description

A Visual Survey of Evaluation Methods

Domain-effects on Visual Thinking and Communication

Crowdsourcing Visualization Edits and Quality Assessment

Modeling the Visual Attention Spotlight for Vector Field Visualization Using Natural Image Descriptors

Evaluating Analysts' Bias Toward Beauty in Visual Analytics'

Hypotheses / Random Ideas

Jadrian

Curve noise statistics

Diffusion model parsimony

Improved Hough-Peak--Finding by Topological Persistence

Feature Identification in Multispectral Images Using the Hough Transform

Toy Problem / Probability Integral Transform

Curve clustering based on chi-squared reconstruction error from a boundary model

Automatic Shape-Sensitive Curve Clustering (§4.1.2)

An Orientation-Aware Boundary Surface Representation of Space Curve Clusters

A Sparse, Volumetric Representation of Space Curve Clusters (§4.1.4)

Automatic Tractography Repair / "Healing a Broken Tractogram"

Automatic Tractography-Based DW-MRI Segmentation

A ??? (§4.2.1)

Hua

Random Ideas

Eni

Trevor

Çağatay

Coloring 3d line fields using Boy’s real projective plane immersion

Andy Forsberg, Jian, DHL

Navigation menu