Statistically Sound Visual Data Analysis
Jump to navigation
Jump to search
Overview
This page documents our research effort towards developing visual interfaces that help people perform statistically sound visual data analysis.
Proposals
- BIGDATA: F: Statistically Sound and Computationally Efficient Massive Data Analysis - Sample Complexity, Uniform Convergence, and False Discovery Rate
References
Vision and use case papers
- Towards Sustainable Insights (or why polygamy is bad for you): outlines a few examples of false discovery issues in existing visual data exploration (e.g. Vizdom) and visual recommendation tools (e.g. SeeDB)
Interactive control of false discover rate during data exploration
- Controlling False Discoveries during Interactive Data Exploration: the paper reports case study results on applying a new procedure for controlling FDR during data exploration and reports; not much emphasis on the interface yet but it sounds like Emanuel is working on that; the case study could provide inspirations for ways in which visualizations could help users better understand the FDR control procedural.
- Graphical approaches for multiple comparison procedures using weighted Bonferroni, Simes, or parametric tests: presents a graphical representation to illustrate test procedures that control family-wise error rate (FWER)
Previous work on visualizations and statistical analysis
- Bayesian reasoning
- Improving Bayesian Reasoning: The Effects of Phrasing, Visualization, and Spatial Ability, Ottley et al., 2016
- Assessing the effect of visualizations on bayesian reasoning through crowdsourcing, Micallef, Dragicevic, and Fekete, 2012
- Visual estimation of coefficients of statistical models
- Judging correlation from scatterplots and parallel coordinate plots, Li, Martens, and van Wijk, 2008. This paper reports experimental results on how well people can assess correlation under varying sample size and visualization methods (scatter plots v.s. parallel coordinate plots). The results suggest that the accuracy of statistical judgements varies across visualization designs, motivating careful analysis and experimentation when designing visualizations to facilitate statistical data analysis.
- Visualization of statistics parameters
- Visual Encodings of Temporal Uncertainty: A Comparative User Study Evaluation of six methods for visualization temporal uncertainty, three of which are for statistical uncertainty.
- Integrating statistics and visualizations for exploratory analysis
- Integrating Statistics and Visualization: Case Studies of Gaining Clarity during Exploratory Data Analysis "Statistics" refers to network metrics specifically, e.g. node degrees, betweenness, closeness. The paper focus on using statistics to help with navigation (e.g. filter and zoom based on network metric values).