Statistically Sound Visual Data Analysis
Overview
This page documents our research effort towards developing visual interfaces that help people perform statistically sound data analysis.
Proposals
- BIGDATA: F: Statistically Sound and Computationally Efficient Massive Data Analysis - Sample Complexity, Uniform Convergence, and False Discovery Rate
References
Vision and use case papers
- Towards Sustainable Insights (or why polygamy is bad for you): outlines a few examples of false discovery issues in existing visual data exploration (e.g. Vizdom) and visual recommendation tools (e.g. SeeDB)
Interactive control of false discover rate during data exploration
- Controlling False Discoveries during Interactive Data Exploration: the paper reports case study results on applying a new procedure for controlling FDR during data exploration and reports; not much emphasis on the interface yet but it sounds like Emanuel is working on that; the case study could provide inspirations for ways in which visualizations could help users better understand the FDR control procedural.
Previous work on visualizations and statistical analysis
- Bayesian reasoning
- Improving Bayesian Reasoning: The Effects of Phrasing, Visualization, and Spatial Ability, Ottley et al., 2016
- Assessing the effect of visualizations on bayesian reasoning through crowdsourcing, Micallef, Dragicevic, and Fekete, 2012
- Visual estimation of coefficients of statistical models
- Judging correlation from scatterplots and parallel coordinate plots, Li, Martens, and van Wijk, 2008. This paper reports experimental results on how well people can assess correlation under varying sample size and visualization methods (scatter plots v.s. parallel coordinate plots). The results suggest that the accuracy of statistical judgements varies across visualization designs, motivating careful analysis and experimentation when designing visualizations to facilitate statistical data analysis.