MPO 581 Class number 9/27   Feb 16, 2011

Topic: Exploratory multivariate colorized go wild HW2

Loose ends from last class:

Teams and homework leaders all assigned!?    (edit amongst yourselves)
Testable questions compilation taking shape: add to it, for course brownie points.


Today's material:

Homework 1: a few comments  


HW2 - Multivariate Mayhem!

Exploratory data analysis should use as much of our sensory and cognitive capability as possible. We are the best pattern finding machines in the universe! No algorithm is going to beat our brains, so why not run everything under your eyeballs first? Later, you can use statistics and transformations to quantify patterns, and double check whether they are demonstrably, objectively real, significant, and worth the attention of others.

Statistical graphics for Multivariate Data  1 old site

How many variables can we squeeze into a plot? How usefully?
The first 2 (or 3) are easy: spatial dimensions in a scatter plot.

One or two more dimensions can be added:

color and plotting symbol size. Zuidema/Kandisnki
Or try animated!

Histograms can be color-weighted by a 3rd variable Khairoutdinov & Randall

There are some creative attempts like Chernoff faces or glyph plots to use our brain's specialized face-recognition capabilities.

Interactive Java Tools For Exploring High Dimensional Data
    More by same guy: http://www.stat.tamu.edu/~west/applets/

A little more creative http://www.datavis.ca/gallery/bright-ideas.php




General colorization advice:
http://www.dataspora.com/blog/how-to-color-multivariate-data/

Loose ends
http://www.webdesignerdepot.com/2009/06/50-great-examples-of-data-visualization/
http://www.trinity.edu/rjensen/352wpvisual/000datavisualization.htm

Outright data art: "photographer" Chris Jordan




Open questions, assignments, and loose ends for next class:

Testable questions about today's material:

Datasets for HW2:

Exploratory:
Start with simple time series, or 1D histograms if the datapoint order is too arbitrary.
Are there any bimodal/ multimodal variables?

Next make multidimensional scatter plots, scatterplot matrix pages,
3D scatter plots, and/or mutidimensional histograms.  Be systematic
at this stage: scatterplot everything against everything else. Anything
interesting?

How many different variables can you squeeze onto one plot? x-y-colorfill-overlay or x-y-dotcolor-dotsize...more?

Matlab resources of interest:
http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/mvplotdemo.html#2
scatterhist
hist3 -- for 2D histograms

IDL:
hist_2d
http://www.idlcoyote.com/tips/scatter3d.html

Python: