MPO 581 Class number 9/27 Feb 16, 2011
Topic: Exploratory multivariate colorized go wild HW2
Loose ends from last class:
Teams
and
homework
leaders
all
assigned!?
(edit amongst
yourselves)
Testable questions
compilation taking shape: add to it, for course brownie points.
Today's material:
Homework 1:
a few comments
HW2 - Multivariate Mayhem!
- the antidote to my stern lectures about the Virtues of black and white for Honest Presentation of True Distributions (where eyeball
integrals add up to the total) of geophysical Stuff (quantities whose total or
average means something).
Exploratory data analysis should use
as much of our sensory and cognitive capability as possible. We are the
best pattern finding machines in the universe! No algorithm is going to
beat our brains, so why not run everything under your eyeballs first?
Later, you can use statistics and transformations to quantify patterns,
and double check whether they are demonstrably, objectively real,
significant, and worth the attention of others.
Statistical graphics for Multivariate Data 1 old site
How many variables can we squeeze into a plot? How usefully?
The first 2 (or 3) are easy: spatial dimensions in a scatter plot.
One or two more dimensions can be added:
color and plotting symbol size. Zuidema/Kandisnki
Or try animated!
Histograms can be color-weighted by a 3rd variable Khairoutdinov
& Randall
There are some creative attempts like Chernoff faces or
glyph plots to use our brain's specialized face-recognition
capabilities.
Interactive Java Tools For Exploring
High Dimensional
Data
More by same guy: http://www.stat.tamu.edu/~west/applets/
A little more creative http://www.datavis.ca/gallery/bright-ideas.php
General colorization advice:
http://www.dataspora.com/blog/how-to-color-multivariate-data/
Loose ends
http://www.webdesignerdepot.com/2009/06/50-great-examples-of-data-visualization/
http://www.trinity.edu/rjensen/352wpvisual/000datavisualization.htm
Outright data art: "photographer"
Chris Jordan
Open questions, assignments, and
loose
ends for next class:
Testable questions about today's
material:
Datasets for HW2:
- Anything else you care to look at with at least 5 colocated
variables
Exploratory:
Start with simple time series, or 1D histograms if the datapoint order
is too arbitrary.
Are there any bimodal/ multimodal variables?
Next make multidimensional scatter plots, scatterplot matrix pages,
3D scatter plots, and/or mutidimensional histograms. Be
systematic
at this stage: scatterplot everything against everything else. Anything
interesting?
How many different variables can you squeeze onto one plot?
x-y-colorfill-overlay or x-y-dotcolor-dotsize...more?
Matlab resources of interest:
http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/mvplotdemo.html#2
scatterhist
hist3 -- for 2D histograms
IDL:
hist_2d
http://www.idlcoyote.com/tips/scatter3d.html
Python: