MPO 581 Class number 7/27 Feb 9, 2011
Topic: The (rest of the) Most Interesting 5% of Statistics
Loose ends from last class:
Teams
and
homework
leaders
all
assigned!?
(edit amongst
yourselves)
Testable questions
compilation taking shape: add to it, for course brownie points.
Today's material: The (rest of
the) Most
Interesting 5% of Statistics
- (trying very hard not to become
a statistics course)
Outline - without content
- What is a statistic, what is
statistics?
- Subdividing the very large
field of statistics (in 2 axes)
- parametric vs. non-parametric; exploratory to confirmatory (or
beyond: predictive).
- Distributions: of Total Stuff,
over Bins (infinitesimal bins, in
the continuous case)
- Probability as the Total Stuff
(always adds up to 1)
- Probability distributions
Outline again - with content
....
4.
Probability
as
the
Total
Stuff
(always
adds
up to 1)
Probability -- it always adds
up to
to 1 over all Bins, because one reality exists!
- "Frequentist"
interpretation:
probability
=
frequency
of
occurrence.
So
straightforward
and direct, for technical work on
quantified systems, which is basically what we are doing in a Data
Analysis.
- "Bayesian
probability is one of the different interpretations of the concept
of probability and belongs to the category of
evidential probabilities. The Bayesian interpretation of probability
can be seen as an extension of logic that enables
reasoning with uncertain statements. To evaluate the probability of a hypothesis,
the
Bayesian
probabilist
specifies
some
prior
probability,
which is
then updated in the light of new relevant data. The Bayesian
interpretation provides a standard set of procedures and formulae to
perform this calculation. Bayesian probability interprets the concept
of probability as "a measure of a state of
knowledge",[1]
in contrast to interpreting it as a frequency or a "propensity" of some phenomenon."
5. Probability distributions
Probability
distribution
function or
probability density function
(PDF): The 1 of total probability is spread over one or
more dimensions. Or
we could say the Unity of probability is subdivided or decomposed into
Bins or slices or pieces, like
pie (or variance, or Error). A PDF is non-negative (PDF ≥ 0)
everywhere. Its
value is also called "likelihood". (as in MLE, search
above). I think it makes more sense to think of likelihood as a probability density. The acronym "PDF"
is nicely ambiguous.
Cumulative
distribution
function (CDF):
In probability theory and statistics,
the
cumulative distribution function (CDF), or just distribution
function, describes the probability that a real-valued random variable X with a given probability distribution will be
found at a value less than or equal to x. Intuitively, it is
the "area so far" function of the probability distribution.
PDFs are solutions to stochastic
differential
equations. The PDF represents an
equilibrium between "fluxes" in and out of "bins" of probability, or in
other words an equilibrium between processes that change other values
of the Bin variable
so that they falls into a given bin, minus processes that change values
within
the bin to other values which fall in other bins. Animations from
an
intuitive case like money exchanges might help: http://www.physics.umd.edu/~yakovenk/econophysics/.
The
Fokker-Planck
equation
is a profound thing to stare at some day, it is the time dependent SDE
governing "probability flux divergence" that these animations are
solutions of:
∂(probability)/∂t = -∇•(probability flux)
Key PDFs worth knowing: (The Most
Interesting 5% of Statistics).
a) Profound, fundamental, yet simply arising from symmetries or
other
non-statements:
The Normal distribution embodies a null hypothesis about the Nature behind the population
your data came from: an
adding machine fed by a random number generator. A null
hypothesis is basically the most
boring (or conservative) possible interpretation of your data
(heck, my random number generator could make that). Can
you
prove otherwise: that your data tell an interesting story about their
source? That is the challenge of statistics to scientific
claims.
b) What we would see in limited samples, if the above were true:
How to build distributions of
Stuff over Bins
- There are 2 physical quantities involved: your Stuff, and
your Bin variable
- There is a special, important kind of Stuff variable: number or frequency.
- Frequency is equal to probability in the frequentist view
- (assuming the sample is an unbiased one!)
A histogram can be built as the derivative of the CDF of your Stuff. I
find this an instructive line of reasoning and will demo it. IDL code here.
But more typically, we just use a built-in histogram function, and work
with the values in each bin: sum them, average them, etc.
Histogram (Wikipedia)
Histogram
bin
width
optimization (MIT, has demo applet)
- Make a sampling frequency distribution of your samples
- this is just a simple histogram of the Bin quantity
array, divided by its total().
- It will be flat, if your Bins are lat or lon or time in a
lat-lon-time gridded fields dataset.
- Total up your
Stuff variable for all the values that fell in each bin.
- Total up the bins and you get Total Stuff.
- umm, normalize again? confused, this part needs a rewrite with
examples (class 7).
Open questions, assignments, and
loose
ends for next class:
Testable questions about today's
material: