|
Nonparametric: data driven statistics
|
Parametric (analytic) statistics: uses probability
theory assumptions
|
Exploratory,
Descriptive,
Characterizing
the data
↓ |
- Data summarization
- mean, variance = stdev^2, skew, kurtosis...
- Histogram
|
- Probability distributions
- implies processes
- linear +/- additive <-> Gaussian or Normal
- like momentum in gas molecules => exp(-V2/T)
- A conserved positive quantity is exchanged
symmetrically among interchangeable (arbitrarily labeled or defined)
interacting subsystems <-> Exponential
- like KE in gas molecules => exp(-KE/T) = exp(-V2/T)
- Umm hey those are the same...
|
↓ (has more
and more)
|
- Transform a dimension (independent variable or domain)
- Variance is spread over d.o.f. 's
- many cross-cuts of dofs possible
- e.g. space, time, spacetime
- partition by scale (regrid) --->
|
- Transforms using specified basis functions
- <---Fourier analysis (sines and cosines)
|
↓ (science
ideas)
|
- Transform values
(e.g. value --> rank)
|
- Transform to Gaussian so can use use theorems
- or rank statistics (there is theory) or other
|
↓ (driving
it)
|
- Associations (correlation, covariance)
|
- Information theory (mutual information, entropy)
- data compression, communication engineering
|
↓ (and
more)
|
- Signal or event detection & isolation
|
- Extreme value theories (events)
- Signal processing
- "System identification"
|
↓ (sophistication)
|
- Confidence or robustness checks
- e.g. subdividing sample
- data denial experiments
|
- Formal hypothesis testing
- e.g. t test p-value of 5% ==> "stat.
significant"
- (based on undersampling theory for Normal populations)
FORMAL SCIENCE
|
|
|
- Monte Carlo Synthetic data generation
|
Inferential or
Confirmatory
(and on to Predictive) |
- Statistical modeling or forecasting ------------------>
|
- Statistical forecast evaluation
APPLICATIONS
|