Classical Statistics

Classical and Bayesian inference

The treatment of uncertainty is different between classical and bayesian inference

"In the classical approach to statistical inference, parameters are regarded as fixed, but unknown. A parameter is estimated using data. The resulting parameter estimate is subject to uncertainty resulting from random variation in the data, known as sampling variability. This variability would become apparent if successive samples of the same size were to be drawn. Thus, the methods of classical inference are typically interpreted in terms of repeated sampling." M347 OU

"In the Bayesian approach to statistical inference, parameters are also regarded as fixed, but unknown. However, in Bayesian inference, uncertainty about the possible value of a parameter is represented by a probability distribution for the parameter and, as such, the parameter is treated as if it is a random variable. Before observing the data, the probability distribution representing all available information regarding the possible value of a parameter is known as the ‘prior distribution’. After observing the data, the information regarding the parameter is updated and represented by a ‘posterior distribution’. This posterior distribution is then used to estimate the parameter and to quantify the uncertainty regarding the parameter’s value."

The choice of parameter is generally made depending on what value is of scientific interest.

Statistical Inference
$$f(x|\theta)$$; where f is the pdf of a continuous distribution, or the pmf of a discrete distribution.

Parameter estimation
$$\Omega$$ is known as the parameter space, the range of all values that the parameter can take.

Parameter Space
The parameters for some probability model $$f(x|\theta)$$ take some values in the parameter space $$\Omega$$

(@todo condense)

$$\Omega = R$$ $$\Omega = R+$$ $$\Omega = (0,1)$$ $$\Omega = \mathbb{R}$$

$$\Omega = \mathbb{R}^+$$

$$\Omega = (0,1)$$

the probability model might suggest a natural way of estimating the parameter such as a proportion or a mean average.

"When the quantity being estimated can be expressed in terms of the moments of the distribution (like the mean or the variance,) an estimate of this type is called a moment estimate" M347 (@todo rewrite)

(@todo is there scope to come up with a table table of natural, or typical estimators?)

if $$\theta$$ is a population characteristic like a mean or proportion, then the corresponding sample value provides an estimate of the $$\theta$$

Point estimation
For a random sample of n observations $$x_1, x_2, ... , x_n$$

The model is that $$x_1, x_2, ..., x_n$$ are observations on n random variables $$X_1, X_2, ... , X_n$$ which are independent and identically distributed.

Point estimation is the process of using the observed values of $$x_1, x_2, ..., x_n$$ to estimate the parameter.

Point Estimator functions
A point estimator is a function, and is denoted $$\hat{\theta}$$, and is a function of the random variables $$X_1, X_2$$, ..., $$X_n$$ underlying the data, but not of the parameter.

The realization of an estimate for a particular sample $$x_1, x_2, ... , x_n$$ is a point estimate, this numerical value is also denoted $$\hat{\theta}$$

Sampling distribution
The distribution of a point estimator is called the sampling distribution

There can be more than 1 estimator for some distribution parameter, such as the sample median in place of the sample mean to estimate the population mean for the Normal distribution.

Thus there can be more than one "valid" estimator of $$\mu$$

Confidence intervals
$$\hat{\theta}$$ provides an estimate of $$\theta$$

however it is useful to quantify the uncertainty of the variability in the sampling distribution of the estimator.

An interval estimator, called a confidence interval reflects the variance in the sampling distribution.

represents a range of plausible values for the estimate

the width of the interval reflects the variance of the sampling distribution

Fixed level tests
rejection region

significance level

one-sample t-test

giving a summary of a test
null hypothesis

alternative hypothesis

test statistic T

Quantiles and symmetry
some distributions are symmetric about zero

Likelihood function
The likelihood summarizes what information is available about a parameter $$\theta$$ from a particular sample of data $$x_1, x_2, ... ,x_n$$

It is a useful concept in both classical and bayesian statistics. Likelihood and probabily are similar but distinct concepts

The likelihood represents the relative likelihood of those parameters values given the observed values of the random variable, hence maximising the likelihood of $$\theta$$ improves estimate of parameter values.

"'probabilities are associated with random variables, likelihoods are associated with parameters.' M347 @todo"

For an observed, and therefore fixed value of x, the likelihood function $$L(\theta) = f(x|\theta)$$ because the likelihood function is the probability of observing this particular value (or values of x) given these paramaters.

large values of $$L(\theta)$$ indicate that the paramaters are more likely to have resulted in the observed value $$X=x$$

for a probability model pmf or pdf, the \theta is fixed for the likelihood function, the x is fixed

maximum likelihood estimations
MLE