Point estimation

point estimation is the estimation of a value of the parameters of a statistics model by a single number, a point estimate, for each paramter. @todo

It is assumed that some set of observations x1,x2,x3 arises as n independent observations on some known distribution, with unknown parameters $$\theta$$

A point estimate is the observed value of a point estimator. An estimate of the value of an unknown parameter of a statistical model.

A point estimator is a function of the random variables $$X_1, X_2, X_3$$.

Point Estimator
a point estimator is a function of the observed values

Maximum likelihood estimation
MLE is a method for obtaining a point estimator for $${\theta}$$  for the unknown parameters $$\hat{\theta}$$

This method tries to obtain the maximum likelihood estimator of an unknown parameter using calculus

$$f(x|\theta)$$ to denote the probabilty density function associated with X

of the pmf if X is discrete

Likelihood function
@todo

referred to as the likelihood for short.

the likelihood function gives a measure of how likely a value of $$\theta$$ is given that is is known that the sample $$X_1, X_2$$ has the values $$x_1, x_2$$, ...

$$L(\theta) = f(x_1|\theta) \times f(x_2|\theta) \times ... \times f(x_n|\theta)$$

$$ = \prod_{i=1}^n f(x_i|\theta)$$

$$L(\theta)$$ is a function of the parameter space and not the x

the best estimator of $$\theta$$ is one that maximises the likelihood function

For any differentiatable likelihood, similarly to any function, such that its maxima can be determined by the use of calculus, so that;

$$ {d \over d\theta } L(\theta) = L'(\theta)=0$$

so you are looking to determine which of these points are maxima, minima, and which are saddlepoints, by taking the 2nd derivative like so;

$$ {d^2 \over d\theta^2 } L(\theta) = L''(\theta)$$

$$ \theta = \theta if L''(\theta) < 0$$ then \theta hat corresponds to a maximum

Log-Likelihood function
The log-likelihood is similarly defined as

$$l(\theta) = log L(\theta)$$ with log to base e

the maximum likelihood estimator, is the maximiser of the $$L(\theta)$$ is also the maximiser of the log-likelihood $$ log L(\theta) = l(\theta)$$

It is usually easier to work with log likelihood because the logs convert the product terms into addition terms, and addition terms are easier to work with.

Maximum log-likelihood estimation
$$l(theta) = log L(theta)$$

the log-likelihood is also maxumized at the same point as the likelihood

If the likelihood function is not differentiatable for its range, then the MLE cannot be found using calculus.

proof of why the likelihood and the log-likelihood have the same maximizer
$$l(\theta)$$

maximiser $$l(\theta)$$ = maximiser $$ log l(\theta)$$

$$l(\theta) = log L(\theta) = log \prod_{i=1}^n f(x_i|\theta)

= \sum_{i=1}^n log f(x_i|\theta)

$$

estimating Normal parameters
@todo

The Cramer-Rao lower bound
the minimum possible variance that it is possible to achieve with any unbiased estimator is defined by the ‘Cramér–Rao lower bound’, often abbreviated to CRLB

unbiasedness
minimum variance unbiased estimator