Estimator

From Exampleproblems

Jump to: navigation, search

In statistics, given a parametric model, an estimator is a function of the known sample data that is used to estimate an unknown parameter; an estimate is the result from the actual application of the function to a particular set of data. Many different estimators are possible for any given parameter. Some criterion is used to choose between the estimators, although it is often the case that a criterion cannot be used to clearly pick one estimator over another.

Contents

Context and definitions

Context

  1. (\Omega,\mathcal{A},P) is a probability space,
  2. (\mathcal{X},\Sigma) is a measure space (state space),
  3. Θ is a parameter space of dimension p\in\mathbb{N}^*,
  4. (Γ,S) is a measure space,
  5. \gamma:\Theta\rightarrow\Gamma is a projection,
  6. \mathcal{F}(\Sigma) is the set of all possible distributions on Σ

For example,

  1. (\Omega,\mathcal{A},P) is any probability space,
  2. (\mathcal{X},\Sigma)=(\mathbb{R},\mathcal{B}),
  3. \Theta=\mathbb{R}\times\mathbb{R}^+
  4. (\Gamma,S)=(\mathbb{R},\mathcal{B}),
  5. \gamma:\mathbb{R}\times\mathbb{R}^+\rightarrow\mathbb{R} is defined by γ(x,y) = x.


Estimator

If \forall n\in\mathbb{N}^*, T_n:(\mathcal{X}^n,\Sigma^n)\mapsto(\gamma(\Theta),S) is a measurable function, then any Tn is called estimator, (T_n)_{n\in\mathbb{N}^*} is an estimating sequence and any value T_n(X_1,\cdots,X_n) is called estimation.

Continuing the example of the previous paragraph, suppose now that (X_n)_{n\in\mathbb{N}^*} is a sequence of random variables such that X_n:(\Omega,\mathcal{A})\mapsto(\mathcal{X},\Sigma) are iid with distribution F\in\mathcal{F}(\Sigma). Then T_n(X_1,\cdots,X_n):=\frac{X_1+\cdots+X_n}{n} defines an estimator. It is called sample mean and is an estimating sequence for the expected value of any distribution for which the integral

XdP
Ω

is defined.


Desirable properties

The definition of an estimator is not very restrictive. Indeed, instead of the sample mean, we could have chosen T_n(X_1,\cdots,X_n):=X_1-X_2 to estimate the mean of the distribution F. This would be a very bad choice because the expected value of Tn is 0. We therefore need ways of assessing the quality of an estimator.

For an estimator Tn of the parameter θ,

  • the error is Tn − θ,
  • the bias is defined as the expected value of the error: B(\widehat{\theta}) := \operatorname{E}(T_n(X_1,\cdots,X_n) - \theta),
  • and the mean squared error is given by \operatorname{MSE}(T_n) = \operatorname{E}[(T_n(X_1,\cdots,X_n) - \theta)^2].

The following equality holds: \operatorname{MSE}(T_n) = \operatorname{var}(T_n) + (B(T_n))^2. An estimator whose bias is 0 is called 'unbiased'. This is the least we can ask from an estimator.

i.e. mean squared error = variance + square of bias.

where var(X) is the variance of X and E(X) is the expected value of X.

The standard deviation of an estimator of θ (the square root of the variance), or an estimate of the standard deviation of an estimator of θ, is called the standard error of θ.


Unbiased estimators

The first quality we might expect from a useful estimator is for its expected value to be the quantity being estimated. Such an estimator is called an unbiased estimator. \widehat{\theta} is an unbiased estimator of θ iff B(\widehat{\theta}) = 0 for all θ, or, equivalently, iff \operatorname{E}(\widehat{\theta}) = \theta for all θ.

Consistency

A consistent estimator is an estimator that converges in probability to the quantity being estimated as the sample size grows.

An estimator tn (where n is the sample size) is a consistent estimator for parameter θ if and only if, for all ε > 0, no matter how small, we have


\lim_{n\to\infty}{\rm Prob}\left\{
\left|
t_n-\theta\right|<\epsilon
\right\}=1.

It is called strongly consistent, if it converges almost surely to the true value.

Efficiency

The quality of an estimator is generally judged by its mean squared error.

However, occasionally one chooses the unbiased estimator with the lowest variance. Efficient estimators are those that have the lowest possible variance among all unbiased estimators. In some cases, a biased estimator may have a uniformly smaller mean squared error than does any unbiased estimator. For that and other reasons, it is sometimes preferable not to limit oneself to unbiased estimators; see bias (statistics). Concerning such "best unbiased estimators", see also Cramér-Rao inequality, Gauss-Markov theorem, Lehmann-Scheffé theorem, Rao-Blackwell theorem.

Other properties

Often, estimator are due to restrictions (restricted estimators).


Asymptotic value of an estimating sequence

The central limit theorem states that the sample mean converges to the mean of the sampled distribution. This situation is what we would expect: the estimation gets better as we have more values to consider. The limit value of a estimating sequence as the sample length grows to infinity is called asymptotic value.

More precisely, suppose that \forall n\in\mathbb{N}^*, T_n:(\mathcal{X}^n,\Sigma^n)\mapsto(\gamma(\Theta),S) is an estimator. and \exists T(F)\in\Gamma, \lim_{n\rightarrow\infty}P(\{\omega:T_n(X_1,\cdots,X_n)=T(F)\})=1, then T:\mathcal{F}(\Sigma)\rightarrow\Gamma is called asymptotic value of (T_n)_{n\in\mathbb{N}^*}.

Types of estimators

Several types of estimators exist, each corresponding to a different view of the problem.


Maximum likelihood estimators

Following the notations of the previous example and supposing the random variables are discrete, the likelihood of a sample is simply the probability of observing that particular sample: l=P((X_1,\cdots,X_n)=(x_1,\cdots,x_n))=\prod_{i=1}^n p(x_i). We could also, having observed a particular sample, consider the likelihood as a function of a parameter of the model. For example, if we toss two coins and get (Heads, Tails), the likelihood is p(1 − p) (supposing the probability to get Heads is p and the probability to get Tails is 1 − p). This is clearly a function of p, which happens to be the parameter of the Bernoulli distribution we used.

More generally:

  • For a discrete distribution p with parameter θ, the likelihood function is defined by:

l(\theta_0;x_1,\cdots,x_n):=\prod_{i=1}^n p(x_i|\theta=\theta_0).

  • For continuous distributions with parameter θ and density f(. | θ), the likelihood function is defined by:

l(\theta_0;x_1,\cdots,x_n):=\prod_{i=1}^nf(x_i|\theta=\theta_0).

If we do not know the value of the parameter θ, we might want to find the one that is the most likely given a specific sample.

If l(\theta;x_1,\cdots,x_n) is the likelihood function in a given parametric model, then the maximum likelihood estimator of a parameter θ is defined by l(T_n;x_1,\cdots,x_n)=\max_{\theta\in\Theta}l(\theta;x_1,\cdots,x_n).

Bayes estimators

Minimax estimators

Interval estimators

References

See also

Books and lecture notes

G. Saporta, Probabilités, Analyse des Données et Statistiques"", Editions TECHNIP, 1990. Prof. R. Viertl, Angewandte Statistik, lecture notes from the Technical University of Vienna, 2004. Prof. K. Felsenstein, Theorie statistischer Schätzung, lecture notes from the Technical University of Vienna, 2004.

External links

pl:Estymator

Argan Oil
Natural Skin Care
Organic Skin Care
visitor stats