STA260 Lecture 16

MVUE
Blackwell
Exponential Family
- Family of dists
- $f_{X} (x; θ) = {\begin{cases} e^{- c (θ) T (x) + d (θ) + S (x)} & for all x \in S \\ 0 & Otherwise \end{cases}$
- The natural / canonical form is slightly different.
  - If you want to show something is in the form of an exponential family, you can show it in the natural form or the above.
- Here $θ$ can mean multiple parameters.
- General Exponential Family:
  - A distribution function is a member of the general exponential family if it can be expressed in the form as follows.
  - $e^{x} = \exp (x)$
  - $f_{X} (x; θ) = \exp (- c (θ) T (x) + d (θ) + S (x))$
  - $c (θ)$ is a function of parameters.
  - $T (x)$ is a function of the data, it's a Sufficient Statistic
  - $d (θ)$ is a function of parameters.
  - $S (x)$ is a function of the data.
  - Note:
    - $T (x) = 1, T (x) = 0$ are not valid sufficient statistics, because it doesn't capture any information about the data.
    - Sufficient Statistic
    - $x = \exp (\ln (x))$ is a useful trick.
  - Example:
    - Show Exponential distribution with parameter $β$ is a member of the GEF (General Exponential Family).
    - $f_{X} (x; θ) = \exp (- c (θ) T (x) + d (θ) + S (x))$
      - #tk remember this for test.
    - $f_{X} (x; β) = \frac{1}{β} e^{- \frac{x}{β}}$
    - $f_{X} (x; β) = \frac{1}{β} \exp (- \frac{x}{β})$
    - $f_{X} (x; β) = \frac{1}{β} \exp (- \frac{1}{β} \cdot x)$
    - $f_{X} (x; β) = \exp (\ln (\frac{1}{β})) \exp (- \frac{1}{β} \cdot x)$
    - $f_{X} (x; β) = \exp (- \frac{1}{β} \cdot x + \ln (\frac{1}{β}) + 0)$
    - $T (x) = x$ is non-trivial.
    - $- c (β) = - β^{- 1}$
    - $d (β) = \ln (\frac{1}{β})$
    - $S (x) = 0$
      - We didn't have a restriction stating it must be non-trivial, so it's fine.
    - So it is a member of the GEF.
  - Example:
    - Binomial
    - Show that the Binomial distribution with parameters $p$ is a member of the GEF.
    - $f_{X} (x; θ) = \exp (- c (θ) T (x) + d (θ) + S (x))$
    - It's only by parameter $p$
    - $f_{X} (x; p) = (\binom{n}{x}) p^{x} (1 - p)^{n - x}$
    - $f_{X} (x; p) = \exp (\ln ((\binom{n}{x}) p^{x} (1 - p)^{n - x}))$
    - $f_{X} (x; p) = \exp (\ln (\binom{n}{x}) + x \ln p + (n - x) \ln (1 - p))$
    - $n \ln (1 - p) - x \ln (1 - p)$
    - $f_{X} (x; p) = \exp (\ln (\binom{n}{x}) + x \ln p + n \ln (1 - p) - x \ln (1 - p))$
    - $f_{X} (x; p) = \exp (\ln (\binom{n}{x}) + x \ln p - x \ln (1 - p) + n \ln (1 - p))$
    - $f_{X} (x; p) = \exp (\ln (\binom{n}{x}) + x (\ln p - \ln (1 - p)) + n \ln (1 - p))$
    - $T (x) = x$
    - $c (p) = \ln p - \ln (1 - p)$
    - $- c (p) = \ln (1 - p) - \ln p$
    - $T (x) = - x$
    - $f_{X} (x; p) = \exp (\ln (\binom{n}{x}) - x (\ln (1 - p) - \ln p) + n \ln (1 - p))$
    - $f_{X} (x; p) = \exp (\ln (\binom{n}{x}) - x \ln \frac{1 - p}{p} + n \ln (1 - p))$
    - $d (p) = n \ln (1 - p)$
    - $S (x) = \ln (\binom{n}{x})$
    - So it is a member of the GEF.
    - #tk check if this is accurate.
  - Example:
    - Normal distribution with known variance $σ^{2}$ and unknown mean $μ$ .
    - Show that it is a member of the GEF.
    - $f_{X} (x; θ) = \exp (- c (θ) T (x) + d (θ) + S (x))$
    - $f_{X} (x; μ) = \frac{1}{\sqrt{2 π σ^{2}}} e^{- \frac{(x - μ)^{2}}{2 σ^{2}}}$
    - $f_{X} (x; μ) = \frac{1}{\sqrt{2 π σ^{2}}} \exp (- \frac{(x - μ)^{2}}{2 σ^{2}})$
    - $f_{X} (x; μ) = \exp (\ln \frac{1}{\sqrt{2 π σ^{2}}}) \exp (- \frac{(x - μ)^{2}}{2 σ^{2}})$
    - $f_{X} (x; μ) = \exp (- \frac{(x - μ)^{2}}{2 σ^{2}} - \frac{1}{2} \ln (2 π σ^{2}))$
    - $f_{X} (x; μ) = \exp (- \frac{(x - μ) (x - μ)}{2 σ^{2}} - \frac{1}{2} \ln (2 π σ^{2}))$
    - $f_{X} (x; μ) = \exp (- \frac{(x - μ) (x - μ)}{2 σ^{2}} - \frac{1}{2} \ln (2 π σ^{2}))$
    - #tk
    - $- \frac{(x - μ) (x - μ)}{2 σ^{2}} = - \frac{x^{2} - 2 μ x + μ^{2}}{2 σ^{2}}$
    - $- \frac{(x - μ) (x - μ)}{2 σ^{2}} = - \frac{x^{2}}{2 σ^{2}} - \frac{μ x}{σ^{2}} + \frac{μ^{2}}{2 σ^{2}}$
    - $f_{X} (x; μ) = \exp (- \frac{x^{2}}{2 σ^{2}} - \frac{μ x}{σ^{2}} + \frac{μ^{2}}{2 σ^{2}} - \frac{1}{2} \ln (2 π σ^{2}))$
    - $f_{X} (x; μ) = \exp (- \frac{x^{2}}{2 σ^{2}} - x \frac{μ}{σ^{2}} + \frac{μ^{2}}{2 σ^{2}} - \frac{1}{2} \ln (2 π σ^{2}))$
    - $T (x) = x$
    - $- c (μ) = - \frac{μ}{σ^{2}}$
    - $d (μ) = \frac{μ^{2}}{2 σ^{2}}$
    - $S (x) = - \frac{x^{2}}{2 σ^{2}} - \frac{1}{2} \ln (2 π σ^{2})$
    - #tk check if this is it.
  - Example:
    - Poisson
    - Show that the Poisson distribution with parameter $λ$ is a member of the GEF.
    - $f_{X} (x; θ) = \exp (- c (θ) T (x) + d (θ) + S (x))$
    - $f_{X} (x; λ) = \frac{e^{- λ} λ^{x}}{x!}$
    - $f_{X} (x; λ) = \frac{\exp (- λ) λ^{x}}{x!}$
    - $f_{X} (x; λ) = \frac{\exp (- λ) \exp (\ln (λ^{x}))}{\exp (\ln (x!))}$
    - $f_{X} (x; λ) = \frac{\exp (- λ) \exp (x \ln (λ))}{\exp (\ln x!)}$
    - $f_{X} (x; λ) = \frac{\exp (- λ + x \ln λ)}{\exp (\ln x!)}$
    - $f_{X} (x; λ) = \exp (- λ + x \ln λ - \ln x!)$
    - $f_{X} (x; λ) = \exp (\ln λ x - λ - \ln x!)$
    - $f_{X} (x; λ) = \exp ((- \ln λ) (- x) - λ - \ln x!)$
    - $T (x) = - x$
    - $- c (λ) = - \ln λ$
    - $d (λ) = - λ$
    - $S (x) = - \ln x!$
  - Why is uniform not GEF?
    - There's nothing to get a non-trivial function of the data, since there is no data.
    - $f_{X} (x; θ) = \frac{1}{θ}$ for $0, θ$
    - $\exp (\ln (θ^{- 1}))$
    - $\exp (- \ln (θ) \cdot 1 + 0 + 0)$
    - Since uniform involves the parameter and data, then we can't have the Sufficient Statistic be a non-trivial function of the data, so it can't be a member of the GEF.
Minimum Variance Unbiased Estimator (MVUE)
- An unbiased estimator $\hat{θ}$ is the MVUE for $θ$ if $Var (\hat{θ}) \leq Var (\tilde{θ})$ for all unbiased estimators $\tilde{θ}$ of $θ$ .
- So it's the best unbiased estimator, in terms of variance.
- Cramér-Rao Lower Bound
- Let $X_{1}, \dots, X_{n}$ be a random sample from a common dist with parameter $θ$ .
  - $X_{1}, \dots, X_{n} \overset{i i d}{\sim} f_{X} (x; θ)$
- An estimator $\hat{g} (X_{1}, \dots, X_{n})$ is an MVUE for parameter $θ$ if:
  - $\hat{g}$ is unbiased for $θ$ .
    - $E [\hat{g} (X_{1}, \dots, X_{n})] = g (θ)$
  - For any other estimator $\hat{h} (X_{1}, \dots, X_{n})$ the $Var (\hat{g} (X_{1}, \dots, X_{n})) \leq Var (\hat{h} (X_{1}, \dots, X_{n}))$
    - The most efficient unbiased estimator.
    - $\hat{g}$ has the minimum variance.
- Definition is in the name.
- Found using the Rao-Blackwell Theorem.
- Properties:
  - For any random variable $T$ and function $g$ . The expected value $E [g (T) | T] = g (T)$ .
Proof of Rao-Blackwell Theorem:
- Let $\hat{θ}$ be an unbiased estimator of $θ$ where $Var (\hat{θ}) < \infty$ . If $T$ is a Sufficient Statistic for $θ$ , ${\hat{θ}}^{*} = E [\hat{θ} | T]$ , then.
- $E [{\hat{θ}}^{*}] = θ$
- $Var ({\hat{θ}}^{*}) \leq Var (\hat{θ})$
- Review from STA256
  - Let $X, Y$ be random variables in a sample space.
    - Law of total expectation: $E [X] = E [E [X | Y]]$
      - #tk remember this for finals.
    - Law of total variance: $Var (X) = Var (E [X | Y]) + E [Var (X | Y)]$
- Proof:
  - $E [{\hat{θ}}^{*}] = E [E [\hat{θ} | T]]$
  - By law of total expectation, $E [{\hat{θ}}^{*}] = E [\hat{θ}]$
  - $E [{\hat{θ}}^{*}] = E [\hat{θ}]$
  - Since it's unbiased
  - $E [{\hat{θ}}^{*}] = θ$
  - $Var (\hat{θ}) = Var (E [\hat{θ} | T]) + E [Var (\hat{θ} | T)]$
  - $Var (\hat{θ}) \geq Var (E [\hat{θ} | T])$
  - $Var (\hat{θ}) \geq Var ({\hat{θ}}^{*})$