STA260 Lecture 16
MVUE
Blackwell
Exponential Family
Family of dists
f X ( x ; θ ) = { e − c ( θ ) T ( x ) + d ( θ ) + S ( x ) for all x ∈ S 0 Otherwise
The natural / canonical form is slightly different.
If you want to show something is in the form of an exponential family, you can show it in the natural form or the above.
Here θ can mean multiple parameters.
General Exponential Family:
A distribution function is a member of the general exponential family if it can be expressed in the form as follows.
e x = exp ( x )
f X ( x ; θ ) = exp ( − c ( θ ) T ( x ) + d ( θ ) + S ( x ) )
c ( θ ) is a function of parameters.
T ( x ) is a function of the data, it's a Sufficient Statistic
d ( θ ) is a function of parameters.
S ( x ) is a function of the data.
Note:
T ( x ) = 1 , T ( x ) = 0 are not valid sufficient statistics, because it doesn't capture any information about the data.
Sufficient Statistic
x = exp ( ln ( x ) ) is a useful trick.
Example:
Show Exponential distribution with parameter β is a member of the GEF (General Exponential Family).
f X ( x ; θ ) = exp ( − c ( θ ) T ( x ) + d ( θ ) + S ( x ) )
#tk remember this for test.
f X ( x ; β ) = 1 β e − x β
f X ( x ; β ) = 1 β exp ( − x β )
f X ( x ; β ) = 1 β exp ( − 1 β ⋅ x )
f X ( x ; β ) = exp ( ln ( 1 β ) ) exp ( − 1 β ⋅ x )
f X ( x ; β ) = exp ( − 1 β ⋅ x + ln ( 1 β ) + 0 )
T ( x ) = x is non-trivial.
− c ( β ) = − β − 1
d ( β ) = ln ( 1 β )
S ( x ) = 0
We didn't have a restriction stating it must be non-trivial, so it's fine.
So it is a member of the GEF.
Example:
Binomial
Show that the Binomial distribution with parameters p is a member of the GEF.
f X ( x ; θ ) = exp ( − c ( θ ) T ( x ) + d ( θ ) + S ( x ) )
It's only by parameter p
f X ( x ; p ) = ( n x ) p x ( 1 − p ) n − x
f X ( x ; p ) = exp ( ln ( ( n x ) p x ( 1 − p ) n − x ) )
f X ( x ; p ) = exp ( ln ( n x ) + x ln p + ( n − x ) ln ( 1 − p ) )
n ln ( 1 − p ) − x ln ( 1 − p )
f X ( x ; p ) = exp ( ln ( n x ) + x ln p + n ln ( 1 − p ) − x ln ( 1 − p ) )
f X ( x ; p ) = exp ( ln ( n x ) + x ln p − x ln ( 1 − p ) + n ln ( 1 − p ) )
f X ( x ; p ) = exp ( ln ( n x ) + x ( ln p − ln ( 1 − p ) ) + n ln ( 1 − p ) )
T ( x ) = x
c ( p ) = ln p − ln ( 1 − p )
− c ( p ) = ln ( 1 − p ) − ln p
T ( x ) = − x
f X ( x ; p ) = exp ( ln ( n x ) − x ( ln ( 1 − p ) − ln p ) + n ln ( 1 − p ) )
f X ( x ; p ) = exp ( ln ( n x ) − x ln 1 − p p + n ln ( 1 − p ) )
d ( p ) = n ln ( 1 − p )
S ( x ) = ln ( n x )
So it is a member of the GEF.
#tk check if this is accurate.
Example:
Normal distribution with known variance σ 2 and unknown mean μ .
Show that it is a member of the GEF.
f X ( x ; θ ) = exp ( − c ( θ ) T ( x ) + d ( θ ) + S ( x ) )
f X ( x ; μ ) = 1 2 π σ 2 e − ( x − μ ) 2 2 σ 2
f X ( x ; μ ) = 1 2 π σ 2 exp ( − ( x − μ ) 2 2 σ 2 )
f X ( x ; μ ) = exp ( ln 1 2 π σ 2 ) exp ( − ( x − μ ) 2 2 σ 2 )
f X ( x ; μ ) = exp ( − ( x − μ ) 2 2 σ 2 − 1 2 ln ( 2 π σ 2 ) )
f X ( x ; μ ) = exp ( − ( x − μ ) ( x − μ ) 2 σ 2 − 1 2 ln ( 2 π σ 2 ) )
f X ( x ; μ ) = exp ( − ( x − μ ) ( x − μ ) 2 σ 2 − 1 2 ln ( 2 π σ 2 ) )
#tk
− ( x − μ ) ( x − μ ) 2 σ 2 = − x 2 − 2 μ x + μ 2 2 σ 2
− ( x − μ ) ( x − μ ) 2 σ 2 = − x 2 2 σ 2 − μ x σ 2 + μ 2 2 σ 2
f X ( x ; μ ) = exp ( − x 2 2 σ 2 − μ x σ 2 + μ 2 2 σ 2 − 1 2 ln ( 2 π σ 2 ) )
f X ( x ; μ ) = exp ( − x 2 2 σ 2 − x μ σ 2 + μ 2 2 σ 2 − 1 2 ln ( 2 π σ 2 ) )
T ( x ) = x
− c ( μ ) = − μ σ 2
d ( μ ) = μ 2 2 σ 2
S ( x ) = − x 2 2 σ 2 − 1 2 ln ( 2 π σ 2 )
#tk check if this is it.
Example:
Poisson
Show that the Poisson distribution with parameter λ is a member of the GEF.
f X ( x ; θ ) = exp ( − c ( θ ) T ( x ) + d ( θ ) + S ( x ) )
f X ( x ; λ ) = e − λ λ x x !
f X ( x ; λ ) = exp ( − λ ) λ x x !
f X ( x ; λ ) = exp ( − λ ) exp ( ln ( λ x ) ) exp ( ln ( x ! ) )
f X ( x ; λ ) = exp ( − λ ) exp ( x ln ( λ ) ) exp ( ln x ! )
f X ( x ; λ ) = exp ( − λ + x ln λ ) exp ( ln x ! )
f X ( x ; λ ) = exp ( − λ + x ln λ − ln x ! )
f X ( x ; λ ) = exp ( ln λ x − λ − ln x ! )
f X ( x ; λ ) = exp ( ( − ln λ ) ( − x ) − λ − ln x ! )
T ( x ) = − x
− c ( λ ) = − ln λ
d ( λ ) = − λ
S ( x ) = − ln x !
Why is uniform not GEF?
There's nothing to get a non-trivial function of the data, since there is no data.
f X ( x ; θ ) = 1 θ for 0 , θ
exp ( ln ( θ − 1 ) )
exp ( − ln ( θ ) ⋅ 1 + 0 + 0 )
Since uniform involves the parameter and data, then we can't have the Sufficient Statistic be a non-trivial function of the data, so it can't be a member of the GEF.
Minimum Variance Unbiased Estimator (MVUE)
An unbiased estimator θ ^ is the MVUE for θ if Var ( θ ^ ) ≤ Var ( θ ~ ) for all unbiased estimators θ ~ of θ .
So it's the best unbiased estimator, in terms of variance.
Cramér-Rao Lower Bound
Let X 1 , … , X n be a random sample from a common dist with parameter θ .
An estimator g ^ ( X 1 , … , X n ) is an MVUE for parameter θ if:
g ^ is unbiased for θ .
For any other estimator h ^ ( X 1 , … , X n ) the Var ( g ^ ( X 1 , … , X n ) ) ≤ Var ( h ^ ( X 1 , … , X n ) )
The most efficient unbiased estimator.
g ^ has the minimum variance.
Definition is in the name.
Found using the Rao-Blackwell Theorem.
Properties:
For any random variable T and function g . The expected value E [ g ( T ) | T ] = g ( T ) .
Proof of Rao-Blackwell Theorem:
Let θ ^ be an unbiased estimator of θ where Var ( θ ^ ) < ∞ . If T is a Sufficient Statistic for θ , θ ^ ∗ = E [ θ ^ | T ] , then.
E [ θ ^ ∗ ] = θ
Var ( θ ^ ∗ ) ≤ Var ( θ ^ )
Review from STA256
Let X , Y be random variables in a sample space.
Law of total expectation: E [ X ] = E [ E [ X | Y ] ]
#tk remember this for finals.
Law of total variance: Var ( X ) = Var ( E [ X | Y ] ) + E [ Var ( X | Y ) ]
Proof:
E [ θ ^ ∗ ] = E [ E [ θ ^ | T ] ]
By law of total expectation, E [ θ ^ ∗ ] = E [ θ ^ ]
E [ θ ^ ∗ ] = E [ θ ^ ]
Since it's unbiased
E [ θ ^ ∗ ] = θ
Var ( θ ^ ) = Var ( E [ θ ^ | T ] ) + E [ Var ( θ ^ | T ) ]
Var ( θ ^ ) ≥ Var ( E [ θ ^ | T ] )
Var ( θ ^ ) ≥ Var ( θ ^ ∗ )