- ↑
STA260 Lecture 11
STA260 Lecture 11 Raw
STA260 Lecture 11 Flashcards
-
Completed Notes Status
- Completed insertions: 7
- Ambiguities left unresolved: none
-
Lecture Summary
- Central objective: Extend Maximum Likelihood Estimation theory by establishing asymptotic normality of MLEs via Fisher Information, then introduce Bayesian Inference as an alternative paradigm that updates prior beliefs with data to produce posterior distributions.
- Key concepts:
- Fisher Information: Quantifies the precision of parameter estimation; high Fisher Information indicates steep likelihood curvature and precise estimates, while low Fisher Information indicates flat curvature and imprecise estimates[1].
- Consistent Estimator: Under regularity conditions, the MLE
converges in distribution to as sample size , enabling asymptotic inference. - Bayesian Inference: Uses Bayes' Theorem to combine prior
and likelihood into posterior , eliminating the intractable marginal likelihood constant[2][3]. - Conjugate Prior: When the prior and posterior belong to the same distributional family, computation simplifies; for Poisson data, the Gamma Distribution is the Conjugate Prior, yielding a gamma posterior[4][5][6].
- Connections:
- The asymptotic variance
mirrors the variance structure in the Central Limit Theorem , showing that Fisher Information acts as an "effective precision" parameter. - Bayesian Inference reinterprets parameters as random variables updated by data, contrasting with frequentist MLE which treats parameters as fixed unknowns.
- Conjugate priors enable closed-form posteriors, bypassing numerical integration of the marginal likelihood.
- The asymptotic variance
-
TK Resolutions
- #tk: Practice calculating Fisher Information using the first derivative method (variance of the score function) in addition to the second derivative method.
- Answer: The first derivative method computes
. For [pages/Poisson Distribution|Poisson], the score is . Then , matching the second derivative result[1:1][7].
- Answer: The first derivative method computes
- #tk: Practice calculating Fisher Information using the first derivative method (variance of the score function) in addition to the second derivative method.
-
Practice Questions
- Remember/Understand:
- Define Fisher Information in terms of the second derivative of the log-likelihood and explain why high Fisher Information corresponds to precise parameter estimation.
- State Bayes' Theorem for parameter estimation and identify the roles of prior, likelihood, marginal likelihood, and posterior.
- What does it mean for an estimator to be consistent, and why is consistency a desirable property of MLEs?
- Apply/Analyse:
- Given
, calculate the Fisher Information using both the second derivative method and the variance of the score function. - For
with prior , derive the posterior distribution and verify that the Beta Distribution is a Conjugate Prior for the Binomial Distribution. - Using Theorem 10, find the asymptotic distribution of the MLE
for .
- Given
- Evaluate/Create:
- Compare and contrast the frequentist MLE approach and the Bayesian approach in terms of parameter interpretation, incorporation of prior knowledge, and computational complexity.
- Design a simulation study to verify that the asymptotic normality result
holds for Poisson MLEs under various sample sizes and parameter values.
- Remember/Understand:
-
Challenging Concepts
- Fisher Information:
- Why it's challenging: The equivalence of the two definitions (
and ) is non-obvious and requires integration by parts and regularity conditions; interpreting Fisher Information as curvature of the log-likelihood is abstract. - Study strategy: Work through the proof of equivalence using the Leibniz rule and practice computing Fisher Information for multiple distributions (exponential, normal, geometric) using both methods to build intuition[1:2][7:1].
- Why it's challenging: The equivalence of the two definitions (
- Posterior Distribution proportionality:
- Why it's challenging: Understanding why the marginal likelihood
can be discarded as a normalising constant and how to recognise distributional kernels to identify posteriors requires pattern recognition and algebraic manipulation. - Study strategy: Memorise kernels of common distributions (gamma, beta, normal); practice "proportionality gymnastics" by working through multiple Conjugate Prior examples (beta-binomial, gamma-Poisson, normal-normal) and explicitly identifying terms that depend on
versus data-only constants[3:1][6:1].
- Why it's challenging: Understanding why the marginal likelihood
- Fisher Information:
-
Action Plan
- Immediate review actions:
- Practice and application:
- Deep dive study:
- Verification and integration:
- Immediate review actions:
-
Footnotes