↑
- STA260

STA260 Lecture 11

STA260 Lecture 11 Raw
STA260 Lecture 11 Flashcards

Completed Notes Status
- Completed insertions: 7
- Ambiguities left unresolved: none
Lecture Summary
- Central objective: Extend Maximum Likelihood Estimation theory by establishing asymptotic normality of MLEs via Fisher Information, then introduce Bayesian Inference as an alternative paradigm that updates prior beliefs with data to produce posterior distributions.
- Key concepts:
  - Fisher Information: Quantifies the precision of parameter estimation; high Fisher Information indicates steep likelihood curvature and precise estimates, while low Fisher Information indicates flat curvature and imprecise estimates^[1].
  - Consistent Estimator: Under regularity conditions, the MLE $\hat{θ}$ converges in distribution to $N (θ, \frac{1}{n I (θ)})$ as sample size $n \to \infty$ , enabling asymptotic inference.
  - Bayesian Inference: Uses Bayes' Theorem to combine prior $P (θ)$ and likelihood $P (X | θ)$ into posterior $P (θ | X) \propto P (X | θ) P (θ)$ , eliminating the intractable marginal likelihood constant^[2]^[3].
  - Conjugate Prior: When the prior and posterior belong to the same distributional family, computation simplifies; for Poisson data, the Gamma Distribution is the Conjugate Prior, yielding a gamma posterior^[4]^[5]^[6].
- Connections:
  - The asymptotic variance $\frac{1}{n I (θ)}$ mirrors the variance structure in the Central Limit Theorem $\frac{σ^{2}}{n}$ , showing that Fisher Information acts as an "effective precision" parameter.
  - Bayesian Inference reinterprets parameters as random variables updated by data, contrasting with frequentist MLE which treats parameters as fixed unknowns.
  - Conjugate priors enable closed-form posteriors, bypassing numerical integration of the marginal likelihood.
TK Resolutions
- #tk: Practice calculating Fisher Information using the first derivative method (variance of the score function) in addition to the second derivative method.
  - Answer: The first derivative method computes $I (θ) = E [{(\frac{\partial}{\partial θ} \log f_{X} (x; θ))}^{2}]$ . For [pages/Poisson Distribution|Poisson], the score is $\frac{\partial}{\partial λ} \log f_{X} (x; λ) = - 1 + x λ^{- 1}$ . Then $I (λ) = E [{(- 1 + X λ^{- 1})}^{2}] = E [1 - 2 X λ^{- 1} + X^{2} λ^{- 2}] = 1 - 2 λ λ^{- 1} + (λ^{2} + λ) λ^{- 2} = 1 - 2 + 1 + λ^{- 1} = λ^{- 1}$ , matching the second derivative result^[1:1]^[7].
Practice Questions
- Remember/Understand:
  - Define Fisher Information in terms of the second derivative of the log-likelihood and explain why high Fisher Information corresponds to precise parameter estimation.
  - State Bayes' Theorem for parameter estimation and identify the roles of prior, likelihood, marginal likelihood, and posterior.
  - What does it mean for an estimator to be consistent, and why is consistency a desirable property of MLEs?
- Apply/Analyse:
  - Given $X_{1}, \dots, X_{n} \sim Exponential (θ)$ , calculate the Fisher Information $I (θ)$ using both the second derivative method and the variance of the score function.
  - For $X_{1}, \dots, X_{n} \sim Binomial (m, p)$ with prior $p \sim Beta (α, β)$ , derive the posterior distribution $p | X$ and verify that the Beta Distribution is a Conjugate Prior for the Binomial Distribution.
  - Using Theorem 10, find the asymptotic distribution of the MLE $\hat{θ}$ for $X_{1}, \dots, X_{n} \sim Geometric (p)$ .
- Evaluate/Create:
  - Compare and contrast the frequentist MLE approach and the Bayesian approach in terms of parameter interpretation, incorporation of prior knowledge, and computational complexity.
  - Design a simulation study to verify that the asymptotic normality result $\hat{λ} \overset{D}{\to} N (λ, \frac{λ}{n})$ holds for Poisson MLEs under various sample sizes and parameter values.
Challenging Concepts
- Fisher Information:
  - Why it's challenging: The equivalence of the two definitions ( $E [S^{2}]$ and $- E [S^{″}]$ ) is non-obvious and requires integration by parts and regularity conditions; interpreting Fisher Information as curvature of the log-likelihood is abstract.
  - Study strategy: Work through the proof of equivalence using the Leibniz rule and practice computing Fisher Information for multiple distributions (exponential, normal, geometric) using both methods to build intuition^[1:2]^[7:1].
- Posterior Distribution proportionality:
  - Why it's challenging: Understanding why the marginal likelihood $P (X)$ can be discarded as a normalising constant and how to recognise distributional kernels to identify posteriors requires pattern recognition and algebraic manipulation.
  - Study strategy: Memorise kernels of common distributions (gamma, beta, normal); practice "proportionality gymnastics" by working through multiple Conjugate Prior examples (beta-binomial, gamma-Poisson, normal-normal) and explicitly identifying terms that depend on $θ$ versus data-only constants^[3:1]^[6:1].
Action Plan
- Immediate review actions:
  - Review all [COMPLETED] insertions and verify with course materials
  - Address the #tk item by deriving Fisher Information for Poisson using $E [S^{2}]$ and compare with the lecture's second derivative method
  - Re-read Lecture Summary and confirm all wiki-links resolve correctly
- Practice and application:
  - Answer Remember/Understand questions by writing definitions and explanations without notes
  - Answer Apply/Analyse questions by computing Fisher Information and posteriors for exponential, binomial, and geometric distributions
  - Attempt Evaluate/Create questions by designing a simulation and writing a short comparison essay
- Deep dive study:
  - Focus on Challenging Concepts by proving the equivalence of Fisher Information definitions and practising distributional kernel recognition for five different Conjugate Prior pairs
  - Extend Fisher Information, Bayesian Inference, Posterior Distribution, Conjugate Prior, and Marginal Likelihood atomic notes with new examples and connections from this lecture
- Verification and integration:
  - Cross-reference notes with textbook sections on asymptotic normality of MLEs and Bayesian Inference
  - Identify remaining gaps (e.g., multivariate Fisher Information matrix, non-conjugate priors, numerical posterior approximation)
  - Link this lecture to STA260 Lecture 10 (MLE foundations) and future lectures on Bayesian inference and credible intervals
Footnotes

STA260 Lecture 11

Completed Notes Status

Lecture Summary

TK Resolutions

Practice Questions

Challenging Concepts

Action Plan

Footnotes