STA260 Lecture 03

STA260 Lecture 03 Raw

Example Problem 1: Probability Calculation

Example Problem 2: Sample Size Determination

Review: Convergence Concepts

Example Problem 3: CLT Application

Responses to #tk Flags

#tk Item 1: Linear Combinations of Normal Variables

Context: "If XiN(μi,σi2) Then i=1naiXiN(i=1naiμi,i=1nai2σi2) #tk expected to remember"

Explanation:
This property is fundamental to statistical theory involving normal distributions. It states that any linear combination of independent normal random variables is itself normally distributed.

There are three key components to remember for this formula:

  1. Normality: The sum of normal variables remains normal. It does not change shape to some other distribution.
  2. Mean (Linearity of Expectation): The expected value operator is linear.E[aiXi]=aiE[Xi]=aiμi
  3. Variance (Independence): The variance operator is not linear in the same way. When variables are independent, the variance of a sum is the sum of the variances. Crucially, constants pull out squared.Var(aiXi)=Var(aiXi)=ai2Var(Xi)=ai2σi2

Relevance to Lecture: This explains why the sample mean X¯=1nXi is normal. Here, each ai=1n.


Lecture Summary

Main Thesis:
This lecture establishes the foundational machinery for statistical inference by demonstrating how to calculate probabilities for sample means using the Central Limit Theorem (CLT) and the Standard Normal Distribution. It connects the theoretical concept of Convergence in Distribution to the practical application of approximating probabilities for large samples, even when the underlying population distribution is unknown.

Key Concepts:

  1. Sampling Distribution of X¯: If the population is normal, X¯ is exactly normal. If the population is non-normal but n is large, X¯ is approximately normal (CLT) with mean μ and variance σ2n.
  2. Standardization (Z-score): To compute probabilities, any normal sample mean X¯ can be transformed into a standard normal variable Z using the formula Z=X¯μσ/n.
  3. Sample Size Determination: We can reverse-engineer the probability formula to find the minimum sample size n required to ensure the sample mean falls within a specific margin of error with a desired confidence level (e.g., 95%).
  4. Convergence in Distribution: The formal mathematical definition involves the limit of Cumulative Distribution Functions (CDFs). Specifically, XnDX means the CDF of the sequence approaches the CDF of the target distribution as n.

Practice Questions

Remember/Understand Level

  1. Define Convergence in Distribution. What specifically must converge for a sequence of random variables to converge in distribution?
  2. State the parameters. If XN(μ,σ2) and we take a sample of size n, what are the mean and variance of the sampling distribution of X¯?
  3. Explain the Standard Error. What is the difference between σ and σn? When do you use one versus the other?

Apply/Analyze Level

  1. Calculate Probability. A factory produces bolts with a mean length of 10 cm and standard deviation of 0.2 cm. If you sample 25 bolts, what is the probability that the average length is greater than 10.05 cm?
  2. Determine Sample Size. You want to estimate a population mean. You know σ=15. How large a sample is needed so that the probability of your estimate being off by more than 2 units is only 0.01? (Hint: P(|Z|>z)=0.01)
  3. Linear Combinations. Let XN(10,4) and YN(5,9) be independent. What is the distribution of W=2XY?

Evaluate/Create Level

  1. Evaluate Assumptions. In the example problem where we calculated the probability of serving 100 customers in 2 hours, we assumed the service times were independent. What would happen to our estimate if the service times were positively correlated (e.g., a slow server implies the next customer is also served slowly)? Would the variance of the sum be higher or lower?

Challenging Concepts to Review

Concept 1: Standard Deviation Vs. Standard Error

Why it's challenging: Students often confuse σ (variability of a single observation) with σn (variability of the sample average).
Study strategy: Visualize the difference. σ is how "wide" the population bell curve is. σn is how "wide" the distribution of averages is. As n gets huge, the average gets very precise, so the bell curve for X¯ gets extremely narrow (approaches a spike at μ), while the population curve stays the same.

Concept 2: The Variance of Linear Combinations

Why it's challenging: It is intuitive to add means (E[X+Y]=E[X]+E[Y]), but unintuitive that variances also add (Var(XY)=Var(X)+Var(Y)) even when subtracting variables, provided they are independent. Also, remembering to square the constants (a2) is a common error source.
Study strategy: Use the definition of variance: Var(aX)=E[(aXaμ)2]=E[a2(Xμ)2]=a2Var(X). Work through the proof once to see where the square comes from.

Concept 3: Convergence in Distribution

Why it's challenging: It is a limiting concept involving functions (CDFs) rather than single numbers. It is more abstract than "the numbers get closer."
Study strategy: Think of it graphically. Imagine the graph of the CDF for n=1,n=2,n=10. Watch how the curve changes shape until it perfectly overlaps the curve of the Normal CDF. It's about the shape stabilizing.

Your Action Plan

Immediate Review Actions

Practice and Application

Deep Dive Study

Verification and Integration