Book 2. Quantitative Analysis

FRM Part 1

QA 5. Sample Moments

Presented by: Sudhanshu

Module 1. Estimating Mean, Variance and Standard Deviation

Module 2. Estimating Moments of The Distribution

Module 1. Estimating Mean, Variance and Standard Deviation

Topic 1. Sample Mean, Variance, and Standard Deviation  

Topic 2. Population and Sample Moments

Topic 3. Variance and Standard Deviation

Topic 4. Point Estimates and Estimators

Topic 5. Biased Estimators

Topic 6. Best Linear Unbiased Estimator

Topic 1.  Sample Mean, Variance, and Standard Deviation

  • Sample Mean         Estimated by dividing the sum of all values in a sample by the number of observations.
    • Used to make inferences about the population mean.
    • Formula:
  • The sum of deviations from the mean is always zero:
  • Sample Variance          Deviations from the mean are squared to estimate variance.
    • Biased sample estimator variance formula:  
    • Unbiased estimator variance:
  • For exams, always compute the unbiased variance by dividing by (n-1) unless otherwise instructed.
  • Sample Standard Deviation:  The square root of the variance.
    • Measures the extent of dispersion around the mean.
  • Note that:
(\hat{\mu}):
\hat{\mu}=\frac{\sum_{i=1}^n X_i}{n}
(\hat{\sigma}^2):
\hat{\sigma}^2 = \frac{1}{n}\sum_{i=1}^n (X_i -\hat{\mu})^2
s^2=\frac{1}{n-1}\sum_{i=1}^n (X_i -\mu)^2
\sum_{i=1}^n\left(x_i-\hat{\mu}\right)=0
s^2=\frac{n}{n-1} \hat{\sigma}^2

Practice Questions: Q1

Q1. A risk manager gathers the following sample data to analyze annual returns for an asset: 12%, 25%, and –1%. He wants to compute the best unbiased estimator of the true population mean and standard deviation. The manager’s estimate of the standard deviation for this asset should be closest to:

A. 0.0111.

B. 0.0133.

C. 0.1054.

D. 0.1300.

Practice Questions: Q1 Answer

Explanation: D is correct.

The calculations for the sample mean and sample variance are shown in the
following table
 

 

 

 

 

 

 

 

 

The sum of all observations of returns for the asset is 0.36. Dividing this by the number of observations, 3, results in an unbiased estimate of the mean of 0.12. The third column subtracts the mean from the actual return for each year. The last column squares these deviations from the mean. The sum of the squared deviations is equal to 0.338 and dividing this by 2, for an unbiased estimate (n – 1) instead of the number of observations, results in an estimated variance of
0.0169. The standard deviation is then 0.13 (computed as the square root of the variance).

Topic 2. Population and Sample Moments

  • Population Mean (μ):
    • The first moment of the distribution of data.
    • Computed by summing all observed values in the population and dividing by the number of observations in the population (N).
    • Formula:
  • Unique for a given population; usually unknown as not all random numbers are observable.
  • Sample Mean  
    • An estimate of the true population mean based on observable data from a sample.
    • A larger sample size generally leads to an estimate closer to the true population mean.
    • Formula:
    •  
  • Arithmetic Mean: The arithmetic mean is the sum of the observed values divided by the number of observations.
  • The population mean and sample mean are both examples of arithmetic means.
  • Arithmetic Mean Properties:
    • Applies to all interval and ratio data sets.
    • Considers all data values in its computation.
    • A data set has only one arithmetic mean (it is unique).
\mu=\frac{\sum_{i=1}^n X_i}{N}
(\hat{\mu}):
\hat{\mu}=\frac{\sum_{i=1}^n X_i}{n}

Topic 3. Variance and Standard Deviation

  • Variance of Mean Estimator:
    • Depends on the variance of the sample data and the number of observations.
    • Formula:


  • Decreases as the sample size (n) increases, reducing the difference between the estimated and true population variance.
  • Variance of a Random Variable
    • Defined as
  • Estimator Moment
    •  
Var[\hat{\mu}]=\frac{\sigma^2}{n}.
(\sigma^2):
\sigma^2=Var[X]=E[(X-E[X])^2]
\hat{\sigma}^2=\frac{1}{n} \sum_{i=1}^n\left(X_i-\hat{\mu}\right)^2

Topic 4. Point Estimates and Estimators

  • Point Estimate: A single (sample) value used to estimate unknown population parameters.
  • Estimator: The formula used to compute a point estimate.
  • Notation: The "hat" notation (e.g.,   ) denotes that the formula is an estimator used to estimate the true unknown parameter (μ).
  • Sample data is then used instead of random data from a population,
  • Mean Estimator: A formula that transforms observed sample data into an estimate of the true population mean.
\hat{\mu}
X_i.

Topic 5. Biased Estimators

  • Bias of an Estimator: Measures the difference between the expected value of the estimator           and the true population value (θ).
    • Formula:
  • Sample Mean: An unbiased estimator because its expected value equals the true population mean                      This also implies that
  • Sample Variance: A biased estimator when computed by dividing by 'n' because it systematically underestimates the population variance, especially for small sample sizes.
    • Formula for sample variance:

    • The bias for the sample variance is:
       
  • An unbiased estimator for the sample variance is defined as:
    •  
    •  
(E[\hat{\theta}])
Bias(\hat{\theta})=E[\hat{\theta}]-\theta
(E[\hat{\mu}]=\mu).
\operatorname{Bias}(\hat{\mu})=E[\hat{\mu}]-\mu=\mu-\mu=0.
\operatorname{Bias}\left(\hat{\sigma}^2\right)=E\left[\hat{\sigma}^2\right]-\hat{\sigma}^2=\frac{n-1}{n} \sigma^2-\sigma^2=\frac{\sigma^2}{n}
\mathrm{E}\left[\hat{\sigma}^2\right]=\sigma^2-\frac{\sigma^2}{\mathrm{n}}=\frac{\mathrm{n}-1}{\mathrm{n}} \sigma^2
s^2=\frac{n}{n-1} \hat{\sigma}^2=\frac{1}{n-1} \sum_{i=1}^n\left(X_i-\mu\right)^2

Topic 6. Best Linear Unbiased Estimator

  • Definition: The Best Linear Unbiased Estimator (BLUE) is the best estimator for the population mean because it has the minimum variance among all linear unbiased estimators.
  • Sample Mean as BLUE: When data is i.i.d., the sample mean is considered BLUE.
  • Linear Estimators of the Mean: Computed as                    , where      are weights (e.g.,                 for equally likely observations), independent of

  • Unbiased Property: An estimator is unbiased if its expected value equals the parameter being estimated.
  • Note: Nonlinear estimators (e.g., maximum likelihood estimators) might be more accurate but are often biased in finite samples. 
\hat{\mu}=\sum_{i=1}^n w_i X_i
w_i=1/n
w_i
X_i.

Practice Questions: Q2

Q2. The sample mean is an unbiased estimator of the population mean because the:

A. sampling distribution of the sample mean is normal.

B. expected value of the sample mean is equal to the population mean.

C. sample mean provides a more accurate estimate of the population mean as the sample size increases.

D. sampling distribution of the sample mean has the smallest variance of any other unbiased estimators of the population mean.

Practice Questions: Q2 Answer

Explanation: B is correct.

The sample mean is an unbiased estimator of the population mean, because the expected value of the sample mean is equal to the population mean. The best linear unbiased estimator (BLUE) is the best estimator of the population mean available because it has the minimum variance of any linear unbiased estimator. 

Module 2. Estimating Moments of the Distribution

Topic 1. Law of Large Numbers (LLN)  

Topic 2. Central Limit Theorem (CLT)

Topic 3. Skewness

Topic 4. Kurtosis

Topic 5. Median and Quantile Estimates

Topic 6. Mean of Two Random Variables

Topic 7. Covariance and Correlation Between Random Variables

Topic 8. Coskewness

Topic 9. Cokurtosis

Topic 1. Law of Large Numbers (LLN)

  • Application: If LLN applies to estimators, they are considered consistent.
  • Properties of a Consistent Estimator:
    • As sample size increases, the finite sample bias approaches zero.
    • As sample size increases, the variance of the estimator approaches zero.
  • Importance: Consistency ensures that estimates from large samples have minimal deviations from the true population mean, leading to better estimates of the true population distribution.
  • Assumption: mean is finite.

Topic 2. Central Limit Theorem (CLT)

  • Statement: For simple random samples of size 'n' from a population with mean μ and finite variance     , the sampling distribution of the sample mean     approaches a normal probability distribution with mean μ and variance          as the sample size becomes large.
  • Assumptions: Requires only that the mean and variance are finite (beyond LLN's finite mean requirement).
    • Does not require assumptions about the distribution of the random variables of the population.
  • Usefulness: Allows application of the normal distribution for hypothesis testing and constructing confidence intervals.
    • Inferences about the population mean can be made from the sample mean, regardless of the population's distribution, given a sufficiently large sample size (typically n30).
  • Key Properties: 
    • For sufficiently large 'n', the sampling distribution of sample means will be approximately normal.
    • The mean of the population and the mean of the distribution of all possible sample means are equal.
    • The variance of the distribution of sample means is         , approaching zero as 'n' increases.
\sigma^2
(\hat{\mu})
\sigma^2/n
\sigma^2/n

Practice Questions: Q3

Q3. A junior analyst is assigned to estimate the first and second moments for an investment. Sample data was gathered that is assumed to represent the random data of the true population. Which of the following statements best describe the assumptions that are required to apply the central limit theorem (CLT) in estimating
moments of this data set?

A. Only the variance is finite.

B. Both the mean and variance are finite.

C. The random variables are normally distributed.

D. The mean is finite and the random variables are normally distributed.

Practice Questions: Q3 Answer

Explanation: B is correct.

The CLT requires that the mean and variance are finite. The CLT does not require assumptions about the distribution of the random variables of the population.

Topic 3. Skewness

  • Definition: The standardized third central moment of the distribution, indicating the extent to which the data distribution is not symmetric around its mean.
  • Calculation: ​
    •  
  • Estimator for the third moment is calculated as:
  • Nonsymmetrical Distributions: Result from outliers.
    • Positively Skewed (Skewed Right): Characterized by many outliers in the upper (right) tail, pulling the mean upward. Mean > Median > Mode.
    • Negatively Skewed (Skewed Left): Has a disproportionately large amount of outliers in the lower (left) tail, pulling the mean downward. Mean < Median < Mode.
  • Symmetrical Distribution: Mean = Median = Mode.
Skewness(X)=\frac{E[(X-E[X])^3]}{E[(X-E[X]^2]^{3/2}}=\frac{\mu_3}{\sigma^3}
\frac{\frac{1}{n} \sum_{i=1}^n\left(X_i-\hat{\mu}\right)^3}{\hat{\sigma}^3}

Topic 4. Kurtosis

  • Definition: The standardized fourth central moment of the distribution, referring to how fat or thin the tails are in the data distribution.
  • Calculation:
  • ​Estimator for he fourth moment is calculated as:
  • Normal Distribution: Has a kurtosis of 3 (excess kurtosis of 0).
  • Types of Kurtosis:
    • Leptokurtic: Distributions with kurtosis > 3 (positive excess kurtosis) and heavier/fatter tails than a normal distribution, implying a greater probability of extreme deviations from the mean.
    • Platykurtic: Distributions with thinner tails than a normal distribution.
  • Importance in Risk Management: Securities returns often exhibit skewness and kurtosis. Modeling returns with an assumed normal distribution can underestimate the potential for large, negative outcomes. Risk managers focus on tail distributions as that's where risk lies. Greater positive kurtosis and more negative skew indicate increased risk.
Kurtosis(X)=\frac{E[(X-R[X])^4]}{E[(X-E[X]^2]^2}=\frac{\mu_4}{\sigma^4}
\frac{\frac{1}{n} \sum_{i=1}^n\left(X_i-\hat{\mu}\right)^4}{\hat{\sigma}^4}

Practice Questions: Q4

Q4. A distribution of returns that has a greater percentage of extremely large deviations from the mean:

A. is positively skewed.

B. is a symmetric distribution.

C. has positive excess kurtosis.

D. has negative excess kurtosis.

Practice Questions: Q4 Answer

Explanation: C is correct.

A distribution that has a greater percentage of extremely large deviations from the mean will be leptokurtic and will exhibit excess kurtosis (positive). The distribution will have fatter tails than a normal distribution.

Topic 5. Median and Quantile Estimates

  • Median: The 50th percentile or midpoint of a data set when data is ordered.
    • Similar to the mean in measuring central tendency.
    • For symmetrical data, mean and median are the same.
    • Robustness to Outliers: The median is a better measure of central tendency than the mean when extreme values (outliers) are present, as it is not affected by them.
  • Estimating Median: Sort data in ascending or descending order.
    • If sample size is odd, the median is the middle observation.
    • If sample size is even, the median is the arithmetic mean of the two middle observations.
  • Quantiles (e.g., Quartiles): The 25th and 75th quantiles (quartiles) are commonly reported.
    • Estimated by sorting data and finding the value at location α×n. If not an integer, average the points immediately above and below.
    • Interquartile Range (IQR): A measure of dispersion from the median (e.g., range from 25th to 75th quartile), useful for determining distribution symmetry and tail weight.
  • Properties of Quantiles: Easy to interpret as they have the same units as the sample data. Robust to outliers or extreme values.

Topic 6. Mean of Two Random Variables

  • Estimation: The mean of two random variables (Xi and Yi) is estimated similarly to individual variables, by summing all values and dividing by the number of observations (n).
    •  
  • CLT Application: If the data for both variables is i.i.d., the Central Limit Theorem applies to both estimators individually.
    •  
  • Bivariate Mean Estimate: If treated as a bivariate mean estimate, the 2x1 vector of mean estimators is asymptotically normally distributed if the multivariate random variable is i.i.d.
\hat{\mu}_X=\frac{1}{n} \sum_{i=1}^n X_i \text{ and } \hat{\mu}_Y=\frac{1}{n} \sum_{i=1}^n Y_i
\hat{\mu}=\left[\begin{array}{l} \hat{\mu}_{\mathrm{X}} \\ \hat{\mu}_{\mathrm{Y}} \end{array}\right]

Topic 7. Covariance and Correlation Between Random Variables

  • Covariance:
    • A statistical measure of the degree to which two variables move together.
    • Captures the linear relationship between variables.
    • Positive Covariance: Variables tend to move in the same direction.
    • Negative Covariance: Variables tend to move in opposite directions.
    • Formula for asset returns:
  • The sample covariance estimator:
  • Its actual value is sensitive to the scale of variables and ranges from negative to positive infinity, expressed in squared units.
  • Correlation Coefficient:
    • Converts covariance into a more interpretable measure.
    • Measures the strength and direction of a linear relationship between two variables.
    • Ranges from -1 (perfect negative correlation) to +1 (perfect positive correlation).
  • Formula for correlation coefficient:
Cov(R_X, R_Y)=E[(R_X-E[R_X])(R_Y-E[R_Y])] = E(X, Y)-E(X) \times E(Y)
\text { sample } \operatorname{Cov}_{X Y}=\frac{\sum_{i=1}^n\left(X_i-\hat{\mu}_X\right)\left(Y_i-\hat{\mu}_Y\right)}{n-1}
\operatorname{Corr}(X, Y)=\frac{\operatorname{Cov}(X, Y)}{\sigma(X) \sigma(Y)}

Practice Questions: Q5

Q5. The correlation of returns between Stocks A and B is 0.50. The covariance between these two securities is 0.0043, and the standard deviation of the return of Stock B is 26%. The variance of returns for Stock A is:

A. 0.0331.

B. 0.0011.

C. 0.2656.

D. 0.0112.

Practice Questions: Q5 Answer

Explanation: B is correct.

 

 

 

 

\begin{aligned} & \operatorname{Corr}\left(\mathrm{R}_{\mathrm{A}}, \mathrm{R}_{\mathrm{B}}\right)=\frac{\operatorname{Cov}\left(\mathrm{R}_{\mathrm{A}}, \mathrm{R}_{\mathrm{B}}\right)}{\left[\sigma\left(\mathrm{R}_{\mathrm{A}}\right)\right]\left[\sigma\left(\mathrm{R}_{\mathrm{B}}\right)\right]} \\ & \sigma^2\left(\mathrm{R}_{\mathrm{A}}\right)=\left[\frac{\operatorname{Cov}\left(\mathrm{R}_{\mathrm{A}}, \mathrm{R}_{\mathrm{B}}\right)}{\left[\sigma\left(\mathrm{R}_{\mathrm{B}}\right)\right] \operatorname{Corr}\left(\mathrm{R}_{\mathrm{A}}, \mathrm{R}_{\mathrm{B}}\right)}\right]^2=\left[\frac{0.0043}{(0.26)(0.5)}\right]^2=0.0331^2=0.0011 \end{aligned}

Practice Questions: Q6

Q6. Consider the following probability matrix:

 

 

 

 

 


The covariance between Stock A and B is closest to:

A. −0.160.

B. −0.055.

C. 0.004.

D. 0.020.

Practice Questions: Q6 Answer

Explanation: B is correct.

 

 

\begin{aligned} & \operatorname{Cov}\left(\mathrm{R}_{\mathrm{A}}, \mathrm{R}_{\mathrm{B}}\right)=0.4(-0.1-0.08)(0.5-0.17)+0.3(0.1-0.08)(0.2-0.17) \\ & +0.3(0.3-0.08)(-0.3-0.17)=-0.0546 \end{aligned}

Topic 8. Coskewness and Cokurtosis

  • The third cross central moment for pairs of random variables.

  • Dividing by the variance of one variable and the standard deviation of the other variable standardizes the cross third moment.

  • The two coskewness measures are computed as:

    •  

    •  

  • Measures the likelihood of large directional movements for one variable when the other variable is large.

  • Zero if there is no relationship between the sign of one variable and large moves in the other.

  • Always zero in a bivariate normal sample due to symmetrical and normal distribution.

  • We can estimate coskewness by applying an expectation operator as follows:

\begin{aligned} s(X, X, Y)&=\frac{E\left[(X-E[X])^2(Y-E[Y])\right]}{\sigma_X^2 \sigma_Y} \\ s(X, Y, Y)&=\frac{E\left[(X-E[X])(Y-E[Y])^2\right]}{\sigma_X \sigma_Y^2} \end{aligned}
\hat{s}(X, X, Y)=\frac{\frac{1}{n} \sum_{i=1}^n\left(X_i-\hat{\mu}_X\right)^2\left(Y_i-\hat{\mu}_Y\right)}{\hat{\sigma}_X^2 \hat{\sigma}_Y}

Topic 9. Cokurtosis

  • The fourth cross central moment for pairs of random variables.
  • The three cokurtosis measures are computed as:
    •  
    •  
    •  
    •  
    •  
  • Computed using combinations of powers that add to 4 (e.g., k(X,X,Y,Y) for symmetric case; k(X,X,X,Y) and k(X,Y,Y,Y) for asymmetric cases).
  • Symmetrical Case k(X,X,Y,Y): Indicates the sensitivity of the magnitude of one series to the magnitude of the other. Will be large if both series are large in magnitude simultaneously.
  • Asymmetrical Cases: Indicate the agreement of return signs when the power 3 return is large in magnitude.
  • The cokurtosis of a bivariate normal depends on the correlation.
    • Symmetric case: The correlation ranges between −1 and +1 and the cokurtosis ranges between 1 and 3, with the smallest value of 1 occurring when the correlation is equal to zero.
    • Asymmetric case:  Cokurtosis ranges from −3 to +3 and is a linear relationship that is upward sloping as the correlation increases from −1 to +1.
\begin{aligned} & k(X, X, Y, Y)=\frac{E\left[(X-E[X])^2(Y-E[Y])^2\right]}{\sigma_X^2 \sigma_Y^2} \\ & k(X, X, X, Y)=\frac{E\left[(X-E[X])^3(Y-E[Y])\right]}{\sigma_X^3 \sigma_Y} \\ & k(X, Y, Y, Y)=\frac{E\left[(X-E[X])(Y-E[Y])^3\right]}{\sigma_X \sigma_Y^3} \end{aligned}

Topic 9. Cokurtosis

Practice Questions: Q7

Q7. An analyst is graphing the cokurtosis and correlation for a pair of bivariate random variables that are normally distributed. For the symmetrical case of the three cokurtosis measures, k(X,X,Y,Y), cokurtosis is graphed on the y-axis and correlation is graphed on the x-axis between –1 and +1. The shape of this graph should be best described as a(n):

A. upward linear graph ranging in cokurtosis values between –3 and +3.

B. downward linear graph ranging in cokurtosis values between –1 and +1.

C. symmetrical curved graph with the maximum cokurtosis of 3 when the correlation is 0.

D. symmetrical curved graph with the minimum cokurtosis of 1 when the correlation is 0.

Practice Questions: Q7 Answer

Explanation: D is correct.

A symmetrical curved graph with the minimum cokurtosis of 1 when the correlation is 0. The graph will be an upward sloping linear relationship for the other two asymmetric cases of cokurtosis k(X,Y,Y,Y) and k(X,X,X,Y).

QA 5. Sample Moments

By Prateek Yadav

QA 5. Sample Moments

  • 22