Topic 1. Relative Assumptions of Single and Multiple Regression
Topic 2. Interpreting Multiple Regression Coefficients
Topic 3. Interpreting Multiple Regression Results
General Form of Multiple Regression Model:
Assumptions of Single Regression (modified for multiple Xs):
Additional Assumption for Multiple Regression: X variables are not perfectly correlated (i.e., they are not perfectly linearly dependent). Each X variable should have some variation not fully explained by other X variables.
Q1. Which of the following is not an assumption of single regression
A. There are no outliers in the data.
B. The variance of the independent variables is greater than zero.
C. Independent variables are not perfectly correlated.
D. Residual variance are homoskedastic.
Explanation: C is correct.
This is an assumption for multiple regression and not for single regression.
For a multiple regression, the interpretation of a slope coefficient is that it captures the change in the dependent variable for a one-unit change in that independent variable, holding the other independent variables constant.
As a result, these are sometimes called partial slope coefficients.
Ordinary Least Squares (OLS) Estimation Process (Stepwise):
Q2. Multiple regression was used to explain stock returns using the following variables: Dependent variable: RET= annual stock returns (%)
Independent variables: MKT= market capitalization=market capitalization / $1.0 million.
IND= industry quartile ranking (IND = 4 is the highest ranking)
FORT= Fortune 500 firm, where {FORT=1 if the stock is that of a Fortune 500 firm, FORT=0 if not a Fortune 500 stock}
The regression results are presented in the following table.
Based on the results in the table, which of the following most accurately represents the regression equation?
A. 0.43 + 3.09(MKT) + 2.61(IND) + 1.70(FORT).
B. 0.681 + 0.021(MKT) + 0.04(IND) + 0.139(FORT).
C. 0.522 + 0.0460(MKT) + 0.7102(IND) + 0.9(FORT).
D. 1.21 + 0.015(MKT) + 0.2725(IND) + 0.5281(FORT).
Explanation: C is correct.
The coefficients column contains the regression parameters.
Q3. Multiple regression was used to explain stock returns using the following variables: Dependent variable: RET= annual stock returns (%)
Independent variables: MKT= market capitalization=market capitalization / $1.0 million.
IND= industry quartile ranking (IND = 4 is the highest ranking)
FORT= Fortune 500 firm, where {FORT=1 if the stock is that of a Fortune 500 firm, FORT=0 if not a Fortune 500 stock}
The regression results are presented in the following table.
The expected amount of the stock return attributable to it being a Fortune 500 stock is closest to:
A. 0.522.
B. 0.046.
C. 0.710.
D. 0.900.
Explanation: D is correct.
The regression equation is 0.522 + 0.0460(MKT) + 0.7102(IND) + 0.9(FORT). The coefficient on FORT is the amount of the return attributable to the stock of a Fortune 500 firm.
Scenario 1: Single Independent Variable
Scenario 2: Adding a Second Independent Variable
Example: Three-Factor Model
Topic 1. Goodness-of-Fit mMeasures for Single and Multiple Regressions
Topic 2. Coefficient of Determination
Topic 3. Adjusted
Topic 4. Joint Hypothesis Tests and Confidence Intervals
Topic 5. The F-Test
To overcome the problem of overestimating the impact of additional variables, researchers recommend adjusting for the number of independent variables.
Formula for Adjusted
Key Characteristics of Adjusted :
Q4. When interpreting the and adjusted measures for a multiple regression, which of the following statements incorrectly reflects a pitfall that could lead to invalid conclusions?
A. The measure does not provide evidence that the most or least appropriate independent variables have been selected.
B. If the is high, we have to assume that we have found all relevant independent variables.
C. If adding an additional independent variable to the regression improves the , this variable is not necessarily statistically significant.
D. The measure may be spurious, meaning that the independent variables may show a high ; however, they are not the exact cause of the movement in the dependent variable.
Explanation: B is correct.
If the is high, we cannot assume that we have found all relevant independent variables. Omitted variables may still exist, which would improve the regression results further.
The magnitude of coefficients does not indicate the importance of independent variables. Hypothesis testing is needed to determine if variables significantly contribute to explaining the dependent variable.
t-statistic for Individual Coefficients:
Testing Statistical Significance ( versus ):
Confidence Interval for a Regression Coefficient:
Q5. Phil Ohlmer estimates a cross sectional regression in order to predict price to earnings ratios (P/E) with fundamental variables that are related to P/E, including dividend payout ratio (DPO), growth rate (G), and beta (B). In addition, all 50 stocks in the sample come from two industries, electric utilities or biotechnology. He defines the following dummy variable:
The results of his regression are shown in the following table.
Based on these results, it would be most appropriate to conclude that:
A. biotechnology industry P/Es are statistically significantly larger than electric
utilities industry P/Es.
B. electric utilities P/Es are statistically significantly larger than biotechnology
industry P/Es, holding DPO, G, and B constant.
C. biotechnology industry P/Es are statistically significantly larger than electric
utilities industry P/Es, holding DPO, G, and B constant.
D. the dummy variable does not display statistical significance.
Explanation: C is correct.
The t-statistic tests the null that industry P/Es are equal. The dummy variable is significant and positive, and the dummy variable is defined as being equal to one for biotechnology stocks, which means that biotechnology P/Es are statistically significantly larger than electric utility P/Es. Remember, however, this is only accurate if we hold the other independent variables in the model constant.
Q6. Phil Ohlmer estimates a cross sectional regression in order to predict price to earnings ratios (P/E) with fundamental variables that are related to P/E, including dividend payout ratio (DPO), growth rate (G), and beta (B). In addition, all 50 stocks in the sample come from two industries, electric utilities or biotechnology. He defines the following dummy variable:
The results of his regression are shown in the following table.
Ohlmer is valuing a biotechnology stock with a dividend payout ratio of 0.00, a beta of 1.50, and an expected earnings growth rate of 0.14. The predicted P/E on the basis of the values of the explanatory variables for the company is closest to:
A. 7.7.
B. 15.7.
C. 17.2.
D. 11.3.
Explanation: B is correct.
Note that IND = 1 because the stock is in the biotech industry. Predicted P/E = 6.75 + (8.00 × 1) + (4.00 × 0.00) + (12.35 × 0.14) − (0.50 × 1.5) = 15.7.