Instructions: Review the data file and answer the questions. Your answers should be submitted in the form of a Word document. The Word document should include ALL your answers including any charts, tables, or graphs required. Answer questions with complete sentences and using appropriate units and notation.You need to submit the Excel files that supports your work (but they will not be graded, and answers will not be searched within the file; make sure to include everything in your Word document.)
Data
Consider the data in the file House Prices of recent sales of single-family houses in Springfield. The variables are:
- Price: Price at which house was eventually sold
- SqFt: Floor area in square feet
- Bedrooms: Number of bedrooms
- Bathrooms: Number of bathrooms
- Offers: Number of offers made on the house prior to the accepted offer
- Brick: Whether the construction is primarily brick or not (yes or no)
- Neighborhood: One of the three neighborhoods in Springfield (east, west or north)
Questions
This assignment extends the analysis of house prices in Springfield that was conducted in Individual Assignment 1.
1. Create a regression model for Price using ALL the quantitative as well as qualitative (categorical) variables in the spreadsheet.
Use “North” as the base (or reference) category for the Neighborhood variable, and “No Brick” as the base category for exterior construction material variable, Brick.
Refer to this model as Model 1. Include a screenshot of the model output.
2. According to Model 1 estimated above, by how much does the average price in the East differs from the average price of a similar house in the North?
3. According to Model 1, by how much does the average price in the West differs from the average price of a similar house in the East?
4. Define the “brick premium” as the average amount by which the price of a brick house exceeds the price of a similar house made without brick.
According to Model 1, what is the brick premium in Springfield?
5. Suppose it is conjectured that the brick premium varies by neighborhood. To account for this conjecture, we augment Model 1 with interaction terms as follows:
Price = a + b1SqFt + b2Bedrooms + b3Bathrooms + b4Offers + b5Brick + b6East + b7West + b8Brick*East + b9Brick*West
Call this model as Model 2. Estimate this model with StatTools. Include an screenshot of the model output.
6. Provide an economic interpretation of b8 in Model 2.
7. Based in Model 2, what is the brick premium in the:
a. North? $__________
b. East? $__________
c. West? $__________
8. Run the following regression as Model 3:
log(Price) = a + b1SqFt + b2Bedrooms + b3Bathrooms + b4Offers + b5Brick + b6East + b7West
(Recall that, “log” refers to natural logarithms.)
Include an screenshot of the model output.
9. From the output for Model 3, what is the value of b7? Provide an economic interpretation for b7.
10. Suppose a homeowner adds an extension to her house in the form of a 400 sq ft. bedroom. According to Model 3, what is the increase in the predicted selling price of her house? (Be careful with units.)
11. Use Model 1, 2 and 3, to estimate the price of a 1720 sq ft. brick house in the West that has 3 bedrooms, 2 bathrooms and has had 1 offer made on it.
Price Estimate (in $) | |
Model 1 | |
Model 2 | |
Model 3 |
12. Interpret the coefficient of determination for Model 2.
13. Which one of these three models would you pick for estimating Price? Concisely justify your answer.
14. Check the assumptions in Model 1. Paste the following residual plots: Residuals vs Fitted Values, Residuals vs Order Index and Histogram of Residuals, and comment on each of the regression assumptions.
p.s I have attached the excel file