联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> OS作业OS作业

日期:2024-03-24 05:01

ECF5410 - Take Home Exercise 3

Rev. 2023-03-17

Follow the below instructions and turn in both your code and results:

1. Load the mathpnl, which comes from Leslie Papke and consists of data at the school district level, and was featured in the Wooldridge (2010) textbook.

Tip: Install the wooldridge package and run mathpnl <- wooldridge::mathpnl to save the dataframe. You may want to also want to %>% this into as_tibble().

We are only going to be working with a few variables.

- distid: the district identifier (our “individual” for fixed effects)

- year: the year the data is from

- math4: the percentage of 4th grade students who are “satisfactory” or better in math

- expp: expenditure per pupil

- lunch: the percentage of students eligible for free lunch

- intid: this will be used to help plotting in Q5

2. Panel data is often described as “N by T”. That is, the number of different individuals N and the number of time periods T. Write code that outputs what N and T are in this data.

Tip: you can count the number of observations for each distid & year by using distinct() and nrow() or count().

3. A balanced panel is one in which each individual shows up in every single time period. You can check whether a data set is a balanced panel by seeing whether the number of unique time periods each individual ID shows up in is the same as the number of unique time periods, or whether the number of unique individual IDs in each time period is the same as the total number of unique individual IDs.

Think to yourself a second about why these procedures would check that this is a balanced panel.

Then, check whether this data set is a balanced panel.

Tip: We can use distinct() for N & T and then use table() for a cross tabulation.

Tip2: Please do not output the whole cross-tab into your document - it will be too long.

4. Create a scatter plot with lunch on the x-axis & math4 on the y-axis. What does the relationship look like? Is it intuitive?

5. Now create another plot with a distid colour aesthetic, what can you see now?

Tip: Given the large dataset, we won’t be able to draw any insights. So filter your dataset such that intid == 9 before passing onto ggplot()

Tip2: As distid is numeric, ggplot() will consider it as so. Try to provide distid as a factor instead

6. Given the new plot, should the relationship apply to majority of the distids? Explain.

7. Run an OLS regression, with no fixed effects, of math4 on expp and lunch. Store the results as m1.

8. Modify the model in step 4 to include fixed effects for distid “by hand”. That is, subtract out the within-distid mean of math4, expp, and lunch, creating new variables math4_demean, expp_demean, and lunch_demean, and re-estimate the model using those variables, storing the result as m2.

9. Next run an OLS regression using dummy variables for each distid. Save this as m3.

Tip: Again, as distid is numeric, use it in lm() as a factor

10. Now we will use a specially-designed function to estimate a model with fixed effects. Use feols() from the fixest package to estimate the model from step 4 but with fixed effects for distid. Save the result as m4.

11. Using msummary(), make a regression table including m1 through m4 so you can compare them all.

Write down two interesting things you notice from the table. Multiple possible answers here.

Tip: As there are a lot of dummy variables in m3, provide msummary() with the argument coef_omit = "distid" to remove them from the regression table

Submit a pdf (knitted RMarkdown) document with your answers and the code on moodle by THU, March 30 9:00AM.

moodle/week 4/Take Home Exercise





相关文章

版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp