Q1. The following chart and accompanying text are reproduced from the “Why Australia: Benchmark Report 2021” prepared by Australian Trade Commission, Australian Government. “A big spender on research and development: Australia’s annual gross domestic expenditure on research and development (GERD) reached A$34 billion in 2018–19. This places Australia alongside the UK, Singapore and France as one of the highest spenders on research and development (R&D). Australia’s trend in R&D is upwards. GERD rose by around 7% per year from 2000–01 to 2018–19 and it now represents 1.8% of Australian GDP. This creates a pool of skilled researchers who are globally competitive.” (i) Discuss the purpose of Chart 1 and critically assess how well it meets this purpose. (3 marks) (ii) Would you recommend that the data be presented in a different way? If not, why not? If so, explain how you would present the data in a better way. (2 marks) Q2. Use the data “htv” from the “Wooldridge” package in R to answer this question. The data set includes information on wages, education, parents’ education, and several other variables for 1,230 working men in 1991. (i) Estimate the regression model ???????????????? = ????0 + ????1????????????ℎ???????????????? + ????2????????????ℎ???????????????? + ????3???????????????? + ????4???????????????? 2 + ???? by OLS and report the results in the equation form. How much sample variation in ???????????????? is explained? Interpret the coefficient on ????????????ℎ????????????????. (1 mark) (ii) Find the value of ????????????????, call it ???????????????? ∗ , where ???????????????? is minimized, holding other factors fixed. (1 mark) (iii) Test the null hypothesis that ???????????????? is linearly related to ???????????????? against the alternative that the relationship is quadratic. (1 mark) (iv) Test H0: ????1 = ????2 against a two-sided alternative. (1 mark) (v) Add the two college tuition variables ????????????????17 and ????????????????18 to the regression and determine whether they are jointly statistically significant. (1 mark) Q3. This problem focuses on the impact of a computer assisted learning program (cal) on educational outcomes. This program is a computer-assisted learning program where children in grade 4 are offered two hours of shared computer time per week during which they play games that involve solving math problems whose level of difficulty responds to their ability to solve them. The data file “baroda.dta” contains the data. Use the “read_dta” function from the “haven” package to read the file in R. Observations are at the child level. “????????????” indicates whether the child was selected in the cal program. Implementation of the program was intended to be randomised among children in grade 4. The main outcome of interest is whether the intervention resulted in improvement in math test scores. Performance in math was measured using ????????????_????????????ℎ???????????????? before implementation, and ????????????????_????????????ℎ????????????????, after the intervention. The tests scores have been normalised to be standardised variables, as indicated by variable names. Note of caution: the program is implemented only in grade 4 (grade is measured by the variable “????????????”) (i) Discuss the potential sources of selection bias and the direction of the bias for such an education program. (1 mark) (ii) Using the standardised variables for tests scores in math, check whether the randomisation has performed well. (1 mark) (iii) Estimate the ATE of the program in math. Can we interpret the effect as causal? (1 mark) (iv) Estimate the effect of the program on whether children improved their math scores relative to what would have been expected relative to their initial scores. In order to do this, estimate a specification in which the dependent variable is improvement in math scores and in which you control for initial math scores. Why would you want to do this? What can you conclude with respect to the likely effect of the program on math outcomes? (1 mark) (v) Using a logit regression, estimate the propensity score of program participation based on premath scores. Estimate the effect of the program on improving math score adjusting for the propensity score of participation. How does your estimate compare to the one obtained in (iv)? (1 mark)