Question 3 [50 Marks]
Before answering this question, the ‘survival’ package should be loaded into R using the following code:
A recently developed drug, MediCo, has been used over the past 12 months to treat a potentially fatal disease. MediCo was approved by a medical regulator last year following initial trials and mortality data has been collected over the past 12 months, since approval, to continue reviewing the drug’s effectiveness. This mortality data has been compared with mortality data collected from infected patients, prior to MediCo’s approval, who were NOT administered the drug. It is suspected that gender may be a significant covariate on the mortality rate of infected patients.
The ‘CBTDATA_0321.csv’ file contains the combined mortality data from this investigation for 4,400 infected patients. The file contains the following five variables:
Unique patient identifier (integers 1, 2, …, 4,400)
Drug indicator (1 = received drug, 0 = did not receive drug)
Gender indicator (1 = female, 0 = male)
Status indicator (1 = death due to disease, 0 = censoring event occurred)
Duration in days at which death/censoring occurred (integers with a range of 1–365, with 1
= first day of the investigation and 365 = last day of the investigation).
Before answering this question, the ‘CBTDATA_0321.csv’ file should be loaded into R and assigned to a data frame called mortalitydata.
a) Plot the Kaplan–Meier survival function estimate for all patients, together with (8 marks)
its two-sided 99.5% confidence interval.
b) Determine, using the estimated survival function from part (a), the probability that a (4 marks)
patient survived from the beginning of the investigation to the end of the investigation.
c) Evaluate the appropriateness of the probability value, calculated in part (b), for (4 marks)
assessing MediCo’s effectiveness.
d) Plot, on a single graph, four Kaplan–Meier survival function estimates without any (9 marks)
confidence intervals, where each estimate represents one of the four possible patient
group combinations of drug and gender. You should use separate colours to identify
each survival function.
e) Estimate a Cox proportional hazards model with death as the event of interest using (8 marks)
the two covariates, drug, and gender, with no interaction term, pasting your results
into your answer script.
f) Comment on the results produced in part (e) with reference to the effects of the two (5 marks)
covariates, drug and gender, on the mortality rate.
g) Update the Cox proportional hazards model in part (r) to include an interaction term between drug and gender, pasting your results into your answer script.
h) Analyse the effectiveness of MediCo, commenting on any differences between males and females.
END OF QUESTION PAPER