Clinical trials can be planned to the very last detail, but that doesn’t prevent people from losing touch with the study, moving abroad, or never experiencing the expected event. That event could be the curing of a disease, platelet counts falling below a certain threshold, or, in undesirable circumstances, death. In all cases where the observation does not experience an event, they are classed as being censored and in these instances, methods such as survival analysis are to be used.
Using R’s survival library, it is possible to conduct very in-depth survival analysis’ with a huge amount of flexibility and scope of analysis. The only downside to conducting this analysis in R is that the graphics can look very basic, which, whilst fine for a journal article, does not lend itself too well to presentations and posters. Step in the brilliant survminer package, which combines the excellent analytical scope of R with the beautiful graphics of GGPlot.
In this tutorial, we will use both the survival library and the survminer library to produce Kaplan-Meier plots and analyze log-rank tests. Solutions are available here.
Exercise 1
Load the lung data set from the survival library and re-factor the status column as a factor.
Exercise 2
Calculate the percentage of censored observations.
Exercise 3
Create a basic survival object exploring the occurrence of events.
Exercise 4
Print this object and plot it to graphically investigate this.
Exercise 5
Now install and load the survminer library and plot your survival object using a GGPlot graphic.
Exercise 6
Create a new survival object, stratifying the survival times now by gender.
- Work extensively with the ggplot package and its functionality,
- Learn about the specific differences between base graphics, lattice and ggplot,
- And much more
Exercise 7
Plot this new age-stratified survival object and comment on your observations.
Exercise 8
Form a set of hypothesis’ to formally test survival times between males and females.
Exercise 9
Compute a log-rank test and report on the p-value calculated, in terms of the previously formed hypothesis.
Exercise 10
Investigate how the daily standard of one’s life affects their survival times. How many patients scored a 3 and what could be done with those individuals scoring 3?
Leave a Reply