|
DATA MINING
Desktop Survival Guide by Graham Williams |
|
|||
Lung |
> l.survreg <- survreg(l.Surv ~ age, data=lung) > summary(l.survreg) |
Call:
survreg(formula = l.Surv ~ age, data = lung)
Value Std. Error z p
(Intercept) 6.8871 0.4466 15.42 1.20e-53
age -0.0136 0.0070 -1.94 5.19e-02
Log(scale) -0.2761 0.0624 -4.43 9.61e-06
Scale= 0.759
Weibull distribution
Loglik(model)= -1151.9 Loglik(intercept only)= -1153.9
Chisq= 3.91 on 1 degrees of freedom, p= 0.048
Number of Newton-Raphson Iterations: 5
n= 228
|
> l.pred <- predict(l.survreg, lung)
> l.pred.q <- predict(l.survreg, lung, type="quantile")
> result <- cbind(data.frame(lung$time, l.pred), l.pred.q)
> names(result) <- c("Actual", "Predicted", "Lower", "Upper")
> head(result)
|
Actual Predicted Lower Upper 1 306 357.8476 64.88637 673.7982 2 455 388.2918 70.40663 731.1220 3 1010 457.1705 82.89600 860.8152 4 210 450.9914 81.77557 849.1803 5 883 432.9505 78.50432 815.2108 6 1022 357.8476 64.88637 673.7982 |
Plot age versus time, and status, then draw line byt age with the predicted survival time:
> ord <-order(lung$age)
> age_ord <- lung$age[ord]
> pred_ord <- l.pred[ord]
> with(lung, plot(age, time, pch=status, col=4-status))
> lines(age_ord, pred_ord, col=4)
> legend("topleft", title="Status", c("Survived", "Died", "Predicted"),
pch=c(1, 2, -1), lty=c(0,0,1), col=c(3,2,4))
|
You can not plot the survreg directly, but could do the following, using the coefficients from the regression formula, which will also give a hint in interpreting the formula:
> l.survreg.weibull <- survreg(l.Surv ~ 1, data=lung, dist='weibull')
> plot(survfit(l.Surv~1, data=lung))
> curve(exp(-(exp(-l.survreg.weibull$coef[1]) * x)^(1/l.survreg.weibull$scale)),
col="red", add=TRUE)
|
Another example this time we fit a parametric survival model with a Weibull distribution for time to death fitting a different shape parameter for each gender, by using a strata term.
> l.survreg <- survreg(l.Surv ~ ph.ecog + age + strata(sex), lung) > print(l.survreg) |
Call:
survreg(formula = l.Surv ~ ph.ecog + age + strata(sex), data = lung)
Coefficients:
(Intercept) ph.ecog age
6.73234505 -0.32443043 -0.00580889
Scale:
sex=1 sex=2
0.7834211 0.6547830
Loglik(model)= -1137.3 Loglik(intercept only)= -1146.2
Chisq= 17.8 on 2 degrees of freedom, p= 0.00014
n=227 (1 observation deleted due to missingness)
|
> summary(l.survreg) |
Call:
survreg(formula = l.Surv ~ ph.ecog + age + strata(sex), data = lung)
Value Std. Error z p
(Intercept) 6.73235 0.42396 15.880 8.75e-57
ph.ecog -0.32443 0.08649 -3.751 1.76e-04
age -0.00581 0.00693 -0.838 4.02e-01
sex=1 -0.24408 0.07920 -3.082 2.06e-03
sex=2 -0.42345 0.10669 -3.969 7.22e-05
Scale:
sex=1 sex=2
0.783 0.655
Weibull distribution
Loglik(model)= -1137.3 Loglik(intercept only)= -1146.2
Chisq= 17.8 on 2 degrees of freedom, p= 0.00014
Number of Newton-Raphson Iterations: 5
n=227 (1 observation deleted due to missingness)
|