Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Decision Tree

The rpart function can also be used with a Surv object as the target. Here we build a decision tree.



> library(rpart)
> l.rpart <- rpart(l.Surv ~ age + sex + ph.ecog + ph.karno + pat.karno + 
                   meal.cal + wt.loss, data=lung)
> l.rpart



n= 228 

node), split, n, deviance, yval
      * denotes terminal node

 1) root 228 307.928100 1.0000000  
   2) ph.ecog< 1.5 176 223.146500 0.8607050  
     4) sex>=1.5 69  79.663710 0.6067812  
       8) pat.karno>=85 41  38.552160 0.4954413 *
       9) pat.karno< 85 28  38.961790 0.7912681  
        18) age< 52.5 7   8.297501 0.3610969 *
        19) age>=52.5 21  26.638230 1.0017890 *
     5) sex< 1.5 107 134.504400 1.0669060  
      10) pat.karno>=65 98 123.426900 1.0085900  
        20) age< 64.5 53  63.983660 0.8433300  
          40) ph.ecog< 0.5 20  22.029470 0.6019784 *
          41) ph.ecog>=0.5 33  38.488990 1.0838490 *
        21) age>=64.5 45  56.194870 1.2707780  
          42) wt.loss< 7.5 21  22.817280 0.9491270 *
          43) wt.loss>=7.5 24  29.796110 1.7206300 *
      11) pat.karno< 65 9   7.162132 2.0298660 *
   3) ph.ecog>=1.5 52  70.309440 1.7041700  
     6) wt.loss>=21 15  18.094340 0.8363971 *
     7) wt.loss< 21 37  38.904150 2.5294840  
      14) sex>=1.5 16   8.883448 1.8424400 *
      15) sex< 1.5 21  27.485660 3.1026240  
        30) age< 59 7   4.587466 1.7378480 *
        31) age>=59 14  17.993450 4.3012260 *

The summary command will provide fuller details of the resulting model.



> summary(l.rpart)

The usual predict can be used to obtain the predictions of the estimated rate. Todo: What is actually being predicted here? These are compared to the same predictions from the coxph model.



> l.rpart.pred <- predict(l.rpart, lung)
> results <- cbind(l.coxph.age.results, rpart=l.rpart.pred)
> head(results)



  time status        lp     risk  expected       sex
1  306      2 0.2096146 1.233203 0.8280030 0.2096146
2  455      2 0.2096146 1.233203 1.3880051 0.2096146
3 1010      1 0.2096146 1.233203 3.5060567 0.2096146
4  210      2 0.2096146 1.233203 0.5185439 0.2096146
5  883      2 0.2096146 1.233203 3.5060567 0.2096146
6 1022      1 0.2096146 1.233203 3.5060567 0.2096146
      rpart
1 0.9491270
2 1.7206297
3 0.6019784
4 2.0298656
5 0.6019784
6 0.9491270

Also, we can generate a plot:



> plot(l.rpart)
> text(l.rpart, use.n=TRUE)

Image dmsurvivor-r:survival:rpart_lung_plot



Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010