The result of rpart is just with 1 root

user3172776 picture user3172776 · Jan 8, 2014 · Viewed 39.2k times · Source

As in my dataset ,the Leakage have two value 1,0. There are just about 300 rows with 1 and extra in 569378 rows are with 1. This would be the reason that I just got 1 root in the rpart result.

How can I solve this?

fm.pipe<-Leakage~PipeAge +PipePressure

> printcp(CART.fit)

Regression tree:
rpart(formula = fm.pipe, data = Data)

Variables actually used in tree construction:
character(0)

Root node error: 299.84/569378 = 0.00052661

n= 569378 

         CP nsplit rel error xerror xstd
1 0.0033246      0         1      0    0

Answer

Jean V. Adams picture Jean V. Adams · Jan 8, 2014

There may not be a way to "solve" this, if the independent variables do not provide enough information to grow the tree. See, for example, the help for rpart.control: "Any split that does not decrease the overall lack of fit by a factor of cp is not attempted." You could try loosening the control parameters, but there's no guarantee that will result in the tree growing beyond a root.

CART.fit <- rpart(formula=fm.pipe, data=Data, control=rpart.control(minsplit=2, minbucket=1, cp=0.001))