How to interpret lm() coefficient estimates when using bs() function for splines

Question 1

How to interpret lm() coefficient estimates when using bs() function for splines

r regression lm spline bspline

PDG · May 21, 2016 · Viewed 9k times · Source

Answer

Answer

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part.

I think your question is really about what is a B-spline function. If you want to understand the meaning of coefficients, you need to know what basis functions are for your spline. See the following:

library(splines)
x <- seq(-5, 5, length = 100)
b <- bs(x, degree = 1, knots = 0)  ## returns a basis matrix
str(b)  ## check structure
b1 <- b[, 1]  ## basis 1
b2 <- b[, 2]  ## basis 2
par(mfrow = c(1, 2))
plot(x, b1, type = "l", main = "basis 1: b1")
plot(x, b2, type = "l", main = "basis 2: b2")

Note:

B-splines of degree-1 are tent functions, as you can see from b1;
B-splines of degree-1 are scaled, so that their functional value is between (0, 1);
a knots of a B-spline of degree-1 is where it bends;
B-splines of degree-1 are compact, and are only non-zero over (no more than) three adjacent knots.

You can get the (recursive) expression of B-splines from Definition of B-spline. B-spline of degree 0 is the most basis class, while

B-spline of degree 1 is a linear combination of B-spline of degree 0
B-spline of degree 2 is a linear combination of B-spline of degree 1
B-spline of degree 3 is a linear combination of B-spline of degree 2

(Sorry, I was getting off-topic...)

Your linear regression using B-splines:

y ~ bs(x, degree = 1, knots = 0)

is just doing:

y ~ b1 + b2

Now, you should be able to understand what coefficient you get mean, it means that the spline function is:

-5.12079 * b1 - 0.05545 * b2

In summary table:

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805

You might wonder why the coefficient of b2 is not significant. Well, compare your y and b1: Your y is symmetric V-shape, while b1 is reverse symmetric V-shape. If you first multiply -1 to b1, and rescale it by multiplying 5, (this explains the coefficient -5 for b1), what do you get? Good match, right? So there is no need for b2.

However, if your y is asymmetric, running trough (-5,5) to (0,0), then to (5,10), then you will notice that coefficients for b1 and b2 are both significant. I think the other answer already gave you such example.

Reparametrization of fitted B-spline to piecewise polynomial is demonstrated here: Reparametrize fitted regression spline as piece-wise polynomials and export polynomial coefficients.

Question 2

I'm using a set of points which go from (-5,5) to (0,0) and (5,5) in a "symmetric V-shape". I'm fitting a model with lm() and the bs() function to fit a "V-shape" spline:

lm(formula = y ~ bs(x, degree = 1, knots = c(0)))

I get the "V-shape" when I predict outcomes by predict() and draw the prediction line. But when I look at the model estimates coef(), I see estimates that I don't expect.

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part. Must I interpret the estimates in a different way?

If I fill the knot in the lm() function manually than I get these coefficients:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.18258    0.13558  -1.347    0.215    
x           -1.02416    0.04805 -21.313 2.47e-08 ***
z            2.03723    0.08575  23.759 1.05e-08 ***

That's more like it. Z's (point of knot) relative change to x is ~ +1

I want to understand how to interpret the bs() result. I've checked, the manual and bs model prediction values are exact the same.

How to interpret lm() coefficient estimates when using bs() function for splines

Answer

Related questions