sm.density.compare(): displaying multiple density estimations in a single plot

Mike C picture Mike C · Jun 22, 2016 · Viewed 8.9k times · Source

I am trying to overlay three different density plots in R to create one plot displaying all three lines (superimpose). I have the sm package installed/loaded but I have tried using it with my data to no avail. I created three individual data plots simply using density() and plotting the values. My code looks like this:

library(sm)

set.seed(0)
x <- rnorm(100, 0, 1)
y <- rnorm(126, 0.3, 1.2)
z <- rnorm(93, -0.5, 0.7)
dx <- density(x)
dy <- density(y)
dz <- density(z)

plot(dx)
plot(dy)
plot(dz)

But when I try using the sm.density.compare() to overlay the graphs:

sm.density.compare(dx,dy,model="equal")

I get an error that says:

Error in sm.density.compare(dx, dy, model = "equal") :
sm.density.compare can handle only 1-d data trace:

Anyone know how I can fix this? I've researched quite a lot but with no success. I am fairly new to R and could really use the help.

Answer

李哲源 picture 李哲源 · Jun 23, 2016

If you want to use sm.density.compare(), then do not use density().

sm.density.compare() itself is doing density estimation. Specifically, it is doing density estimation on grouped data, so that you can plot density of different groups on the same graph.

Here is what you really need to do:

## three groups, each of length: length(x), length(y), length(z)
group.index <- rep(1:3, c(length(x), length(y), length(z)))
## collect data together and use sm.density.compare()
den <- sm.density.compare(c(x,y,z), group = group.index, model = "equal")
## plot will be generated automatically

den3

When using model = "equal", sm.density.compare() has returned values. Have a look at str(den):

List of 4
 $ p          : num 0
 $ estimaate  : num [1:3, 1:50] 2.37e-07 3.81e-06 6.06e-10 2.17e-06 2.26e-05 ...
 $ eval.points: num [1:50] -4.12 -3.94 -3.76 -3.58 -3.4 ...
 $ h          : num 0.376

h contains bandwidth used for all density estimation, eval.points contains estimation points, while estimaate is a matrix, of density estimation values. (Adrian has a typo here, it should be "estimate", not "estimaate", LOL).

All functions from sm package, starting with prefix sm. accept optional arguments ..., passed to sm.options. Have a read on ?sm.options, and you will find you have full control on the colour display, line type and line width, bandwidth selection method, etc.

Reference band will only be added to data of two groups. I.e., for pairwise comparison, sm.density.compare() can do more. For example:

den2 <- sm.density.compare(c(x,y), group = rep(1:2, c(length(x), length(y))),
                           model = "equal")

den2

> str(den2)
List of 6
 $ p          : num 0.22
 $ estimate   : num [1:2, 1:50] 4.92e-06 2.70e-05 2.51e-05 1.00e-04 1.09e-04 ...
 $ eval.points: num [1:50] -4.12 -3.94 -3.76 -3.58 -3.4 ...
 $ upper      : num [1:50] 0.00328 0.00373 0.00459 0.00614 0.00886 ...
 $ lower      : num [1:50] 0 0 0 0 0 ...
 $ h          : num 0.44

where lower and upper gives the bound of reference band / confidence region.


If you use density(), then do not use sm.density.compare()

## set universal estimation range
xlim <- range(x, y, z)
dx <- density(x, from = xlim[1], to = xlim[2], n = 200)
dy <- density(y, from = xlim[1], to = xlim[2], n = 200)
dz <- density(z, from = xlim[1], to = xlim[2], n = 200)

In this situation, density estimation for each group is done independently. Each "density" object is a list, for example:

> str(dx)
List of 7
 $ x        : num [1:200] -2.64 -2.61 -2.58 -2.55 -2.52 ...
 $ y        : num [1:200] 0.023 0.026 0.0291 0.0323 0.0356 ...
 $ bw       : num 0.31
 $ n        : int 100
 $ call     : language density.default(x = x, n = 200, from = xlim[1], to = xlim[2])
 $ data.name: chr "x"
 $ has.na   : logi FALSE
 - attr(*, "class")= chr "density"

x is evaluation points, y is estimated density, bw is bandwidth used. You will see, that dx$bw, dy$bw and dz$bw are different, due to independent estimation. However, you can manually specify a universal bw when calling density(), by using argument bw. See ?density, and I will give no example here.

Now, to overlay those density plot, you need to do yourself.

## set global plotting range
ylim <- range(dx$y, dy$y, dz$y)
## make plot
plot(dx$x, dx$y, col = 1, lwd = 2, type = "l", xlim = xlim, ylim = ylim)
lines(dy$x, dy$y, col = 2, lwd = 2)
lines(dz$x, dz$y, col = 3, lwd = 2)

do it yourself