Correlation Corrplot Configuration

Corintho picture Corintho · Sep 25, 2013 · Viewed 12.5k times · Source

I am newbie in R scripts :-)

I need build a correlation matrix and I´am trying to configurate some parameters to adapt the graph. I am using the corrplot package.

I Built a corrplot matrix this way:

corrplot(cor(d1[,2:14], d1[,2:14]), method=c("color"),
         bg = "white", addgrid.col = "gray50", 
         tl.cex=1, type="lower", tl.col = "black", 
         col = colorRampPalette(c("red","white","blue"))(100))

I need show the values of correlation in the lower matrix inside the color matrix that I built. How i can do that?

Is it possible exclude the main diagonal from the lower matrix? In this diagonl always we have the perfect correlation.

The other doubt - I want to show the significant values for the correlation using stars instead of squares. like (*, , *). Is it possible?

Can you help me guys?

Answer

blmoore picture blmoore · Sep 25, 2013

With a bit of hackery you can do this in a very similar R package, corrgram. This one allows you to easily define your own panel functions, and helpfully makes theirs easy to view as templates. Here's the some code and figure produced:

set.seed(42)
library(corrgram)

# This panel adds significance starts, or NS for not significant
panel.signif <-  function (x, y, corr = NULL, col.regions, digits = 2, cex.cor, 
                           ...) {
  usr <- par("usr")
  on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  results <- cor.test(x, y, alternative = "two.sided")
  est <- results$p.value
  stars <- ifelse(est < 5e-4, "***", 
                  ifelse(est < 5e-3, "**", 
                         ifelse(est < 5e-2, "*", "NS")))
  cex.cor <- 0.4/strwidth(stars)
  text(0.5, 0.5, stars, cex = cex.cor)
}

# This panel combines edits the "shade" panel from the package
# to overlay the correlation value as requested
panel.shadeNtext <- function (x, y, corr = NULL, col.regions, ...) 
{
  if (is.null(corr)) 
    corr <- cor(x, y, use = "pair")
  ncol <- 14
  pal <- col.regions(ncol)
  col.ind <- as.numeric(cut(corr, breaks = seq(from = -1, to = 1, 
                                               length = ncol + 1), include.lowest = TRUE))
  usr <- par("usr")
  rect(usr[1], usr[3], usr[2], usr[4], col = pal[col.ind], 
       border = NA)
  box(col = "lightgray")
  on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- formatC(corr, digits = 2, format = "f")
  cex.cor <- .8/strwidth("-X.xx")
  text(0.5, 0.5, r, cex = cex.cor)
}

# Generate some sample data
sample.data <- matrix(rnorm(100), ncol=10)

# Call the corrgram function with the new panel functions
# NB: call on the data, not the correlation matrix
corrgram(sample.data, type="data", lower.panel=panel.shadeNtext, 
         upper.panel=panel.signif)

enter image description here

The code isn't very clean, as it's mostly patched together functions from the package, but it should give you a good start to get the plot you want. Possibly you can take a similar approach with the corrplot package too.

update: Here's a version with stars and cor on the same triangle:

panel.shadeNtext <- function (x, y, corr = NULL, col.regions, ...) 
{
  corr <- cor(x, y, use = "pair")
  results <- cor.test(x, y, alternative = "two.sided")
  est <- results$p.value
  stars <- ifelse(est < 5e-4, "***", 
                  ifelse(est < 5e-3, "**", 
                         ifelse(est < 5e-2, "*", "")))
  ncol <- 14
  pal <- col.regions(ncol)
  col.ind <- as.numeric(cut(corr, breaks = seq(from = -1, to = 1, 
                                               length = ncol + 1), include.lowest = TRUE))
  usr <- par("usr")
  rect(usr[1], usr[3], usr[2], usr[4], col = pal[col.ind], 
       border = NA)
  box(col = "lightgray")
  on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- formatC(corr, digits = 2, format = "f")
  cex.cor <- .8/strwidth("-X.xx")
  fonts <- ifelse(stars != "", 2,1)
  # option 1: stars:
  text(0.5, 0.4, paste0(r,"\n", stars), cex = cex.cor)
  # option 2: bolding:
  #text(0.5, 0.5, r, cex = cex.cor, font=fonts)
}

# Generate some sample data
sample.data <- matrix(rnorm(100), ncol=10)

# Call the corrgram function with the new panel functions
# NB: call on the data, not the correlation matrix
corrgram(sample.data, type="data", lower.panel=panel.shadeNtext, 
         upper.panel=NULL)

enter image description here

Also commented out is another way of showing significance, it'll bold those below a threshold rather than using stars. Might be clearer that way, depending on what you want to show.