Circular Stacked Bar Plot in R

ben picture ben · Jul 21, 2014 · Viewed 6.9k times · Source

I came across this awesome and relatively straightforward package seen here that can create beautiful normalized stacked bar plots in polar form like so. I hope to create a similar plot but that is not normalized and can instead have raw values as input.

On his blog, he indicates that someone made a un-normalized version of his code that can produce plots like this: enter image description here

This is almost exactly what I need but I can't figure out how to stack the bars to produce a graph like so (sorry for quality): enter image description here

Here is some toy data that is a subset of the real data I will use and that follows his input format:

family  item    score   value
Group 1 Disease 1   Genetics    1
Group 1 Disease 1   EMR 8
Group 1 Disease 1   Pubmed  10
Group 1 Disease 2   Genetics    1
Group 1 Disease 2   EMR 21
Group 1 Disease 2   Pubmed  4
Group 1 Disease 3   Genetics    0
Group 1 Disease 3   EMR 2
Group 1 Disease 3   Pubmed  0
Group 2 Disease 4   Genetics    4
Group 2 Disease 4   EMR 72
Group 2 Disease 4   Pubmed  16
Group 3 Disease 5   Genetics    2
Group 3 Disease 5   EMR 19
Group 3 Disease 5   Pubmed  7
Group 3 Disease 6   Genetics    2
Group 3 Disease 6   EMR 12
Group 3 Disease 6   Pubmed  6
Group 4 Disease 7   Genetics    0
Group 4 Disease 7   EMR 11
Group 4 Disease 7   Pubmed  4

A direct link to the package code that he openly provides can be found here.

Thanks so much, Ben

edit: Here is what I've tried-

I went into the code and replaced:

# histograms
p<-ggplot(df)+geom_rect(
        aes(
            xmin=xmin,
            xmax=xmax,
            ymin=ymin,
            ymax=ymax,
            fill=score)
        )

with:

# histograms
p<-ggplot(df)+ 
        geom_bar(stat="identity", position="stack", aes(x=item, y=value,fill=score))

I did this because as far as I’ve been able to understand there is no easy way of generating a stacked bar using geom_rect and when I try the above code outwith the context of the polarBarChart script it will plot stacked bar plots, but starting from the center generating out instead of coming from the out going in. Additionally, when I use this piece of code within the polarBarChart script I get the following error:

“Error: Discrete value supplied to continuous scale” 

and no output

Answer

Christie Haskell Marsh picture Christie Haskell Marsh · Jul 25, 2014

In order to make this work you do have to use geom_rect(). It just isn't possible to modify geom_bar() to do what you need as a polar geom_bar() creates a rose plot. Therefore, in order to have the data plotted inwards rather than outwards, geom_rect() is the only option (that I'm aware of for ggplot2).

I'll highlight the changes that I made first, display the plot, and then at the end I'll include the entire function as modified.

I modified the block of code that computes xmin, xmax, ymin, and ymax as follows:

xmin was:

xmin <- (indexScore - 1) * (binSize + spaceBar) + (indexItem - 1) * (spaceItem + M * (binSize + spaceBar)) + (indexFamily - 1) * (spaceFamily - spaceItem)

xmin is now:

xmin <- (binSize + spaceBar) + (indexItem - 1) * (spaceItem + (binSize + spaceBar)) + (indexFamily - 1) * (spaceFamily - spaceItem)

I removed (indexScore-1) * and M * as these are what position the bars for each score next to each other. In each item we want them at the same x location.

ymin was:

ymin <- affine(1)

ymin is now:

df<-df[with(df, order(family,item,value)), ] df<-ddply(df,.(item),mutate,ymin=c(1,ymax[1:(length(ymax)-1)]))

We want the ymin for each bar in each item to start at the ymax of the bar that is before it. In order to accomplish this I first ordered the data frame so that in each item the order of the values is from lowest to highest. Then, for each item, I set ymin to 1 for the lowest value, and then to the ymax of the previous bar for all other values.

I also made some ascetic changes. In the family labels section I changed y=1.2 to y=1.7 because your item labels are long so the family labels were consequently on top of them. I also added hjust=0.5 to center them and vjust=0 so they aren't so close to the item labels. As a result, this line:

p<-p+ylim(0,outerRadius+0.2)

Is now:

p<-p+ylim(0,outerRadius+0.7)

So the labels fit within the plot region.

Lastly, this line:

familyLabelsDF<-aggregate(xmin~family,data=df,FUN=function(s) mean(s+binSize))

is now:

familyLabelsDF<-aggregate(xmin~family,data=df,FUN=function(s) mean(s+binSize/2))

This makes it so the family labels are centered in each group.

Here is what it looks like:

enter image description here

And here is the entire function (latest version see GitHub):

## =============================================================================
## Polar BarChart
## Original Polar Histogram by Christophe Ladroue
## Source: http://chrisladroue.com/2012/02/polar-histogram-pretty-and-useful/
## Modified from original by Christos Hatzis 3.22.2012 (CH)
## Modified from modified by Christie Haskell 7.25.2014 (CHR)
## =============================================================================
polarBarChart <-
  function(
    df,
    binSize=1,
    spaceBar=0.05,
    spaceItem=0.2,
    spaceFamily=1.2,
    innerRadius=0.3,
    outerRadius=1,
    nguides=3,
    guides=pretty(range(c(0, df$value)), n=nguides, min.n=2),
    alphaStart=-0.3,
    circleProportion=0.8,
    direction="inwards",
    familyLabels=TRUE,
    itemSize=3,
    legLabels=NULL,
    legTitle="Source"){

    require(ggplot2)
    require(plyr)

    # ordering
    df<-arrange(df,family,item,score)

    # family and item indices
    df$indexFamily <- as.integer(factor(df$family))
    df$indexItem <- with(df, as.integer(factor(item, levels=item[!duplicated(item)])))        
    df$indexScore <- as.integer(factor(df$score))

    df<-arrange(df,family,item,score)

    # define the bins

    vMax <- max(df$value)

    guides <- guides[guides < vMax]
    df$value <- df$value/vMax

    # linear projection  
    affine<-switch(direction,
                   'inwards'= function(y) (outerRadius-innerRadius)*y+innerRadius,
                   'outwards'=function(y) (outerRadius-innerRadius)*(1-y)+innerRadius,
                   stop(paste("Unknown direction")))

    df<-within(df, {
      xmin <- (binSize + spaceBar) + 
        (indexItem - 1) * (spaceItem + (binSize + spaceBar)) +
        (indexFamily - 1) * (spaceFamily - spaceItem)
      xmax <- xmin + binSize
      ymax <- affine(1 - value)
    }
    )

    df<-df[with(df, order(family,item,value)), ]
    df<-ddply(df,.(item),mutate,ymin=c(1,ymax[1:(length(ymax)-1)]))

    # build the guides
    guidesDF<-data.frame(
      xmin=rep(df$xmin,length(guides)),
      y=rep(guides/vMax,1,each=nrow(df)))

    guidesDF<-within(guidesDF,{
      xend<-xmin+binSize+spaceBar
      y<-affine(1-y)
    })


    # Building the ggplot object

    totalLength<-tail(df$xmin+binSize+spaceBar+spaceFamily,1)/circleProportion-0

    # histograms
    p<-ggplot(df)+geom_rect(
      aes(
        xmin=xmin,
        xmax=xmax,
        ymin=ymin,
        ymax=ymax,
        fill=score)
    )

    # guides  
    p<-p+geom_segment(
      aes(
        x=xmin,
        xend=xend,
        y=y,
        yend=y),
      colour="white",
      data=guidesDF)

    # label for guides
    guideLabels<-data.frame(
      x=0,
      y=affine(1-guides/vMax),
      label=guides
    )

    p<-p+geom_text(
      aes(x=x,y=y,label=label),
      data=guideLabels,
      angle=-alphaStart*180/pi,
      hjust=1,
      size=4)

    # item labels
    readableAngle<-function(x){
      angle<-x*(-360/totalLength)-alphaStart*180/pi+90
      angle+ifelse(sign(cos(angle*pi/180))+sign(sin(angle*pi/180))==-2,180,0)
    }
    readableJustification<-function(x){
      angle<-x*(-360/totalLength)-alphaStart*180/pi+90
      ifelse(sign(cos(angle*pi/180))+sign(sin(angle*pi/180))==-2,1,0)
    }

    dfItemLabels<-ddply(df,.(item),summarize,xmin=xmin[1])
    dfItemLabels<-within(dfItemLabels,{
      x <- xmin +  (binSize + spaceBar)/2
      angle <- readableAngle(xmin +  (binSize + spaceBar)/2)
      hjust <- readableJustification(xmin +  (binSize + spaceBar)/2)
    })

    p<-p+geom_text(
      aes(
        x=x,
        label=item,
        angle=angle,
        hjust=hjust),
      y=1.02,
      size=itemSize,
      vjust=0.5,
      data=dfItemLabels)

    # family labels
    if(familyLabels){
      #     familyLabelsDF<-ddply(df,.(family),summarise,x=mean(xmin+binSize),angle=mean(xmin+binSize)*(-360/totalLength)-alphaStart*180/pi)
      familyLabelsDF<-aggregate(xmin~family,data=df,FUN=function(s) mean(s+binSize/2))
      familyLabelsDF<-within(familyLabelsDF,{
        x<-xmin
        angle<-xmin*(-360/totalLength)-alphaStart*180/pi
      })

      p<-p+geom_text(
        aes(
          x=x,
          label=family,
          angle=angle),
        data=familyLabelsDF,
        hjust=0.5,
        vjust=0,
        y=1.7)
    }  

    # empty background and remove guide lines, ticks and labels
    p<-p+opts(
      panel.background=theme_blank(),
      axis.title.x=theme_blank(),
      axis.title.y=theme_blank(),
      panel.grid.major=theme_blank(),
      panel.grid.minor=theme_blank(),
      axis.text.x=theme_blank(),
      axis.text.y=theme_blank(),
      axis.ticks=theme_blank()
    )

    # x and y limits
    p<-p+xlim(0,tail(df$xmin+binSize+spaceFamily,1)/circleProportion)
    p<-p+ylim(0,outerRadius+0.7)

    # project to polar coordinates
    p<-p+coord_polar(start=alphaStart)

    # nice colour scale
    if(is.null(legLabels)) legLabels <- levels(df$score)
    names(legLabels) <- levels(df$score)
    p<-p+scale_fill_brewer(name=legTitle, palette='Set1',type='qual', labels=legLabels)

    p
  }