R barplot axis scaling

bontus picture bontus · Nov 27, 2011 · Viewed 28.8k times · Source

I want to plot a barplot of some data with some x-axis labels but so far I just keep running into the same problem, as the axis scaling is completely off limits and therefore my labels are wrongly positioned below the bars. The most simple example I can think of:

x = c(1:81)
barplot(x)
axis(side=1,at=c(0,20,40,60,80),labels=c(20,40,60,80,100))

As you can see, the x-axis does not stretch along the whole plot but stops somewhere in between. It seems to me as if the problem is quite simple, but I somehow I am not able to fix it and I could not find any solution so far :(

Any help is greatly appreciated.

Answer

Ben Bolker picture Ben Bolker · Nov 27, 2011

The problem is that barplot is really designed for plotting categorical, not numeric data, and as such it pretty much does its own thing in terms of setting up the horizontal axis scale. The main way to get around this is to recover the actual x-positions of the bar midpoints by saving the results of barplot to a variable, but as you can see below I haven't come up with an elegant way of doing what you want in base graphics. Maybe someone else can do better.

x = c(1:81)
b <- barplot(x)
## axis(side=1,at=c(0,20,40,60,80),labels=c(20,40,60,80,100))
head(b)

You can see here that the actual midpoint locations are 0.7, 1.9, 3.1, ... -- not 1, 2, 3 ...

This is pretty quick, if you don't want to extend the axis from 0 to 100:

b <- barplot(x)
axis(side=1,at=b[c(20,40,60,80)],labels=seq(20,80,by=20))

This is my best shot at doing it in base graphics:

b <- barplot(x,xlim=c(0,120))
bdiff <- diff(b)[1]
axis(side=1,at=c(b[1]-bdiff,b[c(20,40,60,80)],b[81]+19*bdiff),
              labels=seq(0,100,by=20))

You can try this, but the bars aren't as pretty:

plot(x,type="h",lwd=4,col="gray",xlim=c(0,100))

Or in ggplot:

library(ggplot2)
d <- data.frame(x=1:81)
ggplot(d,aes(x=x,y=x))+geom_bar(stat="identity",fill="lightblue",
                   colour="gray")+xlim(c(0,100))

Most statistical graphics nerds will tell you that graphing quantitative (x,y) data is better done with points or lines rather than bars (non-data-ink, Tufte, blah blah blah :-) )