Bubble Chart with ggplot2

Roman picture Roman · Nov 5, 2014 · Viewed 34.3k times · Source

I want to print a bubble chart in R. The problem I run into is that both, my x and my y axis are discrete. In theory this means a lot of data points (bubbles) end up on the same coordinate. I would rather have them scattered around the data point, but still within a quadrant that makes clear the bubble belongs to the respective x/y coordinate.

I think it is best demonstrated by a little example. The following code should highlight the problem:

# Example
require(ggplot2)
zz <- textConnection("Row PowerSource ProductSegment Price Model ManufacturingLocation Quantity
1 High SegmentA Low ModA LocationA 5000
2 Low SegmentB Low ModB LocationB 25000
3 High SegmentC Low ModC LocationC 15000
4 Low SegmentD High ModD LocationD 30000
5 High SegmentE High ModE LocationA 2500
6 Low SegmentA Low ModF LocationB 110000
7 High SegmentB Low ModG LocationC 20000
8 Low SegmentC Low ModH LocationD 3500
9 High SegmentD Low ModI LocationA 65500
10 Low SegmentE Low ModJ LocationB 145000
11 High SegmentA Low ModK LocationC 15000
12 Low SegmentB Low ModL LocationD 5000
13 High SegmentC Low ModM LocationA 26000
14 Low SegmentD Low ModN LocationB 14000
15 High SegmentE Mid ModO LocationC 75000
16 Low SegmentA High ModP LocationD 33000
17 High SegmentB Low ModQ LocationA 14000
18 Low SegmentC Mid ModR LocationB 33000
19 High SegmentD High ModS LocationC 95000
20 Low SegmentE Low ModT LocationD 4000
 ")
df2 <- read.table(zz, header= TRUE)
close(zz)
df2


ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_point(aes(size = Quantity, colour = Price)) + 
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

How can I scatter the bubbles a little bit to show the different products in each categories and their quantity?

(Apologies, I can't add an image at the moment because of too few reputations)

enter image description here

Answer

Docconcoct picture Docconcoct · Nov 6, 2014

As Tom Martens pointed out adjusting alpha can show any overlapping. The following alpha level:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_point(aes(size = Quantity, colour = Price, alpha=.02)) + 
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

results in:

enter image description here

Using geom_jitter instead of point, combined with alpha:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_jitter(aes(size = Quantity, colour = Price, alpha=.02)) + 
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

produces this:

enter image description here

EDIT: In order to avoid the artefact in the legend the alpha should be placed outside the aes:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_point(aes(size = Quantity, colour = Price),alpha=.2) +
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

resulting in:

enter image description here

and:

 ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource, label = Model)) +
    geom_jitter(aes(size = Quantity, colour = Price),alpha=.2) +
    geom_text(hjust = 1, size = 2) +
    scale_size(range = c(1,15)) +
    theme_bw()

resulting in:

enter image description here

EDIT 2: So, this took a while to figure out.

I followed the example I linked to in my comment. I adjusted the code to suit your needs. First of all I created the jitter values outside of the plot:

df2$JitCoOr <- jitter(as.numeric(factor(df2$ManufacturingLocation)))
df2$JitCoOrPow <- jitter(as.numeric(factor(df2$PowerSource)))

I then called those values into the geom_point and geom_text x and y coordinates inside aes. This worked by jittering the bubbles and matching labels to them. However it messed up the x and y axis labels so I relabled them as can be seen in scale_x_discrete and scale_y_discrete. Here is the plot code:

ggplot(df2, aes(x = ManufacturingLocation, y = PowerSource)) +
geom_point(data=df2,aes(x=JitCoOr, y=JitCoOrPow,size = Quantity, colour = Price), alpha=.5)+
geom_text(data=df2,aes(x=JitCoOr, y=JitCoOrPow,label=Model)) + 
scale_size(range = c(1,50)) +
scale_y_discrete(breaks =1:3 , labels=c("Low","High"," "), limits = c(1, 2))+
scale_x_discrete(breaks =1:4 , labels=c("Location A","Location B","Location C","Location D"), limits = c(1,2,3,4))+ 
theme_bw()

Which gives this output:

enter image description here

You can adjust the size of the bubbles via scale_size above. I exported this image with dimensions of 1000*800.

Regarding your request to add borders I think it is unnecessary. It is very clear in this plot where the bubbles belong & I think borders would make it look a bit ugly. However, if you still want borders I'll have a look and see what I can do.