How to add a legend for two geom layers in one ggplot2 plot?

hpy picture hpy · Oct 30, 2017 · Viewed 10.5k times · Source

I've got a data frame that looks like this:

glimpse(spottingIntensityByMonth)
# Observations: 27
# Variables: 3
# $ yearMonth <dttm> 2015-05-01, 2015-06-01, 2015-07-01, 2015-08-01, 2015-09-01, 2015-10-01, 2...
# $ nClassificationsPerDayPerSpotter <dbl> 3.322581, 13.212500, 13.621701,
    6.194700, 18.127778, 12.539589, 8.659722, ...
# $ nSpotters <int> 8, 8, 22, 28, 24, 22, 24, 27, 25, 29, 32, 32, 21, 14, 18, 13, 20, 19, 15, ...

I am trying to plot it with ggplot2 like so:

ggplot() + 
    geom_col(data = spottingIntensityByMonth, 
             mapping = aes(x = yearMonth, 
                           y = nClassificationsPerDayPerSpotter)
             ) + 
    xlab("Month of year") + 
    scale_y_continuous(name = "Daily classifications per Spotter") + 
    geom_line(data = spottingIntensityByMonth, 
              mapping = aes(x = yearMonth,
                            y = nSpotters)
              ) +
    theme_bw()

This produces a plot like so:

enter image description here

Now I want to add a legend that says what the line and columns mean. How do I do this? Thanks!

Answer

Z.Lin picture Z.Lin · Oct 31, 2017

In ggplot, legends are automatically created for mapped aesthetics. You can add such mappings as follows:

ggplot(data = df, 
       mapping = aes(x = x)) + 

  # specify fill for bar / color for line inside aes(); you can use
  # whatever label you wish to appear in the legend
  geom_col(aes(y = y.bar, fill = "bar.label")) +
  geom_line(aes(y = y.line, color = "line.label")) +

  xlab("Month of year") + 
  scale_y_continuous(name = "Daily classifications per Spotter") + 

  # the labels must match what you specified above
  scale_fill_manual(name = "", values = c("bar.label" = "grey")) +
  scale_color_manual(name = "", values = c("line.label" = "black")) +

  theme_bw()

In the above example, I've also moved the data & common aesthetic mapping (x) to ggplot().

plot

Sample dataset:

set.seed(7)
df <- data.frame(
  x = 1:20,
  y.bar = rpois(20, lambda = 5),
  y.line = rpois(20, lambda = 10)
)