I'm trying to plot the median values of some data on a density distribution using the ggplot2 R library. I would like to print the median values as text on top of the density plot.
You'll see what I mean with an example (using the "diamonds" default dataframe):
I'm printing three itmes: the density plot itself, a vertical line showing the median price of each cut, and a text label with that value. But, as you can see, the median prices overlap on the "y" axis (this aesthetic is mandatory in the geom_text() function).
Is there any way to dynamically assign a "y" value to each median price, so as to print them at different heights? For example, at the maximum density value of each "cut".
So far I've got this
# input dataframe
dia <- diamonds
# calculate mean values of each numerical variable:
library(plyr)
dia_me <- ddply(dia, .(cut), numcolwise(median))
ggplot(dia, aes(x=price, y=..density.., color = cut, fill = cut), legend=TRUE) +
labs(title="diamond price per cut") +
geom_density(alpha = 0.2) +
geom_vline(data=dia_me, aes(xintercept=price, colour=cut),
linetype="dashed", size=0.5) +
scale_x_log10() +
geom_text(data = dia_me, aes(label = price, y=1, x=price))
(I'm assigning a constant value to the y aesthetics in the geom_text function because it's mandatory)
This might be a start (but it's not very readable due to the colors). My idea was to create an 'y'-position inside the data used to plot the lines for the medians. It's a bit arbitrary, but I wanted y-positions to be between 0.2 and 1 (to nicely fit on the plot). I did this by the sequence-command. Then I tried to order it (didn't do a lot of good) by the median price; this is arbitrary.
#scatter y-pos over plot
dia_me$y_pos <- seq(0.2,1,length.out=nrow(dia_me))[order(dia_me$price,decreasing = T)]
ggplot(dia, aes(x=price, y=..density.., color = cut, fill = cut), legend=TRUE) +
labs(title="diamond price per cut") +
geom_density(alpha = 0.2) +
geom_vline(data=dia_me, aes(xintercept=price, colour=cut),
linetype="dashed", size=0.5) +
scale_x_log10() +
geom_text(data = dia_me, aes(label = price, y=y_pos, x=price))