Plotting summary statistics

Julio Diaz picture Julio Diaz · Mar 7, 2011 · Viewed 12.7k times · Source

For the following data set,

Genre   Amount
Comedy  10
Drama   30
Comedy  20
Action  20
Comedy  20
Drama   20

I want to construct a ggplot2 line graph, where the x-axis is Genre and the y-axis is the sum of all amounts (conditional on the Genre).

I have tried the following:

p = ggplot(test, aes(factor(Genre), Gross)) + geom_point()
p = ggplot(test, aes(factor(Genre), Gross)) + geom_line()
p = ggplot(test, aes(factor(Genre), sum(Gross))) + geom_line()

but to no avail.

Answer

juba picture juba · Mar 7, 2011

If you don't want to compute a new data frame before plotting, you cvan use stat_summary in ggplot2. For example, if your data set looks like this :

R> df <- data.frame(Genre=c("Comedy","Drama","Action","Comedy","Drama"),
R+                  Amount=c(10,30,40,10,20))
R> df
   Genre Amount
1 Comedy     10
2  Drama     30
3 Action     40
4 Comedy     10
5  Drama     20

You can use either qplot with a stat="summary" argument :

R> qplot(Genre, Amount, data=df, stat="summary", fun.y="sum")

Or add a stat_summary to a base ggplot graphic :

R> ggplot(df, aes(x=Genre, y=Amount)) + stat_summary(fun.y="sum", geom="point")