I am new to R and I'm having some difficulty plotting an exponential curve using ggplot2. I have a set of data below.
DATA
X Y x y
1 0.6168111 37.20637 0.6168111 37.20637
2 0.5478698 24.17084 0.5478698 24.17084
3 0.6082697 26.21261 0.6082697 26.21261
4 0.6094899 26.14065 0.6094899 26.14065
5 0.6095040 38.56314 0.6095040 38.56314
6 0.6933108 36.78443 0.6933108 36.78443
7 0.5796637 27.82840 0.5796637 27.82840
8 0.4716866 30.63080 0.4716866 30.63080
9 0.5291792 29.78255 0.5291792 29.78255
10 1.2520000 33.12657 1.2520000 33.12657
11 1.2260000 31.81066 1.2260000 31.81066
12 1.2690000 59.91388 1.2690000 59.91388
13 1.2060000 49.92380 1.2060000 49.92380
14 1.0760000 29.67524 1.0760000 29.67524
15 0.9750000 27.43602 0.9750000 27.43602
16 1.1470000 35.34598 1.1470000 35.34598
17 1.1080000 32.75476 1.1080000 32.75476
18 0.8854048 26.20000 0.8854048 26.20000
19 0.8965901 31.80000 0.8965901 31.80000
20 0.6240262 31.50000 0.6240262 31.50000
21 0.7968513 40.20000 0.7968513 40.20000
22 0.8635455 30.90000 0.8635455 30.90000
23 0.7414680 31.50000 0.7414680 31.50000
24 0.8701420 30.80000 0.8701420 30.80000
25 0.7312760 28.90000 0.7312760 28.90000
26 1.7313667 49.70000 1.7313667 49.70000
27 1.5730064 35.00000 1.5730064 35.00000
28 2.0033461 33.10000 2.0033461 33.10000
29 1.4110183 34.90000 1.4110183 34.90000
30 1.5826836 50.50000 1.5826836 50.50000
31 1.8019046 39.80000 1.8019046 39.80000
32 1.4689220 33.30000 1.4689220 33.30000
33 1.7568460 33.10000 1.7568460 33.10000
34 1.4727440 37.90000 1.4727440 37.90000
35 0.8225826 24.90000 0.8225826 24.90000
36 0.6625028 32.30000 0.6625028 32.30000
37 0.5410429 30.10000 0.5410429 30.10000
38 0.7322787 28.70000 0.7322787 28.70000
39 0.6586351 29.80000 0.6586351 29.80000
40 0.3003746 29.70000 0.3003746 29.70000
41 0.3351484 25.10000 0.3351484 25.10000
42 0.3254572 24.20000 0.3254572 24.20000
43 0.3818777 24.90000 0.3818777 24.90000
44 0.3153609 30.10000 0.3153609 30.10000
When I fit this data with a few different models, the model log(y) ~ x provides the best fit based on comparison of P-values.
CODE
linear.model <-lm(y ~ x, df)
log.model <-lm(log(y) ~ x, df)
exp.model <-lm(y ~ exp(x), df)
I would like to plot this data and this fit using ggplot and geom_smooth.
testPlot <- ggplot(df, aes(x=x, y=y)) +
geom_point() +
geom_smooth(method="lm", formula= (y ~ exp(x)), se=FALSE, color=1) +
geom_smooth(method="lm", formula= (log(y) ~ x), se=FALSE, color=2)
The plotted line (black line) using the (y ~ exp(x) model appears correct, but using (log(y) ~ x) does not give me the expected result (red line). How can I overlay the line for the log(y) ~ x model correctly?
As rightly mentioned in the comments, the range of log(y)
is 3.19 - 4.09. I think you simply need to bring the fitted values back to the same scale as y so try this. Hopefully helps...
library(ggplot2)
df <- read.csv("test.csv")
linear.model <-lm(y ~ x, df)
log.model <-lm(log(y) ~ x, df)
exp.model <-lm(y ~ exp(x), df)
log.model.df <- data.frame(x = df$x,
y = exp(fitted(log.model)))
ggplot(df, aes(x=x, y=y)) +
geom_point() +
geom_smooth(method="lm", aes(color="Exp Model"), formula= (y ~ exp(x)), se=FALSE, linetype = 1) +
geom_line(data = log.model.df, aes(x, y, color = "Log Model"), size = 1, linetype = 2) +
guides(color = guide_legend("Model Type"))