Plot a best fit line R

londwwq1 picture londwwq1 · Aug 13, 2014 · Viewed 51.1k times · Source

Right now i have a large data set with temperature going up and down all the time. I want to smoothen my data and plot the best fit line with all the temperature,

Here is the data:

weather.data  
    date        mtemp   
1   2008-01-01  12.9        
2   2008-01-02  12.9        
3   2008-01-03  14.5        
4   2008-01-04  15.7            
5   2008-01-05  17.0        
6   2008-01-06  17.8    
7   2008-01-07  20.2        
8   2008-01-08  20.8        
9   2008-01-09  21.4        
10  2008-01-10  20.8        
11  2008-01-11  21.4        
12  2008-01-12  22.0        

and so on............... til 2009 Dec 31

My current graph looks like this and my data fit a regression like either the running average or loess:

enter image description here

However, when I tried to fit it with the running average, it became like this:

enter image description here

Here is my code.

plot(weather.data$date,weather.data$mtemp,ylim=c(0,30),type='l',col="orange")
par(new=TRUE)

Could anyone give me a hand?

Answer

nico picture nico · Aug 13, 2014

Depending on your actual data and how you want to smooth it, and why you want to smooth it there are various options.

I am showing you examples with linear regression (first and second order) and local regression (LOESS). These may or may not be the good statistical models to use for your data, but it is difficult to tell without seeing it. In any case:

time <- 0:100
temp <- 20+ 0.01 * time^2 + 0.8 * time + rnorm(101, 0, 5)

# Generate first order linear model
lin.mod <- lm(temp~time)

# Generate second order linear model
lin.mod2 <- lm(temp~I(time^2)+time)

# Calculate local regression
ls <- loess(temp~time)

# Predict the data (passing only the model runs the prediction 
# on the data points used to generate the model itself)
pr.lm <- predict(lin.mod)
pr.lm2 <- predict(lin.mod2)
pr.loess <- predict(ls)

par(mfrow=c(2,2))
plot(time, temp, "l", las=1, xlab="Time", ylab="Temperature")
lines(pr.lm~time, col="blue", lwd=2)

plot(time, temp, "l", las=1, xlab="Time", ylab="Temperature")
lines(pr.lm2~time, col="green", lwd=2)

plot(time, temp, "l", las=1, xlab="Time", ylab="Temperature")
lines(pr.loess~time, col="red", lwd=2)

Another option would be to use a moving average.

For instance:

library(zoo)
mov.avg <- rollmean(temp, 5, fill=NA)
plot(time, temp, "l")
lines(time, mov.avg, col="orange", lwd=2)

examples of smoothing