Using "predict" in nls

Jarrod picture Jarrod · Dec 21, 2013 · Viewed 7.3k times · Source

I have data from the USGS Nation Water Data website. I am currently trying to plot and fit curves to the data to use in prediction for different measurements taken within the dataset (dissolved oxygen, pH, gage height and temp) all in relation to discharge rate. I used the "nls" command and I am using a book of equations to find which curve to use...for this example I have specifically used Schumacher’s equation (p.48 in the book).

Find the link to data:

curve book: http://www.for.gov.bc.ca/hfd/pubs/docs/bio/bio04.htm

data I used: http://waterdata.usgs.gov/mi/nwis/uv?referred_module=qw&search_station_nm=River%20Rouge%20at%20Detroit%20MI&search_station_nm_match_type=anywhere&index_pmcode_00065=1&index_pmcode_00060=1&index_pmcode_00300=1&index_pmcode_00400=1&index_pmcode_00095=1&index_pmcode_00010=1&group_key=NONE&sitefile_output_format=html_table&column_name=agency_cd&column_name=site_no&column_name=station_nm&range_selection=date_range&begin_date=2013-11-18&end_date=2013-12-18&format=html_table&date_format=YYYY-MM-DD&rdb_compression=file&list_of_search_criteria=search_station_nm,realtime_parameter_selection

My problem is that I CANNOT get nls to predict new values once I picked a curve coded it... I also can't quite figure out how to plot it...I'm guessing this can be with the residuals? In the code I have used "aggregate" to extract means of the listed measurements and the corresponding discharge rates, now I just need to get R to predict for me. I got as far as getting what I think are fitted values... but I'm not sure and I hit a wall with "?nls."

##Create new dataframes with means given date for each constituent
ph <- aggregate(Discharge~pH, data=River.Data, mean)

##pH models
pH <- ph$pH
disch <- ph$Discharge
phm <- nls(disch~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph<- data.frame(ph=c(3.0,4.0,5.0,6.0,7.0,8.0,9.0))
predict(phm, newdata=newph)

Answer

jlhoward picture jlhoward · Dec 21, 2013

Seems like you've already got your answer(??), but:

ph    <- aggregate(Discharge~pH, data=River.Data, mean)
phm   <- nls(Discharge~exp(a+(b/pH)), data=ph, trace=T, start=list(a=-47.06 ,b=400.2))
newph <- data.frame(pH=seq(3,9,by=0.1))
Discharge.pred <- predict(phm, newdata=newph)

plot(ph$pH, ph$Discharge, xlim=c(3,9), ylim=c(0,1000))
par(new=t)
plot(newph$pH,Discharge.pred, xlab="", ylab="", axes=F, xlim=c(3,9), ylim=c(0,1000), type="l")

The problem is that your data is for pH in [7.5,8.2] but you are trying to predict in [3,9]. The model you've chosen is not stable for pH that far outside the range.