SpatialPoints and SpatialPointsDataframe

r sp
Stophface picture Stophface · Aug 26, 2015 · Viewed 13.3k times · Source

Working with the sp package in R. I wonder when I would use SpatialPoints and when SpatialPointsDataframe. It seems so me that there is not much difference?!

Is the only difference that in a SpatialPointsDataframe I can store more attributes?! If so, can I create a SpatialPointDataframe from an existing Dataframe (if coordinates are exisiting in this dataframe) without taking the detour of creating spatialPoints?

Answer

jlhoward picture jlhoward · Aug 26, 2015

Both SpatialPoints and SpatialPointsDataFrame objects are S4 objects. It is true that the main structural difference is that, in the latter, there is an extra slot containing the attributes data. However the practical differences more significant. Just to give a few examples (using the built-in meuse database from package sp, containing geocoded contaminant data from the floodplain of the river Meuse).

library(sp)
data(meuse)
class(meuse)        # a data.frame
# [1] "data.frame"
head(meuse[,1:5])   # first 5 columns
#        x      y cadmium copper lead
# 1 181072 333611    11.7     85  299
# 2 181025 333558     8.6     81  277
# 3 181165 333537     6.5     68  199
# 4 181298 333484     2.6     81  116
# 5 181307 333330     2.8     48  117
# 6 181390 333260     3.0     61  137

coordinates(meuse) <- 1:2     # convert to spDF object; use first 2 columns for lon/lat
class(meuse)                  # now a SpatialPointsDataFrame
# [1] "SpatialPointsDataFrame"
# attr(,"package")
# [1] "sp"

Even though meuse is a SpatialPointsDataFrame, we can still index it as if it was a simple data.frame. Notice how we refer to the lead column of the attributes table as if meuse was a df, and notice how indexing works as it does in a df.

meuse[meuse$lead>500,1:5]        # high lead
#         coordinates cadmium copper lead zinc elev
# 55 (179973, 332255)    12.0    117  654 1839 7.90
# 60 (180100, 332213)    10.9     90  541 1571 6.68
meuse[meuse$lead<40,1:5]         # low lead
#              coordinates cadmium copper lead zinc  elev
# 112 (180328, 331158)     0.4     20   39  113 9.717
# 161 (180201, 331160)     0.8     18   37  126 9.036

We can also use the plot method for SpatialPointsDataFrames to plot the data.

par(mfrow=c(1,2), mar=c(2,2,2,2))    # 1 X 2 grid of plots; remove margins
plot(meuse, pch=20, main="Full Dataset", axes=TRUE)
plot(meuse, 
     bg=rev(heat.colors(5))[cut(meuse$lead,breaks=c(0,100,200,300,400,Inf),labels=FALSE)],
     col="grey",main="Lead Distribution", pch=21, axes=TRUE)

And we can transform the coordinates into something more useful (lon/lat).

library(rgdal)
proj4string(meuse) <- CRS("+init=epsg:28992")                   # set original projection
meuse <- spTransform(meuse, CRS("+proj=longlat +datum=WGS84"))  # transform to lon/lat
plot(meuse, pch=20, main="Full Dataset", axes=TRUE)
plot(meuse, 
     bg=rev(heat.colors(5))[cut(meuse$lead,breaks=c(0,100,200,300,400,Inf),labels=FALSE)],
     col="grey",main="Lead Distribution", pch=21, axes=TRUE)

And finally a counter-example, overlaying the points onto a Google map:

library(ggmap)    # loads ggplot2 as well
map <- get_map(location=rowMeans(bbox(meuse)), zoom=13)   # get Google map
ggmap(map) + 
  geom_point(data=as.data.frame(meuse), aes(x,y,fill=lead), 
             color="grey70", size=3.5, shape=21)+
  scale_fill_gradientn(colours=rev(heat.colors(5)))

What we've done here, in essence, is convert meuse from a data.frame to a spatialPointsDataFrame so that we could use spTransform(...) on the coordinates, then convert the result back to a data.frame so that we can use ggplot to overlay them onto a Google map.