GeoJSON and MongoDB: Is it worth it to store points as GeoJSON.Point?

nevi_me picture nevi_me · Apr 21, 2013 · Viewed 11.3k times · Source

With the introduction of 2.3 > MongoDB has become even more useful with location data handling and querying. MongoDB stores documents as BSON, so each document with have all the document fields, which obviously potentially leads to larger databases than our conventional RMDBS.

I used to store polylines and polygons as a series of indexed points, with an extra field representing the order of each line (I was doing this to ensure consistency as I use JavaScript, so points weren't always stored in their correct order). It was something like this:

polyline: {
  [
    point: [0,0],
    order: 0
  ],
  [
    point: [0,1],
    order: 1
  ]
}

Whereas now I use:

polyline: {
  type: 'LineString',
  coordinates: [
    [0,0],
    [1,0]
  ]
}

I've seen an improvement in the size of documents, as some polylines can have up to 500 points.

However, I'm wondering what the benefits of storing all my Point data as GeoJSON would be. I am discouraged by the increase in document size, as for example:

loc: [1,0]

is way better than

loc: {
  type: 'Point',
  coordinates: [0,1]
}

and would thus be easier to work with.

My question is:

Is it better/recommended to store points as GeoJSON objects as opposed to a 2 point array?

What I have considered is the following:

  • Size constraints: I could potentially have millions of documents with a location, which might impact the size of the collection, and potentially my pocket.
  • Consistency: It would be better do deal with every set of coordinates in the lng, lat format as opposed to sticking to lat, lng for points, and the former for all my other location features.
  • Convenience: If I grab a point, and use a $geoWithin or $geoIntersects with it, I wouldn't need to convert it to GeoJSON first before using it as a query parameter.

What I am unsure of is:

  • Whether support for loc: [x,y] will be dropped in the future on MongoDB
  • Any indexing benefits from 2dsphere as opposed to 2d
  • Whether any planned GeoJSON additions to MongoDB might result in the need for the consistency mentioned above.

I'd rather move to GeoJSON while my data is still manageable, than switch in future under a lot of strain.

May I please kindly ask for a thoroughly (even if slightly) thought out answer. I won't select a correct answer soon, so I can evaluate any responses.

I'm also not sure if SO is the right place to pose the question, so if DBA is a more appropriate place I'll move the question there. I chose SO because there's a lot of MongoDB related activity here.

Answer

whostolebenfrog picture whostolebenfrog · May 28, 2013

I would recommend using the new GeoJSON format. Whilst I don't believe that any announcement has been made about dropping support for the old format, the fact that they refer to it as legacy should be an indication of their opinion.

There are some indexing benefits to using 2dsphere rather than 2d.

  • Firstly it actually calculates queries based on the Earth being a sphere. One of the disadvantages of a 2d index is that it doesn't account for this meaning that you will have to handle the conversion yourself if you are interested in the actual area covered by a query rather than the basic lat/lngs.
  • The ability to use compound indexes, if you want to do something like "get me 100 results from this area most recent first" then 2dsphere is your only choice.
  • The ability to use geoIntersects queries.
  • The geoWithin geometry queries require that you use the geoJSON format.

One other important thing to note is that you need to be sure the query you are using is supported by the index you use. If you use a 2dsphere for example you can't use a $box query as it won't be indexed - however mongo will not warn you - the result will just perform a table scan and will be very slow!

Mongo provide a compatibility chart of which queries can be used with which index