Geospatial features of MongoDB

This post is 4 years old. (Or older!) Code samples may not work, screenshots may be missing and links could be broken. Although some of the content may be relevant please take it with a pinch of salt.

In the past fewarticles I was writing about how to convert data from MySQL to MongoDB and why would someone want to do it. This time, I will be discussing the spectacular geospatial features of MongoDB.

I have worked with MySQL a lot and I like it. For a classic application where you have rigid data, you can implement a database backend knowing that you probably won't have to modify your schema. To me, however, MySQL lacks some great geospatial features that MongoDB is very well equipped with - in this article I will explain some of them.

There are many ways of indexing geospatial data in MongoDB but in this article I'm going to be discussing two of them - 2d and 2dsphere. But before we talk about the actual indexes, let's talk about what type of location data you can store in MongoDB and most importantly, how.

There are two surface types available in MongoDB, Spherical and Flat. Spherical allows you calculate geometry over Earth-like spheres whereas the Flat surface type uses the Euclidean plane. It's important to note that if your data is Spherical you need to use the 2dsphere index, not the 2d index, which is for the Flat surface type.

Now it's time to store our objects and MongoDB has support for 3 types of GeoJSON objects: Points, LineStrings and Polygons. If you recall, I mentioned in a previous post that we are going to be working with public transportation. This means that first a particular person's location needs to be detected and then a list of public transportation stops can be listed in the area nearby.

For the sake of simplification I have ported all the stops into one, separate collection. Here's a sample query returning one stop so you can see how the document in the collection is structured:

db.stops.find().limit(1).pretty();
{
"_id" : ObjectId("52275591f0a49d6b3b8b93ad"),
"stopID" : 1,
"name" : "Dulceri",
"gps" : {
"type" : "Point",
"coordinates" : [
12.5386254,
41.8839509
]
}
}

Time to add an index on the gps key, namely, the 2dsphere index that supports calculations on a sphere.

db.stops.ensureIndex({ gps: '2dsphere' });

Just for completeness, let's check if the index was indeed added:

db.stops.getIndexes();
[
{
v: 1,
key: {
_id: 1,
},
ns: 'test.stops',
name: '_id_',
},
{
v: 1,
key: {
gps: '2dsphere',
},
ns: 'test.stops',
name: 'gps_2dsphere',
},
];

First, let's get the user's location using the HTML5 Geolocation API and print out the latitude and longitude values:

if (navigator.geolocation) {
console.log('Geolocation API is supported!');
window.onload = function () {
navigator.geolocation.getCurrentPosition(
function (position) {
document.getElementById('lat').innerHTML = position.coords.latitude;
document.getElementById('lon').innerHTML = position.coords.longitude;
},
function (error) {
alert('Error occurred. Error code: ' + error.code);
}
);
};
} else {
console.log('Geolocation API is not supported.');
}
Lat: <span id="lat"></span> | Lon: <span id="lon"></span>

Now that we have the user's location, a search can be initiated to locate the nearest public transportation stops. This means we need to query the collection with the $near operator:

db.stops.find( { 'gps' : { $near : { $geometry: { type: 'Point', coordinates: [<lon>, <lat>] } }, $maxDistance: <meters> } } );

See this in action - let's assume that our user is somewhere near the Colosseum and we want to find all stops within 200 meters of it - the Colosseum's GPS coordinates are lon: 12.492269, lat: 41.890169:

It's very important to note that coordinates are specified in a longitude - latitude format/order!

db.stops.find( { 'gps' : { $near : { $geometry: { type: 'Point', coordinates: [12.492269, 41.890169] } }, $maxDistance: 200 } } ).pretty();
{
"_id" : ObjectId("52275592f0a49d6b3b8b96cd"),
"stopID" : 70744,
"name" : "Celio Vibenna",
"gps" : {
"type" : "Point",
"coordinates" : [
12.4928042,
41.8893081
]
}
}
{
"_id" : ObjectId("52275592f0a49d6b3b8b9704"),
"stopID" : 70816,
"name" : "Celio Vibenna",
"gps" : {
"type" : "Point",
"coordinates" : [
12.4926191,
41.8891265
]
}
}
{
"_id" : ObjectId("52275591f0a49d6b3b8b9584"),
"stopID" : 70339,
"name" : "Colosseo/Salvi",
"gps" : {
"type" : "Point",
"coordinates" : [
12.4935037,
41.8908301
]
}
}
{
"_id" : ObjectId("52275595f0a49d6b3b8bb1d9"),
"stopID" : 79524,
"name" : "salvi n.",
"gps" : {
"type" : "Point",
"coordinates" : [
12.492466647743,
41.891388486684
]
}
}
{
"_id" : ObjectId("52275591f0a49d6b3b8b9585"),
"stopID" : 70340,
"name" : "Colosseo",
"gps" : {
"type" : "Point",
"coordinates" : [
12.4914758,
41.8912735
]
}
}
{
"_id" : ObjectId("52275592f0a49d6b3b8b972d"),
"stopID" : 70865,
"name" : "colosseo/salvi n.",
"gps" : {
"type" : "Point",
"coordinates" : [
12.493945,
41.88964
]
}
}
{
"_id" : ObjectId("52275595f0a49d6b3b8bb553"),
"stopID" : 90037,
"name" : "Colosseo",
"gps" : {
"type" : "Point",
"coordinates" : [
12.4915001,
41.8914366
]
}
}
{
"_id" : ObjectId("52275592f0a49d6b3b8b975c"),
"stopID" : 70940,
"name" : "Colosseo",
"gps" : {
"type" : "Point",
"coordinates" : [
12.494191,
41.889942
]
}
}
{
"_id" : ObjectId("52275591f0a49d6b3b8b95fc"),
"stopID" : 70479,
"name" : "Colosseo",
"gps" : {
"type" : "Point",
"coordinates" : [
12.4906572,
41.891276
]
}
}

As you can see there are quite a few hits. This is great but what if I'm curious to see more diagnostic information, such as, which is the closest stop and approximately how close it is. Behold $geoNear - an operator which is part of the aggregation framework. Let's rework the above query.

There is something important to note here with regards to the $geoNear aggregator. Upon reading the documentation I have came across an example which looks like this:

db.places.aggregate([
{
$geoNear: {
near: [40.724, -73.997],
distanceField: 'dist.calculated',
maxDistance: 0.008,
query: { type: 'public' },
includeLocs: 'dist.location',
uniqueDocs: true,
num: 5,
},
},
]);

So I put together a very similar query to run against my collection:

db.stops.aggregate([
{
$geoNear: {
near: [12.492269, 41.890169],
distanceField: 'distance',
limit: 3,
},
},
]);

Only to realise that it fails with the following error message:

Error: Printing Stack Trace
		at printStackTrace (src/mongo/shell/utils.js:37:15)
		at DBCollection.aggregate (src/mongo/shell/collection.js:897:9)
		at (shell):1:10
Thu Sep 12 18:57:15.582 aggregate failed: {
	"errmsg" : "exception: geoNear command failed: { ns: \"test.stops\", errmsg: \"exception: geoNear on 2dsphere index requires spherical\", code: 16683, ok: 0.0 }",
	"code" : 16604,
	"ok" : 0
} at src/mongo/shell/collection.js:898

That's strange. The solution is to add a spherical: true option - as I'm using a 2dsphere index - by default it's set to false.

db.stops.aggregate([ { $geoNear: { near: [12.492269, 41.890169], distanceField: "distance", spherical: true, limit: 3 } } ]);

{
	"result" : [
		{
			"_id" : ObjectId("52275592f0a49d6b3b8b96cd"),
			"stopID" : 70744,
			"name" : "Celio Vibenna",
			"gps" : {
				"type" : "Point",
				"coordinates" : [
					12.4928042,
					41.8893081
				]
			},
			"distance" : 0.000016556852802056673
		},
		{
			"_id" : ObjectId("52275592f0a49d6b3b8b9704"),
			"stopID" : 70816,
			"name" : "Celio Vibenna",
			"gps" : {
				"type" : "Point",
				"coordinates" : [
					12.4926191,
					41.8891265
				]
			},
			"distance" : 0.000018754470809543685
		},
		{
			"_id" : ObjectId("52275591f0a49d6b3b8b9584"),
			"stopID" : 70339,
			"name" : "Colosseo/Salvi",
			"gps" : {
				"type" : "Point",
				"coordinates" : [
					12.4935037,
					41.8908301
				]
			},
			"distance" : 0.00001976050947245558
		}
	],
	"ok" : 1
}

The query returns the 3 closest stops and as you can see, it has a distance key as well - and the result-set is automatically ordered by the distance, showing the closest stop to the current location.

To read more about the geo features in MongoDB I recommend you this blog post.