Skip to main content

Retrieve only queried element in an object array in MongoDB collection

3 min read

Older Article

This article was published 7 years ago. Some information may be outdated or no longer applicable.

Intro

Quick tip time. Let’s pull back only the matched element from an array of objects inside a MongoDB document.

Document-based NoSQL databases tend to store nested arrays of objects. Something like this:

{
  "name": "John",
  "cars": [{
    "make": "Ford",
    "colour": "red"
  }, {
    "make": "Mercedes",
    "colour": "blue"
  }]
}
{
  "name": "Susan",
  "cars": [{
    "make": "Fiat",
    "colour": "red"
  }, {
    "make": "BMW",
    "colour": "black"
  }]
}

Often the above is referred to as denormalisation since we place everything we know about a certain data entity into a single document object.

Each person has a cars property, which is an array of car objects, each carrying a make and a colour property.

Say we want documents where the car’s colour is “blue”. A standard find() hands us the entire document with every array item attached:

db.people.find({ 'cars.colour': 'blue' }).pretty();
{
  "_id" : ObjectId("5c4ab3bf996018528b1ef283"),
  "name" : "John",
  "cars" : [
    {
      "make" : "Ford",
      "colour" : "red"
    },
    {
      "make" : "Mercedes",
      "colour" : "blue"
    }
  ]
}

Throwing projection into the query still gives us both objects, not just the blue car we’re after:

db.people.find({ 'cars.colour': 'blue' }, { 'cars.colour': 1 }).pretty();
{
  "_id" : ObjectId("5c4ab3bf996018528b1ef283"),
  "cars" : [
    {
        "colour" : "red"
    },
    {
        "colour" : "blue"
    }
  ]
}

That might work sometimes. But we often need only the single matching object from the array (the blue car, nothing else).

To get there, we reach for MongoDB’s Aggregation. Think of it as a pipeline. Documents pass through multiple stages of transformation: grouping values, adding computed fields, even building virtual sub-objects.

Here’s the aggregate query:

db.people.aggregate([
  { $match: { 'cars.colour': 'blue' } },
  {
    $project: {
      cars: {
        $filter: {
          input: '$cars',
          as: 'cars',
          cond: { $eq: ['$$cars.colour', 'blue'] },
        },
      },
      _id: 0,
    },
  },
]);

The $match stage works exactly like before: grab documents where a car’s colour is “blue”. Then $project lets us pass documents to the next pipeline stage with specific fields (existing or computed). We rebuild the cars property with a condition in $filter: only keep cars with a blue colour. Notice the $$cars reference under cond, which points at the input field’s alias. The result is exactly what we wanted:

{ "cars": [{ "make": "Mercedes", "colour": "blue" }] }

Wondering how to wire this up in Node.js (or any other MongoDB SDK)? It’s the same query, dropped into a connection wrapper:

const MongoClient = require('mongodb').MongoClient;
const util = require('util');

(async function () {
  const url = 'mongodb://localhost:27017/your-db';
  const dbName = 'your-db';
  const client = new MongoClient(url, { useNewUrlParser: true });

  try {
    await client.connect();
    const db = client.db(dbName);
    const collection = db.collection('your-collection');
    const docs = await collection
      .aggregate([
        { $match: { 'cars.colour': 'blue' } },
        {
          $project: {
            cars: {
              $filter: {
                input: '$cars',
                as: 'cars',
                cond: { $eq: ['$$cars.colour', 'blue'] },
              },
            },
            _id: 0,
          },
        },
      ])
      .toArray();
    console.log(util.inspect(docs, { showHidden: false, depth: null }));
  } catch (err) {
    console.log(err.stack);
  }

  client.close();
})();

That’s all folks.