Securing a REST API

This post is 4 years old. (Or older!) Code samples may not work, screenshots may be missing and links could be broken. Although some of the content may be relevant please take it with a pinch of salt.

The source code for the example application discussed in this article - with setup instructions - is available on GitHub.

Application architecure

Applications using a three tiered architecture have been around for a quite a while - LAMP stack anyone? Recently there has been a great number of applications using JavaScript throughout all the stacks and this has some added benefits with regards to lean, agile and rapid application development.

A three tiered architecture is immensely useful as the presentation, application processing and data management functions are physically separated. In a usual setup there's communication happening between the various tiers using the HTTP protocol. Each of the tiers have a responsibility: the user interface is where you define your data views and your user workflow, the middle-tier is where you define the logic for your application as well as the business rules and finally the database tier acts as a persistent storage.

Using JavaScript throughout this stack really means that you have one data model (JSON) and one execution environment. This also means that you no longer need to worry about handling different data structures and mapping those data structures to a domain model in the application.

Custom REST APIs

The way these three tiered architectures work is that you define some service endpoints at the middle-tier level and based on those API endpoints, data is returned from a database and that data is transferred to the client (to a browser for example) over HTTP. This is fairly easy to do with Node.js and the MarkLogic Node.js Client API.

But what happens if you decide to secure an API endpoint so that only users can access it who have been authenticated? Even though MarkLogic has a great built-in security mechanism built into the database you may want to exert more control over the API endpoint access at the middle-tier and the browser tier.

MarkLogic also has a built-in security module which you can get familiar with by reading this Security Guide.

Authentication

Server-based authentication

Before we can get into the discussion of how to secure an API we need to understand the application architecture behind the API itself a bit more.

Think about a classic, server-based authentication. Because the HTTP protocol is stateless there has to be a mechanism in place that allows the storage of user information, otherwise we would have to authenticate at each and every request that we make. User information is normally stored in a cookie and that information is being serialised/deserialised after each request.

There are multiple issues with server-based authentication, including, but not limited to scalability and complex session management.

Token based authentication

When talking about token based authentication a token is generated which is also stateless. Once a valid token is available it can be attached to any HTTP request as part of the heading and the actual authentication process is now simple: each time the request is made a check needs to be done for the existence of a valid token.

JWT

JWT (JSON Web Tokens - pronounced as 'jot') is an open standard that defines how information should be securely transmitted between two parties as a JSON object.

JWT tokens consist of three parts, separated by the dot (period). The sections are 'Header', 'Payload' and 'Signature'.

Example

Let's take a look at a very simple example where we are going to be using two endpoints created using Node.js that will retrieve data from a MarkLogic database.

In order to setup the project please follow the setup instructions in the README file.

//code snippet
const characters = (req, res) => {
db.documents
.query(qb.where(qb.directory('/character/')).slice(0, 100))
.result()
.then((response) => {
let characterNames = response.map((character) => {
return character.content.name;
});
res.json(characterNames);
})
.catch((error) => {
console.log(error);
});
};

const vehicles = (req, res) => {
db.documents
.query(qb.where(qb.directory('/vehicle/')).slice(0, 100))
.result()
.then((response) => {
let vehicleNames = response.map((vehicle) => {
return vehicle.content.name;
});
res.json(vehicleNames);
})
.catch((error) => {
console.log(error);
});
};

router.route('/api/characters').get(characters);
router.route('/api/vehicles').get(vehicles);

This is a very straight forward setup to handle the incoming HTTP GET requests via our middle-tier. We can test is by using a tool such as Postman.

Securing an endpoint

Let's now try to secure one of the endpoints - in other words, let's restrict access to the /api/characters endpoint - meaning that we want to make sure that access to this endpoint is only available for requests that send a valid JWT token.

Creating a token

In order to access the secured endpoint a token needs to be sent with the HTTP request. As mentioned earlier a token has three parts, a 'Header', a 'Payload' and a 'Signature'. In a full blown three tiered architecture the JWT would be created when a user registers or when the user logs in from an UI. The created token is than going to be stored somewhere at the client (most likely as part of the session or local storage) and the token will be used for all subsequent HTTP requests that require authentication.

For the purposes of our example we are going to be generating our token using a Node.js command line script, pretending that it has been created when someone logged in to our system.

The GitHub repository for this sample application contains a file that generates JWT tokens. All you need to do is execute node create-token.js which will return you a token.

The script that generates the token simply inserts a JSON document into the MarkLogic database - mocking the behaviour of a user registration.

Note: Never ever store passwords in their plain text form in any of the documents in the database, use a hashing algorithm to obfuscate the password such as the bcrypt-nodejs npm package.

Once the user has been "registered", we generate a token based on the user information - to be more precise, we are going to create a payload that contains the username.

When creating a token there are quite a few options that can be specified for the payload, including roles for example. In our example we also set an expiration for the token to be one hour. This is useful because we want to make sure that the token that belongs to a given user expiries after an hour.

Finally we also need a 'secret' to create the token.

Note: The secret for the token should never be stored inline in the code - ideally it should be stored as an environment variable and it should be a complex set of strings and numbers.

db.documents
.write({
uri: '/user/tamas.json',
contentType: 'application/json',
content: {
name: 'tamas',
password: 'password',
},
})
.result()
.then((response) => {
return db.documents.read('/user/tamas.json').result();
})
.then((response) => {
let secret = 's3cr3t';
let expire = 3600;
let tokenValue = { username: response[0].content.name };
let token = jwt.sign(tokenValue, secret, { expiresIn: expire });
console.log(token);
})
.catch((error) => {
console.log(error);
});

Running the token generator script returns a token that looks similar this: eyJhbGciOiJI[...]CJ9.eyJ1c2VybmF[...]4OTF9.gGiJD[...]Hg. Copy the token that is returned as it is going be used soon.

Using the token

It's time for securing one of our endpoints.We would like to make sure that only people who have a valid token can access a certain resource in our Node.js middle tier application.

To achieve this we need to create a middlewear in Express.js. A middlewear is nothing more but a function that has access to the request and response objects and it's able to execute any code, make changes to the aforementioned objects and also end the request-response cycle.

What needs creating is a middlewear function that checks for the existence of a JWT token and makes sure that the token itself is valid. If it's valid it should call the next middlewear function (which is going to be the actual query to the database).

Believe it or not creating such a function is really trivial and it's only a few lines:

const authenticate = (req, res, next) => {
let authHeader = req.headers.authorization;
if (authHeader) {
let token = authHeader.split(' ')[1];
jwt.verify(token, 's3cr3t', (error, decoded) => {
if (error) {
console.log(error);
res.sendStatus(401);
} else {
req.username = decoded.username;
next();
}
});
} else {
res.status(403).send({ message: 'No token provided.' });
}
};

First we make sure that we have the Authorization header arriving with the request. If we have it, we want to make sure that we capture the actual JWT and we can run it through the jwt.verify() method. If the token is valid we add a new property to the request object (req.username) with the value of the username property that we have encoded in the JWT, otherwise we send a 401 status back to the client.

The final step is to add this middlewear to the Express router:

router.route('/api/characters').get(authenticate, characters);

Now, when trying to run an HTTP GET request against the /api/characters endpoint a 'no token provided' message appears.

Adding our token with the Authorization header will return the data:

Remember you need to add the Authorization header with the following structure: Authorization: Bearer [token] (the token is without the [ ] characters)

Conclusion

Even though MarkLogic has some great built in Security options which allows for restricting access on documents, sometimes there may be a need to also secure REST API endpoints. In a three tiered architecture this can be challenging due to how authentication is done with Single Page Applications. JSON Web Tokens is a solution that can solve the challenge.